From: Martin D. <mar...@ge...> - 2008-01-06 19:31:47
|
Paul Ramsey send to me a dump of our svn repository. The uncompressed dump size is 2.76 Gb After removing UDig except the required depencies (the GML module has its history in UDig), the dump size is 1.59 Gb. After removing a few (not yet all) of the huge test files and every JAR files, the dump size is 1.43 Gb. More test files will be removed later - I'm really just starting the cleaning. Belows are the biggest files ever commited to our SVN history. I means commited with "svn add", not "svn copy" (otherwise the size in svndump is close to 0). I just pasted the first few files, but there is 73 files bigger than 1 Mb and 406 files bigger than 100 kb. As you can see from this extract, we failed at least partially to get peoples to use "svn copy" - the same files are added again and again. When we switched from CVS to SVN, we said very loud to not use graphical SVN interfaces (no TortoiseSVN, no EclipseSVN - command line only) as they were not good at that time. Apparently we failed to convince peoples. Hopefully those graphical interfaces are better now, but please check with "svn status" from the command line everytime you do some SVN operation that you never did before. Size Filename -------- ----------------------------------------------------------------------- 55474027 geotools/trunk/gt/plugin/geotiff/.../002025_0100_010722_l7_01_utm21.tif 55474027 geotools/branches/geotiff_simone/.../002025_0100_010722_l7_01_utm21.tif 12375769 geotools/trunk/gt/plugin/image/.../po_168213_blu_0000000.tif 8809581 geotools/branches/coverages_branch/branches/.../test-data/W020N90.zip 8809581 geotools/branches/coverages_branch/trunk/gt/.../testData/W020N90.zip 8809581 geotools/branches/coverages_branch/trunk/.../test-data/W020N90.zip 8809572 geotools/trunk/gt/plugin/gtopo30/test/.../testData/W020N90.zip 8809572 geotools/trunk/gt/plugin/gtopo30/test/.../testData/W020N90.zip 7549755 geotools/trunk/spike/arcGrid/test/.../arcgrid_test_data.zip.zip 7549755 geotools/trunk/spike/arcGrid/test/.../arcgrid_test_data.zip 7549755 geotools/branches/2.3.x/ext/coverage_dev/.../arcgrid_test_data.zip 7549746 geotools/branches/2.3.x/ext/coverage_dev/.../arcgrid_test_data.zip 7549746 geotools/trunk/spike/ecw/test/.../test-data/arcgrid_test_data.zip 7549746 geotools/branches/2.3.x/ext/coverage_dev.../arcgrid_test_data.zip 6548376 geotools/trunk/gt/doc/C/output/geotools.ps 4993783 geotools/branches/coverages_branch/.../test-data/fme/roads/roads.xml 4993783 geotools/branches/coverages_branch/.../test-data/test1/roads.xml 4993783 geotools/branches/coverages_branch/.../xml/fme/roads/roads.xml As a side note, uDig SVN has big files too, especially JAR files (actually, when I merge GeoTools and uDig in the same list, most huge files except the two first TIFF files are in uDig SVN). I also have interrogation about some branches. Belows is the total spaces used by some directories. I put a few tags for comparaison purpose, so you can see that "svn copy" has a cost close to zero. I don't know why GeoTools 2.3 tags consume ~300 kb - I would find surprising that changing "2.3-SNAPSHOT" to "2.3.1" alone would consume that much space. But note also the size of the "2.3" and "coverages_branch" branches. Size Directory --------- ---------------------------------------------------------------------- 254368801 geotools/trunk/gt 148524305 geotools/branches/coverages_branch 64509158 geotools/branches/2.3.x 22405164 geotools/branches/2.2.x 13654188 geotools/branches/2.4.x 12371022 geotools/branches/2.0.x 11298431 geotools/branches/2.1.x 326196 geotools/tags/2.3.5 319696 geotools/tags/2.3.3 319503 geotools/tags/2.3.2 318896 geotools/tags/2.3.1 258548 geotools/tags/2.3.0 200867 geotools/tags/2.2.1 124435 geotools/tags/2.1.0 28141 geotools/tags/2.2.2 4492 geotools/tags/2.2.0 0 geotools/tags/2.1.1 0 geotools/tags/2.3.4 I suspect (but have not verified) that the 2.3 branch has been created using "svn copy" as we should, but from that point a lot of code has been merged from trunk using copy-and-paste then "svn add" from Eclipse IDE. For example the 4 Mb EPSG.sql file has been "svn added" to the 2.3 branch, not "svn copied" from trunk. For "coverages_branch", I suspect (but have not verified) that the whole directory has been "svn added" rather than "svn copied". If this branch is not needed anymore, I would like to drop it completly given the large amout of space it consumes. Martin |