From: Vladimir G. <vga...@ne...> - 2010-03-01 18:11:13
|
I have now managed to time-run the migration script on the production server, using a data sample that is about 1/60 of the full data set to be migrated. This took 12min. Extrapolating, the migration run time will be about 12 hours. For reference, running the same sample on treebase-dev takes 21.5 min, almost twice as long, so the production migration would indeed be faster on the production server. There is a chance this could be sped up if Postgresql driver allows larger batch sizes for updates (currently, it is set to 30,000 because of a DB2 limitation). I'll look into that briefly. Other than that, are we ready to do the production run of the migration? Issues to consider: - Does http://treebase.peabody.yale.edu/treebase/migration/Dec-09/ indeed contain the dataset we want to migrate? - Are we ok with running the migration against the instance on treebase-prod? - Is it necessary to do any data cleaning prior to migration, e.g. run the fixtaxonlabels.pl Bill mentioned this morning? --Vladimir |