[Treebase-devel] data migration SDSC -> NESCent

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all,

Bill, Mark, Val and I just had a conference call on pending issues
pre-beta. We decided there really aren't any: we're ready to start
beta testing. The topic then moved to what to do after testing. One of
the main issues is how we will move the actual data in TreeBASE2 from
the SDSC database instance (i.e. DB2 sitting on a computer in San
Diego) to the NESCent instance (i.e. PG sitting on a computer in
Durham).

One possibility is that we use the scripts we've been using to import
TreeBASE1 data into TreeBASE2. Unfortunately, loading the data that
way takes a fair amount of time (think weeks) and human intervention.

A second possibility would be to write a program that, from NESCent,
connects to the SDSC instance through JDBC and loads the data record
by record. This might take a long time too, and it'll depend on the
JDBC connection staying up for that entire time.

To kick off the discussion I'd like to suggest a third possibility: we
implement functionality where each table can be dumped to some
delimited format (CSV, say); the dumps are made available for download
as a compressed archive; the NESCent machine downloads that archive
and loads the tables into PG. It seems to me that we want the
dump+zip+serve up functionality anyway, so this would be a good way to
make that happen.

Any thoughts?

Thanks,

Rutger

-- 
Dr. Rutger A. Vos
Department of zoology
University of British Columbia
http://www.nexml.org
http://rutgervos.blogspot.com