From: Jonathan C. <cra...@pc...> - 2003-05-14 18:00:16
|
Hi Chetna- Chetna D. Warade wrote: > I am working with Michael to load GenBank stuff in gus 3.0. > Right our database is out of tablespace. My question is: Does repetitive > use of plugin/GBParser resume where it left off or will it try to load > everything from scratch. This is something that has to be coded on a per-plugin basis, meaning that unless the authors of the GBParser plugin have explicitly given it the ability to restart cleanly, it won't. Or rather, most plugins will probably run a second time without complaint, but will likely create duplicate rows in the database. Whether you get duplicates also depends on how the plugin handles commits (also a plugin-specific issue). Most of the plugins that load a large amount of data will commit on a periodic basis (e.g. every 1000 or 10000 entries or rows), so that if a crash occurs at 5500 entries, for example, you would end up with 5000 in the database, assuming a commit frequency of 1000. And it also depends whether the plugin checks for the presence of entries/rows before loading duplicates (a facility that may provided support for, but is not equivalent to, the ability to restart a plugin on the same input files.) > Situation here: > Due to limited tablespace we could successfully load 18079 rows in the > database (dots.ExternalNASequence and Dots.NAEntry). I am adding more > tablespace and then re-run the GBParser on the same GenBank file. At the > least I expect the primary key failure error for first 18079 rows and > then GBParser should be able to load the remaining ones. In general you are unlikely to get primary key errors, since the primary key values are autogenerated, and so the second time the plugin is run it will generate a whole new set of IDs (assuming that it has not been written to handle restarts and/or check whether entries being inserted are already in the database.) Again, however, it's something that is plugin-specific. If a table has additional "unique" constraints, for example, and the plugin fails to check whether inserted rows are already in the database, then it is possible for constraint violations to occur when re-running a plugin. Anyway, the bottom line is that it depends almost entirely on how the GBParser has been implemented, and so your questions are all best answered either by the people who wrote the plugin or by looking at the Perl code directly. Jonathan |