From: Ed R. <ed_...@be...> - 2004-10-21 13:35:23
|
We had this problem in a couple other projects I worked on. We solved it by having two ID fields in the subject table, and internal/original and an external/updated ID. Might it not be better to have the initial load of the taxon fill in an original ID and a current ID and then, if it ever changes, update a LastID and a CurrentID field to keep track of where NCBI is? This way, we can use the original ID to stay consistent internally to the GUS records and use LastID and CurrentID to keep track of where NCBI is going. All of this would happen in the update routine of the plugin. It would require two new fields in the SRes.taxon table. -Ed > > From: pi...@pc... > Date: 2004/10/20 Wed PM 03:47:17 EDT > To: gus...@li... > Subject: [Gusdev-gusdev] LoadTaxonomy.pm > > > The LoadTaxonomy plugin was written to load taxonomy information from tables > downloaded from NCBI. It was intended that the plugin update and not replace > rows in the taxonomy tables and that the taxon_ids remain stable. It was > assumed that NCBI didn't replace their tax_ids but in fact they do although at > a low rate. This results in duplications in the taxon_ids that represent the > same taxonomic group. There is a merged.dmp file included in their tar ball > that contains a list of old to new tax_id mappings and seems to be cumulative. > I have written them to confirm that all replacements are in the file and that > it is cumulative. > > I would like to add a subroutine to the LoadTaxon plugin that would be called > first to replace the deprecated ncbi_tax_ids in sres.taxon with their > replacements > and then continue with the plugin as it is. This would require the addition of > another option for the merged.dmp file. I was not going to make this an > optional task but I will try to build in a time saver for first time use. > > Any comments? Please respond quickly as I need to run the plugin ASAP. > > -Debbie > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > Ed Robinson 255 Deerfield Rd Bogart, GA 30622 (706)425-9181 --Learn more about the face of your neighbor, and less about your own. -Sargent Shriver |