From: Hilmar L. <hl...@ne...> - 2010-01-29 21:52:40
|
Fine with me if Bill is. -hilmar On Jan 29, 2010, at 1:29 PM, Vladimir Gapeyev wrote: > Bill, thanks for the info! I did intend import this data in a way > similar to what you suggest. > > Technically, I see how to do step 4 below, but I am not clear why it > will work. (This is clearly a tangent, in the direction of data > quality, i guess.) As I understood before, the natural key in the > TAXONLABEL table is the pair (study_id, taxonlabel), and I thought the > reason was that two studies, say 981 and 1207, might use the same > label, say "Abelia", but mean different taxa by it. In this case, TB2 > should connect (981, "Abelia") and (1207, "Abelia") to different > variant_ids and taxon_ids. This, however, does not mesh with your > instruction to connect TAXONVARIANT and TAXONLABEL tables based on the > value of TAXONLABEL.taxonlabel alone, relying on the fact that > taxonlabel field is unique within the taxon_labels.tab file. Under > this arrangement, there is no way "Abelia" from different studies can > be connected to different taxonvariants! I actually expected to see a > study_id field in taxon_labels.tab. > > I misunderstood something, but what? > > --VG > > > On Jan 28, 2010, at 11:27 PM, William Piel wrote: > >> >> On Jan 28, 2010, at 6:19 PM, Vladimir Gapeyev wrote: >> >>> Here is one important piece of info I could not find: What is the >>> meaning of each column in the taxon intelligence files >>> (taxon_labels.tab, taxon_variants.tab, taxa.tab)? >> >> >> taxon: >> taxon_id (PK) >> namebankid >> namestring >> ncbi_taxid >> >> taxon_variants: >> taxon_variant_id (PK) >> taxon_id (FK) >> namebankid >> namestring >> fullnamestring >> lexicalqualifier >> >> taxon_labels: >> taxon_label_id (PK) >> taxon_variant_id (FK) >> legacy_id (from TreeBASE1) >> taxon_label (from TreeBASE1 -- useful for mapping output from >> headless Mesquite to TV table) >> >> I thing the process should go something like this: >> >> 1. In the taxonlabel table, update all values in the taxonvariant_id >> column to NULL >> 2. Delete all records in the taxonvariant and taxon tables. >> 3. The first two files can be sucked straight into the taxa and >> taxonvariant tables (the mapping among columns should be fairly >> obvious -- let me know if you have questions) >> 4. For mapping the taxonlabel table with the taxonvariant table, I'd >> suggest that you upload the third file into a temporary table, and >> then write an update statement to do the matching: use the last >> column of the taxon_labels file to match against the taxonlabel >> column in the taxonlabel table. Wherever there is a match, put the >> value of the second column in the taxon_labels file in the >> taxonvariant_id column of the taxonlabel table. Recall that in the >> third file, each value in the last column (the "taxon_label") is >> unique; by contrast, the taxonlabel column in the taxonlabel table >> are not unique. >> 5. Drop the temporary table. >> >> bp >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> The Planet: dedicated and managed hosting, cloud storage, colocation >> Stay online with enterprise data centers and the best network in the >> business >> Choose flexible plans and management services without long-term >> contracts >> Personal 24x7 support from experience hosting pros just a phone call >> away. >> http://p.sf.net/sfu/theplanet-com >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel > > > ------------------------------------------------------------------------------ > The Planet: dedicated and managed hosting, cloud storage, colocation > Stay online with enterprise data centers and the best network in the > business > Choose flexible plans and management services without long-term > contracts > Personal 24x7 support from experience hosting pros just a phone call > away. > http://p.sf.net/sfu/theplanet-com > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |