From: Roderic P. <r....@bi...> - 2011-08-15 09:01:34
|
There some additional tricks that could be used. Mapping tree names to matrix names could be formulated as a bipartite matching problem, where we have two lists of names and want to find the best matching. See http://iphylo.blogspot.com/2007/09/matching-names-in-phylogeny-data-files.html for more details. This approach could extended to, say, matching names in a NEXUS file to those in a publication, or a GenBank POPSET from a publication. For example, if we have a NEXUS file and a POPSET we could compute the best matching between the two sets of names. Or taxon names and/or accession numbers could be retrieved from the publication. This would also help provide the context to help avoid homonyms, such as matching animal names to plant names. Regards Rod On 15 Aug 2011, at 05:13, Rutger Vos wrote: >> this calls for easy-to-use NeXML editors. e.g. add the ability to enter >> Genbank accession numbers in Mesquite, and then save as NeXML, thus >> preserving "Homo_sapiens" consistently in all alignments and resulting >> trees, while still communicating the respective accession numbers for each >> locus. Summer-of-Code project here. > > Indeed. > >> C- The basic data model of matrix-rows-matching-with-tree-OTUs works for 99% >> of datasets, but a growing number of studies use BEAST species inference >> (and other similar methods) where the tree ends in species OTUs, but the >> alignment has many more haplotype OTUs. -- i.e. there is, on purpose, a >> complete mismatch between alignment row labels and tree OTUs. Mesquite can >> handle this using a taxon association table, though I don't know that this >> is formal NEXUS or just a Mesquite invention. I don't think that NeXML or >> PhyloML can handle this. This calls for expanding the capabilities of NeXML >> and PhyloML. > > Yes and no. Multiple matrix rows can reference the same otu, but > that's not quite what we want. Multiple, separately annotatable matrix > row segments would be a good feature to have, also for TreeBASE's > needs. > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading, RG6 6BX, United Kingdom > Tel: +44 (0) 118 378 7535 > http://rutgervos.blogspot.com > > ------------------------------------------------------------------------------ > uberSVN's rich system and user administration capabilities and model > configuration take the hassle out of deploying and managing Subversion and > the tools developers use with it. Learn more about uberSVN and get a free > download at: http://p.sf.net/sfu/wandisco-dev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |