You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
(1) |
Apr
(41) |
May
(41) |
Jun
(50) |
Jul
(14) |
Aug
(21) |
Sep
(37) |
Oct
(8) |
Nov
(4) |
Dec
(135) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(145) |
Feb
(110) |
Mar
(216) |
Apr
(101) |
May
(42) |
Jun
(42) |
Jul
(23) |
Aug
(17) |
Sep
(33) |
Oct
(15) |
Nov
(18) |
Dec
(6) |
2011 |
Jan
(8) |
Feb
(10) |
Mar
(8) |
Apr
(41) |
May
(48) |
Jun
(62) |
Jul
(7) |
Aug
(9) |
Sep
(7) |
Oct
(11) |
Nov
(49) |
Dec
(1) |
2012 |
Jan
(17) |
Feb
(63) |
Mar
(4) |
Apr
(13) |
May
(17) |
Jun
(21) |
Jul
(10) |
Aug
(10) |
Sep
|
Oct
|
Nov
|
Dec
(16) |
2013 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(5) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: William P. <wil...@ya...> - 2011-04-15 12:43:01
|
On Apr 15, 2011, at 4:14 AM, Roderic Page wrote: > For large studies the Nexml generation simply times out, so I gave up. If you still have some ID numbers for those big ones, I'd be happy to test it again. It may have been solved because of some recent changes. But, indeed, I'd like access to a dump too. bp |
From: Roderic P. <r....@bi...> - 2011-04-15 08:50:00
|
Dear Bill, Yes, I was being a little glib, and I'm all for the data being available. But I wonder whether, given the rate at which new sequence data is being acquired, many people will redo analyses using more data, rather than reanalyse a older data set. My comment about trees is that, at the end of the day, it's the one thing that makes TreeBASE unique. From my no doubt biased perspective, it could/should be the place I go to find out "what do we know about the phylogeny of group x?" Why not just make a CouchDB database right now? I tried, believe me I tried, but a big chunk of TreeBASE didn't make it down the wire. For large studies the Nexml generation simply times out, so I gave up. I guess this is what Rutger's suggestion of having file dumps would address. If every study had a Nexml file sitting on the server then I could just fetch those, rather than hammer the database and get frustrated when it times out. Given that I have the attention span of a gnat, if something doesn't work I tend to drop it for a while and go on to something else. If I could reliably get all TreeBASE studies in Nexml, I'd make the CouchDB version in a flash. Regards Rod On 14 Apr 2011, at 21:00, William Piel wrote: > > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees > > From where I sit, alignments are an important resource for the community. Nobody emails me asked for a tree that is missing from TreeBASE, but I'm always being asked for an alignment that should be in TreeBASE but is not.Typically it is because the author started, but never finished, a submission. Earlier this year I had a case where an author wanted her alignments embargoed for a year post-publication. After several people independently emailed me to request access, I contacted the journal, they convened the board, and they passed a resolution stating that all data must be released immediately. So these are not without value. Alignments are collections of hypotheses of homology (NCHAR of them per alignment!) that are often difficult to rebuild from scratch -- trees are merely blended summary diagrams of these hypotheses. Plus, retyping a morphological dataset after OCR'ing a PDF is an enormous pain. > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> So I guess I'd do the following: > > Is there anything to stop anyone from doing exactly this? > > And you could have your CouchDB updated periodically by doing a cron on TreeBASE's OAI-PMH to get the IDs of all new or modified studies (e.g., since April 12th, GMT: http://treebase.org/treebase-web/top/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2011-04-12T00:00:00Z), and then pull down the NeXML for just the trees, convert to JSON, populate the CouchDB, etc. It should be relatively easy to maintain a CouchDB mirror. > > bp > > > PS - Hmm... Rod, do you know you do have a way of loving-and-then-hating things? :-) > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> 7. Never, ever mention RDF. > > On Apr 14, 2011, at 1:05 PM, Roderic Page wrote: >> I ... was once an enthusiast [of RDF] > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> Bonus points for not mentioning XML. > > On May 20, 2004, at 7:37 PM, Roderic D. M. Page wrote: >> [I think TreeBASE should] store data (say, the character states for a taxon) as an XML formatted BLOB. > > On Jan 15, 2006, at 4:15 PM, Roderic Page wrote: >> once one of the major providers adopts LSIDs (my money is on uBio), whatever they adopt will drive standards > > On Apr 1, 2009, at 9:20 AM, Roderic D. M. Page wrote: >> I think that [LSIDs have] been the Achilles heel of biodiversity informatics. > > [etc..] > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev_______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: Roderic P. <r....@bi...> - 2011-04-15 08:37:33
|
Touché! On 14 Apr 2011, at 23:26, Rutger Vos wrote: > Ok, that was priceless. > >> PS - Hmm... Rod, do you know you do have a way of loving-and-then-hating >> things? :-) >> On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> >> 7. Never, ever mention RDF. >> >> On Apr 14, 2011, at 1:05 PM, Roderic Page wrote: >> >> I ... was once an enthusiast [of RDF] >> >> On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> >> Bonus points for not mentioning XML. >> >> On May 20, 2004, at 7:37 PM, Roderic D. M. Page wrote: >> >> [I think TreeBASE should] store data (say, the character states for a taxon) >> as an XML formatted BLOB. >> >> On Jan 15, 2006, at 4:15 PM, Roderic Page wrote: >> >> once one of the major providers adopts LSIDs (my money is on uBio), whatever >> they adopt will drive standards >> >> On Apr 1, 2009, at 9:20 AM, Roderic D. M. Page wrote: >> >> I think that [LSIDs have] been the Achilles heel of >> biodiversity informatics. >> >> [etc..] >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading > RG6 6BX > United Kingdom > Tel: +44 (0) 118 378 7535 > http://www.nexml.org > http://rutgervos.blogspot.com > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: Rutger V. <R....@re...> - 2011-04-14 22:27:00
|
Ok, that was priceless. > PS - Hmm... Rod, do you know you do have a way of loving-and-then-hating > things? :-) > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > > 7. Never, ever mention RDF. > > On Apr 14, 2011, at 1:05 PM, Roderic Page wrote: > > I ... was once an enthusiast [of RDF] > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > > Bonus points for not mentioning XML. > > On May 20, 2004, at 7:37 PM, Roderic D. M. Page wrote: > > [I think TreeBASE should] store data (say, the character states for a taxon) > as an XML formatted BLOB. > > On Jan 15, 2006, at 4:15 PM, Roderic Page wrote: > > once one of the major providers adopts LSIDs (my money is on uBio), whatever > they adopt will drive standards > > On Apr 1, 2009, at 9:20 AM, Roderic D. M. Page wrote: > > I think that [LSIDs have] been the Achilles heel of > biodiversity informatics. > > [etc..] > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Vladimir G. <vga...@ne...> - 2011-04-14 21:13:44
|
On Apr 14, 2011, at 4:00 PM, William Piel wrote: > PS - Hmm... Rod, do you know you do have a way of loving-and-then- > hating things? :-) > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> 7. Never, ever mention RDF. > > On Apr 14, 2011, at 1:05 PM, Roderic Page wrote: >> I ... was once an enthusiast [of RDF] > > On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: >> Bonus points for not mentioning XML. > > > On May 20, 2004, at 7:37 PM, Roderic D. M. Page wrote: >> [I think TreeBASE should] store data (say, the character states for >> a taxon) as an XML formatted BLOB. > > On Jan 15, 2006, at 4:15 PM, Roderic Page wrote: >> once one of the major providers adopts LSIDs (my money is on uBio), >> whatever they adopt will drive standards > > On Apr 1, 2009, at 9:20 AM, Roderic D. M. Page wrote: >> I think that [LSIDs have] been the Achilles heel of biodiversity >> informatics. > > [etc..] > Evolution? Someone will build a tree from this! -V |
From: William P. <wil...@ya...> - 2011-04-14 20:00:21
|
On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees From where I sit, alignments are an important resource for the community. Nobody emails me asked for a tree that is missing from TreeBASE, but I'm always being asked for an alignment that should be in TreeBASE but is not.Typically it is because the author started, but never finished, a submission. Earlier this year I had a case where an author wanted her alignments embargoed for a year post-publication. After several people independently emailed me to request access, I contacted the journal, they convened the board, and they passed a resolution stating that all data must be released immediately. So these are not without value. Alignments are collections of hypotheses of homology (NCHAR of them per alignment!) that are often difficult to rebuild from scratch -- trees are merely blended summary diagrams of these hypotheses. Plus, retyping a morphological dataset after OCR'ing a PDF is an enormous pain. On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > So I guess I'd do the following: Is there anything to stop anyone from doing exactly this? And you could have your CouchDB updated periodically by doing a cron on TreeBASE's OAI-PMH to get the IDs of all new or modified studies (e.g., since April 12th, GMT: http://treebase.org/treebase-web/top/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2011-04-12T00:00:00Z), and then pull down the NeXML for just the trees, convert to JSON, populate the CouchDB, etc. It should be relatively easy to maintain a CouchDB mirror. bp PS - Hmm... Rod, do you know you do have a way of loving-and-then-hating things? :-) On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > 7. Never, ever mention RDF. On Apr 14, 2011, at 1:05 PM, Roderic Page wrote: > I ... was once an enthusiast [of RDF] On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > Bonus points for not mentioning XML. On May 20, 2004, at 7:37 PM, Roderic D. M. Page wrote: > [I think TreeBASE should] store data (say, the character states for a taxon) as an XML formatted BLOB. On Jan 15, 2006, at 4:15 PM, Roderic Page wrote: > once one of the major providers adopts LSIDs (my money is on uBio), whatever they adopt will drive standards On Apr 1, 2009, at 9:20 AM, Roderic D. M. Page wrote: > I think that [LSIDs have] been the Achilles heel of biodiversity informatics. [etc..] |
From: Rutger V. <R....@re...> - 2011-04-14 17:57:07
|
> Distributed editing makes sense to me in that for most things there's always a bigger fish. Does anybody think they can do a better job of providing online management of bibliographic metadata than Zotero or Mendeley? If not, why bother recreating that, just enable people to use those tools and harvest the edits. Yes, I think that is key. Focus on the core business and delegate everything else. > On 14 Apr 2011, at 17:31, Rutger Vos wrote: > >> I tried loading the JSON files I shared with you the other day in >> couchdb and it turned out I would have to recompile it with more >> memory or else it chokes. So the idea is that it's just the metadata >> that goes into a database, and you like that to be couchdb as opposed >> to a triple store, right? And you like the idea of distributed >> editing, so does that mean you also like the idea of distributed >> searching, along the plug-in idea? With the various projects you've >> done to annotate/correct/taxon map treebase data, it would be great if >> those could be plugged into a common, easy-to-use front end. >> >> On Thu, Apr 14, 2011 at 5:04 PM, Roderic Page <r....@bi...> wrote: >>> So I guess I'd do the following: >>> >>> 1. Separate data entry from data access. SQL may have a place for data entry, but that's it. And MySQL is fine, really. >>> >>> 2. The data access end is a document database like CouchDB which stores metadata (and trees) as JSON >>> >>> 3. Simple query API that more or less wraps CouchDB queries, search by taxon, identifier, geography, or full text. >>> >>> 4. Store data on disk in original format, as well as derived formats as Rutger suggests. Being able to grab dumps in various formats is handy, especially if the data can be reliably obtained. >>> >>> 5. Have a web interface that's simple, easy to use, supports search without asking user whether something is a number or not, use SVG for trees, enable users to log in using Facebook/Twitter/Mendeley >>> >>> 6. Devolve as much editing as possible to other places, e.g. Mendeley for bibliographic stuff >>> >>> 7. Never, ever mention RDF. Bonus points for not mentioning XML. >>> >>> My sense as an outside observer is that much of the current iteration of TreeBASE has been driven by technology (Postgresql, Tomcat, RDF, Java, XML), not usability. I understand the rationale for the choices (I think), but at the end of the date TreeBASE should be about the trees. It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees. I should be able to find my trees, find trees from a paper, find trees for a taxon, find trees from a given part of the world, find trees that use a given sequence, find trees that look like my trees. >>> >>> Read Michael Wolfe's answer to the question "Why is Dropbox more popular than other programs with similar functionality?" and you'll see where I'm coming from >>> >>> http://www.quora.com/Dropbox/Why-is-Dropbox-more-popular-than-other-programs-with-similar-functionality >>> >>> Regards >>> >>> Rod >>> >>> --------------------------------------------------------- >>> Roderic Page >>> Professor of Taxonomy >>> Institute of Biodiversity, Animal Health and Comparative Medicine >>> College of Medical, Veterinary and Life Sciences >>> Graham Kerr Building >>> University of Glasgow >>> Glasgow G12 8QQ, UK >>> >>> Email: r....@bi... >>> Tel: +44 141 330 4778 >>> Fax: +44 141 330 2792 >>> AIM: rod...@ai... >>> Facebook: http://www.facebook.com/profile.php?id=1112517192 >>> Twitter: http://twitter.com/rdmpage >>> Blog: http://iphylo.blogspot.com >>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >>> >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Benefiting from Server Virtualization: Beyond Initial Workload >>> Consolidation -- Increasing the use of server virtualization is a top >>> priority.Virtualization can reduce costs, simplify management, and improve >>> application availability and disaster protection. Learn more about boosting >>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >>> >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading >> RG6 6BX >> United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://www.nexml.org >> http://rutgervos.blogspot.com >> > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: r....@bi... > Tel: +44 141 330 4778 > Fax: +44 141 330 2792 > AIM: rod...@ai... > Facebook: http://www.facebook.com/profile.php?id=1112517192 > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Roderic P. <r....@bi...> - 2011-04-14 17:05:41
|
Dear Rutger, Not sure why CouchDB choked, I have databases with millions of JSON documents and it works OK. That said, the JSON you sent was pretty ugly (namespaces, brrr!) It seems to me that much of what we do is pass around objects, and JSON (without namespaces) is a light-weight way to do this that we can also operate on very simply. I lots patience with triple stores, partly because there's the overhead of having RDF vocabularies, and the interesting queries are not supported (e.g., spatial queries). I get the idea behind RDF, and was once an enthusiast, but I think there are too many practical issues that make it less than useful. Distributed editing makes sense to me in that for most things there's always a bigger fish. Does anybody think they can do a better job of providing online management of bibliographic metadata than Zotero or Mendeley? If not, why bother recreating that, just enable people to use those tools and harvest the edits. So, I'd want to focus on the one thing TreeBASE has that nobody else has, namely the trees. Although, having said that, if Phylota had a decent interface it would be awesome. Regards Rod On 14 Apr 2011, at 17:31, Rutger Vos wrote: > I tried loading the JSON files I shared with you the other day in > couchdb and it turned out I would have to recompile it with more > memory or else it chokes. So the idea is that it's just the metadata > that goes into a database, and you like that to be couchdb as opposed > to a triple store, right? And you like the idea of distributed > editing, so does that mean you also like the idea of distributed > searching, along the plug-in idea? With the various projects you've > done to annotate/correct/taxon map treebase data, it would be great if > those could be plugged into a common, easy-to-use front end. > > On Thu, Apr 14, 2011 at 5:04 PM, Roderic Page <r....@bi...> wrote: >> So I guess I'd do the following: >> >> 1. Separate data entry from data access. SQL may have a place for data entry, but that's it. And MySQL is fine, really. >> >> 2. The data access end is a document database like CouchDB which stores metadata (and trees) as JSON >> >> 3. Simple query API that more or less wraps CouchDB queries, search by taxon, identifier, geography, or full text. >> >> 4. Store data on disk in original format, as well as derived formats as Rutger suggests. Being able to grab dumps in various formats is handy, especially if the data can be reliably obtained. >> >> 5. Have a web interface that's simple, easy to use, supports search without asking user whether something is a number or not, use SVG for trees, enable users to log in using Facebook/Twitter/Mendeley >> >> 6. Devolve as much editing as possible to other places, e.g. Mendeley for bibliographic stuff >> >> 7. Never, ever mention RDF. Bonus points for not mentioning XML. >> >> My sense as an outside observer is that much of the current iteration of TreeBASE has been driven by technology (Postgresql, Tomcat, RDF, Java, XML), not usability. I understand the rationale for the choices (I think), but at the end of the date TreeBASE should be about the trees. It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees. I should be able to find my trees, find trees from a paper, find trees for a taxon, find trees from a given part of the world, find trees that use a given sequence, find trees that look like my trees. >> >> Read Michael Wolfe's answer to the question "Why is Dropbox more popular than other programs with similar functionality?" and you'll see where I'm coming from >> >> http://www.quora.com/Dropbox/Why-is-Dropbox-more-popular-than-other-programs-with-similar-functionality >> >> Regards >> >> Rod >> >> --------------------------------------------------------- >> Roderic Page >> Professor of Taxonomy >> Institute of Biodiversity, Animal Health and Comparative Medicine >> College of Medical, Veterinary and Life Sciences >> Graham Kerr Building >> University of Glasgow >> Glasgow G12 8QQ, UK >> >> Email: r....@bi... >> Tel: +44 141 330 4778 >> Fax: +44 141 330 2792 >> AIM: rod...@ai... >> Facebook: http://www.facebook.com/profile.php?id=1112517192 >> Twitter: http://twitter.com/rdmpage >> Blog: http://iphylo.blogspot.com >> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading > RG6 6BX > United Kingdom > Tel: +44 (0) 118 378 7535 > http://www.nexml.org > http://rutgervos.blogspot.com > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: Rutger V. <R....@re...> - 2011-04-14 16:31:38
|
I tried loading the JSON files I shared with you the other day in couchdb and it turned out I would have to recompile it with more memory or else it chokes. So the idea is that it's just the metadata that goes into a database, and you like that to be couchdb as opposed to a triple store, right? And you like the idea of distributed editing, so does that mean you also like the idea of distributed searching, along the plug-in idea? With the various projects you've done to annotate/correct/taxon map treebase data, it would be great if those could be plugged into a common, easy-to-use front end. On Thu, Apr 14, 2011 at 5:04 PM, Roderic Page <r....@bi...> wrote: > So I guess I'd do the following: > > 1. Separate data entry from data access. SQL may have a place for data entry, but that's it. And MySQL is fine, really. > > 2. The data access end is a document database like CouchDB which stores metadata (and trees) as JSON > > 3. Simple query API that more or less wraps CouchDB queries, search by taxon, identifier, geography, or full text. > > 4. Store data on disk in original format, as well as derived formats as Rutger suggests. Being able to grab dumps in various formats is handy, especially if the data can be reliably obtained. > > 5. Have a web interface that's simple, easy to use, supports search without asking user whether something is a number or not, use SVG for trees, enable users to log in using Facebook/Twitter/Mendeley > > 6. Devolve as much editing as possible to other places, e.g. Mendeley for bibliographic stuff > > 7. Never, ever mention RDF. Bonus points for not mentioning XML. > > My sense as an outside observer is that much of the current iteration of TreeBASE has been driven by technology (Postgresql, Tomcat, RDF, Java, XML), not usability. I understand the rationale for the choices (I think), but at the end of the date TreeBASE should be about the trees. It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees. I should be able to find my trees, find trees from a paper, find trees for a taxon, find trees from a given part of the world, find trees that use a given sequence, find trees that look like my trees. > > Read Michael Wolfe's answer to the question "Why is Dropbox more popular than other programs with similar functionality?" and you'll see where I'm coming from > > http://www.quora.com/Dropbox/Why-is-Dropbox-more-popular-than-other-programs-with-similar-functionality > > Regards > > Rod > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: r....@bi... > Tel: +44 141 330 4778 > Fax: +44 141 330 2792 > AIM: rod...@ai... > Facebook: http://www.facebook.com/profile.php?id=1112517192 > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Roderic P. <r....@bi...> - 2011-04-14 16:04:51
|
So I guess I'd do the following: 1. Separate data entry from data access. SQL may have a place for data entry, but that's it. And MySQL is fine, really. 2. The data access end is a document database like CouchDB which stores metadata (and trees) as JSON 3. Simple query API that more or less wraps CouchDB queries, search by taxon, identifier, geography, or full text. 4. Store data on disk in original format, as well as derived formats as Rutger suggests. Being able to grab dumps in various formats is handy, especially if the data can be reliably obtained. 5. Have a web interface that's simple, easy to use, supports search without asking user whether something is a number or not, use SVG for trees, enable users to log in using Facebook/Twitter/Mendeley 6. Devolve as much editing as possible to other places, e.g. Mendeley for bibliographic stuff 7. Never, ever mention RDF. Bonus points for not mentioning XML. My sense as an outside observer is that much of the current iteration of TreeBASE has been driven by technology (Postgresql, Tomcat, RDF, Java, XML), not usability. I understand the rationale for the choices (I think), but at the end of the date TreeBASE should be about the trees. It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees. I should be able to find my trees, find trees from a paper, find trees for a taxon, find trees from a given part of the world, find trees that use a given sequence, find trees that look like my trees. Read Michael Wolfe's answer to the question "Why is Dropbox more popular than other programs with similar functionality?" and you'll see where I'm coming from http://www.quora.com/Dropbox/Why-is-Dropbox-more-popular-than-other-programs-with-similar-functionality Regards Rod --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: William P. <wil...@ya...> - 2011-04-14 14:33:46
|
On Apr 14, 2011, at 9:39 AM, Richard Ree wrote: > Hi folks, > > Just joined the list, as I am interested in using TB to develop collaborative methods for synthesizing plant phylogeny, e.g., by grafting clades together and other kinds of agglomeration. So, I am particularly interested in the API and harvesting, especially with respect to taxonomic names and classifications, and GenBank identifiers of sequences. > > Speaking of which, how would I go about harvesting the GenBank numbers for a given study and associating them with their alignments? Its that possible with the current API? > > I like Rutger's plug-in ideas. In general, it's easier for someone like me to provide a simple web service, rather than contribute directly to TB development. > > -Rick I guess there are various options for this. For example, you could start by finding studies published by a guy named "Ree": http://purl.org/phylo/treebase/phylows/study/find?query=dcterms.contributor=Ree&format=rss1 Out of that list, pick the first item (S10145), and you could ask for a list of matrices: http://purl.org/phylo/treebase/phylows/study/find?query=tb.identifier.study=S10145&format=rss1&recordSchema=matrix And then if you pick one matrix (e.g. M4388), you could ask for the NeXML serialization of it: http://purl.org/phylo/treebase/phylows/matrix/TB2:M4388?format=nexml And in the OTU section, you'll find a mapping between "Ruta graveolens" and NCBI's taxid 37565: <otu about="#otu21609" id="otu21609" label="Ruta graveolens"> <meta href="http://purl.uniprot.org/taxonomy/37565" id="meta21613" rel="skos:closeMatch" xsi:type="nex:ResourceMeta"/> Alternatively, you could ask for a list of trees: http://purl.org/phylo/treebase/phylows/study/find?query=tb.identifier.study=S10145&format=rss1&recordSchema=tree And then serialize one of the trees: http://purl.org/phylo/treebase/phylows/tree/TB2:Tr6161?format=nexml .... with the same annotation in for Ruta graveolens. bp |
From: Hilmar L. <hl...@ne...> - 2011-04-14 14:31:53
|
Could we not, similar to what NCBI includes in their dumps, create simple downloadable mapping tables, for example one with (at least) two columns, one being Genbank accession, and the other being the TB matrix ID (and to make it more useful, columns for TB study, TB taxon, and citation could be added). If we create a script that extracts that out of the database once a week, that might be pretty useful to people. NCBI produces a variety of these, to map accession to taxon, gene ID, etc, and people use them a lot. -hilmar On Apr 14, 2011, at 9:39 AM, Richard Ree wrote: > Hi folks, > > Just joined the list, as I am interested in using TB to develop > collaborative methods for synthesizing plant phylogeny, e.g., by > grafting clades together and other kinds of agglomeration. So, I am > particularly interested in the API and harvesting, especially with > respect to taxonomic names and classifications, and GenBank > identifiers of sequences. > > Speaking of which, how would I go about harvesting the GenBank > numbers for a given study and associating them with their > alignments? Its that possible with the current API? > > I like Rutger's plug-in ideas. In general, it's easier for someone > like me to provide a simple web service, rather than contribute > directly to TB development. > > -Rick > > > On Thu, Apr 14, 2011 at 7:24 AM, Rutger Vos <R....@re...> > wrote: > I've been thinking about a redesign lately, and here's what I would > do: > > - make sure we can export all the metadata from TreeBASE in (nexml) > files, i.e. Laurel's GSoC project > > - get all the files for all the studies and create a simple folder > structure with those files, more or less along the lines of the > phylows urls (e.g phylows/tree/TB2/nexml/T1312), in all file formats > we can think of (json, nexus, phylip, newick, phyloxml, fasta, static > html, etc...) > > - create some mod_rewrite rules to map our urls onto the folder > structure (so going from phylows/tree/TB2:T1312?format=nexml to the > actual file). Performance will obviously be much, much better. > > - also allow harvesting of those files using rsync and ftp. Now we > have data dumps. > > - create a simple plug in architecture where, given a query string, a > remote web service returns a list of hits which it somehow generates > from its local, harvested data dump > > Here's a use case: Laurel wants to implement BLASTing into TreeBASE. > So she periodically harvests everything in phylows/matrix/TB2/fasta > with an rsync cron job. She runs formatdb on the fasta sequences, > creating a standalone BLAST. She then writes a simple cgi script that > accepts a target string (e.g. > myblast.cgi?tb2.blast=acgctcgcatcgcatcgactacgac) and returns a list of > phylows matrix urls from the matching results. > > On the TreeBASE side, we simply add a search widget that delegates the > query to Laurel's service and integrates it into our interfaces > (graphical, web services). > > With an architecture like that it would be so much easier for anyone > to add functionality in whatever programming language they like > without having to deal with a massive database schema. Of course any > of those remote services might have its own little database, but it'd > be more along the lines of a three-table SQLite database to store the > ITIS taxonomy structure such that we can find all TreeBASE taxa > subtended by a given ITIS higher taxon ID (for example). > > We'd have to implement some core plug ins ourselves, notably one that > extracts the metadata (using RDFA2RDFXML) and sticks that in a triple > store so we can search on, say, author names, journals, etc. I think > that's scalable because it's only a few dozen triples for each study. > > The idea is a little bit inspired by DAS, which seems to work quite > well: http://www.biodas.org/wiki/Main_Page > > (I'm leaving out the submission part as an exercise for the reader. > Presumably there would have to be a restricted area where properly > formatted files are uploaded and made available to reviewers.) > > Rutger > > p.s. I've done some harvesting a few weeks back, but unless that's > created queries that are still running I'm innocent. > > On Thu, Apr 14, 2011 at 12:28 PM, Roderic Page > <r....@bi...> wrote: > > TreeBASE performance has nothing to do with me folks, I pretty > much gave up trying to download data from it a few weeks back. > Someone really, really needs to rethink the way TreeBASE works, > because it's virtually unusable. > > > > Regards > > > > Rod > > > > On 14 Apr 2011, at 04:20, William Piel wrote: > > > >> > >> On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: > >> > >>> The trees in S11267 seem to silently fail to render in > PhyloWidget. > >>> All I get is a single dot. At least try the first three: > >>> > >>> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 > >>> > >>> Is this a temporary glitch, a problem with the reconstructed > file, or > >>> something else? Should I best file this as a bug in the bug > tracker? > >>> > >>> -hilmar > >>> > >> > >> Thanks for the alert. This is actually a bug in PhyloWidget -- if > the word "tree" appears in the title of the tree block, PhyloWidget > confuses this word with the TREE command, and so goofs up the parsing. > >> > >> I have removed the word "tree" from the tree block, so it works > now. > >> > >> However, you might need to refresh the PhyloWidget window after > loading to get the tree to pull through -- TreeBASE feels quite slow > lately; our API must be getting hit by someone (Rod?). > >> > >> bp > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Benefiting from Server Virtualization: Beyond Initial Workload > >> Consolidation -- Increasing the use of server virtualization is a > top > >> priority.Virtualization can reduce costs, simplify management, > and improve > >> application availability and disaster protection. Learn more > about boosting > >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > >> _______________________________________________ > >> Treebase-devel mailing list > >> Tre...@li... > >> https://lists.sourceforge.net/lists/listinfo/treebase-devel > >> > > > > --------------------------------------------------------- > > Roderic Page > > Professor of Taxonomy > > Institute of Biodiversity, Animal Health and Comparative Medicine > > College of Medical, Veterinary and Life Sciences > > Graham Kerr Building > > University of Glasgow > > Glasgow G12 8QQ, UK > > > > Email: r....@bi... > > Tel: +44 141 330 4778 > > Fax: +44 141 330 2792 > > AIM: rod...@ai... > > Facebook: http://www.facebook.com/profile.php?id=1112517192 > > Twitter: http://twitter.com/rdmpage > > Blog: http://iphylo.blogspot.com > > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > Benefiting from Server Virtualization: Beyond Initial Workload > > Consolidation -- Increasing the use of server virtualization is a > top > > priority.Virtualization can reduce costs, simplify management, and > improve > > application availability and disaster protection. Learn more about > boosting > > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > > _______________________________________________ > > Treebase-devel mailing list > > Tre...@li... > > https://lists.sourceforge.net/lists/listinfo/treebase-devel > > > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading > RG6 6BX > United Kingdom > Tel: +44 (0) 118 378 7535 > http://www.nexml.org > http://rutgervos.blogspot.com > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and > improve > application availability and disaster protection. Learn more about > boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and > improve > application availability and disaster protection. Learn more about > boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev_______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <R....@re...> - 2011-04-14 14:22:38
|
Hi Rick, if you mean whether we currently write out accession numbers for sequences then no, that's currently not possible. You can do more with NCBI taxonomy identifiers, though, under the current API. Rutger On Thu, Apr 14, 2011 at 2:39 PM, Richard Ree <rr...@fi...> wrote: > Hi folks, > Just joined the list, as I am interested in using TB to develop > collaborative methods for synthesizing plant phylogeny, e.g., by grafting > clades together and other kinds of agglomeration. So, I am particularly > interested in the API and harvesting, especially with respect to taxonomic > names and classifications, and GenBank identifiers of sequences. > Speaking of which, how would I go about harvesting the GenBank numbers for a > given study and associating them with their alignments? Its that possible > with the current API? > I like Rutger's plug-in ideas. In general, it's easier for someone like me > to provide a simple web service, rather than contribute directly to TB > development. > -Rick > > On Thu, Apr 14, 2011 at 7:24 AM, Rutger Vos <R....@re...> wrote: >> >> I've been thinking about a redesign lately, and here's what I would do: >> >> - make sure we can export all the metadata from TreeBASE in (nexml) >> files, i.e. Laurel's GSoC project >> >> - get all the files for all the studies and create a simple folder >> structure with those files, more or less along the lines of the >> phylows urls (e.g phylows/tree/TB2/nexml/T1312), in all file formats >> we can think of (json, nexus, phylip, newick, phyloxml, fasta, static >> html, etc...) >> >> - create some mod_rewrite rules to map our urls onto the folder >> structure (so going from phylows/tree/TB2:T1312?format=nexml to the >> actual file). Performance will obviously be much, much better. >> >> - also allow harvesting of those files using rsync and ftp. Now we >> have data dumps. >> >> - create a simple plug in architecture where, given a query string, a >> remote web service returns a list of hits which it somehow generates >> from its local, harvested data dump >> >> Here's a use case: Laurel wants to implement BLASTing into TreeBASE. >> So she periodically harvests everything in phylows/matrix/TB2/fasta >> with an rsync cron job. She runs formatdb on the fasta sequences, >> creating a standalone BLAST. She then writes a simple cgi script that >> accepts a target string (e.g. >> myblast.cgi?tb2.blast=acgctcgcatcgcatcgactacgac) and returns a list of >> phylows matrix urls from the matching results. >> >> On the TreeBASE side, we simply add a search widget that delegates the >> query to Laurel's service and integrates it into our interfaces >> (graphical, web services). >> >> With an architecture like that it would be so much easier for anyone >> to add functionality in whatever programming language they like >> without having to deal with a massive database schema. Of course any >> of those remote services might have its own little database, but it'd >> be more along the lines of a three-table SQLite database to store the >> ITIS taxonomy structure such that we can find all TreeBASE taxa >> subtended by a given ITIS higher taxon ID (for example). >> >> We'd have to implement some core plug ins ourselves, notably one that >> extracts the metadata (using RDFA2RDFXML) and sticks that in a triple >> store so we can search on, say, author names, journals, etc. I think >> that's scalable because it's only a few dozen triples for each study. >> >> The idea is a little bit inspired by DAS, which seems to work quite >> well: http://www.biodas.org/wiki/Main_Page >> >> (I'm leaving out the submission part as an exercise for the reader. >> Presumably there would have to be a restricted area where properly >> formatted files are uploaded and made available to reviewers.) >> >> Rutger >> >> p.s. I've done some harvesting a few weeks back, but unless that's >> created queries that are still running I'm innocent. >> >> On Thu, Apr 14, 2011 at 12:28 PM, Roderic Page <r....@bi...> >> wrote: >> > TreeBASE performance has nothing to do with me folks, I pretty much >> > gave up trying to download data from it a few weeks back. Someone really, >> > really needs to rethink the way TreeBASE works, because it's virtually >> > unusable. >> > >> > Regards >> > >> > Rod >> > >> > On 14 Apr 2011, at 04:20, William Piel wrote: >> > >> >> >> >> On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: >> >> >> >>> The trees in S11267 seem to silently fail to render in PhyloWidget. >> >>> All I get is a single dot. At least try the first three: >> >>> >> >>> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 >> >>> >> >>> Is this a temporary glitch, a problem with the reconstructed file, or >> >>> something else? Should I best file this as a bug in the bug tracker? >> >>> >> >>> -hilmar >> >>> >> >> >> >> Thanks for the alert. This is actually a bug in PhyloWidget -- if the >> >> word "tree" appears in the title of the tree block, PhyloWidget confuses >> >> this word with the TREE command, and so goofs up the parsing. >> >> >> >> I have removed the word "tree" from the tree block, so it works now. >> >> >> >> However, you might need to refresh the PhyloWidget window after loading >> >> to get the tree to pull through -- TreeBASE feels quite slow lately; our API >> >> must be getting hit by someone (Rod?). >> >> >> >> bp >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Benefiting from Server Virtualization: Beyond Initial Workload >> >> Consolidation -- Increasing the use of server virtualization is a top >> >> priority.Virtualization can reduce costs, simplify management, and >> >> improve >> >> application availability and disaster protection. Learn more about >> >> boosting >> >> the value of server virtualization. >> >> http://p.sf.net/sfu/vmware-sfdev2dev >> >> _______________________________________________ >> >> Treebase-devel mailing list >> >> Tre...@li... >> >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> >> >> > >> > --------------------------------------------------------- >> > Roderic Page >> > Professor of Taxonomy >> > Institute of Biodiversity, Animal Health and Comparative Medicine >> > College of Medical, Veterinary and Life Sciences >> > Graham Kerr Building >> > University of Glasgow >> > Glasgow G12 8QQ, UK >> > >> > Email: r....@bi... >> > Tel: +44 141 330 4778 >> > Fax: +44 141 330 2792 >> > AIM: rod...@ai... >> > Facebook: http://www.facebook.com/profile.php?id=1112517192 >> > Twitter: http://twitter.com/rdmpage >> > Blog: http://iphylo.blogspot.com >> > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > ------------------------------------------------------------------------------ >> > Benefiting from Server Virtualization: Beyond Initial Workload >> > Consolidation -- Increasing the use of server virtualization is a top >> > priority.Virtualization can reduce costs, simplify management, and >> > improve >> > application availability and disaster protection. Learn more about >> > boosting >> > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> > _______________________________________________ >> > Treebase-devel mailing list >> > Tre...@li... >> > https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > >> >> >> >> -- >> Dr. Rutger A. Vos >> School of Biological Sciences >> Philip Lyle Building, Level 4 >> University of Reading >> Reading >> RG6 6BX >> United Kingdom >> Tel: +44 (0) 118 378 7535 >> http://www.nexml.org >> http://rutgervos.blogspot.com >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about >> boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Richard R. <rr...@fi...> - 2011-04-14 14:01:47
|
Hi folks, Just joined the list, as I am interested in using TB to develop collaborative methods for synthesizing plant phylogeny, e.g., by grafting clades together and other kinds of agglomeration. So, I am particularly interested in the API and harvesting, especially with respect to taxonomic names and classifications, and GenBank identifiers of sequences. Speaking of which, how would I go about harvesting the GenBank numbers for a given study and associating them with their alignments? Its that possible with the current API? I like Rutger's plug-in ideas. In general, it's easier for someone like me to provide a simple web service, rather than contribute directly to TB development. -Rick On Thu, Apr 14, 2011 at 7:24 AM, Rutger Vos <R....@re...> wrote: > I've been thinking about a redesign lately, and here's what I would do: > > - make sure we can export all the metadata from TreeBASE in (nexml) > files, i.e. Laurel's GSoC project > > - get all the files for all the studies and create a simple folder > structure with those files, more or less along the lines of the > phylows urls (e.g phylows/tree/TB2/nexml/T1312), in all file formats > we can think of (json, nexus, phylip, newick, phyloxml, fasta, static > html, etc...) > > - create some mod_rewrite rules to map our urls onto the folder > structure (so going from phylows/tree/TB2:T1312?format=nexml to the > actual file). Performance will obviously be much, much better. > > - also allow harvesting of those files using rsync and ftp. Now we > have data dumps. > > - create a simple plug in architecture where, given a query string, a > remote web service returns a list of hits which it somehow generates > from its local, harvested data dump > > Here's a use case: Laurel wants to implement BLASTing into TreeBASE. > So she periodically harvests everything in phylows/matrix/TB2/fasta > with an rsync cron job. She runs formatdb on the fasta sequences, > creating a standalone BLAST. She then writes a simple cgi script that > accepts a target string (e.g. > myblast.cgi?tb2.blast=acgctcgcatcgcatcgactacgac) and returns a list of > phylows matrix urls from the matching results. > > On the TreeBASE side, we simply add a search widget that delegates the > query to Laurel's service and integrates it into our interfaces > (graphical, web services). > > With an architecture like that it would be so much easier for anyone > to add functionality in whatever programming language they like > without having to deal with a massive database schema. Of course any > of those remote services might have its own little database, but it'd > be more along the lines of a three-table SQLite database to store the > ITIS taxonomy structure such that we can find all TreeBASE taxa > subtended by a given ITIS higher taxon ID (for example). > > We'd have to implement some core plug ins ourselves, notably one that > extracts the metadata (using RDFA2RDFXML) and sticks that in a triple > store so we can search on, say, author names, journals, etc. I think > that's scalable because it's only a few dozen triples for each study. > > The idea is a little bit inspired by DAS, which seems to work quite > well: http://www.biodas.org/wiki/Main_Page > > (I'm leaving out the submission part as an exercise for the reader. > Presumably there would have to be a restricted area where properly > formatted files are uploaded and made available to reviewers.) > > Rutger > > p.s. I've done some harvesting a few weeks back, but unless that's > created queries that are still running I'm innocent. > > On Thu, Apr 14, 2011 at 12:28 PM, Roderic Page <r....@bi...> > wrote: > > TreeBASE performance has nothing to do with me folks, I pretty much gave > up trying to download data from it a few weeks back. Someone really, really > needs to rethink the way TreeBASE works, because it's virtually unusable. > > > > Regards > > > > Rod > > > > On 14 Apr 2011, at 04:20, William Piel wrote: > > > >> > >> On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: > >> > >>> The trees in S11267 seem to silently fail to render in PhyloWidget. > >>> All I get is a single dot. At least try the first three: > >>> > >>> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 > >>> > >>> Is this a temporary glitch, a problem with the reconstructed file, or > >>> something else? Should I best file this as a bug in the bug tracker? > >>> > >>> -hilmar > >>> > >> > >> Thanks for the alert. This is actually a bug in PhyloWidget -- if the > word "tree" appears in the title of the tree block, PhyloWidget confuses > this word with the TREE command, and so goofs up the parsing. > >> > >> I have removed the word "tree" from the tree block, so it works now. > >> > >> However, you might need to refresh the PhyloWidget window after loading > to get the tree to pull through -- TreeBASE feels quite slow lately; our API > must be getting hit by someone (Rod?). > >> > >> bp > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Benefiting from Server Virtualization: Beyond Initial Workload > >> Consolidation -- Increasing the use of server virtualization is a top > >> priority.Virtualization can reduce costs, simplify management, and > improve > >> application availability and disaster protection. Learn more about > boosting > >> the value of server virtualization. > http://p.sf.net/sfu/vmware-sfdev2dev > >> _______________________________________________ > >> Treebase-devel mailing list > >> Tre...@li... > >> https://lists.sourceforge.net/lists/listinfo/treebase-devel > >> > > > > --------------------------------------------------------- > > Roderic Page > > Professor of Taxonomy > > Institute of Biodiversity, Animal Health and Comparative Medicine > > College of Medical, Veterinary and Life Sciences > > Graham Kerr Building > > University of Glasgow > > Glasgow G12 8QQ, UK > > > > Email: r....@bi... > > Tel: +44 141 330 4778 > > Fax: +44 141 330 2792 > > AIM: rod...@ai... > > Facebook: http://www.facebook.com/profile.php?id=1112517192 > > Twitter: http://twitter.com/rdmpage > > Blog: http://iphylo.blogspot.com > > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > Benefiting from Server Virtualization: Beyond Initial Workload > > Consolidation -- Increasing the use of server virtualization is a top > > priority.Virtualization can reduce costs, simplify management, and > improve > > application availability and disaster protection. Learn more about > boosting > > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > > _______________________________________________ > > Treebase-devel mailing list > > Tre...@li... > > https://lists.sourceforge.net/lists/listinfo/treebase-devel > > > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading > RG6 6BX > United Kingdom > Tel: +44 (0) 118 378 7535 > http://www.nexml.org > http://rutgervos.blogspot.com > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > |
From: William P. <wil...@ya...> - 2011-04-14 13:26:03
|
Whoops... forgot to reply to <TreeBASE Devel>. At any rate, I just noticed that someone form the Czech Republic has uploaded >20 files, each with > 60,000 sequences, this might be causing java heap size issues. bp On Apr 14, 2011, at 8:39 AM, William Piel wrote: > Hi Rod, > > Some performance measures should have improved since your earlier efforts with the API. Morphological datasets, for example, should download much faster -- in part seeing as the database has shrunk from ~200GB to 3GB in size. > > But of course you're right -- the API needs to be much more efficient, e.g. refactored so that the CQL translates more directly into HQL instead of just hitching in to our web SearchController. > > Regarding the user interface usability -- as you know, this is an unfunded open source project -- so you're very welcome to make improvements ! > > bp > > > On Apr 14, 2011, at 7:28 AM, Roderic Page wrote: > >> TreeBASE performance has nothing to do with me folks, I pretty much gave up trying to download data from it a few weeks back. Someone really, really needs to rethink the way TreeBASE works, because it's virtually unusable. >> >> Regards >> >> Rod >> >> On 14 Apr 2011, at 04:20, William Piel wrote: >> >>> >>> On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: >>> >>>> The trees in S11267 seem to silently fail to render in PhyloWidget. >>>> All I get is a single dot. At least try the first three: >>>> >>>> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 >>>> >>>> Is this a temporary glitch, a problem with the reconstructed file, or >>>> something else? Should I best file this as a bug in the bug tracker? >>>> >>>> -hilmar >>>> >>> >>> Thanks for the alert. This is actually a bug in PhyloWidget -- if the word "tree" appears in the title of the tree block, PhyloWidget confuses this word with the TREE command, and so goofs up the parsing. >>> >>> I have removed the word "tree" from the tree block, so it works now. >>> >>> However, you might need to refresh the PhyloWidget window after loading to get the tree to pull through -- TreeBASE feels quite slow lately; our API must be getting hit by someone (Rod?). >>> >>> bp >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Benefiting from Server Virtualization: Beyond Initial Workload >>> Consolidation -- Increasing the use of server virtualization is a top >>> priority.Virtualization can reduce costs, simplify management, and improve >>> application availability and disaster protection. Learn more about boosting >>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel > |
From: Rutger V. <R....@re...> - 2011-04-14 12:24:56
|
I've been thinking about a redesign lately, and here's what I would do: - make sure we can export all the metadata from TreeBASE in (nexml) files, i.e. Laurel's GSoC project - get all the files for all the studies and create a simple folder structure with those files, more or less along the lines of the phylows urls (e.g phylows/tree/TB2/nexml/T1312), in all file formats we can think of (json, nexus, phylip, newick, phyloxml, fasta, static html, etc...) - create some mod_rewrite rules to map our urls onto the folder structure (so going from phylows/tree/TB2:T1312?format=nexml to the actual file). Performance will obviously be much, much better. - also allow harvesting of those files using rsync and ftp. Now we have data dumps. - create a simple plug in architecture where, given a query string, a remote web service returns a list of hits which it somehow generates from its local, harvested data dump Here's a use case: Laurel wants to implement BLASTing into TreeBASE. So she periodically harvests everything in phylows/matrix/TB2/fasta with an rsync cron job. She runs formatdb on the fasta sequences, creating a standalone BLAST. She then writes a simple cgi script that accepts a target string (e.g. myblast.cgi?tb2.blast=acgctcgcatcgcatcgactacgac) and returns a list of phylows matrix urls from the matching results. On the TreeBASE side, we simply add a search widget that delegates the query to Laurel's service and integrates it into our interfaces (graphical, web services). With an architecture like that it would be so much easier for anyone to add functionality in whatever programming language they like without having to deal with a massive database schema. Of course any of those remote services might have its own little database, but it'd be more along the lines of a three-table SQLite database to store the ITIS taxonomy structure such that we can find all TreeBASE taxa subtended by a given ITIS higher taxon ID (for example). We'd have to implement some core plug ins ourselves, notably one that extracts the metadata (using RDFA2RDFXML) and sticks that in a triple store so we can search on, say, author names, journals, etc. I think that's scalable because it's only a few dozen triples for each study. The idea is a little bit inspired by DAS, which seems to work quite well: http://www.biodas.org/wiki/Main_Page (I'm leaving out the submission part as an exercise for the reader. Presumably there would have to be a restricted area where properly formatted files are uploaded and made available to reviewers.) Rutger p.s. I've done some harvesting a few weeks back, but unless that's created queries that are still running I'm innocent. On Thu, Apr 14, 2011 at 12:28 PM, Roderic Page <r....@bi...> wrote: > TreeBASE performance has nothing to do with me folks, I pretty much gave up trying to download data from it a few weeks back. Someone really, really needs to rethink the way TreeBASE works, because it's virtually unusable. > > Regards > > Rod > > On 14 Apr 2011, at 04:20, William Piel wrote: > >> >> On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: >> >>> The trees in S11267 seem to silently fail to render in PhyloWidget. >>> All I get is a single dot. At least try the first three: >>> >>> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 >>> >>> Is this a temporary glitch, a problem with the reconstructed file, or >>> something else? Should I best file this as a bug in the bug tracker? >>> >>> -hilmar >>> >> >> Thanks for the alert. This is actually a bug in PhyloWidget -- if the word "tree" appears in the title of the tree block, PhyloWidget confuses this word with the TREE command, and so goofs up the parsing. >> >> I have removed the word "tree" from the tree block, so it works now. >> >> However, you might need to refresh the PhyloWidget window after loading to get the tree to pull through -- TreeBASE feels quite slow lately; our API must be getting hit by someone (Rod?). >> >> bp >> >> >> >> ------------------------------------------------------------------------------ >> Benefiting from Server Virtualization: Beyond Initial Workload >> Consolidation -- Increasing the use of server virtualization is a top >> priority.Virtualization can reduce costs, simplify management, and improve >> application availability and disaster protection. Learn more about boosting >> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev >> _______________________________________________ >> Treebase-devel mailing list >> Tre...@li... >> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: r....@bi... > Tel: +44 141 330 4778 > Fax: +44 141 330 2792 > AIM: rod...@ai... > Facebook: http://www.facebook.com/profile.php?id=1112517192 > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html > > > > > > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Roderic P. <r....@bi...> - 2011-04-14 11:52:16
|
TreeBASE performance has nothing to do with me folks, I pretty much gave up trying to download data from it a few weeks back. Someone really, really needs to rethink the way TreeBASE works, because it's virtually unusable. Regards Rod On 14 Apr 2011, at 04:20, William Piel wrote: > > On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: > >> The trees in S11267 seem to silently fail to render in PhyloWidget. >> All I get is a single dot. At least try the first three: >> >> http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 >> >> Is this a temporary glitch, a problem with the reconstructed file, or >> something else? Should I best file this as a bug in the bug tracker? >> >> -hilmar >> > > Thanks for the alert. This is actually a bug in PhyloWidget -- if the word "tree" appears in the title of the tree block, PhyloWidget confuses this word with the TREE command, and so goofs up the parsing. > > I have removed the word "tree" from the tree block, so it works now. > > However, you might need to refresh the PhyloWidget window after loading to get the tree to pull through -- TreeBASE feels quite slow lately; our API must be getting hit by someone (Rod?). > > bp > > > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r....@bi... Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rod...@ai... Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html |
From: William P. <wil...@ya...> - 2011-04-14 03:20:18
|
On Apr 13, 2011, at 9:36 PM, Hilmar Lapp wrote: > The trees in S11267 seem to silently fail to render in PhyloWidget. > All I get is a single dot. At least try the first three: > > http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 > > Is this a temporary glitch, a problem with the reconstructed file, or > something else? Should I best file this as a bug in the bug tracker? > > -hilmar > Thanks for the alert. This is actually a bug in PhyloWidget -- if the word "tree" appears in the title of the tree block, PhyloWidget confuses this word with the TREE command, and so goofs up the parsing. I have removed the word "tree" from the tree block, so it works now. However, you might need to refresh the PhyloWidget window after loading to get the tree to pull through -- TreeBASE feels quite slow lately; our API must be getting hit by someone (Rod?). bp |
From: Hilmar L. <hl...@ne...> - 2011-04-14 01:36:40
|
The trees in S11267 seem to silently fail to render in PhyloWidget. All I get is a single dot. At least try the first three: http://www.treebase.org/treebase-web/search/study/trees.html?id=11267 Is this a temporary glitch, a problem with the reconstructed file, or something else? Should I best file this as a bug in the bug tracker? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <rut...@gm...> - 2011-03-25 18:05:02
|
Bio::Phylo reads LINK and TITLE tags correctly and uses them to link blocks to one another. On Fri, Mar 25, 2011 at 6:01 PM, William Piel <wil...@ya...> wrote: > > On Mar 25, 2011, at 11:51 AM, Rutger Vos wrote: > >> The additions are necessary because they specify the relationships >> between multiple taxa blocks, characters blocks and tree blocks. >> They're not part of the core standard per se, though - I think they >> were introduced by mesquite. Maybe ape should switch to nexml instead? > > > According to this: > > https://www.nescent.org/wg_phyloinformatics/Supporting_NEXUS#Testing_conformance_of_NEXUS_files > > Both MacClade and Mesquite are "level III parsers" for validating NEXUS files. Neither has a problem with TITLE and LINK commands. PAUP does not object to them either. I haven't tested Bio::NEXUS, though, which is the other level III parser that is recommended. > > bp > > > > ------------------------------------------------------------------------------ > Enable your software for Intel(R) Active Management Technology to meet the > growing manageability and security demands of your customers. Businesses > are taking advantage of Intel(R) vPro (TM) technology - will your software > be a part of the solution? Download the Intel(R) Manageability Checker > today! http://p.sf.net/sfu/intel-dev2devmar > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: William P. <wil...@ya...> - 2011-03-25 18:01:35
|
On Mar 25, 2011, at 11:51 AM, Rutger Vos wrote: > The additions are necessary because they specify the relationships > between multiple taxa blocks, characters blocks and tree blocks. > They're not part of the core standard per se, though - I think they > were introduced by mesquite. Maybe ape should switch to nexml instead? According to this: https://www.nescent.org/wg_phyloinformatics/Supporting_NEXUS#Testing_conformance_of_NEXUS_files Both MacClade and Mesquite are "level III parsers" for validating NEXUS files. Neither has a problem with TITLE and LINK commands. PAUP does not object to them either. I haven't tested Bio::NEXUS, though, which is the other level III parser that is recommended. bp |
From: Rutger V. <rut...@gm...> - 2011-03-25 15:52:02
|
The additions are necessary because they specify the relationships between multiple taxa blocks, characters blocks and tree blocks. They're not part of the core standard per se, though - I think they were introduced by mesquite. Maybe ape should switch to nexml instead? On Fri, Mar 25, 2011 at 3:25 PM, Hilmar Lapp <hl...@ne...> wrote: > FYI - the NEXUS download from TreeBASE seems to contain stuff that > aren't understood by some (rather widely used) NEXUS parsers. I don't > know whether these are part of the core standard or were added later > as a "flavor", but it may be beneficial to get rid of them if they > aren't crucial. > > Thoughts? > > (BTW the readNexus() function in the phylobase package - which can > read this - is based on NCL. So presumably that means that in NCL's > implementation of the standard this is correct. It's just that not > everyone uses NCL :-) > > -hilmar > > Begin forwarded message: > >> From: Emmanuel Paradis <Emm...@ir...> >> Date: March 25, 2011 12:00:36 AM EDT >> To: François Michonneau <fra...@gm...> >> Cc: R-s...@r-... >> Subject: Re: [R-sig-phylo] reading nexus file from treebase? >> >> Scott, >> >> readNexus (phylobase) can read your tree but not read.nexus (ape). >> The problem is the two lines inserted within the TREES block: >> >> BEGIN TREES; >> TITLE Tb10793; <<<<<< >> LINK TAXA = M4787; <<<<<< >> TREE Fig._3c = [&R] >> >> If you delete them, it's OK. FigTree also cannot read this file. >> >> Cheers, >> >> Emmanuel >> >> François Michonneau wrote on 25/03/2011 00:35: >>> Hi Scott, >>> Which version of phylobase are you using and which architecture? I >>> can >>> read the file on my machine. >>> Cheers, >>> -- François >>> On Thu, Mar 24, 2011 at 12:39, Scott Chamberlain <myr...@gm... >>> >wrote: >>>> Hello, >>>> >>>> I can't get read.nexus (ape) or readNexus (phylobase) to read >>>> nexus files >>>> downloaded from treebase with URLs parsed from xml files. I can't >>>> manually >>>> edit each file as I want to read a lot of these files. Is there an >>>> easy fix? >>>> One of the files is copied below. >>>> >>>> Thanks! >>>> Scott Chamberlain >>>> Rice University, EEB Dept. >>>> >>>> >>>> >>>> >>>> #NEXUS >>>> >>>> [!This data set was downloaded from TreeBASE, a relational >>>> database of >>>> phylogenetic knowledge. TreeBASE has been supported by the NSF, >>>> Harvard >>>> University, Yale University, SDSC and UC Davis. Please do not >>>> remove this >>>> acknowledgment from the Nexus file. >>>> >>>> >>>> Downloaded on March 24, 2011; 16:32 GMT >>>> >>>> TreeBASE (cc) 1994-2008 >>>> >>>> Study reference: >>>> Brown R., & Yang Z. 2010. Bayesian Dating of Shallow Phylogenies >>>> with a >>>> Relaxed Clock. >>>> Systematic Biology, 59(2): 119-131. >>>> >>>> TreeBASE Study URI: >>>> http://purl.org/phylo/treebase/phylows/study/TB2:S10165] >>>> >>>> BEGIN TAXA; >>>> TITLE M4787; >>>> DIMENSIONS NTAX=16; >>>> TAXLABELS >>>> Chalcides_coeruleopunctatus_E2806.20 >>>> Chalcides_coeruleopunctatus_E2806.22 >>>> Chalcides_manueli_E2506.1 >>>> Chalcides_mionecton_mionecton_E2506.10 >>>> Chalcides_mionecton_mionecton_E2506.12 >>>> Chalcides_mionecton_trifasciatus_E2506.18 >>>> Chalcides_polylepis_E14124.1 >>>> Chalcides_polylepis_E14124.2 >>>> Chalcides_polylepis_E2506.21 >>>> Chalcides_sexlineatus_bistriatus_E2806.6 >>>> Chalcides_sexlineatus_sexlineatus_E2806.8 >>>> Chalcides_simonyi_E3007.2 >>>> Chalcides_sphenopsiformis_E8121.26 >>>> Chalcides_sphenopsiformis_E8121.27 >>>> Chalcides_viridanus_E2806.10 >>>> Chalcides_viridanus_E2806.14 >>>> ; >>>> END; >>>> >>>> BEGIN TREES; >>>> TITLE Tb10793; >>>> LINK TAXA = M4787; >>>> TREE Fig._3c = [&R] >>>> ((Chalcides_sphenopsiformis_E8121.26,Chalcides_sphenopsiformis_E8121.27),(((Chalcides_viridanus_E2806.10,Chalcides_viridanus_E2806.14),((Chalcides_sexlineatus_bistriatus_E2806.6,Chalcides_sexlineatus_sexlineatus_E2806.8),(Chalcides_coeruleopunctatus_E2806.22,Chalcides_coeruleopunctatus_E2806.20))),(Chalcides_simonyi_E3007.2,((Chalcides_mionecton_trifasciatus_E2506.18,(Chalcides_mionecton_mionecton_E2506.12,Chalcides_mionecton_mionecton_E2506.10)),(Chalcides_manueli_E2506.1,(Chalcides_polylepis_E14124.1,(Chalcides_polylepis_E14124.2,Chalcides_polylepis_E2506.21))))))); >>>> [! TreeBASE tree URI: >>>> http://purl.org/phylo/treebase/phylows/tree/TB2:Tr6136] >>>> >>>> >>>> END; >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> R-sig-phylo mailing list >>>> R-s...@r-... >>>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >>>> >>> [[alternative HTML version deleted]] >>> ------------------------------------------------------------------------ >>> _______________________________________________ >>> R-sig-phylo mailing list >>> R-s...@r-... >>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >> >> -- >> Emmanuel Paradis >> IRD, Jakarta, Indonesia >> http://ape.mpl.ird.fr/ >> >> _______________________________________________ >> R-sig-phylo mailing list >> R-s...@r-... >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : > =========================================================== > > > > > ------------------------------------------------------------------------------ > Enable your software for Intel(R) Active Management Technology to meet the > growing manageability and security demands of your customers. Businesses > are taking advantage of Intel(R) vPro (TM) technology - will your software > be a part of the solution? Download the Intel(R) Manageability Checker > today! http://p.sf.net/sfu/intel-dev2devmar > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |
From: Hilmar L. <hl...@ne...> - 2011-03-25 15:26:05
|
FYI - the NEXUS download from TreeBASE seems to contain stuff that aren't understood by some (rather widely used) NEXUS parsers. I don't know whether these are part of the core standard or were added later as a "flavor", but it may be beneficial to get rid of them if they aren't crucial. Thoughts? (BTW the readNexus() function in the phylobase package - which can read this - is based on NCL. So presumably that means that in NCL's implementation of the standard this is correct. It's just that not everyone uses NCL :-) -hilmar Begin forwarded message: > From: Emmanuel Paradis <Emm...@ir...> > Date: March 25, 2011 12:00:36 AM EDT > To: François Michonneau <fra...@gm...> > Cc: R-s...@r-... > Subject: Re: [R-sig-phylo] reading nexus file from treebase? > > Scott, > > readNexus (phylobase) can read your tree but not read.nexus (ape). > The problem is the two lines inserted within the TREES block: > > BEGIN TREES; > TITLE Tb10793; <<<<<< > LINK TAXA = M4787; <<<<<< > TREE Fig._3c = [&R] > > If you delete them, it's OK. FigTree also cannot read this file. > > Cheers, > > Emmanuel > > François Michonneau wrote on 25/03/2011 00:35: >> Hi Scott, >> Which version of phylobase are you using and which architecture? I >> can >> read the file on my machine. >> Cheers, >> -- François >> On Thu, Mar 24, 2011 at 12:39, Scott Chamberlain <myr...@gm... >> >wrote: >>> Hello, >>> >>> I can't get read.nexus (ape) or readNexus (phylobase) to read >>> nexus files >>> downloaded from treebase with URLs parsed from xml files. I can't >>> manually >>> edit each file as I want to read a lot of these files. Is there an >>> easy fix? >>> One of the files is copied below. >>> >>> Thanks! >>> Scott Chamberlain >>> Rice University, EEB Dept. >>> >>> >>> >>> >>> #NEXUS >>> >>> [!This data set was downloaded from TreeBASE, a relational >>> database of >>> phylogenetic knowledge. TreeBASE has been supported by the NSF, >>> Harvard >>> University, Yale University, SDSC and UC Davis. Please do not >>> remove this >>> acknowledgment from the Nexus file. >>> >>> >>> Downloaded on March 24, 2011; 16:32 GMT >>> >>> TreeBASE (cc) 1994-2008 >>> >>> Study reference: >>> Brown R., & Yang Z. 2010. Bayesian Dating of Shallow Phylogenies >>> with a >>> Relaxed Clock. >>> Systematic Biology, 59(2): 119-131. >>> >>> TreeBASE Study URI: >>> http://purl.org/phylo/treebase/phylows/study/TB2:S10165] >>> >>> BEGIN TAXA; >>> TITLE M4787; >>> DIMENSIONS NTAX=16; >>> TAXLABELS >>> Chalcides_coeruleopunctatus_E2806.20 >>> Chalcides_coeruleopunctatus_E2806.22 >>> Chalcides_manueli_E2506.1 >>> Chalcides_mionecton_mionecton_E2506.10 >>> Chalcides_mionecton_mionecton_E2506.12 >>> Chalcides_mionecton_trifasciatus_E2506.18 >>> Chalcides_polylepis_E14124.1 >>> Chalcides_polylepis_E14124.2 >>> Chalcides_polylepis_E2506.21 >>> Chalcides_sexlineatus_bistriatus_E2806.6 >>> Chalcides_sexlineatus_sexlineatus_E2806.8 >>> Chalcides_simonyi_E3007.2 >>> Chalcides_sphenopsiformis_E8121.26 >>> Chalcides_sphenopsiformis_E8121.27 >>> Chalcides_viridanus_E2806.10 >>> Chalcides_viridanus_E2806.14 >>> ; >>> END; >>> >>> BEGIN TREES; >>> TITLE Tb10793; >>> LINK TAXA = M4787; >>> TREE Fig._3c = [&R] >>> ((Chalcides_sphenopsiformis_E8121.26,Chalcides_sphenopsiformis_E8121.27),(((Chalcides_viridanus_E2806.10,Chalcides_viridanus_E2806.14),((Chalcides_sexlineatus_bistriatus_E2806.6,Chalcides_sexlineatus_sexlineatus_E2806.8),(Chalcides_coeruleopunctatus_E2806.22,Chalcides_coeruleopunctatus_E2806.20))),(Chalcides_simonyi_E3007.2,((Chalcides_mionecton_trifasciatus_E2506.18,(Chalcides_mionecton_mionecton_E2506.12,Chalcides_mionecton_mionecton_E2506.10)),(Chalcides_manueli_E2506.1,(Chalcides_polylepis_E14124.1,(Chalcides_polylepis_E14124.2,Chalcides_polylepis_E2506.21))))))); >>> [! TreeBASE tree URI: >>> http://purl.org/phylo/treebase/phylows/tree/TB2:Tr6136] >>> >>> >>> END; >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> R-sig-phylo mailing list >>> R-s...@r-... >>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >>> >> [[alternative HTML version deleted]] >> ------------------------------------------------------------------------ >> _______________________________________________ >> R-sig-phylo mailing list >> R-s...@r-... >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > > -- > Emmanuel Paradis > IRD, Jakarta, Indonesia > http://ape.mpl.ird.fr/ > > _______________________________________________ > R-sig-phylo mailing list > R-s...@r-... > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : =========================================================== |
From: Rutger V. <rut...@gm...> - 2011-03-14 13:47:12
|
---------- Forwarded message ---------- From: Google Calendar <cal...@go...> Date: Sat, Mar 12, 2011 at 5:05 PM Subject: Reminder: deadline ievobio @ Fri Mar 18, 2011 (Rutger Vos) To: Rutger Vos <rut...@gm...> more details »<https://www.google.com/calendar/event?action=VIEW&eid=N2hhbzJlczNjY2cyNDJuMHVxZzM5dG9rN2cgcnV0Z2VyYWxkb0Bt&tok=MjAjcnV0Z2VyYWxkb0BnbWFpbC5jb21mMDFlMTZhMWRiYWM0YmI1MmM0MWZmODY4OTJiNDdmMzEzYzZmYmE3&ctz=Europe%2FLondon&hl=en> deadline ievobio The Call for Abstracts for full talks is now open for the 2011 conference on Informatics for Phylogenetics, Evolution, and Biodiversity (iEvoBio), at http://ievobio.org/ocs/index.php/ievobio/2011<http://www.google.com/url?q=http%3A%2F%2Fievobio.org%2Focs%2Findex.php%2Fievobio%2F2011&usd=2&usg=AFQjCNFZGQoslfCZQYgSiwMP06MbXWh-SQ>. See below for instructions. Accepted talks will be about 15-20 minutes in length and will be presented during the full talk sessions in the morning of each of the two conference days, following the day's keynote presentation. Submitted talks should be in the area of informatics aimed at advancing research in phylogenetics, evolution, and biodiversity, including new tools, cyberinfrastructure development, large-scale data analysis, and visualization. Submissions consist of a title and an abstract at most 1 page long. The abstract should provide an overview of the talk's subject. As the number of program slots for full talks is limited, the abstract should give enough detail so reviewers can decide whether the submission merits a full talk or whether it should be moved to one of the Lightning Talk sessions. If the subject of the talk is a specific software component for use by the research community, the abstract must state the license and give the URL where the source code is available so reviewers can verify that the open-source requirement(*) is met. The deadline for submission is March 18, 2011. We intend to notify authors of accepted talks before early registration for iEvoBio (and Evolution) ends. Further instructions for submission are at the following URL: http://ievobio.org/ocs/index.php/ievobio/2011/schedConf/cfp<http://www.google.com/url?q=http%3A%2F%2Fievobio.org%2Focs%2Findex.php%2Fievobio%2F2011%2FschedConf%2Fcfp&usd=2&usg=AFQjCNG7VXo3RDOBRllz0TiKqiQMvcY4lw> Full talks are 1 of 5 kinds of contributed content that iEvoBio will feature. The other 4 are: 1) Lightning talks (5 mins long), 2) Challenge entries, 3) Software bazaar demonstrations, and 4) Birds-of- a-Feather gatherings. The Call for Challenge entries is already open (see http://ievobio.org/challenge.html<http://www.google.com/url?q=http%3A%2F%2Fievobio.org%2Fchallenge.html&usd=2&usg=AFQjCNGQqx6QuPTXhi4JmKWe3v7IzP8ngw>). The calls for contribution to the other 3 sessions will open later, and will remain open until shortly before the conference or until the respective track fills up. More details about the program and guidelines for contributing content are available at http://ievobio.org<http://www.google.com/url?q=http%3A%2F%2Fievobio.org&usd=2&usg=AFQjCNGuGGRgD7Y66tr9YUHcOVpxhmsgKw>. You can also find continuous updates on the conference's Twitter feed at http://twitter.com/<http://www.google.com/url?q=http%3A%2F%2Ftwitter.com%2F&usd=2&usg=AFQjCNFrw0bIBFXMJOynirJL5Eyikc0YuA> iEvoBio , or subscribe to the low-traffic iEvoBio announcements mailing list at http://groups.google.com/group/ievobio-announce iEvoBio is sponsored by the US National Evolutionary Synthesis Center (NESCent) in partnership with the Society for the Study of Ecolution (SSE) and the Society of Systematic Biologists (SSB). Additional support has been provided by the Encyclopedia of Life (EOL). The iEvoBio 2011 Organizing Committee: Rob Guralnick (University of Colorado at Boulder) (Co-chair) Cynthia Parr (Encyclopedia of Life) (Co-chair) Dawn Field (UK National Environmental Research Center) Mark Holder (University of Kansas) Hilmar Lapp (NESCent) Rod Page (University of Glasgow) (*) iEvoBio and its sponsors are dedicated to promoting the practice and philosophy of Open Source software development (see http://www.opensource.org/docs/definition.php<http://www.google.com/url?q=http%3A%2F%2Fwww.opensource.org%2Fdocs%2Fdefinition.php&usd=2&usg=AFQjCNEQcWgwBoO0kepsI4YHIpT7ADgy9Q> ) and reuse within the research community. For this reason, if a submitted talk concerns a specific software system for use by the research community, that software must be licensed with a recognized Open Source License (see http://www.opensource.org/licenses/<http://www.google.com/url?q=http%3A%2F%2Fwww.opensource.org%2Flicenses%2F&usd=2&usg=AFQjCNFvM-2dvS0ruUQBpdGyvnFxG0MA5g>), and be available for download, including source code, by a tar/zip file accessed through ftp/http or through a widely used version control system like cvs, Subversion, git, Bazaar, or Mercurial. Authors of full talks who cannot meet this requirement at the time of submission should state their intentions, and are advised that the requirement must be met by June 19, 2011, at the latest. *When* Fri Mar 18, 2011 *Calendar* Rutger Vos *Who* • Rutger Vos - organizer Invitation from Google Calendar <https://www.google.com/calendar/> You are receiving this email at the account rut...@gm... because you set a reminder for this event on the calendar Rutger Vos. You can change your reminders for specific events in the event details page in https://www.google.com/calendar/. -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |