From: William P. <wil...@ya...> - 2011-04-14 20:00:21
|
On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > It's not about publications, it's not about sequences, it's not really about data (OK, a little bit about data), it's about trees From where I sit, alignments are an important resource for the community. Nobody emails me asked for a tree that is missing from TreeBASE, but I'm always being asked for an alignment that should be in TreeBASE but is not.Typically it is because the author started, but never finished, a submission. Earlier this year I had a case where an author wanted her alignments embargoed for a year post-publication. After several people independently emailed me to request access, I contacted the journal, they convened the board, and they passed a resolution stating that all data must be released immediately. So these are not without value. Alignments are collections of hypotheses of homology (NCHAR of them per alignment!) that are often difficult to rebuild from scratch -- trees are merely blended summary diagrams of these hypotheses. Plus, retyping a morphological dataset after OCR'ing a PDF is an enormous pain. On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > So I guess I'd do the following: Is there anything to stop anyone from doing exactly this? And you could have your CouchDB updated periodically by doing a cron on TreeBASE's OAI-PMH to get the IDs of all new or modified studies (e.g., since April 12th, GMT: http://treebase.org/treebase-web/top/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2011-04-12T00:00:00Z), and then pull down the NeXML for just the trees, convert to JSON, populate the CouchDB, etc. It should be relatively easy to maintain a CouchDB mirror. bp PS - Hmm... Rod, do you know you do have a way of loving-and-then-hating things? :-) On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > 7. Never, ever mention RDF. On Apr 14, 2011, at 1:05 PM, Roderic Page wrote: > I ... was once an enthusiast [of RDF] On Apr 14, 2011, at 12:04 PM, Roderic Page wrote: > Bonus points for not mentioning XML. On May 20, 2004, at 7:37 PM, Roderic D. M. Page wrote: > [I think TreeBASE should] store data (say, the character states for a taxon) as an XML formatted BLOB. On Jan 15, 2006, at 4:15 PM, Roderic Page wrote: > once one of the major providers adopts LSIDs (my money is on uBio), whatever they adopt will drive standards On Apr 1, 2009, at 9:20 AM, Roderic D. M. Page wrote: > I think that [LSIDs have] been the Achilles heel of biodiversity informatics. [etc..] |