From: Rutger V. <R....@re...> - 2011-11-08 14:52:48
|
> The "low-hanging fruit" version that Bill describes would mean just > putting a text blob into a NeXML "submission" or "miapa_checklist" > element designed specifically for this purpose. Given external > vocabulary support, NeXML can support something a bit better than > this, which is to have a "submission" or "miapa_checklist" bag filled > with RDF-like triples (using NeXML's scheme for this). A further step > might be to build some of the logical structure of the MIAPA checklist > into the NeXML schema, though this raises the question of whether it > all belongs in a "miapa-checklist" element or should be distributed in > various places in the file (e.g., alignment method with characters, > tree method with tree, author data at the top level, etc). I don't think a text blob is truly a low hanging fruit if that means it is some sort of opaque flat text syntax that would have to be parsed separately from NeXML syntax. XML element structures can be the values/objects of semantic annotations so I would suggest that as the entry-level (but still suboptimal) way of doing this. Since parts of MIAPA metadata are applicable to specific types of data objects (e.g. alignment parameters apply to multiple sequence alignments) of which there can be multiples in a single document, I would strongly suggest distributing checklist elements to relevant positions in a document. > 3. If we want to build in support for measuring MIAPA conformance > (i.e., this submission gets a 3.2 out of 7 checklist items), then > there must be some kind of standardized grammar so that a machine can > detect whether or not a record has specified a particular checklist > element, e.g., alignment method. A text blob will not suffice for > this. So the more granular the annotations are, the better - though I think it unrealistically optimistic to expect us to be able to validate individual annotation values (e.g. do these alignment parameters make sense?). But at least we'd be able to report presence/absence. > 4. None of this addresses where we are going to get controlled > vocabularies to specify alignment methods, for instance. Several > people have tried to address this, and there are resources out there > that have some elements of the desired vocabulary (mygrid services > ontology; O'Meara's treetapper resource; CDAO). Its easy to start > this but hard to finish. As Bill mentioned, it was a goal of CIPRES, > too. Every time someone tries to do this, they end up with a hornet's > nest. But maybe that is due to the lack of a clear target-- which > perhaps is remedied by having a miapa checklist and an auto-submission > problem to solve. That seems to be the way things are done around here :-) > 5. Is it problematic that MEGA is not open-source, e.g., with respect > to devoting resources to working with a non-open-source? According to > Sudhir (I asked him specifically about this) "the source code for the > computational core is available upon request and permission is granted > to use the computational core of MEGA for personal research and > testing only", but that the GUI is based on proprietary components and > the source code is not available. Would this prevent us from working > with MEGA programmers at a NESCent hackathon, for instance? Would we > ask Sudhir to open-source the submission component of the code as a > separate module? I can't imagine that's a showstopper. Either the submission component is factored out as a separate module - which is probably the best way to go about it anyway from MEGA's design p.o.v. - or any other hackers are just going to have to promise not to look at the proprietary GUI components, which are irrelevant here anyway. Or something. Rutger > On Nov 6, 2011, at 7:50 PM, Hilmar Lapp wrote: > >> Hi Arlin, >> >> I spoke with Sudhir earlier this year at the ISMB conference about >> pretty much the same thing. The Dryad-TreeBASE interface isn't secret >> in any way [1,2], and as Bill points out is quite limited in what it >> achieves. >> >> In the ABI grant proposal we submitted in July [3], we actually >> propose to create precisely such a submission API that 3rd party >> applications can use to submit richly annotated data to TreeBASE >> directly, and indeed we propose to build on the Dryad/TreeBASE hand- >> shaking interface to accomplish this. If Sudhir has resources >> available to prototype this now, at the end of TreeBASE or MEGA or >> both, that'd be terrific, and I'd be happy to help as far as I can to >> facilitate that better. >> >> BTW I also spoke with Sudhir about possibly supporting NeXML from >> within MEGA, and he appeared very open to that - he said that >> essentially all he needs is someone who can help by providing the >> guidance on NeXML implementation. MEGA supporting NeXML wouldn't help >> with TreeBASE submission right now, but I imagine that the envisioned >> programmable submission API would certainly rely on NeXML. >> >> -hilmar >> >> [1] https://datadryad.org/wiki/TreeBASE_Submission_Integration >> [2] https://datadryad.org/wiki/BagIt_Handshaking >> [3] http://www.evoio.org/wiki/ABI_2011_proposal >> >> On Nov 4, 2011, at 7:53 AM, Arlin Stoltzfus wrote: >> >>> Hello all. Yesterday I had a talk with Sudhir Kumar, author of MEGA, >>> which probably is responsible for more published trees than any other >>> phylogeny inference package (not necessarily the most trees among the >>> phylogeny elite represented in TreeBASE). I discovered that MEGA >>> has >>> a graphical name-reconciling interface for users to align mismatched >>> OTU names between tree and alignment files-- this is a common problem >>> and a barrier to re-use that I have encountered personally multiple >>> times. >>> >>> He suggested the idea that, to facilitate effective archiving, it >>> might be useful to have a way for phylogeny applications to >>> generate a >>> submission in TreeBASE, providing metadata such as software version >>> and run conditions. >>> >>> Probably you have heard this suggestion before (I heard it earlier >>> this week from Joseph Hughes in regard to BEAST). >>> >>> I mentioned that TreeBASE has a top-secret interface that Dryad uses >>> to submit NEXUS files, and that this could be the basis for a >>> submission interface for other applications. My understanding is >>> that >>> this is done via web-services, and that the user gets a link to a >>> temporary submission that must be completed interactively. I hope I >>> didn't give the wrong impression. >>> >>> Anyway, Sudhir was very interested in this. He said that he has >>> programmers with time to work on this kind of thing. If the MEGA >>> team prototyped a direct-submission interface, they could write a >>> brief paper about it, and maybe we could get other developers >>> together >>> to hash out the metadata terms to support, based on the recent MIAPA >>> exercise at TDWG. If we could get MEGA and the top 3 TreeBASE >>> programs (PAUP, MB, RAXML-- right?), that would cover a very large >>> segment of users. >>> >>> I realize that this approach might not be the best way to promote >>> archiving in the long-term. However, it might be more effective in >>> the short term, and we might learn a lot from it. >>> >>> I'd like to hear any thoughts you have on this. Would this be a >>> useful exercise? What are the disadvantages? How could it fit >>> into a >>> larger strategy? >>> >>> Arlin >>> ------- >>> Arlin Stoltzfus (ar...@um...) >>> Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST >>> IBBR, 9600 Gudelsky Drive, Rockville, MD >>> tel: 240 314 6208; web: www.molevol.org >>> >>> >>> ------------------------------------------------------------------------------ >>> RSA(R) Conference 2012 >>> Save $700 by Nov 18 >>> Register now >>> http://p.sf.net/sfu/rsa-sfdev2dev1 >>> _______________________________________________ >>> Treebase-devel mailing list >>> Tre...@li... >>> https://lists.sourceforge.net/lists/listinfo/treebase-devel >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : >> =========================================================== >> >> >> > > ------- > Arlin Stoltzfus (ar...@um...) > Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST > IBBR, 9600 Gudelsky Drive, Rockville, MD > tel: 240 314 6208; web: www.molevol.org > > > ------------------------------------------------------------------------------ > RSA(R) Conference 2012 > Save $700 by Nov 18 > Register now > http://p.sf.net/sfu/rsa-sfdev2dev1 > _______________________________________________ > Treebase-devel mailing list > Tre...@li... > https://lists.sourceforge.net/lists/listinfo/treebase-devel > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com |