From: William P. <wil...@ya...> - 2011-06-14 15:03:25
|
Interesting solution. MIAPA will definitely need to tackle analysis info, even if it's at a rudimentary level (like what produced what from what). TreeBASE used to only have analysis records -- the analysis step records was introduced with the idea that submitters might want the ability to describe more complex analysis in a multi-step fashion (e.g. took matrix x and produced set of trees y, took set of trees y and produced consensus tree z, etc). But alas, very few submitters have taken advantage of this -- most just have one step per analysis. Yet having multiple steps adds a slightly greater mouse-clicking burden. Maybe we should consider abandoning the multi-step design, and collapsing it down to single-step analysis entries? How might MIAPA come to an opinion on matters like this? bp On Jun 13, 2011, at 8:37 AM, Rutger Vos wrote: > Hi all, > > over the weekend I did some experimentation with how additional > metadata having to do with phylogenetic analyses stored by TreeBASE > could be serialized. Attached is the result as produced by a test case > that I committed to the TreeBASE source. > > For context, here is how TreeBASE sees the world: every submission to > TreeBASE consists of the results of one or more analyses. Each > analysis consists of one or more analysis steps. For each step, we > store the "algorithm" (e.g. neighbor joining) and the "software" (e.g. > PAUP). Optional additional metadata can consist of a textual > description of the algorithm, a version number and URL of the software > and a text string containing analysis step commands (perhaps something > like a PAUP block). > > Every analysis step has input and output data. These data can be trees > and matrices. The set of taxa in the input must be a superset of the > taxa in the output (i.e. some sort of taxon pruning is allowed, but > new taxa cannot be introduced during an analysis step). All data > that's accessible to third parties (i.e. all public, non-embargoed > data) must be the input or output of at least one analysis step, i.e. > we don't allow orphaned data in completed submissions. > > In the attached example, I'm annotating the study (i.e. the root of > the nexml document) to specify the permanent URLs of any associated > analyses, and I annotate those analysis URLs with their respective > analysis steps, specifying their PURLs and any additional metadata as > described above. This is shown in lines 3-13. > > Then, for every data object I specify for which analysis step(s) it is > the input and/or output (a data object can be both input and output if > analysis steps are chained together). This is shown in line 448 for a > character state matrix and line 1849 for a tree. > > This is all highly experimental but I figured I'd share at as a > discussion piece for refining actual implementation of MIAPA > annotations. > > Rutger |