Re: [Treebase-devel] analysis metadata in TreeBASE, some experiments

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I don¹t believe the two things are exclusive - a single step is just a
"big step" that can be refined into a sequence of smaller steps if one
wants to. 
I believe Maryam is currently looking at the "leaves" of this hierarchical
tree that decomposes the analysis (by looking at the individual tools that
can be used in the analysis).

Enrico

-- 
Dept. Computer Science,
New Mexico State University
MSC CS, Box 30001, Las Cruces, NM 88003
Voice: 575-646-6239   Fax: 575-646-1002

On 6/14/11 9:03 AM, "William Piel" <wil...@ya...> wrote:

>
>Interesting solution. MIAPA will definitely need to tackle analysis info,
>even if it's at a rudimentary level (like what produced what from what).
>TreeBASE used to only have analysis records -- the analysis step records
>was introduced with the idea that submitters might want the ability to
>describe more complex analysis in a multi-step fashion (e.g. took matrix
>x and produced set of trees y, took set of trees y and produced consensus
>tree z, etc). But alas, very few submitters have taken advantage of this
>-- most just have one step per analysis. Yet having multiple steps adds a
>slightly greater mouse-clicking burden.  Maybe we should consider
>abandoning the multi-step design, and collapsing it down to single-step
>analysis entries? How might MIAPA come to an opinion on matters like this?
>
>bp
>
>On Jun 13, 2011, at 8:37 AM, Rutger Vos wrote:
>
>> Hi all,
>> 
>> over the weekend I did some experimentation with how additional
>> metadata having to do with phylogenetic analyses stored by TreeBASE
>> could be serialized. Attached is the result as produced by a test case
>> that I committed to the TreeBASE source.
>> 
>> For context, here is how TreeBASE sees the world: every submission to
>> TreeBASE consists of the results of one or more analyses. Each
>> analysis consists of one or more analysis steps. For each step, we
>> store the "algorithm" (e.g. neighbor joining) and the "software" (e.g.
>> PAUP). Optional additional metadata can consist of a textual
>> description of the algorithm, a version number and URL of the software
>> and a text string containing analysis step commands (perhaps something
>> like a PAUP block).
>> 
>> Every analysis step has input and output data. These data can be trees
>> and matrices. The set of taxa in the input must be a superset of the
>> taxa in the output (i.e. some sort of taxon pruning is allowed, but
>> new taxa cannot be introduced during an analysis step). All data
>> that's accessible to third parties (i.e. all public, non-embargoed
>> data) must be the input or output of at least one analysis step, i.e.
>> we don't allow orphaned data in completed submissions.
>> 
>> In the attached example, I'm annotating the study (i.e. the root of
>> the nexml document) to specify the permanent URLs of any associated
>> analyses, and I annotate those analysis URLs with their respective
>> analysis steps, specifying their PURLs and any additional metadata as
>> described above. This is shown in lines 3-13.
>> 
>> Then, for every data object I specify for which analysis step(s) it is
>> the input and/or output (a data object can be both input and output if
>> analysis steps are chained together). This is shown in line 448 for a
>> character state matrix and line 1849 for a tree.
>> 
>> This is all highly experimental but I figured I'd share at as a
>> discussion piece for refining actual implementation of MIAPA
>> annotations.
>> 
>> Rutger
>
>
>-- 
>You received this message because you are subscribed to the Google
>Groups "MIAPA" group.
>For more options, visit this group at
>http://groups.google.com/group/miapa-discuss?hl=en