Upon reflectionI realize that this is, for me, actually a new objection to mzML.  My original problem with the reliance on CV/OBO is that an XML parser for it looks something like this:
 
for each element {
    if (element.name=="cvParam") then {
       a whole bunch of handrolled logic to pick this apart
    } else {
      there isn't much else
    }
 }
 
That's not really an XML parser, therefore I conclude that mzML isn't really XML.  But I have previously beaten that horse to death.  
 
Now we have something new not to like: it's impossible to write a parser that's even remotely future-proof.  Or maybe it's not new, and I just missed it before.  Either way, this all looks increasingly ill conceived to me.  Sorry to be such a downer.
 
Hey, the horse just twitched:  by placing CVparam information in attributes of the elements of a conventionally structured XML schema (ala mzXML) we can make use of the OBO work without adding a lot of unwanted complexity to software systems that aren't really interested in it.  An mzML that integrates well with OBO-aware systems is an excellent idea, but an mzML that demands you BE an OBO-aware system seems less likely to achieve widespread adoption.
 
I do understand the desire to maintain an ontology instead of an ontology and an XML schema, but I'm not sure we can really get away with it.  By having a schema that offloads most of its work to an external ontology, we're just pushing the work that having a proper schema saves onto the folks creating the readers and writers, making their job much more complicated that it ought to be - you can't autogenerate a parser or serializer without a fully realized schema.  I think we risk them deciding that mzXML and mzData aren't really all that broken after all.
 
Brian


From: psidev-ms-dev-bounces@lists.sourceforge.net [mailto:psidev-ms-dev-bounces@lists.sourceforge.net] On Behalf Of Matthew Chambers
Sent: Tuesday, August 07, 2007 11:57 AM
To: psidev-ms-dev@lists.sourceforge.net
Subject: Re: [Psidev-ms-dev] cvParams using name attribute as value

In addition to Mike’s and Brian’s concerns, I am wondering how “LCQ Deca” is called a “term/concept?”  “Instrument model” is the closest relevant term/concept as I understand those words.  Is the cvParam not capable of controlling both the name and possible values of its definitions?  Also, why are the different instrument models part of the CV anyway?  It seems that the CV should support controlling both terms and the values (or instances) of those terms:

“LCQ Deca” IS A VALID INSTANCE OF “thermo finnigan” IS A “thermo fisher scientific” IS A “instrument model”

 I don’t really understand the middle two jumps either, i.e. why are they redundant?

 


From: Eric Deutsch [mailto:edeutsch@systemsbiology.org]
Sent: Tuesday, August 07, 2007 12:13 PM
To: Matthew Chambers; psidev-ms-dev@lists.sourceforge.net
Subject: RE: [Psidev-ms-dev] cvParams using name attribute as value

 

Hi Matt, the agree-upon rule here is that the cvParams should always refer to the most detailed concept, and the value attribute should *only* be filled if there is a scalar value associated with the concept that cannot be in the CV itself.  So:

 

<cvParam cvLabel="MS" accession="MS:1000554" name="LCQ Deca" value=""/>

<cvParam cvLabel="MS" accession="MS:1000529" name="Instrument Serial Number" value="23433"/>

 

So for the first, the term/concept is “LCQ Deca”.  For the CV, one can learn that an “LCQ Deca” IS A “instrument model”, and so there’s no need (and is perhaps a little dangerous) to put “LCQ Deca” as a value of “instrument model”.

 

However, “instrument serial number” is the most specific concept in the CV, and thus the actual SN is the value.

 

This was discussed at some length and this is the new way of doing things, that will be uniform across all PSI and FuGE implementations. At least, that is my understanding. This does mean that parsers need to be a little smarter and be “CV-aware”. The parser/interpreter can no longer assume that there will be a term “instrument model” and look for its value.  But rather, the parser/interpreter must now look to see if any of the terms provided are a child of “instrument model” in the CV.

 

Regards,

Eric

 

 

 


From: psidev-ms-dev-bounces@lists.sourceforge.net [mailto:psidev-ms-dev-bounces@lists.sourceforge.net] On Behalf Of Matthew Chambers
Sent: Tuesday, August 07, 2007 9:40 AM
To: psidev-ms-dev@lists.sourceforge.net
Subject: [Psidev-ms-dev] cvParams using name attribute as value

 

I’m a little confused about the parameters which use the accession number as a kind of value instead of the accession number identifying a variable and then using the value attribute to assign the value.  I don’t understand why:

<cvParam cvLabel="MS" accession="MS:1000130" name="Positive Scan" value=""/> (from mzML)

Is preferable to:

<cvParam cvLabel="psi" accession="PSI:1000037" name="Polarity" value="positive"/> (from mzData)

 

There are other examples of this as well.  What’s the logic here?

 

-Matt Chambers