| 
     
      
      
      From: Eric D. <ede...@sy...> - 2007-08-08 06:35:22
      
     
   | 
Thank you all for the lively discussion.
=20
One proposal I once made in Lyon (which was roundly dismissed I believe)
was something like this: instead of:
=20
<cvParam cvLabel=3D"MS" accession=3D"MS:1000554" name=3D"LCQ Deca" =
value=3D""/>
=20
Have:
=20
<cvParam cvLabel=3D"MS" parentAccession=3D"MS:1000031"
accession=3D"MS:1000554" name=3D"LCQ Deca" value=3D""/>
=20
Thus the parser can easily be coded to know that any cvParam with a
parentAccession=3D"MS:1000031" is going to be an instrument model =
whether
or not it's in the CV. The mzML semantic validator tool would, of
course, check all this. The main argument against this was the potential
for inconsistency, I seem to recall.
=20
The decision was made to make individual models cv terms to avoid
problems like:
=20
<cvParam cvLabel=3D"MS" accession=3D"MS:1000031" name=3D"instrument =
model"
value=3D"LCQ Deca"/>
<cvParam cvLabel=3D"MS" accession=3D"MS:1000031" name=3D"instrument =
model"
value=3D"LCQ DECA"/>
<cvParam cvLabel=3D"MS" accession=3D"MS:1000031" name=3D"instrument =
model"
value=3D"LTQ FT"/>
<cvParam cvLabel=3D"MS" accession=3D"MS:1000031" name=3D"instrument =
model"
value=3D"LTQ-FT"/>
<cvParam cvLabel=3D"MS" accession=3D"MS:1000031" name=3D"instrument =
model"
value=3D"LTQFT"/>
=20
I would argue that your code snippet below would better look like:
=20
#define MS_CV_POLARITY_TYPE "MS:1000037"
if( element.parent =3D=3D "spectrumDescription" )  {
   for each child {
      if (child.name=3D=3D"cvParam") then {
         if( cv.isChildOf(child.attrs['accession], MS_CV_POLARITY_TYPE)
)    // if a polarity type
           spectrum.polarity =3D cv.getName(child.attrs['accession']);
    }
 }
=20
Note that the cvParam name (should that be "positive" or "Positive" or
"positive polarity" or "Polarity" or "polarity"?) is not in the code,
just MS:1000037 which can be considered final.
=20
This does require a CV class and some methods:
cv.loadFromFile()
cv.isChildOf()
cv.getName()
=20
but this is not really complicated.
=20
Take cover!
Eric
=20
=20
________________________________
From: psi...@li...
[mailto:psi...@li...] On Behalf Of
Matthew Chambers
Sent: Tuesday, August 07, 2007 1:43 PM
To: psi...@li...
Subject: Re: [Psidev-ms-dev] cvParams using name attribute as value
=20
=20
As long as the name/value paradigm is used, the loop doesn't get much
more complicated than:
if( element.parent =3D=3D "spectrumDescription" )  {
   for each child {
      if (child.name=3D=3D"cvParam") then {
         if( child.attrs['name'] =3D=3D "Polarity" )
           spectrum.polarity =3D child.attrs['value'];
    }
 }
=20
But if you have to do:
if( element.parent =3D=3D "spectrumDescription" )  {
   for each child {
      if (child.name=3D=3D"cvParam") then {
         if( child.attrs['name'] =3D=3D "Positive" )
           spectrum.polarity =3D "positive";
         else if( child.attrs['name'] =3D=3D "Negative" )
           spectrum.polarity =3D "negative";
    }
 }
...parsers will be painful to write and adoption will suffer because of
it I think.  Not to mention the fact that the idea of adding these
things that should really be values as "terms" in the vocabulary is
indeed not future-proof.  In the future, there might be another IS_A
relationship for "LCQ Deca" so that merely by seeing LCQ Deca you won't
know that you're looking at an instrument model parameter.  Of course,
the accession number would tell you uniquely, but then you'll have two
accession numbers in the vocabulary with the name "LCQ Deca."  Yuck!
=20
I think values for terms should be given a special relationship in the
CV, they shouldn't be given an "IS_A" relationship and expect the parser
to look up the implication of that relationship every time a
value-as-term is encountered.
=20
-Matt
=20
=20
________________________________
From: psi...@li...
[mailto:psi...@li...] On Behalf Of Brian
Pratt
Sent: Tuesday, August 07, 2007 3:00 PM
To: psi...@li...
Subject: Re: [Psidev-ms-dev] cvParams using name attribute as value
=20
Upon reflection, I realize that this is, for me, actually a new
objection to mzML.  My original problem with the reliance on CV/OBO is
that an XML parser for it looks something like this:
=20
for each element {
    if (element.name=3D=3D"cvParam") then {
       a whole bunch of handrolled logic to pick this apart
    } else {
      there isn't much else
    }
 }
=20
That's not really an XML parser, therefore I conclude that mzML isn't
really XML.  But I have previously beaten that horse to death. =20
=20
Now we have something new not to like: it's impossible to write a parser
that's even remotely future-proof.  Or maybe it's not new, and I just
missed it before.  Either way, this all looks increasingly ill conceived
to me.  Sorry to be such a downer.
=20
Hey, the horse just twitched:  by placing CVparam information in
attributes of the elements of a conventionally structured XML schema
(ala mzXML) we can make use of the OBO work without adding a lot of
unwanted complexity to software systems that aren't really interested in
it.  An mzML that integrates well with OBO-aware systems is an excellent
idea, but an mzML that demands you BE an OBO-aware system seems less
likely to achieve widespread adoption.
=20
I do understand the desire to maintain an ontology instead of an
ontology and an XML schema, but I'm not sure we can really get away with
it.  By having a schema that offloads most of its work to an external
ontology, we're just pushing the work that having a proper schema saves
onto the folks creating the readers and writers, making their job much
more complicated that it ought to be - you can't autogenerate a parser
or serializer without a fully realized schema.  I think we risk them
deciding that mzXML and mzData aren't really all that broken after all.
=20
Brian
=20
 |