<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Moving on to an appropriate subject so we can reform the Church of
Controlled Vocabulary... :)<br>
<br>
<br>
Eric Deutsch wrote:
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta name="Generator" content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal;
font-family:Arial;
color:windowtext;}
span.EmailStyle18
{mso-style-type:personal;
font-family:Arial;
color:navy;}
span.m1
{color:blue;}
span.t1
{color:#990000;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:Arial;
color:navy;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
/* List Definitions */
@list l0
{mso-list-id:1639410469;
mso-list-type:hybrid;
mso-list-template-ids:1919608944 1959066812 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Arial;
mso-fareast-font-family:"Times New Roman";}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
-->
</style>
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">Hi everyone,
I’ve taken some time to
think carefully about what Brian says and here is my attempt at
focusing the
discussion:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- First:
yes, there are several problems
in the CV is_a and part_of. We agreed at the CV meeting that we will
tackle
this to try to make it uniform.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- Here are
two rules within the CV worth that
may hold true and should be documented:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> - if a
term’s direct parent
is a “xxxx attribute”, then it must furnish a value within the
cvParam element, else it cannot<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> - if a
term has children, then it
cannot be specified as a cvParam (except as a category/parent in option
C)<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> Is this
correct? Counter examples?<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- Regarding
the reflectron example, I
think the CV should look like this, even though it does not quite now:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> -
“reflectron on” is_a “reflectron
state” is_a “analyzer attribute”<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> -
“reflectron off” is_a
“reflectron state” is_a “analyzer attribute”<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
</div>
</blockquote>
These points do not address the more significant issue that the CV is
apparently incapable of defining types for categories with uncontrolled
values and there is no automatic way to distinguish between a category
and a controlled value (i.e. an accession number that represents a
category vs. an accession number that represents a value). I suggest
the convention (like Angel mentions in his reply to this post) where
categories have a pure PART_OF relationship and controlled values have
an IS_A relationship to their parent category. I still don't know how
to encapsulate the type information for uncontrolled values in the CV
though. Perhaps each type (real, integer, string, etc.) could be given
a special accession number which indicates the type and also indicates
to the validator/parser that the value should be taken from the
name/text attribute instead of the accession attribute? But then I'm
not sure how to assign that accession number to the uncontrolled
classes, because each type would have an IS_A relationship to multiple
categories.<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- Thus
cvParams would be used like this:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> Option A:
<cvParam
cvLabel="MS" accession="MS:1000105" name="reflectron off"
value="" /><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> Option C+:
<cvParam name="reflectron
off" cvLabel="MS" accession="MS:1000105" parentAccession=”
MS:1000021”/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
</div>
</blockquote>
I will regurgitate my preferred version of Option C:<br>
Option E: <cvParam name="reflectron state" valueName="off"
accession="MS:1000021" valueAccession="MS:1000105"/><br>
Same information, but IMO more intuitive, human readable, and it avoids
the potentially nasty pitfall of defining what a "parent" is (i.e. is
it one level up the CV branch, all the way up, part of the way up?).<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- Brian
proposed:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
<reflectronState
accession=”MS:1000021” off/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> This does
not seem like well formed
XML to me. Or is it?? I assume he meant this:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
<reflectronState
accession=”MS:1000105” name=“reflectron off”/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- If so, the
real dilemma is between:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> 1)
<cvParam name="reflectron
off" cvLabel="MS" accession="MS:1000105" parentAccession=”
MS:1000021”/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> 2)
<reflectronState
accession=”MS:1000105” name=“reflectron off”/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> Brian,
would you agree that these
are the two sides? They both seem fully complete to me. If I’ve got it
wrong, then the rest would seem premature, but I’ll press on believing
I’ve
got it right. Because by creating an element in the schema
<reflectronState>, this automatically takes the place of {
cvLabel="MS"
parentAccession=” MS:1000021” }</span></font></p>
</div>
</blockquote>
Yes, that is the real dilemma. I cast my vote in for going either ALL
CV or ALL schema. I don't like the idea of mixing the two. I am a bit
confused though and Brian will need to clarify: he previously suggested
that the entire schema would be hand-rolled and the CV would be
generated FROM the schema. Would that mean that accession numbers
would be assigned in the schema and propagated into the CV? I don't
recall Brian proposing the <reflectronState ...> method while
still filling in the schema from a separately maintained CV - that
would be too much hassle.<br>
<br>
No matter which route we take though, we should have a fully
descriptive XML schema in order to allow standard XML tools to do the
semantic validation. In the case of the CV, that schema will be
auto-generated every time the CV changes. In the case of the
hand-rolled schema, it'll be completely self-contained.<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- So for
option 1, we’re essentially
at that right now (we would need to adjust option A to option 1, but
it’s
close)<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- For option
2, we would need to find all
the CV terms that we think deserve to be promoted to element status and
add
them to schema. I don’t know how many there are, but there would be
lots.
The schema would increase in size many fold.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- A further
complication is where does
this element go? Does it go in the instrument description section? Or
could the
reflectron be turned on and off for different spectra and thus go in
the scan
element? I have no idea. If we put it in the schema, we’ve got to get
it
right now. If we don’t, then the schema will have to be updated to fix
it.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- The
current state is a flexible (some
might say lazy or dangerous) way. We acknowledge that we don’t have all
the CV terms and we’re not exactly sure where some will be used, so we
leave it open. No example instance document yet has reflectron state
information in it. I’d be delighted if someone could provide one.</span></font></p>
</div>
</blockquote>
No matter which way we go, CV w/ autogenerated schema or hand-rolled
schema, or cvParams or explicit elements, changing an element's valid
location from one part of the document to another will break backward
compatibility with the semantic validation, as well as breaking all but
the smartest parsers. We should definitely try to avoid moving terms
around once we've released the spec!<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- So what we
can do today is provide a
term “reflectron off” that almost no one really cares much about
and let someone out there who does care write some mzML with this
annotation in
it. When this document is checked against the semantic validator, the
validator
will complain that you’ve used a child term of “reflectron state”
in a place where it’s not allowed. But the writer insists that it
should
be allowed there. The PSI-MS WG is pursuaded it should be. So we update
the
semantic validator and the CV perhaps and these new documents are
written out
with reflectron state information and validate. Most software doesn’t
care a hoot about the reflectron state and that cvParam can be safely
ignored
or dumbly displayed to the user in case the user cares. All the above
can
happen without a rev of the schema.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- But that’s
the same thing as
updating the schema except in name, you say. Perhaps.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
</div>
</blockquote>
I also say it's the same as updating the schema, because the schema
DOES have to be updated when the CV is updated in order to reflect the
new changes. Right now we have a pretty useless schema because it is
inadequate to do semantic validation or write a parser.<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- So, I hope
I have helped this discussion
rather than confused it. Clearly the current schema has a big element
of
flexibility/power/danger in it. Some would believe that this will allow
us to
improve the format in minor ways without schema revision and provide a
way for
producers to express their data with annotations that make sense to
them. The
only thing standing between flexibility and utter mayhem is the
semantic
validator. Perhaps in some sense, this is half XML schema and half
pseudo RDF.
Can we pull it off or are we lunatics for trying it?<o:p></o:p></span></font></p>
</div>
</blockquote>
We need to re-evaluate the idea that the schema should be perpetually
unchanging. To me, that is an illogical and contradictory requirement
when we also have the requirement to do semantic validation with an
ever-changing CV. Why should we be afraid of schema revisions? We
should, more specifically, be afraid of removing existing terms,
shifting them from one part of the spec to another, and adding new
features (like new compression types for the peak lists, new precision
types, etc.). And I hope everyone can see that these fears should
exist for both a CV-based schema and a hand-rolled schema.<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- I am
clearly biased here, but I try to
keep an open mind.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- To my
mind, the most important
unconsidered problem that Brian brings up is the data type problem.
Consider
the example:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
<cvParam cvLabel="MS"
accession="MS:1000285" name="total ion current"
value="1.66755e+007" parentAccession=”MS:1000499”/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> Brian’s
proposed alternative
is (I hope I’m right):<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
<spectrumAttribute accession="MS:1000285"
name="total ion current" value="1.66755e+007"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"> In
principle, this second way would
allow me to specify a data type and let XML validators enforce it.
However, this
may not quite work either, because what if I want:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
<spectrumAttribute accession="MS:1009999"
name="spectrum subjective quality" value="10"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">To be
allowed? All spectrumAttributes
would have to have the same data type for that to work. The example is
pretty
contrived. Unless every single attribute got its own element like:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
<totalIonCurrent value="1.66755e+007"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- The latter
here is fully specified and
concrete. But if we get anything wrong or want to add anything, then we
have to
release a new version of the schema. One possible option is to full
specify in
schema everything we can think of now, and then for new or later things
use
cvParam. If we do that, then we’re still needing to apply sematic
validation so we’ve only half-solved the problem. Finally, a
dangerous door may be opening. If we want to expand this duality, we
have a possible
“more than one way to do it” problem. Some might choose to use the
cvParam, and some the schema element. The only thing that could prevent
that is
the semantic validator again.</span></font></p>
</div>
</blockquote>
No duality should be possible. A category should either be done with
an element or with a cvParam, and I prefer that all categories should
be done with one or the other instead of a mix of the two. But
certainly no single category should have both an element and a cvParam
method for specifying its value.<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- I wonder
whether we can add a nice
method of datatype validation to option 1 above? Any ideas?<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
</div>
</blockquote>
First we have to get data type specification into the CV that is
complete and comprehensible to machines (so we can auto-generate a
schema from the CV). Let's figure that out first. :) And if we CAN'T
do that, we are pretty much forced to go with a hand-rolled schema
because at that point I see very little reason to use the OBO CV at all.<br>
<br>
<br>
<blockquote
cite="mid:5BEF622F935E3E4186527FA178FA85F8044B0AC8@..."
type="cite">
<div class="Section1">
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">I had hoped
to focus the discussion, but
rereading it, all I did was shake the already-opened can of worms.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">Let the
commentary ensue.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">Regards,<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">Eric<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<div
style="border-style: none none none solid; border-color: -moz-use-text-color -moz-use-text-color -moz-use-text-color blue; border-width: medium medium medium 1.5pt; padding: 0in 0in 0in 4pt;">
<div>
<div class="MsoNormal" style="text-align: center;" align="center"><font
face="Times New Roman" size="3"><span style="font-size: 12pt;">
<hr tabindex="-1" align="center" size="2" width="100%"></span></font></div>
<p class="MsoNormal"><b><font face="Tahoma" size="2"><span
style="font-size: 10pt; font-family: Tahoma; font-weight: bold;">From:</span></font></b><font
face="Tahoma" size="2"><span
style="font-size: 10pt; font-family: Tahoma;">
<a class="moz-txt-link-abbreviated" href="mailto:psidev-ms-dev-bounces@...>
[<a class="moz-txt-link-freetext" href="mailto:psidev-ms-dev-bounces@...>] <b><span
style="font-weight: bold;">On Behalf Of </span></b>Brian Pratt<br>
<b><span style="font-weight: bold;">Sent:</span></b> Monday, October
08, 2007
11:38 AM<br>
<b><span style="font-weight: bold;">To:</span></b> 'Mass spectrometry
standard
development'<br>
<b><span style="font-weight: bold;">Subject:</span></b>
[Psidev-ms-dev] MANIFESTO
TIME! (was RE: more is_a vs. part_oferrors?)</span></font><o:p></o:p></p>
</div>
<p class="MsoNormal"><font face="Times New Roman" size="3"><span
style="font-size: 12pt;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">Eh, it’s
even more broken than I
thought. I’ve amended my amendments inline below, new changes in
double parenthesis. <o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">After a day
so of messing with this, it is
now:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">MANIFESTO
TIME!<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">RESOLVED:<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">The mzML
specification process should be
schema-centric, and the CV should be generated from the schema (should
be a
fairly simple matter of XSLT, since XSD is itself XML). <o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">REASON 1:
THE CV-CENTRIC APPROACH IS ERROR
PRONE.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">The kinds of
inheritance errors shown
below are, if not actually impossible, much harder to make in the
context of a
W3C schema when using readily available software tools to create and
maintain
the schema.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">REASON 2:
OBO/CV IS AN INSUFFICIENT TOOL
FOR THE JOB OF PRODUCING A READILY AND THOROUGHLY VALIDATABLE DATA
FORMAT.<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">CV
apparently provides no means for
specifying range or formatting of instance values. An “isolation
width” (</span></font><font face="Courier New" size="2"><span
style="font-size: 10pt; font-family: "Courier New";">MS:1000023) </span></font><font
color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">could
happily have a value of “-2”, “2”,
“two”, or “extra sprinkles, please”. You could
(and should) certainly put some text in the description along the lines
of
“this is a non-negative floating point value” but that’s no
help to a validating parser. XSD on the other hand has standardized
syntax for enforcing precisely these kinds of restrictions, meaning
that
validating parsers and code generators (for both read and write) don’t
need any special-purpose logic added. <o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">There are a
handful of places where value
range restrictions have been attempted in the MS CV, but these are
awkward
because of the tools. The reflectron_state, for example, has two
children
“on” and “off”, but this only confuses things, since
these are not *<b><span style="font-weight: bold;">values</span></b>*
of
reflectron state but rather *<b><span style="font-weight: bold;">are</span></b>*
reflectron states, a distinction which may be meaningless in English
but
significant when attempting to create a data structure. Picture how
this looks in an instance doc:<o:p></o:p></span></font></p>
<p class="MsoNormal" style="text-indent: 0.5in;"><span class="m1"><font
color="black" face="Courier New" size="2"><span
style="font-size: 10pt; font-family: "Courier New"; color: black;"><</span></font></span><span
class="t1"><font color="black" face="Courier New" size="2"><span
style="font-size: 10pt; font-family: "Courier New"; color: black;">cvParam</span></font></span><font
color="black" face="Courier New" size="2"><span
style="font-size: 10pt; font-family: "Courier New"; color: black;"> <span
class="t1"><font color="black"><span style="color: black;">cvLabel</span></font></span><span
class="m1"><font color="black"><span style="color: black;">="</span></font></span><b><span
style="font-weight: bold;">MS</span></b><span class="m1"><font
color="black"><span style="color: black;">"</span></font></span><span
class="t1"><font color="black"><span style="color: black;"> accession</span></font></span><span
class="m1"><font color="black"><span style="color: black;">="</span></font></span><b><span
style="font-weight: bold;">MS:1000105</span></b><span class="m1"><font
color="black"><span style="color: black;">"</span></font></span><span
class="t1"><font color="black"><span style="color: black;"> name</span></font></span><span
class="m1"><font color="black"><span style="color: black;">="</span></font></span><b><span
style="font-weight: bold;">off</span></b><span class="m1"><font
color="black"><span style="color: black;">"</span></font></span><span
class="t1"><font color="black"><span style="color: black;"> value</span></font></span><span
class="m1"><font color="black"><span style="color: black;">="" /></span></font></span></span></font><span
class="m1"><font color="black" face="Courier New"><span
style="font-family: "Courier New"; color: black;"><o:p></o:p></span></font></span></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">I can’t
think of anything nice to
say about that. Better it should read:</span></font><font color="navy"
face="Arial"><span style="font-family: Arial; color: navy;"><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">
</span></font><font color="black" face="Courier New" size="2"><span
style="font-size: 10pt; font-family: "Courier New"; color: black;"><reflectronState
accession=”MS:1000021” off/><o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">CONCLUSION:
THE CV WORK TO DATE IS
IMPORTANT AND USEFUL, BUT SHOULD BE RECAST AS SCHEMA WORK<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">The CV
should not attempt to be a
replacement for the schema - it just hasn’t got the requisite
mechanisms
to do the job. The information CV can convey is only a subset of the
information that is needed to fully specify a data format. The
information in the CV as it stands should be folded into the mzML
schema, and
maintained therein moving forward. An actual OBO/CV file can be
generated
as needed. <o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;">- Brian<o:p></o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font color="navy" face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial; color: navy;"><o:p> </o:p></span></font></p>
</div>
</div>
</blockquote>
<br>
</body>
</html>
|