You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
(5) |
Sep
|
Oct
(5) |
Nov
(1) |
Dec
(2) |
2005 |
Jan
(2) |
Feb
(5) |
Mar
|
Apr
(1) |
May
(5) |
Jun
(2) |
Jul
(3) |
Aug
(7) |
Sep
(18) |
Oct
(22) |
Nov
(10) |
Dec
(15) |
2006 |
Jan
(15) |
Feb
(8) |
Mar
(16) |
Apr
(8) |
May
(2) |
Jun
(5) |
Jul
(3) |
Aug
(1) |
Sep
(34) |
Oct
(21) |
Nov
(14) |
Dec
(2) |
2007 |
Jan
|
Feb
(17) |
Mar
(10) |
Apr
(25) |
May
(11) |
Jun
(30) |
Jul
(1) |
Aug
(38) |
Sep
|
Oct
(119) |
Nov
(18) |
Dec
(3) |
2008 |
Jan
(34) |
Feb
(202) |
Mar
(57) |
Apr
(76) |
May
(44) |
Jun
(33) |
Jul
(33) |
Aug
(32) |
Sep
(41) |
Oct
(49) |
Nov
(84) |
Dec
(216) |
2009 |
Jan
(102) |
Feb
(126) |
Mar
(112) |
Apr
(26) |
May
(91) |
Jun
(54) |
Jul
(39) |
Aug
(29) |
Sep
(16) |
Oct
(18) |
Nov
(12) |
Dec
(23) |
2010 |
Jan
(29) |
Feb
(7) |
Mar
(11) |
Apr
(22) |
May
(9) |
Jun
(13) |
Jul
(7) |
Aug
(10) |
Sep
(9) |
Oct
(20) |
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
(4) |
Mar
(27) |
Apr
(15) |
May
(23) |
Jun
(13) |
Jul
(15) |
Aug
(11) |
Sep
(23) |
Oct
(18) |
Nov
(10) |
Dec
(7) |
2012 |
Jan
(23) |
Feb
(19) |
Mar
(7) |
Apr
(20) |
May
(16) |
Jun
(4) |
Jul
(6) |
Aug
(6) |
Sep
(14) |
Oct
(16) |
Nov
(31) |
Dec
(23) |
2013 |
Jan
(14) |
Feb
(19) |
Mar
(7) |
Apr
(25) |
May
(8) |
Jun
(5) |
Jul
(5) |
Aug
(6) |
Sep
(20) |
Oct
(19) |
Nov
(10) |
Dec
(12) |
2014 |
Jan
(6) |
Feb
(15) |
Mar
(6) |
Apr
(4) |
May
(16) |
Jun
(6) |
Jul
(4) |
Aug
(2) |
Sep
(3) |
Oct
(3) |
Nov
(7) |
Dec
(3) |
2015 |
Jan
(3) |
Feb
(8) |
Mar
(14) |
Apr
(3) |
May
(17) |
Jun
(9) |
Jul
(4) |
Aug
(2) |
Sep
|
Oct
(13) |
Nov
|
Dec
(6) |
2016 |
Jan
(8) |
Feb
(1) |
Mar
(20) |
Apr
(16) |
May
(11) |
Jun
(6) |
Jul
(5) |
Aug
|
Sep
(2) |
Oct
(5) |
Nov
(7) |
Dec
(2) |
2017 |
Jan
(10) |
Feb
(3) |
Mar
(17) |
Apr
(7) |
May
(5) |
Jun
(11) |
Jul
(4) |
Aug
(12) |
Sep
(9) |
Oct
(7) |
Nov
(2) |
Dec
(4) |
2018 |
Jan
(7) |
Feb
(2) |
Mar
(5) |
Apr
(6) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(1) |
Sep
(9) |
Oct
(5) |
Nov
(3) |
Dec
(5) |
2019 |
Jan
(10) |
Feb
|
Mar
(4) |
Apr
(4) |
May
(2) |
Jun
(8) |
Jul
(2) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
(9) |
Dec
(1) |
2020 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
|
May
(3) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Brian P. <bri...@in...> - 2008-02-12 22:46:16
|
I think that's not quite right - arrayLength needs to remain an attribute of BinaryDataArray since not all BinaryDataArray elements in a spectrum will necessarily contain the same number of entries as an mz or intensity array, since not all BinaryDataArray elements are guaranteed (as I understand mzML, which is but dimly) to be mz or intensity. You'll need to write it again as an attribute of spectrum, something like mzintPairsCount if you don't like PeaksCount. -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Eric Deutsch Sent: Tuesday, February 12, 2008 1:28 PM To: Mass spectrometry standard development Cc: Eric Deutsch Subject: Re: [Psidev-ms-dev] binaryArrayData lengths So there seems to be broad consensus (4 for 4;) that moving the arrayLength up a little higher is a good idea. So instead of: <spectrum id="S19" scanNumber="19" msLevel="1"> <spectrumDescription> ... </spectrumDescription> <binaryDataArray arrayLength="1313" encodedLength="5433" dataProcessingRef="Xcalibur Processing"> ... <binary>AAAAwDsGeUAAAAD...</binary> </binaryDataArray> <binaryDataArray arrayLength="1313" encodedLength="4892"> ... <binary>AAAAAIBJxk...</binary> </binaryDataArray> </spectrum> We will have: !!!!!!!!!!!!!!!!!! <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> <spectrumDescription> ... </spectrumDescription> <binaryDataArray encodedLength="5433" dataProcessingRef="Xcalibur Processing"> ... <binary>AAAAwDsGeUAAAAD...</binary> </binaryDataArray> <binaryDataArray encodedLength="4892"> ... <binary>AAAAAIBJxk...</binary> </binaryDataArray> </spectrum> Agreed? > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Matthew Chambers > Sent: Wednesday, February 06, 2008 10:49 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > I agree that the primary data arrays should probably be treated as > special in the schema so it's clear that they are paired values and thus > peak count could move into the spectrum element or spectrumDescription. > There should still be options to have additional arrays that aren't the > same as the main arrays (for example, an additional set of arrays, one > for a subset of the m/zs and the other for peak charge information). > > -Matt > > > Kessner, Darren E. wrote: > > Any other comments regarding <binaryArrayData> lengths? > > > > > >> (from Rune) > >> If they have to be equal size, then > >> that size ought to be specified in the spectrumDescription. > >> > > > > I agree -- I would like to encode the length in <spectrum> somewhere > > (either attribute or cvParam) so that: > > 1) it's clear that the arrays are of equal size > > 2) Readers don't have to peek into the attributes of the first > > <binaryArrayData> to get the info > > > > I need this right now for the MSData RAMP adapter code, so I'll encode > > it as a <userParam> until a decision has been made on the specification. > > > > > > Darren > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Joshua T. <jt...@sy...> - 2008-02-12 22:15:40
|
Hi Eric, You missed my strong vote for "B)". -Josh Eric Deutsch wrote: > Hi everyone, I'm trying to see if we can get to some consensus on some > of these ongoing threads. Regarding the "unknown instrument" problem, I > think there has been some confusion, so let me see if I can clarify and > ask for a final round of opinions. I agree with Fredrik's comments > below that his examples below are *not* what is intended. Here is what I > believe Lennart intended: > > A) > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value=""/> > > Or the other alternative is to create a term for unknown: > > B) > <cvParam cvLabel="MS" accession="MS:1099931" name="unknown instrument > model" value=""/> > (where the number is obviously made up by me right now, but would be in > the CV) > > So those are the choices. Putting something in the value attribute is > not an option as Fredrik concludes below. > > Benefits of A) > - No need to litter the CV with "xxx unknown" terms > - Happenstance very easy for the existing validator software to > accommodate > - Somewhat counterintuitive and thus dissuades laziness > Drawbacks of A) > - Somewhat counterintuitive and awkward > > Benefits of B) > - Very intuitive and straightforward: the concept of what instrument > generated these spectra is captured by the concept "sorry, I just don't > know which instrument it was" > Drawbacks of B) > - Opens the door to perhaps needing to sprinkle other unknowns in the CV > - Is a little more inviting to users to be lazy and claim they don't > know, when with a little more effort they could find out and report > properly (because "unknown" is not an *obvious* option) > - Would require more development in the validator to properly handle a > special term like this. > > Based on the feedback I saw so far, Lennart, Luisa and Angel like A. > Matt seemed more in favor of B. No clear reads on others. > > I myself prefer B. To me it feels like A is a convenient but > counterintuitive trick to working around the problem. B feels like the > right solution even if it facilitates laziness. I don't think that will > be a big problem. I'm sure we can come up with some syntax for the > validator to permit or disallow "ambiguity terms" as desired. > > So, what say ye? > > > > >> From: psi...@li... > [mailto:psidev-ms-dev- >> Hi Lennart, Josh, Matt and others, >> >> If the top level term is allowed it will be possible to define not > only >> instrument value='unknown', but also instruments that are not in the > CV >> by putting something in the value field: >> <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" >> value="The new mass spec not in CV"/> >> <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" >> value="unknown"/> >> Instead of the intended: >> <cvParam cvLabel="MS" accession="MS:1000189" name="q-tof ultima" >> value=""/> >> I'm not so sure that this is wanted. Especially since unknown could be >> written as 'not known', 'not specified' etcetera. It make sense to > have >> a CV term for 'unknown', but it would be quite a few 'unknown' terms > to >> add to the CV to get one for each required category in the mzML >> schema...At some places it would be enough with just 'unknown' >> (source,detector etc), but at other places it must be specified what > is >> unknown! >> >> Anyway, I am still for usage of top level elements :-) , see line 16 > at: >> http://trac.thep.lu.se/trac/fp6- >> prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML >> >> cheers >> >> Fredrik >> >> Joshua Tasman skrev: >>> I'm with Matt on this one, and like his solution. There are >> unfortunately lots of real use cases (combining dta, mgfs) where the >> information will really be unknown, and we should accurately represent > the >> lack of information. If it's not too much effort to add a little more >> code to the validator, I would much prefer the accurate addition of an >> "unknown" term. There has been so much effort getting the CV and > document >> to line up with reality, it looks very strange to me to force this >> ontological 'hack' by allowing the category to appear as a value, as > Matt >> has said. >>> Josh >>> >>> >>> Matthew Chambers wrote: >>> >>>> Lennart Martens wrote: >>>> >>>>> Hi Matt, and Colleagues, >>>>> >>>>> >>>>> >>>>> >>>>>> I don't really prefer one to the other very much, but I don't see > how >>>>>> the parent term would be easier to validate ("all but X children > of a >>>>>> term" doesn't make sense to me, do you mean "all children of a > term >>>>>> except X"?) >>>>>> >>>>>> >>>>> You are right; I provided bad shorthand for: 'all children of a > term, >>>>> except X (and Y, and Z, ... -- potentially). >>>>> >>>>> The reason why it it is easier to validate is due to the way the >>>>> validator mapping file is designed, e.g. (example verbatim from >> current >>>>> 0.99.1 mapping file): >>>>> >>>>> <CvTerm termAccession="MS:1000031" useTerm="false" >>>>> termName="instrument model" isRepeatable="false" >>>>> scope="/mzML/instrumentList/instrument" allowChildren="true" >>>>> cvIdentifier="MS"></CvTerm> >>>>> >>>>> this means that although all children of term 'MS:1000031 -- >> instrument >>>>> model' are allowed (allowChildren="true"), the term itself is not >>>>> allowed (useTerm="false"). By flipping this latter boolean, we can >> allow >>>>> the parent term, thus separating between MIAPE requirements > (current >>>>> configuration) and the 'usable mzML requirements' (flipped boolean > as >>>>> explained above) -- for the instrument model at least. >>>>> >>>>> >>>> OK, so it's an implementation thing. That's fine. >>>> >>>> >>>>>> What about data converted from DTAs or MGFs >>>>>> where the user doesn't even remember (or never knew) what kind of >>>>>> instrument it came from? >>>>>> >>>>>> >>>>> When the instrument is really unknown (which is unfortunate and >>>>> constitutes dramatic metadata loss whichever way you look at it), > the >>>>> proposed scenario (usage of toplevel term) provides solace. For > all >>>>> other scenarios (where an incentive to adapt convertor software or >>>>> report the development of a new instrument is concerned), the > relative >>>>> obscurity of the 'fix' might contribute to 'going the extra mile' >>>>> (upgrading the convertor, mailing in the new instrument name). >>>>> >>>>> >>>> While the toplevel term does provide some solace, it is obscure > enough >>>> that a casual user might look at it and think that something was > wrong >>>> because it does not intuitively make sense for the category to > appear >> as >>>> a value. What about this alternative: provide an "unknown > instrument" >>>> term with a unique accession #, but make the term name something > like >>>> "unknown (instrument type not specified or not in CV)". That would > be >>>> intuitive but still eye-catching (and it would be the eye-catching > part >>>> that implementors would want to minimize, because it makes them > look >>>> bad). ;) >>>> >>>> -Matt >>>> >>>> > ----------------------------------------------------------------------- >> -- >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>> > ------------------------------------------------------------------------ >> - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >> > ------------------------------------------------------------------------ > - >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-12 22:14:40
|
I agree with Darren. For profile (aka continuous) mode data, each element in the array is not a peak, so I would not want to label it such. We use the term encodedLength to refer to the length of the string after base64 encoding. It seems like a natural thing to call this concept arrayLength. Something like arrayElementCount could work, too. But I'm currently still in favor of arrayLength unless someone has a more elegant name. Thanks, Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Kessner, Darren E. > Sent: Tuesday, February 12, 2008 1:49 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > I have an objection to the use of "peak" or "peakCount", since this has > the alternate meaning of "local maximum" (as opposed to "data point"). > > Darren > > > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On Behalf Of Mike > Coleman > Sent: Tuesday, February 12, 2008 1:42 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > I'm in favor of this change. Would the interpretation be a little > more obvious is this were called something like "peakCount" instead of > "arrayLength"? > > Mike > > > On Feb 12, 2008 3:27 PM, Eric Deutsch <ede...@sy...> > wrote: > > <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> > > ------------------------------------------------------------------------ > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is STRICTLY PROHIBITED. > > If you have received this message in error, please notify us immediately > by calling (310) 423-6428 and destroy the related message. Thank You for > your cooperation. > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-12 22:07:59
|
> From: psi...@li... [mailto:psidev-ms-dev- > > Hi all, > > Just wanted a clarification on the encoding of the instrument > manufacturer. > > In the example tiny*.mzML, we have: > > <instrument> > <cvParam cvLabel="MS" accession="MS:1000554" name="LCQ Deca" value=""/> > ... > </instrument> > > In the CV we have the following branch: > - "instrument description" > - "model by vendor" > - "Thermo Fisher Scientific" > - "Thermo Finnigan" > - "LCQ Deca" This is the way it was many months ago. But with the release of 0.99.1 in November, it became: - "instrument" -(has) "instrument model" -(is) "Thermo Fisher Scientific instrument model" -(is) "Thermo Finnigan instrument model" -(is) "LCQ Deca" Please insure that you're using the latest CV from the dev web page. It hasn't changed since November, but quite a few things changed with that release. We'll be adding some minor things soon, too (including some of your requests) > Is this the intended procedure for determining the instrument > manufacturer? > 1) look for a cvParam child of "model by vendor" <find "LCQ Deca"> > 2) walk up the branch until you get to the immediate child of "model by > vendor" <walk back to "Thermo Fisher Scientific"> I guess I would imagine the following logic: getVendor("MS:1000554") // "LCQ Deca" would: - getTermParent("MS:1000554") - regexp s/ instrument model// > Or do we want to encode the manufacturer as a separate CV term in the > <instrument> element? I think we decided that separately encoding models and vendors was unnecessarily intricate. We do not have a concept for manufacturer/vendor. However, the models are organized in a predictable way in the CV to allow the above regexp logic. > One other thing, the Thermo tree looks like: > - "Thermo Fisher Scientific" > - "Finnigan MAT" > - some instruments > - "Thermo Electron" > - one instrument > - "Thermo Finnigan" > - some instruments > - "Thermo Scientific" > - more instruments > > Perhaps this tree should be flattened like the other vendors CV trees? Perhaps. I have no special attachment to the current layout if it seems overly burdensome. The general idea was that since Thermo* has evolved considerably, it made sense to categorize the instruments by the most recent entity name that manufactured them and then lump all those under the most recent umbrella company name, which may likely change over time given past history. I think the reasoning was that "Thermo Scientific never really made an LCQ; that was a different older company." Maybe we're splitting hairs here or maybe this seems like a reasonable way to build in some scheme to gracefully handle the case when, say, two totally different vendors merge a couple years from now. I'm happy with the way it is now, but don't feel super strongly about it. Eric > > > > > Darren > > > > > > > > Darren Kessner > > Scientific Programmer > > Dar...@cs... > > 310-423-9538 > > > > Spielberg Family Center for Applied Proteomics > > Cedars-Sinai Medical Center > > http://www.sfcap.cshs.org/ > > > > > > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is STRICTLY PROHIBITED. > > If you have received this message in error, please notify us immediately > by calling (310) 423-6428 and destroy the related message. Thank You for > your cooperation. |
From: Kessner, D. E. <Dar...@cs...> - 2008-02-12 21:49:20
|
I have an objection to the use of "peak" or "peakCount", since this has the alternate meaning of "local maximum" (as opposed to "data point"). Darren -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Mike Coleman Sent: Tuesday, February 12, 2008 1:42 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] binaryArrayData lengths I'm in favor of this change. Would the interpretation be a little more obvious is this were called something like "peakCount" instead of "arrayLength"? Mike On Feb 12, 2008 3:27 PM, Eric Deutsch <ede...@sy...> wrote: > <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Mike C. <tu...@gm...> - 2008-02-12 21:42:05
|
I'm in favor of this change. Would the interpretation be a little more obvious is this were called something like "peakCount" instead of "arrayLength"? Mike On Feb 12, 2008 3:27 PM, Eric Deutsch <ede...@sy...> wrote: > <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> |
From: Matthew C. <mat...@va...> - 2008-02-12 21:38:12
|
I can live with that (or "length" or "peakCount"). -Matt Eric Deutsch wrote: > So there seems to be broad consensus (4 for 4;) that moving the > arrayLength up a little higher is a good idea. So instead of: > > <spectrum id="S19" scanNumber="19" msLevel="1"> > <spectrumDescription> > ... > </spectrumDescription> > <binaryDataArray arrayLength="1313" encodedLength="5433" > dataProcessingRef="Xcalibur Processing"> > ... > <binary>AAAAwDsGeUAAAAD...</binary> > </binaryDataArray> > <binaryDataArray arrayLength="1313" encodedLength="4892"> > ... > <binary>AAAAAIBJxk...</binary> > </binaryDataArray> > </spectrum> > > We will have: !!!!!!!!!!!!!!!!!! > > <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> > <spectrumDescription> > ... > </spectrumDescription> > <binaryDataArray encodedLength="5433" > dataProcessingRef="Xcalibur Processing"> > ... > <binary>AAAAwDsGeUAAAAD...</binary> > </binaryDataArray> > <binaryDataArray encodedLength="4892"> > ... > <binary>AAAAAIBJxk...</binary> > </binaryDataArray> > </spectrum> > > > Agreed? > > > > >> -----Original Message----- >> From: psi...@li... >> > [mailto:psidev-ms-dev- > >> bo...@li...] On Behalf Of Matthew Chambers >> Sent: Wednesday, February 06, 2008 10:49 AM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] binaryArrayData lengths >> >> I agree that the primary data arrays should probably be treated as >> special in the schema so it's clear that they are paired values and >> > thus > >> peak count could move into the spectrum element or >> > spectrumDescription. > >> There should still be options to have additional arrays that aren't >> > the > >> same as the main arrays (for example, an additional set of arrays, one >> for a subset of the m/zs and the other for peak charge information). >> >> -Matt >> >> >> Kessner, Darren E. wrote: >> >>> Any other comments regarding <binaryArrayData> lengths? >>> >>> >>> >>>> (from Rune) >>>> If they have to be equal size, then >>>> that size ought to be specified in the spectrumDescription. >>>> >>>> >>> I agree -- I would like to encode the length in <spectrum> somewhere >>> (either attribute or cvParam) so that: >>> 1) it's clear that the arrays are of equal size >>> 2) Readers don't have to peek into the attributes of the first >>> <binaryArrayData> to get the info >>> >>> I need this right now for the MSData RAMP adapter code, so I'll >>> > encode > >>> it as a <userParam> until a decision has been made on the >>> > specification. > >>> Darren >>> >> >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Kessner, D. E. <Dar...@cs...> - 2008-02-12 21:37:42
|
Yes, it's closed -- synonyms are fine. Darren -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Eric Deutsch Sent: Tuesday, February 12, 2008 1:36 PM To: Mass spectrometry standard development Cc: Eric Deutsch Subject: Re: [Psidev-ms-dev] CV param readability Hi Darren, is this issue closed or still open? There is the suggestion to use the shortest exact synonym for your purpose. Is that adequate? Regarding: > >> I would like to propose using standard acronyms in the CV term names > >> when it is clear what they mean. I believe is the policy of the PSI CV designers (not just the MS CV) that acronyms should NOT be used, but rather should be a synonym. Can anyone confirm that? Thanks, Eric > From: psi...@li... [mailto:psidev-ms-dev- > > The OBO has "exact_synonyms" like this: > > [Term] > id: MS:1000079 > name: fourier transform ion cyclotron resonance mass spectrometer > def: "A mass spectrometer based on the principle of ion cyclotron > resonance in which an ion in a magnetic field moves in a circular orbit at > a frequency characteristic of its m/z value. Ions are coherently excited > to a larger radius orbit using a pulse of radio frequency energy and their > image charge is detected on receiver plates as a time domain signal. > Fourier transformation of the time domain signal results in a frequency > domain signal which is converted to a mass spectrum based in the inverse > relationship between frequency and m/z." [PSI:MS] > exact_synonym: "FT_ICR" [] > is_a: MS:1000443 ! mass analyzer type > > Darren, I suggest you parse both the term name and its synonyms into a set > for that term, and choose from it the shortest string to put in the enum. > :) > > -Matt > > > Joshua Tasman wrote: > > Hi Darren, > > > > Speaking only for myself, I think that the "name" attribute should be > optional in the file and not interfere with validation. I've never > understood why the text string needs to exactly match the CV for > validation; someone one the list had brought up other languages, etc. But > I think it came up on the list before, and requiring strict mapping > between accession numbers and text string seemed to be important for the > format. > > > > At the least, acronyms would require additional 'mapping files' or > something similar to be added to the specification, and the validator to > be updated. Maybe someone more familiar with these tasks could step in. > Maybe the CV could be expanded so that every entry had an additional > "acronym" field. This brings up other questions, like would uniqueness be > enforced, etc? > > > > Josh > > > > > > Kessner, Darren E. wrote: > > > >> I would like to propose using standard acronyms in the CV term names > >> when it is clear what they mean. > >> > >> > >> > >> We currently have: > >> > >> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted > laser > >> desorption ionization" value=""/> > >> > >> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform > ion > >> cyclotron resonance mass spectrometer" value=""/> > >> > >> > >> > >> I think this is more readable: > >> > >> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> > >> > >> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" > value=""/> > >> > >> > >> > >> The full name can still be available in the term description field. > >> > >> > >> > >> I have an ulterior motive for this -- in the code generation of the > >> MSData library, the above terms become constants: > >> > >> MS_matrix_assisted_laser_desorption_ionization = 1000075, > >> > >> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = > >> 1000079, > >> > >> > >> > >> But I think the following is more programmer-friendly: > >> > >> MS_MALDI = 1000075, > >> > >> MS_FT_ICR_MS = 1000079, > >> > >> > >> > >> > >> > >> > >> > >> Darren > >> > >> > >> > >> > >> > >> > >> > >> Darren Kessner > >> > >> Scientific Programmer > >> > >> Dar...@cs... <mailto:Dar...@cs...> > >> > >> 310-423-9538 > >> > >> > >> > >> Spielberg Family Center for Applied Proteomics > >> > >> Cedars-Sinai Medical Center > >> > >> http://www.sfcap.cshs.org/ > >> > >> > >> > >> > >> > >> IMPORTANT WARNING: This message is intended for the use of the person > or > >> entity to which it is addressed and may contain information that is > >> privileged and confidential, the disclosure of which is governed by > >> applicable law. If the reader of this message is not the intended > >> recipient, or the employee or agent responsible for delivering it to > the > >> intended recipient, you are hereby notified that any dissemination, > >> distribution or copying of this information is STRICTLY PROHIBITED. > >> > >> If you have received this message in error, please notify us > immediately > >> by calling (310) 423-6428 and destroy the related message. Thank You > for > >> your cooperation. > >> > >> > >> ----------------------------------------------------------------------- > - > >> > >> ----------------------------------------------------------------------- > -- > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> > >> > >> ----------------------------------------------------------------------- > - > >> > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > > > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Eric D. <ede...@sy...> - 2008-02-12 21:36:07
|
Hi Darren, is this issue closed or still open? There is the suggestion to use the shortest exact synonym for your purpose. Is that adequate? Regarding: > >> I would like to propose using standard acronyms in the CV term names > >> when it is clear what they mean. I believe is the policy of the PSI CV designers (not just the MS CV) that acronyms should NOT be used, but rather should be a synonym. Can anyone confirm that? Thanks, Eric > From: psi...@li... [mailto:psidev-ms-dev- > > The OBO has "exact_synonyms" like this: > > [Term] > id: MS:1000079 > name: fourier transform ion cyclotron resonance mass spectrometer > def: "A mass spectrometer based on the principle of ion cyclotron > resonance in which an ion in a magnetic field moves in a circular orbit at > a frequency characteristic of its m/z value. Ions are coherently excited > to a larger radius orbit using a pulse of radio frequency energy and their > image charge is detected on receiver plates as a time domain signal. > Fourier transformation of the time domain signal results in a frequency > domain signal which is converted to a mass spectrum based in the inverse > relationship between frequency and m/z." [PSI:MS] > exact_synonym: "FT_ICR" [] > is_a: MS:1000443 ! mass analyzer type > > Darren, I suggest you parse both the term name and its synonyms into a set > for that term, and choose from it the shortest string to put in the enum. > :) > > -Matt > > > Joshua Tasman wrote: > > Hi Darren, > > > > Speaking only for myself, I think that the "name" attribute should be > optional in the file and not interfere with validation. I've never > understood why the text string needs to exactly match the CV for > validation; someone one the list had brought up other languages, etc. But > I think it came up on the list before, and requiring strict mapping > between accession numbers and text string seemed to be important for the > format. > > > > At the least, acronyms would require additional 'mapping files' or > something similar to be added to the specification, and the validator to > be updated. Maybe someone more familiar with these tasks could step in. > Maybe the CV could be expanded so that every entry had an additional > "acronym" field. This brings up other questions, like would uniqueness be > enforced, etc? > > > > Josh > > > > > > Kessner, Darren E. wrote: > > > >> I would like to propose using standard acronyms in the CV term names > >> when it is clear what they mean. > >> > >> > >> > >> We currently have: > >> > >> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted > laser > >> desorption ionization" value=""/> > >> > >> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform > ion > >> cyclotron resonance mass spectrometer" value=""/> > >> > >> > >> > >> I think this is more readable: > >> > >> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> > >> > >> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" > value=""/> > >> > >> > >> > >> The full name can still be available in the term description field. > >> > >> > >> > >> I have an ulterior motive for this -- in the code generation of the > >> MSData library, the above terms become constants: > >> > >> MS_matrix_assisted_laser_desorption_ionization = 1000075, > >> > >> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = > >> 1000079, > >> > >> > >> > >> But I think the following is more programmer-friendly: > >> > >> MS_MALDI = 1000075, > >> > >> MS_FT_ICR_MS = 1000079, > >> > >> > >> > >> > >> > >> > >> > >> Darren > >> > >> > >> > >> > >> > >> > >> > >> Darren Kessner > >> > >> Scientific Programmer > >> > >> Dar...@cs... <mailto:Dar...@cs...> > >> > >> 310-423-9538 > >> > >> > >> > >> Spielberg Family Center for Applied Proteomics > >> > >> Cedars-Sinai Medical Center > >> > >> http://www.sfcap.cshs.org/ > >> > >> > >> > >> > >> > >> IMPORTANT WARNING: This message is intended for the use of the person > or > >> entity to which it is addressed and may contain information that is > >> privileged and confidential, the disclosure of which is governed by > >> applicable law. If the reader of this message is not the intended > >> recipient, or the employee or agent responsible for delivering it to > the > >> intended recipient, you are hereby notified that any dissemination, > >> distribution or copying of this information is STRICTLY PROHIBITED. > >> > >> If you have received this message in error, please notify us > immediately > >> by calling (310) 423-6428 and destroy the related message. Thank You > for > >> your cooperation. > >> > >> > >> ----------------------------------------------------------------------- > - > >> > >> ----------------------------------------------------------------------- > -- > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> > >> > >> ----------------------------------------------------------------------- > - > >> > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > > > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Angel P. <an...@ma...> - 2008-02-12 21:34:18
|
+1 agreed -angel On Feb 12, 2008 4:27 PM, Eric Deutsch <ede...@sy...> wrote: > > So there seems to be broad consensus (4 for 4;) that moving the > arrayLength up a little higher is a good idea. So instead of: > > <spectrum id="S19" scanNumber="19" msLevel="1"> > <spectrumDescription> > ... > </spectrumDescription> > <binaryDataArray arrayLength="1313" encodedLength="5433" > dataProcessingRef="Xcalibur Processing"> > ... > <binary>AAAAwDsGeUAAAAD...</binary> > </binaryDataArray> > <binaryDataArray arrayLength="1313" encodedLength="4892"> > ... > <binary>AAAAAIBJxk...</binary> > </binaryDataArray> > </spectrum> > > We will have: !!!!!!!!!!!!!!!!!! > > <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> > <spectrumDescription> > ... > </spectrumDescription> > <binaryDataArray encodedLength="5433" > dataProcessingRef="Xcalibur Processing"> > ... > <binary>AAAAwDsGeUAAAAD...</binary> > </binaryDataArray> > <binaryDataArray encodedLength="4892"> > ... > <binary>AAAAAIBJxk...</binary> > </binaryDataArray> > </spectrum> > > > Agreed? > > > > > -----Original Message----- > > From: psi...@li... > [mailto:psidev-ms-dev- > > bo...@li...] On Behalf Of Matthew Chambers > > Sent: Wednesday, February 06, 2008 10:49 AM > > To: Mass spectrometry standard development > > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > > > I agree that the primary data arrays should probably be treated as > > special in the schema so it's clear that they are paired values and > thus > > peak count could move into the spectrum element or > spectrumDescription. > > There should still be options to have additional arrays that aren't > the > > same as the main arrays (for example, an additional set of arrays, one > > for a subset of the m/zs and the other for peak charge information). > > > > -Matt > > > > > > Kessner, Darren E. wrote: > > > Any other comments regarding <binaryArrayData> lengths? > > > > > > > > >> (from Rune) > > >> If they have to be equal size, then > > >> that size ought to be specified in the spectrumDescription. > > >> > > > > > > I agree -- I would like to encode the length in <spectrum> somewhere > > > (either attribute or cvParam) so that: > > > 1) it's clear that the arrays are of equal size > > > 2) Readers don't have to peek into the attributes of the first > > > <binaryArrayData> to get the info > > > > > > I need this right now for the MSData RAMP adapter code, so I'll > encode > > > it as a <userParam> until a decision has been made on the > specification. > > > > > > > > > Darren > > > > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > -- Angel Pizarro Director, ITMAT Bioinformatics Facility 806 Biological Research Building 421 Curie Blvd. Philadelphia, PA 19104-6160 215-573-3736 |
From: Eric D. <ede...@sy...> - 2008-02-12 21:27:54
|
So there seems to be broad consensus (4 for 4;) that moving the arrayLength up a little higher is a good idea. So instead of: <spectrum id="S19" scanNumber="19" msLevel="1"> <spectrumDescription> ... </spectrumDescription> <binaryDataArray arrayLength="1313" encodedLength="5433" dataProcessingRef="Xcalibur Processing"> ... <binary>AAAAwDsGeUAAAAD...</binary> </binaryDataArray> <binaryDataArray arrayLength="1313" encodedLength="4892"> ... <binary>AAAAAIBJxk...</binary> </binaryDataArray> </spectrum> We will have: !!!!!!!!!!!!!!!!!! <spectrum id="S19" scanNumber="19" msLevel="1" arrayLength="1313"> <spectrumDescription> ... </spectrumDescription> <binaryDataArray encodedLength="5433" dataProcessingRef="Xcalibur Processing"> ... <binary>AAAAwDsGeUAAAAD...</binary> </binaryDataArray> <binaryDataArray encodedLength="4892"> ... <binary>AAAAAIBJxk...</binary> </binaryDataArray> </spectrum> Agreed? > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Matthew Chambers > Sent: Wednesday, February 06, 2008 10:49 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > I agree that the primary data arrays should probably be treated as > special in the schema so it's clear that they are paired values and thus > peak count could move into the spectrum element or spectrumDescription. > There should still be options to have additional arrays that aren't the > same as the main arrays (for example, an additional set of arrays, one > for a subset of the m/zs and the other for peak charge information). > > -Matt > > > Kessner, Darren E. wrote: > > Any other comments regarding <binaryArrayData> lengths? > > > > > >> (from Rune) > >> If they have to be equal size, then > >> that size ought to be specified in the spectrumDescription. > >> > > > > I agree -- I would like to encode the length in <spectrum> somewhere > > (either attribute or cvParam) so that: > > 1) it's clear that the arrays are of equal size > > 2) Readers don't have to peek into the attributes of the first > > <binaryArrayData> to get the info > > > > I need this right now for the MSData RAMP adapter code, so I'll encode > > it as a <userParam> until a decision has been made on the specification. > > > > > > Darren > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Angel P. <an...@ma...> - 2008-02-12 20:20:32
|
On Feb 12, 2008 2:52 PM, Eric Deutsch <ede...@sy...> wrote: > > Hi everyone, I'm trying to see if we can get to some consensus on some > of these ongoing threads. Regarding the "unknown instrument" problem, I > think there has been some confusion, so let me see if I can clarify and > ask for a final round of opinions. I agree with Fredrik's comments > below that his examples below are *not* what is intended. Here is what I > believe Lennart intended: > > A) > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value=""/> > > Or the other alternative is to create a term for unknown: > > B) > <cvParam cvLabel="MS" accession="MS:1099931" name="unknown instrument > model" value=""/> > (where the number is obviously made up by me right now, but would be in > the CV) > > So those are the choices. Putting something in the value attribute is > not an option as Fredrik concludes below. > > Benefits of A) > - No need to litter the CV with "xxx unknown" terms > - Happenstance very easy for the existing validator software to > accommodate > - Somewhat counterintuitive and thus dissuades laziness > Drawbacks of A) > - Somewhat counterintuitive and awkward > Its not counterintuitive to me, but I don't care much which way we go. E.g. I am not invested in this scheme the way that the validator folks might be. -angel |
From: Matthew C. <mat...@va...> - 2008-02-12 20:06:25
|
Hi all, Eric, you judge correctly that I prefer B, but there is a third option (just for completeness) C) Yes, it's blank on purpose. :) It's conceivable that for parameters which are acceptably unknown (in at least non-MIAPE mode, but possibly MIAPE as well), such parameters could just be absent from the parameter group it would be valid to appear in. Benefits of C) - No need to litter the CV with "xxx unknown" terms - Should be easy for the existing validator to accommodate - Intuitive Drawbacks of C) - Could be confused for "I didn't think that parameter belong in that section" or "I didn't think that parameter was applicable in my circumstances;" i.e. "unknown" would become synonymous with "n/a" which is clearly undesirable. -Matt Eric Deutsch wrote: > Hi everyone, I'm trying to see if we can get to some consensus on some > of these ongoing threads. Regarding the "unknown instrument" problem, I > think there has been some confusion, so let me see if I can clarify and > ask for a final round of opinions. I agree with Fredrik's comments > below that his examples below are *not* what is intended. Here is what I > believe Lennart intended: > > A) > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value=""/> > > Or the other alternative is to create a term for unknown: > > B) > <cvParam cvLabel="MS" accession="MS:1099931" name="unknown instrument > model" value=""/> > (where the number is obviously made up by me right now, but would be in > the CV) > > So those are the choices. Putting something in the value attribute is > not an option as Fredrik concludes below. > > Benefits of A) > - No need to litter the CV with "xxx unknown" terms > - Happenstance very easy for the existing validator software to > accommodate > - Somewhat counterintuitive and thus dissuades laziness > Drawbacks of A) > - Somewhat counterintuitive and awkward > > Benefits of B) > - Very intuitive and straightforward: the concept of what instrument > generated these spectra is captured by the concept "sorry, I just don't > know which instrument it was" > Drawbacks of B) > - Opens the door to perhaps needing to sprinkle other unknowns in the CV > - Is a little more inviting to users to be lazy and claim they don't > know, when with a little more effort they could find out and report > properly (because "unknown" is not an *obvious* option) > - Would require more development in the validator to properly handle a > special term like this. > > Based on the feedback I saw so far, Lennart, Luisa and Angel like A. > Matt seemed more in favor of B. No clear reads on others. > > I myself prefer B. To me it feels like A is a convenient but > counterintuitive trick to working around the problem. B feels like the > right solution even if it facilitates laziness. I don't think that will > be a big problem. I'm sure we can come up with some syntax for the > validator to permit or disallow "ambiguity terms" as desired. > > So, what say ye? > > > > > >> From: psi...@li... >> > [mailto:psidev-ms-dev- > >> Hi Lennart, Josh, Matt and others, >> >> If the top level term is allowed it will be possible to define not >> > only > >> instrument value='unknown', but also instruments that are not in the >> > CV > >> by putting something in the value field: >> <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" >> value="The new mass spec not in CV"/> >> <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" >> value="unknown"/> >> Instead of the intended: >> <cvParam cvLabel="MS" accession="MS:1000189" name="q-tof ultima" >> value=""/> >> I'm not so sure that this is wanted. Especially since unknown could be >> written as 'not known', 'not specified' etcetera. It make sense to >> > have > >> a CV term for 'unknown', but it would be quite a few 'unknown' terms >> > to > >> add to the CV to get one for each required category in the mzML >> schema...At some places it would be enough with just 'unknown' >> (source,detector etc), but at other places it must be specified what >> > is > >> unknown! >> >> Anyway, I am still for usage of top level elements :-) , see line 16 >> > at: > >> http://trac.thep.lu.se/trac/fp6- >> prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML >> >> cheers >> >> Fredrik >> >> Joshua Tasman skrev: >> >>> I'm with Matt on this one, and like his solution. There are >>> >> unfortunately lots of real use cases (combining dta, mgfs) where the >> information will really be unknown, and we should accurately represent >> > the > >> lack of information. If it's not too much effort to add a little more >> code to the validator, I would much prefer the accurate addition of an >> "unknown" term. There has been so much effort getting the CV and >> > document > >> to line up with reality, it looks very strange to me to force this >> ontological 'hack' by allowing the category to appear as a value, as >> > Matt > >> has said. >> >>> Josh >>> >>> >>> Matthew Chambers wrote: >>> >>> >>>> Lennart Martens wrote: >>>> >>>> >>>>> Hi Matt, and Colleagues, >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> I don't really prefer one to the other very much, but I don't see >>>>>> > how > >>>>>> the parent term would be easier to validate ("all but X children >>>>>> > of a > >>>>>> term" doesn't make sense to me, do you mean "all children of a >>>>>> > term > >>>>>> except X"?) >>>>>> >>>>>> >>>>>> >>>>> You are right; I provided bad shorthand for: 'all children of a >>>>> > term, > >>>>> except X (and Y, and Z, ... -- potentially). >>>>> >>>>> The reason why it it is easier to validate is due to the way the >>>>> validator mapping file is designed, e.g. (example verbatim from >>>>> >> current >> >>>>> 0.99.1 mapping file): >>>>> >>>>> <CvTerm termAccession="MS:1000031" useTerm="false" >>>>> termName="instrument model" isRepeatable="false" >>>>> scope="/mzML/instrumentList/instrument" allowChildren="true" >>>>> cvIdentifier="MS"></CvTerm> >>>>> >>>>> this means that although all children of term 'MS:1000031 -- >>>>> >> instrument >> >>>>> model' are allowed (allowChildren="true"), the term itself is not >>>>> allowed (useTerm="false"). By flipping this latter boolean, we can >>>>> >> allow >> >>>>> the parent term, thus separating between MIAPE requirements >>>>> > (current > >>>>> configuration) and the 'usable mzML requirements' (flipped boolean >>>>> > as > >>>>> explained above) -- for the instrument model at least. >>>>> >>>>> >>>>> >>>> OK, so it's an implementation thing. That's fine. >>>> >>>> >>>> >>>>>> What about data converted from DTAs or MGFs >>>>>> where the user doesn't even remember (or never knew) what kind of >>>>>> instrument it came from? >>>>>> >>>>>> >>>>>> >>>>> When the instrument is really unknown (which is unfortunate and >>>>> constitutes dramatic metadata loss whichever way you look at it), >>>>> > the > >>>>> proposed scenario (usage of toplevel term) provides solace. For >>>>> > all > >>>>> other scenarios (where an incentive to adapt convertor software or >>>>> report the development of a new instrument is concerned), the >>>>> > relative > >>>>> obscurity of the 'fix' might contribute to 'going the extra mile' >>>>> (upgrading the convertor, mailing in the new instrument name). >>>>> >>>>> >>>>> >>>> While the toplevel term does provide some solace, it is obscure >>>> > enough > >>>> that a casual user might look at it and think that something was >>>> > wrong > >>>> because it does not intuitively make sense for the category to >>>> > appear > >> as >> >>>> a value. What about this alternative: provide an "unknown >>>> > instrument" > >>>> term with a unique accession #, but make the term name something >>>> > like > >>>> "unknown (instrument type not specified or not in CV)". That would >>>> > be > >>>> intuitive but still eye-catching (and it would be the eye-catching >>>> > part > >>>> that implementors would want to minimize, because it makes them >>>> > look > >>>> bad). ;) >>>> >>>> -Matt >>>> >>>> >>>> > ----------------------------------------------------------------------- > >> -- >> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>> > ------------------------------------------------------------------------ > >> - >> >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Eric D. <ede...@sy...> - 2008-02-12 19:52:06
|
Hi everyone, I'm trying to see if we can get to some consensus on some of these ongoing threads. Regarding the "unknown instrument" problem, I think there has been some confusion, so let me see if I can clarify and ask for a final round of opinions. I agree with Fredrik's comments below that his examples below are *not* what is intended. Here is what I believe Lennart intended: A) <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" value=""/> Or the other alternative is to create a term for unknown: B) <cvParam cvLabel="MS" accession="MS:1099931" name="unknown instrument model" value=""/> (where the number is obviously made up by me right now, but would be in the CV) So those are the choices. Putting something in the value attribute is not an option as Fredrik concludes below. Benefits of A) - No need to litter the CV with "xxx unknown" terms - Happenstance very easy for the existing validator software to accommodate - Somewhat counterintuitive and thus dissuades laziness Drawbacks of A) - Somewhat counterintuitive and awkward Benefits of B) - Very intuitive and straightforward: the concept of what instrument generated these spectra is captured by the concept "sorry, I just don't know which instrument it was" Drawbacks of B) - Opens the door to perhaps needing to sprinkle other unknowns in the CV - Is a little more inviting to users to be lazy and claim they don't know, when with a little more effort they could find out and report properly (because "unknown" is not an *obvious* option) - Would require more development in the validator to properly handle a special term like this. Based on the feedback I saw so far, Lennart, Luisa and Angel like A. Matt seemed more in favor of B. No clear reads on others. I myself prefer B. To me it feels like A is a convenient but counterintuitive trick to working around the problem. B feels like the right solution even if it facilitates laziness. I don't think that will be a big problem. I'm sure we can come up with some syntax for the validator to permit or disallow "ambiguity terms" as desired. So, what say ye? > From: psi...@li... [mailto:psidev-ms-dev- > > Hi Lennart, Josh, Matt and others, > > If the top level term is allowed it will be possible to define not only > instrument value='unknown', but also instruments that are not in the CV > by putting something in the value field: > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value="The new mass spec not in CV"/> > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value="unknown"/> > Instead of the intended: > <cvParam cvLabel="MS" accession="MS:1000189" name="q-tof ultima" > value=""/> > I'm not so sure that this is wanted. Especially since unknown could be > written as 'not known', 'not specified' etcetera. It make sense to have > a CV term for 'unknown', but it would be quite a few 'unknown' terms to > add to the CV to get one for each required category in the mzML > schema...At some places it would be enough with just 'unknown' > (source,detector etc), but at other places it must be specified what is > unknown! > > Anyway, I am still for usage of top level elements :-) , see line 16 at: > http://trac.thep.lu.se/trac/fp6- > prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML > > cheers > > Fredrik > > Joshua Tasman skrev: > > I'm with Matt on this one, and like his solution. There are > unfortunately lots of real use cases (combining dta, mgfs) where the > information will really be unknown, and we should accurately represent the > lack of information. If it's not too much effort to add a little more > code to the validator, I would much prefer the accurate addition of an > "unknown" term. There has been so much effort getting the CV and document > to line up with reality, it looks very strange to me to force this > ontological 'hack' by allowing the category to appear as a value, as Matt > has said. > > > > Josh > > > > > > Matthew Chambers wrote: > > > >> Lennart Martens wrote: > >> > >>> Hi Matt, and Colleagues, > >>> > >>> > >>> > >>> > >>>> I don't really prefer one to the other very much, but I don't see how > >>>> the parent term would be easier to validate ("all but X children of a > >>>> term" doesn't make sense to me, do you mean "all children of a term > >>>> except X"?) > >>>> > >>>> > >>> You are right; I provided bad shorthand for: 'all children of a term, > >>> except X (and Y, and Z, ... -- potentially). > >>> > >>> The reason why it it is easier to validate is due to the way the > >>> validator mapping file is designed, e.g. (example verbatim from > current > >>> 0.99.1 mapping file): > >>> > >>> <CvTerm termAccession="MS:1000031" useTerm="false" > >>> termName="instrument model" isRepeatable="false" > >>> scope="/mzML/instrumentList/instrument" allowChildren="true" > >>> cvIdentifier="MS"></CvTerm> > >>> > >>> this means that although all children of term 'MS:1000031 -- > instrument > >>> model' are allowed (allowChildren="true"), the term itself is not > >>> allowed (useTerm="false"). By flipping this latter boolean, we can > allow > >>> the parent term, thus separating between MIAPE requirements (current > >>> configuration) and the 'usable mzML requirements' (flipped boolean as > >>> explained above) -- for the instrument model at least. > >>> > >>> > >> OK, so it's an implementation thing. That's fine. > >> > >> > >>>> What about data converted from DTAs or MGFs > >>>> where the user doesn't even remember (or never knew) what kind of > >>>> instrument it came from? > >>>> > >>>> > >>> When the instrument is really unknown (which is unfortunate and > >>> constitutes dramatic metadata loss whichever way you look at it), the > >>> proposed scenario (usage of toplevel term) provides solace. For all > >>> other scenarios (where an incentive to adapt convertor software or > >>> report the development of a new instrument is concerned), the relative > >>> obscurity of the 'fix' might contribute to 'going the extra mile' > >>> (upgrading the convertor, mailing in the new instrument name). > >>> > >>> > >> While the toplevel term does provide some solace, it is obscure enough > >> that a casual user might look at it and think that something was wrong > >> because it does not intuitively make sense for the category to appear > as > >> a value. What about this alternative: provide an "unknown instrument" > >> term with a unique accession #, but make the term name something like > >> "unknown (instrument type not specified or not in CV)". That would be > >> intuitive but still eye-catching (and it would be the eye-catching part > >> that implementors would want to minimize, because it makes them look > >> bad). ;) > >> > >> -Matt > >> > >> ----------------------------------------------------------------------- > -- > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Kessner, D. E. <Dar...@cs...> - 2008-02-11 20:21:29
|
Hi all, Just wanted a clarification on the encoding of the instrument manufacturer. In the example tiny*.mzML, we have: <instrument> <cvParam cvLabel="MS" accession="MS:1000554" name="LCQ Deca" value=""/> ... </instrument> In the CV we have the following branch: - "instrument description" - "model by vendor" - "Thermo Fisher Scientific" - "Thermo Finnigan" - "LCQ Deca" Is this the intended procedure for determining the instrument manufacturer? 1) look for a cvParam child of "model by vendor" <find "LCQ Deca"> 2) walk up the branch until you get to the immediate child of "model by vendor" <walk back to "Thermo Fisher Scientific"> Or do we want to encode the manufacturer as a separate CV term in the <instrument> element? One other thing, the Thermo tree looks like: - "Thermo Fisher Scientific" - "Finnigan MAT" - some instruments - "Thermo Electron" - one instrument - "Thermo Finnigan" - some instruments - "Thermo Scientific" - more instruments Perhaps this tree should be flattened like the other vendors CV trees? Darren Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center http://www.sfcap.cshs.org/ IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Eric D. <ede...@sy...> - 2008-02-08 08:26:55
|
Hi everyone, the minutes from Tuesday's telecon are posted on the psidev site by Lennart: http://psidev.info/index.php?q=node/315 Below is a revised list of things to do. This does not include the lively discussion on the list on Wednesday. Thanks to those who volunteered to pick up certain tasks at the call. If there are other volunteers, we would be grateful! Regards, Eric ------------------------------------ mzML Aims: ----------------------- - Have all format changes complete by the Toledo meeting (hopefully sooner) - Mop up documentation and loose threads at the meeting. - 100.00% done by ASMS Schedule: ----------------------- Jan 25: mzML reviews returned. Official community review complete. Feb 5: mzML telecon 9:00am PST Feb 19: mzML telecon 9:00am PST Mar 4: mzML telecon 9:00am PST Mar 17: US HUPO meeting Mar 25: mzML telecon 9:00am PST Apr 8: mzML telecon 9:00am PST Apr 23: PSI meeting in Toledo May Jun 1-5: ASMS - Must be done and advertising it here! News items: ----------------------- - Official community reviews are in and sent around - Abstract was submitted for ASMS. May be selected as a week-long display To do list: ----------------------- Schema changes: --------------- - Incorporate Phil's suggested schema typing changes of 1/24 - Figure out how to implement datatype validation in cvParams - For consistency binaryDataArray should be in List???? - Address suggestions from Darren 1/22 - Fix instances in spec doc of instrumentType instead of instrument, etc. - Fix spectrumRef to point to id instead of scanNumber in example docs - Get full name "Proteomics Standards Initiative Mass Spectrometry Ontology" in obo file - Replace <referenceableParamGroup> with <paramGroup> - Remove <instrumentSoftwareRef> and use <softwareRef> - Change to: <cv id="MS" ... > ... <cvParam cvRef="MS" ...> - <sourceFile id="1" sourceFileName="tiny1.RAW" sourceFileLocation="file://F:/data/Exp01" > should be shortened to: <sourceFile id="1" name="tiny1.RAW" location="file://F:/data/Exp01" > - Address the Mallick lab need for multiple precursors Allow multiple ionSelection elements allow terms for precursorIntensity score confidence see snippet on email thread 11/26 - Rune suggests allowing a range for the precursor like for MS^E (or technically there's always a window) See 12/7 - There was a discussion on 11/22 - 11/24 that essentially boils down to: Can we encode the MS inclusion list in an mzML file? - Ask Randy how his "engineers are currently encoding chromatograms into mzData 1.05 using supplemental data vectors (not pretty)" - Invite Mike MacCoss to help with chromatograms (via Parag) - Decide on the open issue discussed in the spec doc regarding cvParam attributes - What to do for sourceFile when the source is really a directory of files for the run instead of a single file? - Can the arrayLengths for <binaryDataArray> ever be different within one <spectrum>? If not, maybe it should be specified only once somewhere? - SpectrumDescription changes from Randy 2/5 Example files: --------------- - Fix spectrumRef in examples - mzML <--> MIAPE-MS mapping assessment. Build on work by Pierre-Alain & Frederik - Get some of JimS's example files into a public area - Work with Waters to get MS^E examples made - We need to develop a good MALDI example file with spot ids - We need to develop a good example of a file created from individual dtas - We meed to develop a good example of a file that contains summed scans - Randy will provide a list of things that we don't handle yet that he thinks we should for subsequent followup Address reviewer comments: --------------- - Address reviewer points in a document - Angel's comments to the reviews - Address the blunt criticisms summarized by Angel on 1/14 Validator: --------------- - Make validator enforce this ascending scanNumber rule - Update validator to check datatypes - Update validator to 0.99.2 - Set up both basic and MIAPE-MS validation levels CV work: --------------- - Figure out where we left off on CV - Add "scan event" or similar to CV - Need to get the relevant CV part into all vendors hands to update - Various other CV open items to address - Can we make the CV have the distinction between categories and terms? - What do we do in a case like with MassWolf where it cannot know the instrument model? - Coordinate submission of PSI-MS to OBO Foundry via Chris Mungall - Add "unknown instrument" - Get both Kermit Murray and David Sparkman involved in the CV - Need to make a crystal clear new term submission path Documentation: --------------- - Clarify in spec doc that the binary data arrays are base64 encoded - Should we get the indexing documented as an appendix? - Include checksum definition - Improve scanNumber ascending requirement in documentation - Address locale issues in spec doc (offending example from mzData) <spectrumInstrument msLevel="1" mzRangeStart="75,00" mzRangeStop="1000,00"> <cvParam cvLabel="psi" accession="PSI:1000038" name="TimeInMinutes" value="0,033" /> - Document why we chose not to encode SRM data as chromatograms - Get the information encoded in the validators mapping file into the spec - State that not all mzML files need to be MIAPE-MS compliant. There will be a basic set of requirements and a second mapping file for full MIAPE-MS compliance - Can the arrayLengths for <binaryDataArray> ever be different within one <spectrum>? Related software: --------------- - Update ReAdW and Wolf for mzML 0.99.2 - Add support for mzML 0.99.2 to mzWiff and Hunter - Finish off other converter loose ends - Fix current indexing and binary encoding bugs reported by Darren 1/28 - Darren Kessner's msData C++ library reads/will read mzML - Brian Pratt is implementing RAMP parser for mzML using Darren's library - Get TPP / ISB workflow working with mzML - Brian suggests a single C/C++ codebase with SWIG bindings to minimize implementation differences - Pierre-Alain will be building an mzML reader into his sytem - Jim Shofstahl already has an mzML -> SRF converter that then feeds into SEQUEST - Randy will definitely be reading mzML files into his data system by ASMS - It would be useful to have converters that could prompt the user for information that is not available - Other software? |
From: Fredrik L. <Fre...@im...> - 2008-02-07 18:09:42
|
Yes, consider the following examples: <cvParam cvLabel="MS" accession="MS:1000079" name="q-tof micro" value=""/> <cvParam cvLabel="MS" accession="MS:1000079" name="customized FT-ICR" value=""/> In my view these are not valid, since the accession numbers and names are not matching. If the mzML definition allows anything to be written into the name field it has to be clearly specified. I would guess that most people that inspect the file manually would only look at the name field, while software should use the accession field, this could clearly lead to errors. Could a possible definition of a valid cvParam be that the name field must match any of the exact synonyms in the CV version that it was made from? Fredrik Eric Deutsch skrev: > The simple Perl XML validator that is distributed in the kit does check > to see if the cv param names match the id's. It would be nice if this > functionality could be added to the main validator. > > > >> -----Original Message----- >> From: psi...@li... >> > [mailto:psidev-ms-dev- > >> bo...@li...] On Behalf Of jtasman >> Sent: Thursday, February 07, 2008 7:51 AM >> To: len...@eb...; Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] CV param readability >> >> Great, thanks for clearing that up. I must have been remembering a >> previous version of the validator. >> >> Josh >> >> Lennart Martens wrote: >> >>> Hi Josh, >>> >>> >>> >>> >>>> Yes, but the official validator will complain if the 'name' string >>>> >> doesn't exactly match the CV, capitalization included. >> >>> This is incorrect (I actually tested it with version 0.99.1 of the >>> semantic mzML validator we wrote and distributed in the kit). >>> >>> The tool validates solely on the CV accession numbers, because this >>> > is > >>> the only 'fixed quantity' (names have exact synonyms for instance), >>> > and > >>> the 'name' attribute is mainly in the standard format for >>> > readability > >>> reasons. >>> >>> So I think it's acceptable if people use acronyms instead of full >>> versions, especially if they are present as exact synonyms in the >>> > CV. > >>> If the correct CV accession number is used for a term, there should >>> never be a problem, since it has been decided throughout the PSI a >>> > long > >>> time ago (and I think we can all agree on this as well) that the >>> accession number takes precedence over the name at all times. >>> >>> >>> >>> >>>> Kessner, Darren E. wrote: >>>> >>>> >>>>> I wasn't thinking about validation, since I'm ignoring the 'name' >>>>> attribute. >>>>> >>>>> This is solely a readability issue, for the mzML and for MSData >>>>> > client > >>>>> code. >>>>> >>>>> >>>>> Darren >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: psi...@li... >>>>> [mailto:psi...@li...] On Behalf Of >>>>> >> Joshua >> >>>>> Tasman >>>>> Sent: Wednesday, February 06, 2008 12:00 PM >>>>> To: Mass spectrometry standard development >>>>> Subject: Re: [Psidev-ms-dev] CV param readability >>>>> >>>>> Hi Darren, >>>>> >>>>> Speaking only for myself, I think that the "name" attribute should >>>>> > be > >>>>> optional in the file and not interfere with validation. I've >>>>> > never > >>>>> understood why the text string needs to exactly match the CV for >>>>> validation; someone one the list had brought up other languages, >>>>> > etc. > >>>>> But I think it came up on the list before, and requiring strict >>>>> >> mapping >> >>>>> between accession numbers and text string seemed to be important >>>>> > for > >> the >> >>>>> format. >>>>> >>>>> At the least, acronyms would require additional 'mapping files' or >>>>> something similar to be added to the specification, and the >>>>> > validator > >> to >> >>>>> be updated. Maybe someone more familiar with these tasks could >>>>> > step > >> in. >> >>>>> Maybe the CV could be expanded so that every entry had an >>>>> > additional > >>>>> "acronym" field. This brings up other questions, like would >>>>> >> uniqueness >> >>>>> be enforced, etc? >>>>> >>>>> Josh >>>>> >>>>> >>>>> Kessner, Darren E. wrote: >>>>> >>>>> >>>>>> I would like to propose using standard acronyms in the CV term >>>>>> > names > >>>>>> when it is clear what they mean. >>>>>> >>>>>> >>>>>> >>>>>> We currently have: >>>>>> >>>>>> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix >>>>>> > assisted > >>>>> laser >>>>> >>>>> >>>>>> desorption ionization" value=""/> >>>>>> >>>>>> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier >>>>>> > transform > >>>>> ion >>>>> >>>>> >>>>>> cyclotron resonance mass spectrometer" value=""/> >>>>>> >>>>>> >>>>>> >>>>>> I think this is more readable: >>>>>> >>>>>> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" >>>>>> > value=""/> > >>>>>> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" >>>>>> >>>>>> >>>>> value=""/> >>>>> >>>>> >>>>>> The full name can still be available in the term description >>>>>> > field. > >>>>>> >>>>>> I have an ulterior motive for this -- in the code generation of >>>>>> > the > >>>>>> MSData library, the above terms become constants: >>>>>> >>>>>> MS_matrix_assisted_laser_desorption_ionization = 1000075, >>>>>> >>>>>> >>>>>> > MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = > >>>>>> 1000079, >>>>>> >>>>>> >>>>>> >>>>>> But I think the following is more programmer-friendly: >>>>>> >>>>>> MS_MALDI = 1000075, >>>>>> >>>>>> MS_FT_ICR_MS = 1000079, >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Darren >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Darren Kessner >>>>>> >>>>>> Scientific Programmer >>>>>> >>>>>> Dar...@cs... <mailto:Dar...@cs...> >>>>>> >>>>>> 310-423-9538 >>>>>> >>>>>> >>>>>> >>>>>> Spielberg Family Center for Applied Proteomics >>>>>> >>>>>> Cedars-Sinai Medical Center >>>>>> >>>>>> http://www.sfcap.cshs.org/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> IMPORTANT WARNING: This message is intended for the use of the >>>>>> > person > >>>>> or >>>>> >>>>> >>>>>> entity to which it is addressed and may contain information that >>>>>> > is > >>>>>> privileged and confidential, the disclosure of which is governed >>>>>> > by > >>>>>> applicable law. If the reader of this message is not the intended >>>>>> recipient, or the employee or agent responsible for delivering it >>>>>> > to > >>>>> the >>>>> >>>>> >>>>>> intended recipient, you are hereby notified that any >>>>>> > dissemination, > >>>>>> distribution or copying of this information is STRICTLY >>>>>> > PROHIBITED. > >>>>>> If you have received this message in error, please notify us >>>>>> >>>>>> >>>>> immediately >>>>> >>>>> >>>>>> by calling (310) 423-6428 and destroy the related message. Thank >>>>>> > You > >>>>> for >>>>> >>>>> >>>>>> your cooperation. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> > ---------------------------------------------------------------------- > >> -- >> > ---------------------------------------------------------------------- > >> -- >> >>>>> - >>>>> >>>>> >>>>>> This SF.net email is sponsored by: Microsoft >>>>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> > ---------------------------------------------------------------------- > >> -- >> >>>>>> _______________________________________________ >>>>>> Psidev-ms-dev mailing list >>>>>> Psi...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>>>> >>>>>> > ---------------------------------------------------------------------- > >> -- >> >>>>> - >>>>> This SF.net email is sponsored by: Microsoft >>>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>> _______________________________________________ >>>>> Psidev-ms-dev mailing list >>>>> Psi...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>>> IMPORTANT WARNING: This message is intended for the use of the >>>>> > person > >> or entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> >>>>> applicable law. If the reader of this message is not the intended >>>>> >> recipient, or the employee or agent responsible for delivering it to >> > the > >> intended recipient, you are hereby notified that any dissemination, >> distribution or copying of this information is STRICTLY PROHIBITED. >> >>>>> If you have received this message in error, please notify us >>>>> >> immediately >> >>>>> by calling (310) 423-6428 and destroy the related message. Thank >>>>> > You > >> for your cooperation. >> >>>>> > ---------------------------------------------------------------------- > >> --- >> >>>>> This SF.net email is sponsored by: Microsoft >>>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>> _______________________________________________ >>>>> Psidev-ms-dev mailing list >>>>> Psi...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>>> >>>>> > ----------------------------------------------------------------------- > >> -- >> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>> > ------------------------------------------------------------------------ > >> - >> >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >> >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Eric D. <ede...@sy...> - 2008-02-07 17:21:15
|
The simple Perl XML validator that is distributed in the kit does check to see if the cv param names match the id's. It would be nice if this functionality could be added to the main validator. > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of jtasman > Sent: Thursday, February 07, 2008 7:51 AM > To: len...@eb...; Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] CV param readability > > Great, thanks for clearing that up. I must have been remembering a > previous version of the validator. > > Josh > > Lennart Martens wrote: > > Hi Josh, > > > > > > > >> Yes, but the official validator will complain if the 'name' string > doesn't exactly match the CV, capitalization included. > >> > > > > This is incorrect (I actually tested it with version 0.99.1 of the > > semantic mzML validator we wrote and distributed in the kit). > > > > The tool validates solely on the CV accession numbers, because this is > > the only 'fixed quantity' (names have exact synonyms for instance), and > > the 'name' attribute is mainly in the standard format for readability > > reasons. > > > > So I think it's acceptable if people use acronyms instead of full > > versions, especially if they are present as exact synonyms in the CV. > > If the correct CV accession number is used for a term, there should > > never be a problem, since it has been decided throughout the PSI a long > > time ago (and I think we can all agree on this as well) that the > > accession number takes precedence over the name at all times. > > > > > > > >> Kessner, Darren E. wrote: > >> > >>> I wasn't thinking about validation, since I'm ignoring the 'name' > >>> attribute. > >>> > >>> This is solely a readability issue, for the mzML and for MSData client > >>> code. > >>> > >>> > >>> Darren > >>> > >>> > >>> -----Original Message----- > >>> From: psi...@li... > >>> [mailto:psi...@li...] On Behalf Of > Joshua > >>> Tasman > >>> Sent: Wednesday, February 06, 2008 12:00 PM > >>> To: Mass spectrometry standard development > >>> Subject: Re: [Psidev-ms-dev] CV param readability > >>> > >>> Hi Darren, > >>> > >>> Speaking only for myself, I think that the "name" attribute should be > >>> optional in the file and not interfere with validation. I've never > >>> understood why the text string needs to exactly match the CV for > >>> validation; someone one the list had brought up other languages, etc. > >>> But I think it came up on the list before, and requiring strict > mapping > >>> between accession numbers and text string seemed to be important for > the > >>> format. > >>> > >>> At the least, acronyms would require additional 'mapping files' or > >>> something similar to be added to the specification, and the validator > to > >>> be updated. Maybe someone more familiar with these tasks could step > in. > >>> Maybe the CV could be expanded so that every entry had an additional > >>> "acronym" field. This brings up other questions, like would > uniqueness > >>> be enforced, etc? > >>> > >>> Josh > >>> > >>> > >>> Kessner, Darren E. wrote: > >>> > >>>> I would like to propose using standard acronyms in the CV term names > >>>> when it is clear what they mean. > >>>> > >>>> > >>>> > >>>> We currently have: > >>>> > >>>> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted > >>>> > >>> laser > >>> > >>>> desorption ionization" value=""/> > >>>> > >>>> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform > >>>> > >>> ion > >>> > >>>> cyclotron resonance mass spectrometer" value=""/> > >>>> > >>>> > >>>> > >>>> I think this is more readable: > >>>> > >>>> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> > >>>> > >>>> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" > >>>> > >>> value=""/> > >>> > >>>> > >>>> > >>>> The full name can still be available in the term description field. > >>>> > >>>> > >>>> > >>>> I have an ulterior motive for this -- in the code generation of the > >>>> MSData library, the above terms become constants: > >>>> > >>>> MS_matrix_assisted_laser_desorption_ionization = 1000075, > >>>> > >>>> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = > >>>> 1000079, > >>>> > >>>> > >>>> > >>>> But I think the following is more programmer-friendly: > >>>> > >>>> MS_MALDI = 1000075, > >>>> > >>>> MS_FT_ICR_MS = 1000079, > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> Darren > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> Darren Kessner > >>>> > >>>> Scientific Programmer > >>>> > >>>> Dar...@cs... <mailto:Dar...@cs...> > >>>> > >>>> 310-423-9538 > >>>> > >>>> > >>>> > >>>> Spielberg Family Center for Applied Proteomics > >>>> > >>>> Cedars-Sinai Medical Center > >>>> > >>>> http://www.sfcap.cshs.org/ > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> IMPORTANT WARNING: This message is intended for the use of the person > >>>> > >>> or > >>> > >>>> entity to which it is addressed and may contain information that is > >>>> privileged and confidential, the disclosure of which is governed by > >>>> applicable law. If the reader of this message is not the intended > >>>> recipient, or the employee or agent responsible for delivering it to > >>>> > >>> the > >>> > >>>> intended recipient, you are hereby notified that any dissemination, > >>>> distribution or copying of this information is STRICTLY PROHIBITED. > >>>> > >>>> If you have received this message in error, please notify us > >>>> > >>> immediately > >>> > >>>> by calling (310) 423-6428 and destroy the related message. Thank You > >>>> > >>> for > >>> > >>>> your cooperation. > >>>> > >>>> > >>>> > >>>> > >>> ---------------------------------------------------------------------- > -- > >>> ---------------------------------------------------------------------- > -- > >>> - > >>> > >>>> This SF.net email is sponsored by: Microsoft > >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>>> > >>>> > >>>> > >>>> > >>> ---------------------------------------------------------------------- > -- > >>> > >>>> _______________________________________________ > >>>> Psidev-ms-dev mailing list > >>>> Psi...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>>> > >>> ---------------------------------------------------------------------- > -- > >>> - > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> IMPORTANT WARNING: This message is intended for the use of the person > or entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > >>> applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is STRICTLY PROHIBITED. > >>> > >>> If you have received this message in error, please notify us > immediately > >>> by calling (310) 423-6428 and destroy the related message. Thank You > for your cooperation. > >>> > >>> ---------------------------------------------------------------------- > --- > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >> ----------------------------------------------------------------------- > -- > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > >> > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Kessner, D. E. <Dar...@cs...> - 2008-02-07 16:56:43
|
Thank you to everyone for clearing up the issue for me! Darren -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of jtasman Sent: Thursday, February 07, 2008 7:51 AM To: len...@eb...; Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] CV param readability Great, thanks for clearing that up. I must have been remembering a previous version of the validator. Josh Lennart Martens wrote: > Hi Josh, > > > >> Yes, but the official validator will complain if the 'name' string doesn't exactly match the CV, capitalization included. >> > > This is incorrect (I actually tested it with version 0.99.1 of the > semantic mzML validator we wrote and distributed in the kit). > > The tool validates solely on the CV accession numbers, because this is > the only 'fixed quantity' (names have exact synonyms for instance), and > the 'name' attribute is mainly in the standard format for readability > reasons. > > So I think it's acceptable if people use acronyms instead of full > versions, especially if they are present as exact synonyms in the CV. > If the correct CV accession number is used for a term, there should > never be a problem, since it has been decided throughout the PSI a long > time ago (and I think we can all agree on this as well) that the > accession number takes precedence over the name at all times. > > > >> Kessner, Darren E. wrote: >> >>> I wasn't thinking about validation, since I'm ignoring the 'name' >>> attribute. >>> >>> This is solely a readability issue, for the mzML and for MSData client >>> code. >>> >>> >>> Darren >>> >>> >>> -----Original Message----- >>> From: psi...@li... >>> [mailto:psi...@li...] On Behalf Of Joshua >>> Tasman >>> Sent: Wednesday, February 06, 2008 12:00 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] CV param readability >>> >>> Hi Darren, >>> >>> Speaking only for myself, I think that the "name" attribute should be >>> optional in the file and not interfere with validation. I've never >>> understood why the text string needs to exactly match the CV for >>> validation; someone one the list had brought up other languages, etc. >>> But I think it came up on the list before, and requiring strict mapping >>> between accession numbers and text string seemed to be important for the >>> format. >>> >>> At the least, acronyms would require additional 'mapping files' or >>> something similar to be added to the specification, and the validator to >>> be updated. Maybe someone more familiar with these tasks could step in. >>> Maybe the CV could be expanded so that every entry had an additional >>> "acronym" field. This brings up other questions, like would uniqueness >>> be enforced, etc? >>> >>> Josh >>> >>> >>> Kessner, Darren E. wrote: >>> >>>> I would like to propose using standard acronyms in the CV term names >>>> when it is clear what they mean. >>>> >>>> >>>> >>>> We currently have: >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted >>>> >>> laser >>> >>>> desorption ionization" value=""/> >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform >>>> >>> ion >>> >>>> cyclotron resonance mass spectrometer" value=""/> >>>> >>>> >>>> >>>> I think this is more readable: >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" >>>> >>> value=""/> >>> >>>> >>>> >>>> The full name can still be available in the term description field. >>>> >>>> >>>> >>>> I have an ulterior motive for this -- in the code generation of the >>>> MSData library, the above terms become constants: >>>> >>>> MS_matrix_assisted_laser_desorption_ionization = 1000075, >>>> >>>> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = >>>> 1000079, >>>> >>>> >>>> >>>> But I think the following is more programmer-friendly: >>>> >>>> MS_MALDI = 1000075, >>>> >>>> MS_FT_ICR_MS = 1000079, >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Darren >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Darren Kessner >>>> >>>> Scientific Programmer >>>> >>>> Dar...@cs... <mailto:Dar...@cs...> >>>> >>>> 310-423-9538 >>>> >>>> >>>> >>>> Spielberg Family Center for Applied Proteomics >>>> >>>> Cedars-Sinai Medical Center >>>> >>>> http://www.sfcap.cshs.org/ >>>> >>>> >>>> >>>> >>>> >>>> IMPORTANT WARNING: This message is intended for the use of the person >>>> >>> or >>> >>>> entity to which it is addressed and may contain information that is >>>> privileged and confidential, the disclosure of which is governed by >>>> applicable law. If the reader of this message is not the intended >>>> recipient, or the employee or agent responsible for delivering it to >>>> >>> the >>> >>>> intended recipient, you are hereby notified that any dissemination, >>>> distribution or copying of this information is STRICTLY PROHIBITED. >>>> >>>> If you have received this message in error, please notify us >>>> >>> immediately >>> >>>> by calling (310) 423-6428 and destroy the related message. Thank You >>>> >>> for >>> >>>> your cooperation. >>>> >>>> >>>> >>>> >>> ------------------------------------------------------------------------ >>> ------------------------------------------------------------------------ >>> - >>> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> >>>> >>>> >>>> >>> ------------------------------------------------------------------------ >>> >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>> ------------------------------------------------------------------------ >>> - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by >>> applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. >>> >>> If you have received this message in error, please notify us immediately >>> by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. >>> >>> ------------------------------------------------------------------------ - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >> ------------------------------------------------------------------------ - >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: jtasman <jt...@sy...> - 2008-02-07 15:51:18
|
Great, thanks for clearing that up. I must have been remembering a previous version of the validator. Josh Lennart Martens wrote: > Hi Josh, > > > >> Yes, but the official validator will complain if the 'name' string doesn't exactly match the CV, capitalization included. >> > > This is incorrect (I actually tested it with version 0.99.1 of the > semantic mzML validator we wrote and distributed in the kit). > > The tool validates solely on the CV accession numbers, because this is > the only 'fixed quantity' (names have exact synonyms for instance), and > the 'name' attribute is mainly in the standard format for readability > reasons. > > So I think it's acceptable if people use acronyms instead of full > versions, especially if they are present as exact synonyms in the CV. > If the correct CV accession number is used for a term, there should > never be a problem, since it has been decided throughout the PSI a long > time ago (and I think we can all agree on this as well) that the > accession number takes precedence over the name at all times. > > > >> Kessner, Darren E. wrote: >> >>> I wasn't thinking about validation, since I'm ignoring the 'name' >>> attribute. >>> >>> This is solely a readability issue, for the mzML and for MSData client >>> code. >>> >>> >>> Darren >>> >>> >>> -----Original Message----- >>> From: psi...@li... >>> [mailto:psi...@li...] On Behalf Of Joshua >>> Tasman >>> Sent: Wednesday, February 06, 2008 12:00 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] CV param readability >>> >>> Hi Darren, >>> >>> Speaking only for myself, I think that the "name" attribute should be >>> optional in the file and not interfere with validation. I've never >>> understood why the text string needs to exactly match the CV for >>> validation; someone one the list had brought up other languages, etc. >>> But I think it came up on the list before, and requiring strict mapping >>> between accession numbers and text string seemed to be important for the >>> format. >>> >>> At the least, acronyms would require additional 'mapping files' or >>> something similar to be added to the specification, and the validator to >>> be updated. Maybe someone more familiar with these tasks could step in. >>> Maybe the CV could be expanded so that every entry had an additional >>> "acronym" field. This brings up other questions, like would uniqueness >>> be enforced, etc? >>> >>> Josh >>> >>> >>> Kessner, Darren E. wrote: >>> >>>> I would like to propose using standard acronyms in the CV term names >>>> when it is clear what they mean. >>>> >>>> >>>> >>>> We currently have: >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted >>>> >>> laser >>> >>>> desorption ionization" value=""/> >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform >>>> >>> ion >>> >>>> cyclotron resonance mass spectrometer" value=""/> >>>> >>>> >>>> >>>> I think this is more readable: >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> >>>> >>>> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" >>>> >>> value=""/> >>> >>>> >>>> >>>> The full name can still be available in the term description field. >>>> >>>> >>>> >>>> I have an ulterior motive for this -- in the code generation of the >>>> MSData library, the above terms become constants: >>>> >>>> MS_matrix_assisted_laser_desorption_ionization = 1000075, >>>> >>>> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = >>>> 1000079, >>>> >>>> >>>> >>>> But I think the following is more programmer-friendly: >>>> >>>> MS_MALDI = 1000075, >>>> >>>> MS_FT_ICR_MS = 1000079, >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Darren >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Darren Kessner >>>> >>>> Scientific Programmer >>>> >>>> Dar...@cs... <mailto:Dar...@cs...> >>>> >>>> 310-423-9538 >>>> >>>> >>>> >>>> Spielberg Family Center for Applied Proteomics >>>> >>>> Cedars-Sinai Medical Center >>>> >>>> http://www.sfcap.cshs.org/ >>>> >>>> >>>> >>>> >>>> >>>> IMPORTANT WARNING: This message is intended for the use of the person >>>> >>> or >>> >>>> entity to which it is addressed and may contain information that is >>>> privileged and confidential, the disclosure of which is governed by >>>> applicable law. If the reader of this message is not the intended >>>> recipient, or the employee or agent responsible for delivering it to >>>> >>> the >>> >>>> intended recipient, you are hereby notified that any dissemination, >>>> distribution or copying of this information is STRICTLY PROHIBITED. >>>> >>>> If you have received this message in error, please notify us >>>> >>> immediately >>> >>>> by calling (310) 423-6428 and destroy the related message. Thank You >>>> >>> for >>> >>>> your cooperation. >>>> >>>> >>>> >>>> >>> ------------------------------------------------------------------------ >>> ------------------------------------------------------------------------ >>> - >>> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> >>>> >>>> >>>> >>> ------------------------------------------------------------------------ >>> >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>> ------------------------------------------------------------------------ >>> - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by >>> applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. >>> >>> If you have received this message in error, please notify us immediately >>> by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. >>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Lennart M. <len...@eb...> - 2008-02-07 10:31:17
|
Hi Josh, > Yes, but the official validator will complain if the 'name' string doesn't exactly match the CV, capitalization included. This is incorrect (I actually tested it with version 0.99.1 of the semantic mzML validator we wrote and distributed in the kit). The tool validates solely on the CV accession numbers, because this is the only 'fixed quantity' (names have exact synonyms for instance), and the 'name' attribute is mainly in the standard format for readability reasons. So I think it's acceptable if people use acronyms instead of full versions, especially if they are present as exact synonyms in the CV. If the correct CV accession number is used for a term, there should never be a problem, since it has been decided throughout the PSI a long time ago (and I think we can all agree on this as well) that the accession number takes precedence over the name at all times. > > Kessner, Darren E. wrote: >> I wasn't thinking about validation, since I'm ignoring the 'name' >> attribute. >> >> This is solely a readability issue, for the mzML and for MSData client >> code. >> >> >> Darren >> >> >> -----Original Message----- >> From: psi...@li... >> [mailto:psi...@li...] On Behalf Of Joshua >> Tasman >> Sent: Wednesday, February 06, 2008 12:00 PM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] CV param readability >> >> Hi Darren, >> >> Speaking only for myself, I think that the "name" attribute should be >> optional in the file and not interfere with validation. I've never >> understood why the text string needs to exactly match the CV for >> validation; someone one the list had brought up other languages, etc. >> But I think it came up on the list before, and requiring strict mapping >> between accession numbers and text string seemed to be important for the >> format. >> >> At the least, acronyms would require additional 'mapping files' or >> something similar to be added to the specification, and the validator to >> be updated. Maybe someone more familiar with these tasks could step in. >> Maybe the CV could be expanded so that every entry had an additional >> "acronym" field. This brings up other questions, like would uniqueness >> be enforced, etc? >> >> Josh >> >> >> Kessner, Darren E. wrote: >>> I would like to propose using standard acronyms in the CV term names >>> when it is clear what they mean. >>> >>> >>> >>> We currently have: >>> >>> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted >> laser >>> desorption ionization" value=""/> >>> >>> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform >> ion >>> cyclotron resonance mass spectrometer" value=""/> >>> >>> >>> >>> I think this is more readable: >>> >>> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> >>> >>> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" >> value=""/> >>> >>> >>> The full name can still be available in the term description field. >>> >>> >>> >>> I have an ulterior motive for this -- in the code generation of the >>> MSData library, the above terms become constants: >>> >>> MS_matrix_assisted_laser_desorption_ionization = 1000075, >>> >>> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = >>> 1000079, >>> >>> >>> >>> But I think the following is more programmer-friendly: >>> >>> MS_MALDI = 1000075, >>> >>> MS_FT_ICR_MS = 1000079, >>> >>> >>> >>> >>> >>> >>> >>> Darren >>> >>> >>> >>> >>> >>> >>> >>> Darren Kessner >>> >>> Scientific Programmer >>> >>> Dar...@cs... <mailto:Dar...@cs...> >>> >>> 310-423-9538 >>> >>> >>> >>> Spielberg Family Center for Applied Proteomics >>> >>> Cedars-Sinai Medical Center >>> >>> http://www.sfcap.cshs.org/ >>> >>> >>> >>> >>> >>> IMPORTANT WARNING: This message is intended for the use of the person >> or >>> entity to which it is addressed and may contain information that is >>> privileged and confidential, the disclosure of which is governed by >>> applicable law. If the reader of this message is not the intended >>> recipient, or the employee or agent responsible for delivering it to >> the >>> intended recipient, you are hereby notified that any dissemination, >>> distribution or copying of this information is STRICTLY PROHIBITED. >>> >>> If you have received this message in error, please notify us >> immediately >>> by calling (310) 423-6428 and destroy the related message. Thank You >> for >>> your cooperation. >>> >>> >>> >> ------------------------------------------------------------------------ >> ------------------------------------------------------------------------ >> - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> >>> >>> >> ------------------------------------------------------------------------ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> ------------------------------------------------------------------------ >> - >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. >> >> If you have received this message in error, please notify us immediately >> by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Kessner, D. E. <Dar...@cs...> - 2008-02-07 00:07:56
|
Great -- thanks, Matt. I hadn't seen that. Darren -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Matthew Chambers Sent: Wednesday, February 06, 2008 12:06 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] CV param readability The OBO has "exact_synonyms" like this: [Term] id: MS:1000079 name: fourier transform ion cyclotron resonance mass spectrometer def: "A mass spectrometer based on the principle of ion cyclotron resonance in which an ion in a magnetic field moves in a circular orbit at a frequency characteristic of its m/z value. Ions are coherently excited to a larger radius orbit using a pulse of radio frequency energy and their image charge is detected on receiver plates as a time domain signal. Fourier transformation of the time domain signal results in a frequency domain signal which is converted to a mass spectrum based in the inverse relationship between frequency and m/z." [PSI:MS] exact_synonym: "FT_ICR" [] is_a: MS:1000443 ! mass analyzer type Darren, I suggest you parse both the term name and its synonyms into a set for that term, and choose from it the shortest string to put in the enum. :) -Matt Joshua Tasman wrote: > Hi Darren, > > Speaking only for myself, I think that the "name" attribute should be optional in the file and not interfere with validation. I've never understood why the text string needs to exactly match the CV for validation; someone one the list had brought up other languages, etc. But I think it came up on the list before, and requiring strict mapping between accession numbers and text string seemed to be important for the format. > > At the least, acronyms would require additional 'mapping files' or something similar to be added to the specification, and the validator to be updated. Maybe someone more familiar with these tasks could step in. Maybe the CV could be expanded so that every entry had an additional "acronym" field. This brings up other questions, like would uniqueness be enforced, etc? > > Josh > > > Kessner, Darren E. wrote: > >> I would like to propose using standard acronyms in the CV term names >> when it is clear what they mean. >> >> >> >> We currently have: >> >> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted laser >> desorption ionization" value=""/> >> >> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform ion >> cyclotron resonance mass spectrometer" value=""/> >> >> >> >> I think this is more readable: >> >> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> >> >> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" value=""/> >> >> >> >> The full name can still be available in the term description field. >> >> >> >> I have an ulterior motive for this -- in the code generation of the >> MSData library, the above terms become constants: >> >> MS_matrix_assisted_laser_desorption_ionization = 1000075, >> >> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = >> 1000079, >> >> >> >> But I think the following is more programmer-friendly: >> >> MS_MALDI = 1000075, >> >> MS_FT_ICR_MS = 1000079, >> >> >> >> >> >> >> >> Darren >> >> >> >> >> >> >> >> Darren Kessner >> >> Scientific Programmer >> >> Dar...@cs... <mailto:Dar...@cs...> >> >> 310-423-9538 >> >> >> >> Spielberg Family Center for Applied Proteomics >> >> Cedars-Sinai Medical Center >> >> http://www.sfcap.cshs.org/ >> >> >> >> >> >> IMPORTANT WARNING: This message is intended for the use of the person or >> entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended >> recipient, or the employee or agent responsible for delivering it to the >> intended recipient, you are hereby notified that any dissemination, >> distribution or copying of this information is STRICTLY PROHIBITED. >> >> If you have received this message in error, please notify us immediately >> by calling (310) 423-6428 and destroy the related message. Thank You for >> your cooperation. >> >> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------ - >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Joshua T. <jt...@sy...> - 2008-02-06 22:51:36
|
Thanks for the input and useful example, Fredrik. You've identified an important question to focus on: the intended usage of the schema with extra-CV items, like new machines or software tools. I personally feel that the spec should be as complete as possible, and not encourage adding extra information in a non-validating way. What I'd like to see is a very active and well publicized mechanism for getting new terms added to the CV in a timely way, after the standardization is met. The model I'd imagine is that companies/institutions beta-test their tools with modified mzML such as you created, and perhaps modify the validator for in-house usage. When the time comes to share these file externally, they would contact the standards group to add their term-- hopefully a painless and routine issue taking no more than a week or two. (Again speaking only for myself, I would have preferred a much more strongly typed format, such as almost entirely element-based XML with traditional XML schema validation, rather than using cvParam and an external validator. Since we're doing this the cvParam way, I'd still like to keep things strictly conforming to the format.) This is just my two cents. I'd like to hear from Lennart and everyone else too :) Josh Fredrik Levander wrote: > Hi Lennart, Josh, Matt and others, > > If the top level term is allowed it will be possible to define not only > instrument value='unknown', but also instruments that are not in the CV > by putting something in the value field: > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value="The new mass spec not in CV"/> > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value="unknown"/> > Instead of the intended: > <cvParam cvLabel="MS" accession="MS:1000189" name="q-tof ultima" value=""/> > I'm not so sure that this is wanted. Especially since unknown could be > written as 'not known', 'not specified' etcetera. It make sense to have > a CV term for 'unknown', but it would be quite a few 'unknown' terms to > add to the CV to get one for each required category in the mzML > schema...At some places it would be enough with just 'unknown' > (source,detector etc), but at other places it must be specified what is > unknown! > > Anyway, I am still for usage of top level elements :-) , see line 16 at: > http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML > > cheers > > Fredrik > > Joshua Tasman skrev: >> I'm with Matt on this one, and like his solution. There are unfortunately lots of real use cases (combining dta, mgfs) where the information will really be unknown, and we should accurately represent the lack of information. If it's not too much effort to add a little more code to the validator, I would much prefer the accurate addition of an "unknown" term. There has been so much effort getting the CV and document to line up with reality, it looks very strange to me to force this ontological 'hack' by allowing the category to appear as a value, as Matt has said. >> >> Josh >> >> >> Matthew Chambers wrote: >> >>> Lennart Martens wrote: >>> >>>> Hi Matt, and Colleagues, >>>> >>>> >>>> >>>> >>>>> I don't really prefer one to the other very much, but I don't see how >>>>> the parent term would be easier to validate ("all but X children of a >>>>> term" doesn't make sense to me, do you mean "all children of a term >>>>> except X"?) >>>>> >>>>> >>>> You are right; I provided bad shorthand for: 'all children of a term, >>>> except X (and Y, and Z, ... -- potentially). >>>> >>>> The reason why it it is easier to validate is due to the way the >>>> validator mapping file is designed, e.g. (example verbatim from current >>>> 0.99.1 mapping file): >>>> >>>> <CvTerm termAccession="MS:1000031" useTerm="false" >>>> termName="instrument model" isRepeatable="false" >>>> scope="/mzML/instrumentList/instrument" allowChildren="true" >>>> cvIdentifier="MS"></CvTerm> >>>> >>>> this means that although all children of term 'MS:1000031 -- instrument >>>> model' are allowed (allowChildren="true"), the term itself is not >>>> allowed (useTerm="false"). By flipping this latter boolean, we can allow >>>> the parent term, thus separating between MIAPE requirements (current >>>> configuration) and the 'usable mzML requirements' (flipped boolean as >>>> explained above) -- for the instrument model at least. >>>> >>>> >>> OK, so it's an implementation thing. That's fine. >>> >>> >>>>> What about data converted from DTAs or MGFs >>>>> where the user doesn't even remember (or never knew) what kind of >>>>> instrument it came from? >>>>> >>>>> >>>> When the instrument is really unknown (which is unfortunate and >>>> constitutes dramatic metadata loss whichever way you look at it), the >>>> proposed scenario (usage of toplevel term) provides solace. For all >>>> other scenarios (where an incentive to adapt convertor software or >>>> report the development of a new instrument is concerned), the relative >>>> obscurity of the 'fix' might contribute to 'going the extra mile' >>>> (upgrading the convertor, mailing in the new instrument name). >>>> >>>> >>> While the toplevel term does provide some solace, it is obscure enough >>> that a casual user might look at it and think that something was wrong >>> because it does not intuitively make sense for the category to appear as >>> a value. What about this alternative: provide an "unknown instrument" >>> term with a unique accession #, but make the term name something like >>> "unknown (instrument type not specified or not in CV)". That would be >>> intuitive but still eye-catching (and it would be the eye-catching part >>> that implementors would want to minimize, because it makes them look >>> bad). ;) >>> >>> -Matt >>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Fredrik L. <Fre...@im...> - 2008-02-06 21:41:10
|
Hi Lennart, Josh, Matt and others, If the top level term is allowed it will be possible to define not only instrument value='unknown', but also instruments that are not in the CV by putting something in the value field: <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" value="The new mass spec not in CV"/> <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" value="unknown"/> Instead of the intended: <cvParam cvLabel="MS" accession="MS:1000189" name="q-tof ultima" value=""/> I'm not so sure that this is wanted. Especially since unknown could be written as 'not known', 'not specified' etcetera. It make sense to have a CV term for 'unknown', but it would be quite a few 'unknown' terms to add to the CV to get one for each required category in the mzML schema...At some places it would be enough with just 'unknown' (source,detector etc), but at other places it must be specified what is unknown! Anyway, I am still for usage of top level elements :-) , see line 16 at: http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML cheers Fredrik Joshua Tasman skrev: > I'm with Matt on this one, and like his solution. There are unfortunately lots of real use cases (combining dta, mgfs) where the information will really be unknown, and we should accurately represent the lack of information. If it's not too much effort to add a little more code to the validator, I would much prefer the accurate addition of an "unknown" term. There has been so much effort getting the CV and document to line up with reality, it looks very strange to me to force this ontological 'hack' by allowing the category to appear as a value, as Matt has said. > > Josh > > > Matthew Chambers wrote: > >> Lennart Martens wrote: >> >>> Hi Matt, and Colleagues, >>> >>> >>> >>> >>>> I don't really prefer one to the other very much, but I don't see how >>>> the parent term would be easier to validate ("all but X children of a >>>> term" doesn't make sense to me, do you mean "all children of a term >>>> except X"?) >>>> >>>> >>> You are right; I provided bad shorthand for: 'all children of a term, >>> except X (and Y, and Z, ... -- potentially). >>> >>> The reason why it it is easier to validate is due to the way the >>> validator mapping file is designed, e.g. (example verbatim from current >>> 0.99.1 mapping file): >>> >>> <CvTerm termAccession="MS:1000031" useTerm="false" >>> termName="instrument model" isRepeatable="false" >>> scope="/mzML/instrumentList/instrument" allowChildren="true" >>> cvIdentifier="MS"></CvTerm> >>> >>> this means that although all children of term 'MS:1000031 -- instrument >>> model' are allowed (allowChildren="true"), the term itself is not >>> allowed (useTerm="false"). By flipping this latter boolean, we can allow >>> the parent term, thus separating between MIAPE requirements (current >>> configuration) and the 'usable mzML requirements' (flipped boolean as >>> explained above) -- for the instrument model at least. >>> >>> >> OK, so it's an implementation thing. That's fine. >> >> >>>> What about data converted from DTAs or MGFs >>>> where the user doesn't even remember (or never knew) what kind of >>>> instrument it came from? >>>> >>>> >>> When the instrument is really unknown (which is unfortunate and >>> constitutes dramatic metadata loss whichever way you look at it), the >>> proposed scenario (usage of toplevel term) provides solace. For all >>> other scenarios (where an incentive to adapt convertor software or >>> report the development of a new instrument is concerned), the relative >>> obscurity of the 'fix' might contribute to 'going the extra mile' >>> (upgrading the convertor, mailing in the new instrument name). >>> >>> >> While the toplevel term does provide some solace, it is obscure enough >> that a casual user might look at it and think that something was wrong >> because it does not intuitively make sense for the category to appear as >> a value. What about this alternative: provide an "unknown instrument" >> term with a unique accession #, but make the term name something like >> "unknown (instrument type not specified or not in CV)". That would be >> intuitive but still eye-catching (and it would be the eye-catching part >> that implementors would want to minimize, because it makes them look >> bad). ;) >> >> -Matt >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Joshua T. <jt...@sy...> - 2008-02-06 20:36:45
|
Yes, but the official validator will complain if the 'name' string doesn't exactly match the CV, capitalization included. -Josh Kessner, Darren E. wrote: > I wasn't thinking about validation, since I'm ignoring the 'name' > attribute. > > This is solely a readability issue, for the mzML and for MSData client > code. > > > Darren > > > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On Behalf Of Joshua > Tasman > Sent: Wednesday, February 06, 2008 12:00 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] CV param readability > > Hi Darren, > > Speaking only for myself, I think that the "name" attribute should be > optional in the file and not interfere with validation. I've never > understood why the text string needs to exactly match the CV for > validation; someone one the list had brought up other languages, etc. > But I think it came up on the list before, and requiring strict mapping > between accession numbers and text string seemed to be important for the > format. > > At the least, acronyms would require additional 'mapping files' or > something similar to be added to the specification, and the validator to > be updated. Maybe someone more familiar with these tasks could step in. > Maybe the CV could be expanded so that every entry had an additional > "acronym" field. This brings up other questions, like would uniqueness > be enforced, etc? > > Josh > > > Kessner, Darren E. wrote: >> I would like to propose using standard acronyms in the CV term names >> when it is clear what they mean. >> >> >> >> We currently have: >> >> <cvParam cvLabel="MS" accession="MS:1000075" name="matrix assisted > laser >> desorption ionization" value=""/> >> >> <cvParam cvLabel="MS" accession="MS:1000079" name="fourier transform > ion >> cyclotron resonance mass spectrometer" value=""/> >> >> >> >> I think this is more readable: >> >> <cvParam cvLabel="MS" accession="MS:1000075" name="MALDI" value=""/> >> >> <cvParam cvLabel="MS" accession="MS:1000079" name="FT-ICR MS" > value=""/> >> >> >> The full name can still be available in the term description field. >> >> >> >> I have an ulterior motive for this -- in the code generation of the >> MSData library, the above terms become constants: >> >> MS_matrix_assisted_laser_desorption_ionization = 1000075, >> >> MS_fourier_transform_ion_cyclotron_resonance_mass_spectrometer = >> 1000079, >> >> >> >> But I think the following is more programmer-friendly: >> >> MS_MALDI = 1000075, >> >> MS_FT_ICR_MS = 1000079, >> >> >> >> >> >> >> >> Darren >> >> >> >> >> >> >> >> Darren Kessner >> >> Scientific Programmer >> >> Dar...@cs... <mailto:Dar...@cs...> >> >> 310-423-9538 >> >> >> >> Spielberg Family Center for Applied Proteomics >> >> Cedars-Sinai Medical Center >> >> http://www.sfcap.cshs.org/ >> >> >> >> >> >> IMPORTANT WARNING: This message is intended for the use of the person > or >> entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended >> recipient, or the employee or agent responsible for delivering it to > the >> intended recipient, you are hereby notified that any dissemination, >> distribution or copying of this information is STRICTLY PROHIBITED. >> >> If you have received this message in error, please notify us > immediately >> by calling (310) 423-6428 and destroy the related message. Thank You > for >> your cooperation. >> >> >> > ------------------------------------------------------------------------ >> > ------------------------------------------------------------------------ > - >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> >> >> > ------------------------------------------------------------------------ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------ > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. > > If you have received this message in error, please notify us immediately > by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |