You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
(5) |
Sep
|
Oct
(5) |
Nov
(1) |
Dec
(2) |
2005 |
Jan
(2) |
Feb
(5) |
Mar
|
Apr
(1) |
May
(5) |
Jun
(2) |
Jul
(3) |
Aug
(7) |
Sep
(18) |
Oct
(22) |
Nov
(10) |
Dec
(15) |
2006 |
Jan
(15) |
Feb
(8) |
Mar
(16) |
Apr
(8) |
May
(2) |
Jun
(5) |
Jul
(3) |
Aug
(1) |
Sep
(34) |
Oct
(21) |
Nov
(14) |
Dec
(2) |
2007 |
Jan
|
Feb
(17) |
Mar
(10) |
Apr
(25) |
May
(11) |
Jun
(30) |
Jul
(1) |
Aug
(38) |
Sep
|
Oct
(119) |
Nov
(18) |
Dec
(3) |
2008 |
Jan
(34) |
Feb
(202) |
Mar
(57) |
Apr
(76) |
May
(44) |
Jun
(33) |
Jul
(33) |
Aug
(32) |
Sep
(41) |
Oct
(49) |
Nov
(84) |
Dec
(216) |
2009 |
Jan
(102) |
Feb
(126) |
Mar
(112) |
Apr
(26) |
May
(91) |
Jun
(54) |
Jul
(39) |
Aug
(29) |
Sep
(16) |
Oct
(18) |
Nov
(12) |
Dec
(23) |
2010 |
Jan
(29) |
Feb
(7) |
Mar
(11) |
Apr
(22) |
May
(9) |
Jun
(13) |
Jul
(7) |
Aug
(10) |
Sep
(9) |
Oct
(20) |
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
(4) |
Mar
(27) |
Apr
(15) |
May
(23) |
Jun
(13) |
Jul
(15) |
Aug
(11) |
Sep
(23) |
Oct
(18) |
Nov
(10) |
Dec
(7) |
2012 |
Jan
(23) |
Feb
(19) |
Mar
(7) |
Apr
(20) |
May
(16) |
Jun
(4) |
Jul
(6) |
Aug
(6) |
Sep
(14) |
Oct
(16) |
Nov
(31) |
Dec
(23) |
2013 |
Jan
(14) |
Feb
(19) |
Mar
(7) |
Apr
(25) |
May
(8) |
Jun
(5) |
Jul
(5) |
Aug
(6) |
Sep
(20) |
Oct
(19) |
Nov
(10) |
Dec
(12) |
2014 |
Jan
(6) |
Feb
(15) |
Mar
(6) |
Apr
(4) |
May
(16) |
Jun
(6) |
Jul
(4) |
Aug
(2) |
Sep
(3) |
Oct
(3) |
Nov
(7) |
Dec
(3) |
2015 |
Jan
(3) |
Feb
(8) |
Mar
(14) |
Apr
(3) |
May
(17) |
Jun
(9) |
Jul
(4) |
Aug
(2) |
Sep
|
Oct
(13) |
Nov
|
Dec
(6) |
2016 |
Jan
(8) |
Feb
(1) |
Mar
(20) |
Apr
(16) |
May
(11) |
Jun
(6) |
Jul
(5) |
Aug
|
Sep
(2) |
Oct
(5) |
Nov
(7) |
Dec
(2) |
2017 |
Jan
(10) |
Feb
(3) |
Mar
(17) |
Apr
(7) |
May
(5) |
Jun
(11) |
Jul
(4) |
Aug
(12) |
Sep
(9) |
Oct
(7) |
Nov
(2) |
Dec
(4) |
2018 |
Jan
(7) |
Feb
(2) |
Mar
(5) |
Apr
(6) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(1) |
Sep
(9) |
Oct
(5) |
Nov
(3) |
Dec
(5) |
2019 |
Jan
(10) |
Feb
|
Mar
(4) |
Apr
(4) |
May
(2) |
Jun
(8) |
Jul
(2) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
(9) |
Dec
(1) |
2020 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
|
May
(3) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Darren K. <dke...@ya...> - 2008-02-19 15:35:03
|
Actually, my comment about dataProcessing was limited to the uses of the software during processing. I think the addition of the cvParam for the general software type is useful (and in fact I'm using it in the latest msdata code). If nothing else, it provides for a much more straightfoward translation from mzXML. Without it, encoding the mzXML software type is much more awkward. Darren On Tue, 19 Feb 2008 4:20 am, Lennart Martens wrote: > Hi Eric, hi PSI MS Enthousiast, > > >> I read that this discussion was deemed moot. Play-by-play below. >> Lennart, should we remove your new cvParam entry location to remove >> temptation to use it, or leave it in? > > I'll schedule it for removal, and will do so in the version I'll try to > build after the phone con tonight (or this morning :)). > > > Cheers, > > lnnrt. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev Darren |
From: Fredrik L. <Fre...@im...> - 2008-02-19 15:28:56
|
Hi All, In QTOF files from Waters with mixed MS1 and MS2 data we have several parallel 'functions' with data being recorded into separate files. The scan numbers are only unique within each function. In the raw data folder we thus have several different spectra with the same scan number (but different source files). When converting this into an mzML file it would be good to keep the original scan numbers which are useful for traceability, but to generate unique spectrum ids. I thus propose that the requirement for unique scanNumbers within an mzML file is removed. However, spectra should not be repeated within the file, so this would NOT be applicable to the dta to mzML conversion use case. Would such a change generate problems for the readers? How is this solved in MassWolf? Regards Fredrik |
From: Coleman, M. <MK...@St...> - 2008-02-19 15:27:46
|
I'm strongly in favor of (b), i.e., keeping that charge state information. If the instrument software, or some other software upstream of the search engine has reason to believe that the charge for a particular spectrum is +2 or +3 but not +1, or +2 but not +1 or +3, or whatever, the search engine ought to be able to make use of this information. As a practical matter, the spectrum format we currently use here (ms2, very similar to dta) efficiently encodes this information, so not having it in mzML would be at least a minor argument for not converting. (We could, of course, simply duplicate the entire spectrum in this case, but this would further bloat the output, and still lose some important information.) Mike > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On > Behalf Of Fredrik Levander > Sent: Tuesday, February 19, 2008 9:04 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] DTA to mzML conversion > > > Hi dta fans, > > I agree completely with 1 and 2. For 3 (several possible > charge states), > there seems to be two possibilities: > a) Do not write the chargestate at all into the mzML in cases where > there are multiple guesses. > b) Put all the proposed values into one precursor. See line > 206-207 at: > http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML/ADH0 > 71030_002.mzML?rev=26 > > Anyone else who would prefer either of a or b? At least some search > engines would try both 2+ and 3+ if there is no charge state given in > the file, so maybe solution a is better? Or does b have advantages? > > Fredrik > > Eric Deutsch wrote: > > Hi everyone, regarding list dta to mzML conversion, here are my > > thoughts: > > > > 1) The current rule is that scanNumbers must be unique > within a file and > > always increasing, although not necessarily sequentially. > IDs must be > > unique within a file. I don't think should change for > conversion from > > dta. > > > > 2) I would only encode the spectrum once, since as you say > it is just > > one spectrum. > > > > 3) I don't even see why you need two precursors. When we > convert dta to > > mzXML, duplicates were dropped and the actual observed > precursor mass > > was put in the mzXML. Thus you are "losing" the information that the > > spectrum could be charge 2 or 3. However, this information > was guessed > > in the first place, and most software I know that extracts > a spectrum > > with no charge information will apply some rules to decide on what > > charges to search. So, I suggest that the conversion from > dta to mzML is > > just the reverse of mzML to dta. One spectrum per scan. If > only 1 charge > > (dta file) is provided, encode it at the user's discretion. > If more than > > 1 charge (dta file) is provided, encode the spectrum > without any charge > > information. For LCQ data, it would probably be reasonable > to not encode > > *any* charge information in the mzML file at all. Because it doesn't > > come with any in the first place. > > > > We will be adding the functionality for multiple precursors > anyway for > > the case when you have multiple peaks in your selection > window as seen, > > e.g., in an orbitrap. I suppose there's no reason you couldn't take > > advantage of that to encode both the 2+ and 3+ although I wouldn't > > recommend it. > > > > Eric > > > > > > > > > >> -----Original Message----- > >> From: psi...@li... > >> > > [mailto:psidev-ms-dev- > > > >> bo...@li...] On Behalf Of Fredrik Levander > >> Sent: Thursday, February 14, 2008 9:55 AM > >> To: Mass spectrometry standard development > >> Subject: Re: [Psidev-ms-dev] DTA to mzML conversion > >> > >> Hi Matt and Rune, > >> > >> Thanks for the comments. I agree that the important > information is the > >> scan number, since this is what you would like to look up > in the raw > >> data file. And it doesn't make much sense to have the scan repeated > >> twice in the file, so I think we'll go for solution 2 and just keep > >> > > the > > > >> sourceFileRef to one of the files. > >> However, since we do have unique spectrum ids there should > not be any > >> real need to stick to the unique scan number requirement > from what I > >> > > got > > > >> from the indexing discussion, even if it is still in the specs (?). > >> Couldn't there be cases when data is collected in > different channels > >> where the scan numbers are the same in different channels? > >> > >> Regards > >> > >> Fredrik > >> > >> Matthew Chambers skrev: > >> > >>> Hi Fredrik, > >>> > >>> Our group has a converter that does this conversion (to mzXML or > >>> > > mzData > > > >>> currently, not yet mzML, but they all have the same uniqueness > >>> constraints on scan numbers and they all support multiple > precursors > >>> > > at > > > >>> least in theory); we went with solution 2 because solution 1 is > >>> > > invalid > > > >>> for all the XML formats (i.e. it would need a schema > change and that > >>> change isn't likely to happen, whereas multiple > sourceFileRefs would > >>> > > be > > > >>> understandable). As I understand it, sourceFileRef is optional > >>> ("<xs:attribute name="sourceFileRef" type="xs:anyURI" > >>> > > use="optional">"), > > > >>> so if you can't or don't want to encode it correctly, just don't > >>> > > include > > > >>> it. Our converter doesn't even bother to include the > sourceFileRefs > >>> > > to > > > >>> the DTAs, it's not helpful information IMO. As long as the > >>> > > conversion is > > > >>> done without data loss, get it over with and then have > mercy on your > >>> filesystem by deleting the DTAs. ;) > >>> > >>> -Matt > >>> > >>> > >>> Fredrik Levander wrote: > >>> > >>> > >>>> Hi All, > >>>> > >>>> In the Proteios platform we're including converters from > some peak > >>>> > > list > > > >>>> formats to mzData, and now also to mzML. It is clearly > not optimal > >>>> > > with > > > >>>> such conversion since instrument settings etcetera are lost. > >>>> > > However, I > > > >>>> guess there will be need for such converters if someone wants to > >>>> > > use > > > >>>> their old instruments with manufacturer peak picking algorithms. > >>>> > >>>> There are sample files generated from DTAs and ProteinLynx by the > >>>> converters (0.99.1) at: > >>>> http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML > >>>> > >>>> The converters will be part of the new release of the Proteios > >>>> > > Software > > > >>>> Environment, but if anyone would like to try them with > their files, > >>>> there is a standalone package (mzMLconverters.zip) at the address > >>>> > > above > > > >>>> which should work under Windows/Linux/OSX with Java 1.5 > or higher. > >>>> > >>>> Please notice that the output files are not schematically valid > >>>> > > since > > > >>>> some terms are still missing in the CV. > >>>> > >>>> For the conversion of multiple DTA files to one mzML > file there is > >>>> > > a > > > >>>> small problem which is related to how lcq_dta generates > dta files: > >>>> > > If > > > >>>> the charge state of the precursor can not be determined, > a spectrum > >>>> > > can > > > >>>> result in two DTA files which are identical apart from the > >>>> > > precursor. > > > >>>> There are two solutions on how to handle this: > >>>> 1) Two spectra, with the same scanNumber but different > spectrum Ids > >>>> > >> (The > >> > >>>> solution used by the current converter) > >>>> 2) One spectrum, two precursors. However, this will not work with > >>>> > > the > > > >>>> current schema since there can only be one sourceFileRef for a > >>>> > >> spectrum. > >> > >>>> Do you all think solution 1 is fine, or is there a > better solution? > >>>> Solution 2 seems to need schema changes. > >>>> Other comments are also welcome > >>>> > >>>> Thanks, > >>>> > >>>> Fredrik > >>>> > >>>> > >>>> > > > -------------------------------------------------------------- > --------- > > > >> -- > >> > >>>> This SF.net email is sponsored by: Microsoft > >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>>> _______________________________________________ > >>>> Psidev-ms-dev mailing list > >>>> Psi...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>>> > >>>> > >>>> > >>>> > >>> > > > -------------------------------------------------------------- > ---------- > > > >> - > >> > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >>> > >> > > > -------------------------------------------------------------- > ---------- > > - > > > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > > > > > -------------------------------------------------------------- > ----------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > -------------------------------------------------------------- > ----------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Fredrik L. <Fre...@im...> - 2008-02-19 15:05:36
|
Hi dta fans, I agree completely with 1 and 2. For 3 (several possible charge states), there seems to be two possibilities: a) Do not write the chargestate at all into the mzML in cases where there are multiple guesses. b) Put all the proposed values into one precursor. See line 206-207 at: http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML/ADH071030_002.mzML?rev=26 Anyone else who would prefer either of a or b? At least some search engines would try both 2+ and 3+ if there is no charge state given in the file, so maybe solution a is better? Or does b have advantages? Fredrik Eric Deutsch wrote: > Hi everyone, regarding list dta to mzML conversion, here are my > thoughts: > > 1) The current rule is that scanNumbers must be unique within a file and > always increasing, although not necessarily sequentially. IDs must be > unique within a file. I don't think should change for conversion from > dta. > > 2) I would only encode the spectrum once, since as you say it is just > one spectrum. > > 3) I don't even see why you need two precursors. When we convert dta to > mzXML, duplicates were dropped and the actual observed precursor mass > was put in the mzXML. Thus you are "losing" the information that the > spectrum could be charge 2 or 3. However, this information was guessed > in the first place, and most software I know that extracts a spectrum > with no charge information will apply some rules to decide on what > charges to search. So, I suggest that the conversion from dta to mzML is > just the reverse of mzML to dta. One spectrum per scan. If only 1 charge > (dta file) is provided, encode it at the user's discretion. If more than > 1 charge (dta file) is provided, encode the spectrum without any charge > information. For LCQ data, it would probably be reasonable to not encode > *any* charge information in the mzML file at all. Because it doesn't > come with any in the first place. > > We will be adding the functionality for multiple precursors anyway for > the case when you have multiple peaks in your selection window as seen, > e.g., in an orbitrap. I suppose there's no reason you couldn't take > advantage of that to encode both the 2+ and 3+ although I wouldn't > recommend it. > > Eric > > > > >> -----Original Message----- >> From: psi...@li... >> > [mailto:psidev-ms-dev- > >> bo...@li...] On Behalf Of Fredrik Levander >> Sent: Thursday, February 14, 2008 9:55 AM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] DTA to mzML conversion >> >> Hi Matt and Rune, >> >> Thanks for the comments. I agree that the important information is the >> scan number, since this is what you would like to look up in the raw >> data file. And it doesn't make much sense to have the scan repeated >> twice in the file, so I think we'll go for solution 2 and just keep >> > the > >> sourceFileRef to one of the files. >> However, since we do have unique spectrum ids there should not be any >> real need to stick to the unique scan number requirement from what I >> > got > >> from the indexing discussion, even if it is still in the specs (?). >> Couldn't there be cases when data is collected in different channels >> where the scan numbers are the same in different channels? >> >> Regards >> >> Fredrik >> >> Matthew Chambers skrev: >> >>> Hi Fredrik, >>> >>> Our group has a converter that does this conversion (to mzXML or >>> > mzData > >>> currently, not yet mzML, but they all have the same uniqueness >>> constraints on scan numbers and they all support multiple precursors >>> > at > >>> least in theory); we went with solution 2 because solution 1 is >>> > invalid > >>> for all the XML formats (i.e. it would need a schema change and that >>> change isn't likely to happen, whereas multiple sourceFileRefs would >>> > be > >>> understandable). As I understand it, sourceFileRef is optional >>> ("<xs:attribute name="sourceFileRef" type="xs:anyURI" >>> > use="optional">"), > >>> so if you can't or don't want to encode it correctly, just don't >>> > include > >>> it. Our converter doesn't even bother to include the sourceFileRefs >>> > to > >>> the DTAs, it's not helpful information IMO. As long as the >>> > conversion is > >>> done without data loss, get it over with and then have mercy on your >>> filesystem by deleting the DTAs. ;) >>> >>> -Matt >>> >>> >>> Fredrik Levander wrote: >>> >>> >>>> Hi All, >>>> >>>> In the Proteios platform we're including converters from some peak >>>> > list > >>>> formats to mzData, and now also to mzML. It is clearly not optimal >>>> > with > >>>> such conversion since instrument settings etcetera are lost. >>>> > However, I > >>>> guess there will be need for such converters if someone wants to >>>> > use > >>>> their old instruments with manufacturer peak picking algorithms. >>>> >>>> There are sample files generated from DTAs and ProteinLynx by the >>>> converters (0.99.1) at: >>>> http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML >>>> >>>> The converters will be part of the new release of the Proteios >>>> > Software > >>>> Environment, but if anyone would like to try them with their files, >>>> there is a standalone package (mzMLconverters.zip) at the address >>>> > above > >>>> which should work under Windows/Linux/OSX with Java 1.5 or higher. >>>> >>>> Please notice that the output files are not schematically valid >>>> > since > >>>> some terms are still missing in the CV. >>>> >>>> For the conversion of multiple DTA files to one mzML file there is >>>> > a > >>>> small problem which is related to how lcq_dta generates dta files: >>>> > If > >>>> the charge state of the precursor can not be determined, a spectrum >>>> > can > >>>> result in two DTA files which are identical apart from the >>>> > precursor. > >>>> There are two solutions on how to handle this: >>>> 1) Two spectra, with the same scanNumber but different spectrum Ids >>>> >> (The >> >>>> solution used by the current converter) >>>> 2) One spectrum, two precursors. However, this will not work with >>>> > the > >>>> current schema since there can only be one sourceFileRef for a >>>> >> spectrum. >> >>>> Do you all think solution 1 is fine, or is there a better solution? >>>> Solution 2 seems to need schema changes. >>>> Other comments are also welcome >>>> >>>> Thanks, >>>> >>>> Fredrik >>>> >>>> >>>> > ----------------------------------------------------------------------- > >> -- >> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>>> >>> > ------------------------------------------------------------------------ > >> - >> >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Lennart M. <len...@eb...> - 2008-02-19 12:19:52
|
Hi Eric, hi PSI MS Enthousiast, > I read that this discussion was deemed moot. Play-by-play below. > Lennart, should we remove your new cvParam entry location to remove > temptation to use it, or leave it in? I'll schedule it for removal, and will do so in the version I'll try to build after the phone con tonight (or this morning :)). Cheers, lnnrt. |
From: Rune S. P. <mai...@ph...> - 2008-02-19 11:17:49
|
Hi Eric Deutsch wrote: > > Example files: > > --------------- > > - Work with Waters to get MS^E examples made > Actually, since I work with MS^E data, I have some example data in mzXML and mzData. Normally the way MS^E works is by switching between survey and fragmentation scans. The m/z range of the survey scan, which is (normally) also the precursor range, is typically wide 50-2000 or 300-2000 or so. The important thing is that the scanning range is the same long enough to get an elution profile (but normally fixed throughout the run). So with the typical experiment, it is sufficient to exclude the precursor info in the fragmentation scans, since the precursors are whatever is seen in the survey scan. Fortunately for me, my data follows this typical scheme. So I just exclude the precursor info. I have attached an example file, which is a small cutout of a larger file. -- Regards Rune |
From: Eric D. <ede...@sy...> - 2008-02-19 09:14:45
|
Hi everyone, here is a revised agenda, news, and to do list for discussion at the call in 8 hr. See Lennart's message 15 hr ago for dial info. ------------------------ Agenda for Feb 19, 9am PST ------- - Darren new pwiz/msdata release - Lennart's 0.99.9_SNAPSHOT schema changes - Eric's CV changes - Discussion: placement of arrayLength attribute - Discussion: unknown instrument model - Discussion: msLevel, scanNumber, chromatograms Schedule: ----------------------- Jan 25: mzML reviews returned. Official community review complete. Feb 5: mzML telecon 9:00am PST Feb 19: mzML telecon 9:00am PST Mar 4: mzML telecon 9:00am PST Mar 17: US HUPO meeting Mar 25: mzML telecon 9:00am PST Apr 8: mzML telecon 9:00am PST Apr 23: PSI meeting in Toledo May Jun 1-5: ASMS - Must be done and advertising it here! News items: ----------------------- - ASMS Abstract was selected as a week-long display poster - Ongoing effort to get a presentation at Computer Applications Interest Group workshop at ASMS - Darren released a new pwiz/msdata snapshot on Feb 18 To do list: ----------------------- Schema changes: --------------- - Incorporate Phil's suggested schema typing changes of 1/24 - Figure out how to implement datatype validation in cvParams - For consistency binaryDataArray should be in List???? - Address suggestions from Darren 1/22 - Fix instances in spec doc of instrumentType instead of instrument, etc. - Fix spectrumRef to point to id instead of scanNumber in example docs - Get full name "Proteomics Standards Initiative Mass Spectrometry Ontology" in obo file - Replace <referenceableParamGroup> with <paramGroup> - Remove <instrumentSoftwareRef> and use <softwareRef> - Change to: <cv id="MS" ... > ... <cvParam cvRef="MS" ...> - <sourceFile id="1" sourceFileName="tiny1.RAW" sourceFileLocation="file://F:/data/Exp01" > should be shortened to: <sourceFile id="1" name="tiny1.RAW" location="file://F:/data/Exp01" > - Address the Mallick lab need for multiple precursors Allow multiple ionSelection elements allow terms for precursorIntensity score confidence see snippet on email thread 11/26 - Rune suggests allowing a range for the precursor like for MS^E (or technically there's always a window) See 12/7 - There was a discussion on 11/22 - 11/24 that essentially boils down to: Can we encode the MS inclusion list in an mzML file? - Ask Randy how his "engineers are currently encoding chromatograms into mzData 1.05 using supplemental data vectors (not pretty)" - Invite Mike MacCoss to help with chromatograms (via Parag) - Decide on the open issue discussed in the spec doc regarding cvParam attributes - What to do for sourceFile when the source is really a directory of files for the run instead of a single file? - Can the arrayLengths for <binaryDataArray> ever be different within one <spectrum>? If not, maybe it should be specified only once somewhere? - SpectrumDescription changes from Randy 2/5 Example files: --------------- - Fix spectrumRef in examples - mzML <--> MIAPE-MS mapping assessment. Build on work by Pierre-Alain & Frederik - Get some of JimS's example files into a public area - Work with Waters to get MS^E examples made - We need to develop a good MALDI example file with spot ids - We need to develop a good example of a file created from individual dtas - Examine and fold in to distro example at: http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML - We meed to develop a good example of a file that contains summed scans - Randy will provide a list of things that we don't handle yet that he thinks we should for subsequent followup Address reviewer comments: --------------- - Address reviewer points in a document - Angel's comments to the reviews - Address the blunt criticisms summarized by Angel on 1/14 Validator: --------------- - Make validator enforce this ascending scanNumber rule - Update validator to check datatypes - Update validator to 0.99.2 - Set up both basic and MIAPE-MS validation levels - Adjust the validator so that it will complain if cvParam names do not match the accession (or a synonym thereof) CV work: --------------- - Figure out where we left off on CV - Add "scan event" or similar to CV - Need to get the relevant CV part into all vendors hands to update - Various other CV open items to address - Can we make the CV have the distinction between categories and terms? - What do we do in a case like with MassWolf where it cannot know the instrument model? - Coordinate submission of PSI-MS to OBO Foundry via Chris Mungall - Add "unknown instrument" - Get both Kermit Murray and David Sparkman involved in the CV - Need to make a crystal clear new term submission path Documentation: --------------- - Clarify in spec doc that the binary data arrays are base64 encoded - Should we get the indexing documented as an appendix? - Include checksum definition - Improve scanNumber ascending requirement in documentation - Address locale issues in spec doc (offending example from mzData) <spectrumInstrument msLevel="1" mzRangeStart="75,00" mzRangeStop="1000,00"> <cvParam cvLabel="psi" accession="PSI:1000038" name="TimeInMinutes" value="0,033" /> - Document why we chose not to encode SRM data as chromatograms - Get the information encoded in the validators mapping file into the spec - State that not all mzML files need to be MIAPE-MS compliant. There will be a basic set of requirements and a second mapping file for full MIAPE-MS compliance - Can the arrayLengths for <binaryDataArray> ever be different within one <spectrum>? Related software: --------------- - Update ReAdW and Wolf for mzML 0.99.2 - Add support for mzML 0.99.2 to mzWiff and Hunter - Finish off other converter loose ends - Fix current indexing and binary encoding bugs reported by Darren 1/28 - Darren Kessner's msData C++ library reads/will read mzML - Brian Pratt is implementing RAMP parser for mzML using Darren's library - Get TPP / ISB workflow working with mzML - Brian suggests a single C/C++ codebase with SWIG bindings to minimize implementation differences - Pierre-Alain will be building an mzML reader into his sytem - Jim Shofstahl already has an mzML -> SRF converter that then feeds into SEQUEST - Randy will definitely be reading mzML files into his data system by ASMS - It would be useful to have converters that could prompt the user for information that is not available - Converter from ProteinLynx Global Server XML to mzML within Proteois is ready for release when 0.99.2 is finalized. - Fredrik is working on a PKL or DTA -> mzML converter - Other software? |
From: Eric D. <ede...@sy...> - 2008-02-19 09:14:37
|
> -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Lennart Martens > Sent: Thursday, February 14, 2008 3:24 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] a few more CV name issues > > Hi Darren, > > > > 1) exact_synonym same as name (except for capitalization): > > > > [Term] > > id: MS:1000114 > > name: microchannel plate detector > > def: ... > > exact_synonym: "Microchannel Plate Detector" [] > > exact_synonym: "multichannel plate" [] > > is_a: MS:1000026 ! detector type > > I believe this one is intentional so I'm unsure about whether to record > it as 'to correct' (thoughts anyone?), but the rest is certainly flagged > now! I don't see why this would make sense. I think it should be deleted. I just did. Anyone yell if it should be restored. This brings up another issue though. At one point last year, we went through an effort to change all terms to lower case (except proper names like "Waters" and acronyms). But it looks like this change was not applied to exact_synonyms. It should be. Regarding Darren's: > id: MS:1000580 > name: MSn spectrum > def: ... > exact_synonym: "Multiple-Stage Mass Spectrometry" [] synonym collision, I changed to: exact_synonym: "multiple-stage mass spectrometry spectrum" [] Please let me know if you object. Yes, any term with a '?' should be revisited. Thanks, Eric > > Thanks! > > Cheers, > > lnnrt. > > > > > > > > > > > > 2) near name collision with exact_synonym: > > > > > > > > [Term] > > > > id: MS:1000270 > > > > name: multiple stage mass spectrometry > > > > def: ... > > > > exact_synonym: "MSn" [] > > > > is_a: MS:1000445 ! sequential m/z separation method ? > > > > > > > > [Term] > > > > id: MS:1000580 > > > > name: MSn spectrum > > > > def: ... > > > > exact_synonym: "Multiple-Stage Mass Spectrometry" [] > > > > is_a: MS:1000524 ! data file content > > > > is_a: MS:1000559 ! spectrum type > > > > > > > > > > > > 3) some term names have ? at end -- I assume this is to flag for > > reconsideration > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 09:14:35
|
I have removed "peak processing" as a synonym of all three. "peak processing" seems sufficiently vague that this probably doesn't make sense for any of them. Perhaps "peak processing" could be a parent term of peaking picking and smoothing, etc. if someone cares about this. > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Darren Kessner > Sent: Wednesday, February 13, 2008 5:55 AM > To: Mass spectrometry standard development > Subject: [Psidev-ms-dev] exact_synonym: "peak processing" > > In psi-ms.obo: > > exact_synonym: "peak processing" > occurs in multiple terms. > > I assume it's a copy/paste error. > > > [Term] > id: MS:1000035 > name: peak picking > def: ... > exact_synonym: "peak processing" [] > is_a: MS:1000543 ! data processing action > > [Term] > id: MS:1000592 > name: smoothing > def: ... > exact_synonym: "peak processing" [] > is_a: MS:1000543 ! data processing action > > [Term] > id: MS:1000593 > name: baseline reduction > def: ... > exact_synonym: "peak processing" [] > is_a: MS:1000543 ! data processing action > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 07:50:13
|
> When are we going to open the cvParam-format can of worms? Hi Matt, it's on the to do list. I think we had tentatively planned to discuss this week. But since Randy has delivered some fresh worms of a different sort, maybe we best table it until next time. I say we tackle it after these other items if there's time. Thanks, Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Matthew Chambers > Sent: Monday, February 18, 2008 10:53 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > Is there a reason to accommodate non-spectral data inside spectrum > elements? If the file should be able to handle non-spectral data, then I > think we should have other kinds of elements instead of introducing > strange logic about deciding whether a spectrum is really spectrum or > not based on its MS level. Working out the other data representations > would take time, though. It's worth discussing in the teleconference. > > As for the scanNumber vs. scan element question, I'm a bit confused > about that so I'd also like to cover it tomorrow. > > When are we going to open the cvParam-format can of worms? > > -Matt > > > Randy Julian wrote: > > I'd like to get a couple of schema items on the agenda tomorrow. > > > > I've been asking about a possible change in the schema regarding > > msLevel. As an alternative to moving the attribute, or making it > > optional, I would like to propose that we allow non-MS channels acquired > > by the MS data system and stored in the raw file to be marked as > > msLevel=0. This would require a change to the specification document > > but would allow software to ignore non-spectral content (whatever it > > might be) if the level is not at least 1. > > > > Another approach which is also consistent with the rest of the schema is > > to make the attribute a cvParam like the axis names. This would require > > a schema change and shift the validation of msLevel to the validator > > program. If there is strong support for a required msLevel attribute in > > the current location, we could still represent the other signals with > > the suggestion above. > > > > Also, I haven't heard back about the relationship between the 'scan' > > number attributes and the scan elements. Has anyone looked at this yet? > > Can we also discuss how this is supposed to work tomorrow? > > > > Thanks, > > Randy > > > > > > -----Original Message----- > > From: psi...@li... > > [mailto:psi...@li...] On Behalf Of > > Lennart Martens > > Sent: Monday, February 18, 2008 1:07 PM > > To: Mass spectrometry standard development > > Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Dear PSI-MS Enthousiasts, > > > > > > The next telephone conference for the PSI-MS development group will take > > > > place on Tuesday, 19 february 2008. > > > > The phone conference will take place at the time indicated below (please > > > > find a location near you ): > > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > > =2008&hour=17&min=0&sec=0&p1=0 > > > > phone numbers are: > > > > + Germany: 08001012079 > > > > + Switzerland: 0800000860 > > > > + UK: 08081095644 > > > > + USA: 1-866-314-3683 > > > > + Generic international: +44 2083222500 (UK number) > > > > access code: 297427 > > > > > > You can also view these details online on the PSI website: > > > > http://www.psidev.info/index.php?q=node/313 > > > > > > Best regards, > > > > lnnrt. > > > > ------------------------------------------------------------------------ > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 07:47:29
|
Hi Randy, regarding: > Also, I haven't heard back about the relationship between the 'scanNumber' > attributes and the scan elements. Has anyone looked at this yet? > Can we also discuss how this is supposed to work tomorrow? scanNumber in the <spectrum> element is perhaps somewhat misplaced. It does belong in <scan>, yes. BUT, since the scan number has historically been an important handle for identifying the spectrum and for use in the mzXML indexing scheme, it has been promoted to the top. <spectrum id="S19" scanNumber="19" msLevel="1"> <cvParam cvLabel="MS" accession="MS:1000580" name="MSn spectrum" value=""/> <spectrumDescription> <scan instrumentRef="LCQ Deca" ???scanNumber="19"???> If we move it into <scan>, this may be more semantically correct, but it will almost surely make the parser software writers jobs more difficult. This may be a case of function over beauty. This is somewhat mitigated by the fact that we now have an id attribute. I'm not inclined to change it because I'm lazy and because I suspect it will make Darren and Brian sad. But we can discuss. If we allow the previous chromatograms in <spectrum> we then may be faced with generating bogus scan numbers to keep the validator happy. If it's in scan, and there is no scan, that's less of a problem. If we move scanNumber into <scan>, what's to prevent msLevel following suit? http://blog.b92.net/arhiva/files/images/can%20of%20worms.jpg Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Randy Julian > Sent: Monday, February 18, 2008 10:22 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > I'd like to get a couple of schema items on the agenda tomorrow. > > I've been asking about a possible change in the schema regarding > msLevel. As an alternative to moving the attribute, or making it > optional, I would like to propose that we allow non-MS channels acquired > by the MS data system and stored in the raw file to be marked as > msLevel=0. This would require a change to the specification document > but would allow software to ignore non-spectral content (whatever it > might be) if the level is not at least 1. > > Another approach which is also consistent with the rest of the schema is > to make the attribute a cvParam like the axis names. This would require > a schema change and shift the validation of msLevel to the validator > program. If there is strong support for a required msLevel attribute in > the current location, we could still represent the other signals with > the suggestion above. > > Also, I haven't heard back about the relationship between the 'scan' > number attributes and the scan elements. Has anyone looked at this yet? > Can we also discuss how this is supposed to work tomorrow? > > Thanks, > Randy > > > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On Behalf Of > Lennart Martens > Sent: Monday, February 18, 2008 1:07 PM > To: Mass spectrometry standard development > Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > Dear PSI-MS Enthousiasts, > > > The next telephone conference for the PSI-MS development group will take > > place on Tuesday, 19 february 2008. > > The phone conference will take place at the time indicated below (please > > find a location near you ): > > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > =2008&hour=17&min=0&sec=0&p1=0 > > phone numbers are: > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > + Generic international: +44 2083222500 (UK number) > > access code: 297427 > > > You can also view these details online on the PSI website: > > http://www.psidev.info/index.php?q=node/313 > > > Best regards, > > lnnrt. > > ------------------------------------------------------------------------ > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 07:35:40
|
[resent with better subject; delete previous] Hi everyone, thank you for this discussion on msLevel and chromatograms. Let's indeed discuss at the call. Here is the discussion so far followed by my thoughts: msLevel discussion: - Randy proposes that msLevel attribute be optional - Or perhaps make it a cvParam instead of a <spectrum> attribute - Or perhaps allow msLevel=0 for non-MS spectra - Matt suggests using different kinds of elements besides <spectrum> - Randy says we need to handle the non-mass spectra recorded on instrument analog channels - Matt asks where the detailed proposal is? - Matt suggests <uvChannel>, <chromatogram> rather than shoving it in <spectrum> - Randy reiterates that these are important use cases and small changes should be made to accommodate - Angel agrees - Matt says that <spectrum> is not generic enough to hold non mass spectra. call it runItem - Randy asks about how to store ADC or PDA data? - And how do we store the "original time axis from a TOF or an FT instrument?" - Randy and Matt go back and forth on various items - Matt thinks every <spectrum> should have a m/z axis - Randy is not dismayed by loss of string arrays for labeling items in the array - Matt suggests NULL terminated strings for the string array or BSTR array I would like to add my thoughts: 1) Regarding ADC or PDA or time axis data for TOF and FT instrument, all I can do is reiterate that I have no experience with such data, have never seen such data myself, and don't know where to start, and don't feel like I have the time to investigate this myself. So I have often asked Randy and others to come with examples (preferably suggested mzML examples of how to encode the information so we can chew on it). No example will be coming from me, but I'm happy to entertain one from you. I think both Randy and Jim S agreed to come up with examples. We await them eagerly! 2) I think our current <spectrum> and flexible binary arrays were an attempt to leave the door open for this kind of data if someone finds it. I'd contend that spectrum is pretty general. We all think spectrum = mass spectrum, but there's also electromagnetic spectrum, power spectrum, frequency spectrum. Maybe time spectrum = chromatogram is stretching it a bit, but not far in my mind. But this does make Randy's point that requiring a msLevel attribute means that we need to allow msLevel=0 for non-standard stuff. 3) What's wrong with this? <spectrum id="UV01" scanNumber="4300" msLevel="0" arrayLength="5404"> <cvParam cvLabel="MS" accession="MS:1099580" name="ultraviolet chromatogram" value=""/> <spectrumDescription> <cvParam cvLabel="MS" accession="MS:1099127" name="flux capacitor detector" value=""/> <cvParam cvLabel="MS" accession="MS:1099343" name="detector wavelength" value="112.23" unitAccession="MS:1009938" unitName="nanometer"/> </spectrumDescription> <binaryDataArray encodedLength="3433" dataProcessingRef="Xcalibur Processing"> <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="zlib compression" value=""/> <cvParam cvLabel="MS" accession="MS:1099514" name="time array" value="" unitAccession="MS:1000038" unitName="minute"/> <binary>AAAAwDsGeUpAAAAAAOejAADAOg6cQA==</binary> </binaryDataArray> <binaryDataArray encodedLength="3433"> <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="zlib compression" value=""/> <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" value=""/> <binary>AAAAAIBJAAAAABIhAAAAAAMysQA==</binary> </binaryDataArray> </spectrum> (except that I said I wouldn't produce any examples) (perhaps okay because no thought went into it). I look forward to the discussion. Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Randy Julian > Sent: Monday, February 18, 2008 12:46 PM > To: Matthew Chambers; Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > I think we've hit at some of the key points for the discussion tomorrow. > > > What is your recommendation for storing ADC (or PDA) data? > > Also, does the current idea for the data vectors support storing the > original time axis from a TOF or an FT instrument? > > Thanks, > Randy > > -----Original Message----- > From: Matthew Chambers [mailto:mat...@va...] > Sent: Monday, February 18, 2008 3:19 PM > To: Mass spectrometry standard development > Cc: Randy Julian > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Randy Julian wrote: > > I originally presented a draft of mzData 1.1 which had chromatogram > > elements in it, and it worked just fine for all sorts of acquisitions > an > > instrument can perform in addition to acquiring a spectrum. I > > appreciate that this suggestion also created some other difficulties > > (like multiple ways to store the same data), and I dropped the draft > as > > a serious suggestion in favor of a merger between mzData and mzXML. > > > Yes, as I understand the term, a chromatogram is a generic concept for > any data stored with time as one axis. > > > "Analog Channel" is a nickname for the typical analog-to-digital > > converters available on most mass spectrometers for recording data > from > > external devices which generate either a voltage or current output. > > These ADC inputs, and everything else recorded by the data system, > > undergo digitization. And yes, historically detectors were mostly > > analog, but over the past decade or so, they are increasingly pulse > > counting systems with all sorts of signal processing possibilities. > > Most people don't consider pulse counting systems to be analog... > > > OK. I can't say I like that nickname to refer to an extra/auxiliary data > > channel, but so be it. > > > We have already gone to generic vectors where the name (like mz and > > intensity) have to be provided in a cvParam. It is easy already to > name > > the vectors anything you like. This is important, especially since we > > got rid of the supplemental data vectors for holding things like > > individual peak annotations, and alternative processing of the > spectrum > > (like digital filtering, etc.). This is all really good, and pretty > > generic already. I'm not suggesting that we complicate things more > with > > specialization, but acknowledge the generalization which is already > > present and needed to record common extensions to the base use case. > > > > Because of the generic, unnamed vectors, a display program will > already > > have to sort out what it's looking at when it reads each vector. They > > are not ordered, for example, and there is not a schema-enforced > > requirement that there are always two - or even that they are named at > > all. I'm suggesting that since a robust viewing program is going to > > have to do a lot of checking to determine how the vectors are used in > > the current scheme, we would not have to do much to make the schema > much > > more broadly applicable. Since the schema is being considered for use > > in metabolomics and other small molecule work, I think this is > > important. > > > Yes, the vectors are generic, but their parent element is not > (<spectrum*>), so the only thing they should be generic for are things > within the domain of the "spectrum" concept. You are suggesting that we > take away the (intuitive) attribute requirements of a spectrum so that > it can be used as a generic concept. I am not at all opposed to the idea > > of a generic concept at the level of <spectrum> in the data hierarchy, > I'm just opposed to the idea that such a concept be called a "spectrum". > > If you were to suggest that we rename the spectrum element to something > generic like "runItem" (and spectrumList possibly to "runItemList") I > could live with that. It looks silly, but it wouldn't be flat-out wrong > and counter-intuitive! :) I would prefer to keep the spectrum element > and add a generic sibling concept instead, though. > > -Matt > > > Randy > > > > -----Original Message----- > > From: Matthew Chambers [mailto:mat...@va...] > > Sent: Monday, February 18, 2008 2:23 PM > > To: Mass spectrometry standard development > > Cc: Randy Julian > > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Have you previously made a detailed proposal about what the > > representation of these non-MS signals should look like? And to my > > (limited) knowledge, calling them "analog" signals is rather > misleading, > > > > because by necessity they must be digitized to be represented > digitally. > > > > :) Don't MS signals come from analog detectors as well? > > > > It sounds like you either want a specialized way to encode each > > non-spectral data type, or a generic way to encode any non-spectral > data > > > > type. In the former case, the schema and the validator mapping would > > define semantics for which data axes are allowed in which data type > > (e.g. "mz vs. intensity in a <spectrum>", "time vs. intensity in a > > <chromatogram>", "x vs. y in a <uvChannel>", etc.), and in the latter > > case, there would be a generic <channel> element which would have a > > variable set of binary data arrays and the names/types of those arrays > > > would be determined by the file creator. Or both approaches could be > > combined. But either (or both) approaches are superior to trying to > > shove generic "channel" data into a <spectrum> element IMO. Like you > > said, it should be possible for readers which only care about spectral > > > data to easily skip the non-spectral data and that would be vastly > more > > intuitive if there were other element names to put the non-spectra > data > > in. > > > > -Matt > > > > > > Randy Julian wrote: > > > >> Matt, > >> > >> I'm only talking about data which is collected by the mass > >> > > spectrometer > > > >> data system in conjunction with the mass spectral experiment. > >> > >> When we did LC-LC experiments in my lab, we would sometimes put a UV > >> detector between the two columns, and collect data on analog channels > >> recorded by XCalibur. Most instruments have this capability. > >> > >> Since there seems to be resistance to the whole idea of a > >> > > <chromatogram> > > > >> element (which I appreciate), it leaves open the question about what > >> > > to > > > >> do with data collected by the data system during the LC-MS > experiment. > >> > >> I don't understand why we don't want to acknowledge that almost all > MS > >> data systems can be used to collect analog signals during experiments > >> along with spectra. This is simple stuff, and very useful. I don't > >> want to lose this use case, and we've no place else to put this data. > >> > >> Randy > >> > >> > >> -----Original Message----- > >> From: psi...@li... > >> [mailto:psi...@li...] On Behalf Of > >> Matthew Chambers > >> Sent: Monday, February 18, 2008 1:53 PM > >> To: Mass spectrometry standard development > >> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > >> > >> Is there a reason to accommodate non-spectral data inside spectrum > >> elements? If the file should be able to handle non-spectral data, > then > >> > > I > > > >> think we should have other kinds of elements instead of introducing > >> strange logic about deciding whether a spectrum is really spectrum or > > >> not based on its MS level. Working out the other data representations > > >> would take time, though. It's worth discussing in the teleconference. > >> > >> As for the scanNumber vs. scan element question, I'm a bit confused > >> about that so I'd also like to cover it tomorrow. > >> > >> When are we going to open the cvParam-format can of worms? > >> > >> -Matt > >> > >> > >> Randy Julian wrote: > >> > >> > >>> I'd like to get a couple of schema items on the agenda tomorrow. > >>> > >>> I've been asking about a possible change in the schema regarding > >>> msLevel. As an alternative to moving the attribute, or making it > >>> optional, I would like to propose that we allow non-MS channels > >>> > >>> > >> acquired > >> > >> > >>> by the MS data system and stored in the raw file to be marked as > >>> msLevel=0. This would require a change to the specification > document > >>> but would allow software to ignore non-spectral content (whatever it > >>> might be) if the level is not at least 1. > >>> > >>> Another approach which is also consistent with the rest of the > schema > >>> > >>> > >> is > >> > >> > >>> to make the attribute a cvParam like the axis names. This would > >>> > >>> > >> require > >> > >> > >>> a schema change and shift the validation of msLevel to the validator > >>> program. If there is strong support for a required msLevel > attribute > >>> > >>> > >> in > >> > >> > >>> the current location, we could still represent the other signals > with > >>> the suggestion above. > >>> > >>> Also, I haven't heard back about the relationship between the 'scan' > >>> number attributes and the scan elements. Has anyone looked at this > >>> > >>> > >> yet? > >> > >> > >>> Can we also discuss how this is supposed to work tomorrow? > >>> > >>> Thanks, > >>> Randy > >>> > >>> > >>> -----Original Message----- > >>> From: psi...@li... > >>> [mailto:psi...@li...] On Behalf Of > >>> Lennart Martens > >>> Sent: Monday, February 18, 2008 1:07 PM > >>> To: Mass spectrometry standard development > >>> Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > >>> > >>> Dear PSI-MS Enthousiasts, > >>> > >>> > >>> The next telephone conference for the PSI-MS development group will > >>> > >>> > >> take > >> > >> > >>> place on Tuesday, 19 february 2008. > >>> > >>> The phone conference will take place at the time indicated below > >>> > >>> > >> (please > >> > >> > >>> find a location near you ): > >>> > >>> > >>> > >>> > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > > > >> > >> > >>> =2008&hour=17&min=0&sec=0&p1=0 > >>> > >>> phone numbers are: > >>> > >>> + Germany: 08001012079 > >>> > >>> + Switzerland: 0800000860 > >>> > >>> + UK: 08081095644 > >>> > >>> + USA: 1-866-314-3683 > >>> > >>> + Generic international: +44 2083222500 (UK number) > >>> > >>> access code: 297427 > >>> > >>> > >>> You can also view these details online on the PSI website: > >>> > >>> http://www.psidev.info/index.php?q=node/313 > >>> > >>> > >>> Best regards, > >>> > >>> lnnrt. > >>> > >>> > >>> > >>> > > > ------------------------------------------------------------------------ > > > >> > >> > >>> - > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >>> > >>> > >>> > > > ------------------------------------------------------------------------ > > > >> - > >> > >> > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >>> > >>> > >>> > >> > > > ------------------------------------------------------------------------ > > > >> - > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > >> > >> > > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 07:32:40
|
Hi everyone, thank you for this discussion on msLevel and chromatograms. Let's indeed discuss at the call. Here is the discussion so far followed by my thoughts: msLevel discussion: - Randy proposes that msLevel attribute be optional - Or perhaps make it a cvParam instead of a <spectrum> attribute - Or perhaps allow msLevel=0 for non-MS spectra - Matt suggests using different kinds of elements besides <spectrum> - Randy says we need to handle the non-mass spectra recorded on instrument analog channels - Matt asks where the detailed proposal is? - Matt suggests <uvChannel>, <chromatogram> rather than shoving it in <spectrum> - Randy reiterates that these are important use cases and small changes should be made to accommodate - Angel agrees - Matt says that <spectrum> is not generic enough to hold non mass spectra. call it runItem - Randy asks about how to store ADC or PDA data? - And how do we store the "original time axis from a TOF or an FT instrument?" - Randy and Matt go back and forth on various items - Matt thinks every <spectrum> should have a m/z axis - Randy is not dismayed by loss of string arrays for labeling items in the array - Matt suggests NULL terminated strings for the string array or BSTR array I would like to add my thoughts: 1) Regarding ADC or PDA or time axis data for TOF and FT instrument, all I can do is reiterate that I have no experience with such data, have never seen such data myself, and don't know where to start, and don't feel like I have the time to investigate this myself. So I have often asked Randy and others to come with examples (preferably suggested mzML examples of how to encode the information so we can chew on it). No example will be coming from me, but I'm happy to entertain one from you. I think both Randy and Jim S agreed to come up with examples. We await them eagerly! 2) I think our current <spectrum> and flexible binary arrays were an attempt to leave the door open for this kind of data if someone finds it. I'd contend that spectrum is pretty general. We all think spectrum = mass spectrum, but there's also electromagnetic spectrum, power spectrum, frequency spectrum. Maybe time spectrum = chromatogram is stretching it a bit, but not far in my mind. But this does make Randy's point that requiring a msLevel attribute means that we need to allow msLevel=0 for non-standard stuff. 3) What's wrong with this? <spectrum id="UV01" scanNumber="4300" msLevel="0" arrayLength="5404"> <cvParam cvLabel="MS" accession="MS:1099580" name="ultraviolet chromatogram" value=""/> <spectrumDescription> <cvParam cvLabel="MS" accession="MS:1099127" name="flux capacitor detector" value=""/> <cvParam cvLabel="MS" accession="MS:1099343" name="detector wavelength" value="112.23" unitAccession="MS:1009938" unitName="nanometer"/> </spectrumDescription> <binaryDataArray encodedLength="3433" dataProcessingRef="Xcalibur Processing"> <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="zlib compression" value=""/> <cvParam cvLabel="MS" accession="MS:1099514" name="time array" value="" unitAccession="MS:1000038" unitName="minute"/> <binary>AAAAwDsGeUpAAAAAAOejAADAOg6cQA==</binary> </binaryDataArray> <binaryDataArray encodedLength="3433"> <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="zlib compression" value=""/> <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" value=""/> <binary>AAAAAIBJAAAAABIhAAAAAAMysQA==</binary> </binaryDataArray> </spectrum> (except that I said I wouldn't produce any examples) (perhaps okay because no thought went into it). I look forward to the discussion. Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Randy Julian > Sent: Monday, February 18, 2008 12:46 PM > To: Matthew Chambers; Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > I think we've hit at some of the key points for the discussion tomorrow. > > > What is your recommendation for storing ADC (or PDA) data? > > Also, does the current idea for the data vectors support storing the > original time axis from a TOF or an FT instrument? > > Thanks, > Randy > > -----Original Message----- > From: Matthew Chambers [mailto:mat...@va...] > Sent: Monday, February 18, 2008 3:19 PM > To: Mass spectrometry standard development > Cc: Randy Julian > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Randy Julian wrote: > > I originally presented a draft of mzData 1.1 which had chromatogram > > elements in it, and it worked just fine for all sorts of acquisitions > an > > instrument can perform in addition to acquiring a spectrum. I > > appreciate that this suggestion also created some other difficulties > > (like multiple ways to store the same data), and I dropped the draft > as > > a serious suggestion in favor of a merger between mzData and mzXML. > > > Yes, as I understand the term, a chromatogram is a generic concept for > any data stored with time as one axis. > > > "Analog Channel" is a nickname for the typical analog-to-digital > > converters available on most mass spectrometers for recording data > from > > external devices which generate either a voltage or current output. > > These ADC inputs, and everything else recorded by the data system, > > undergo digitization. And yes, historically detectors were mostly > > analog, but over the past decade or so, they are increasingly pulse > > counting systems with all sorts of signal processing possibilities. > > Most people don't consider pulse counting systems to be analog... > > > OK. I can't say I like that nickname to refer to an extra/auxiliary data > > channel, but so be it. > > > We have already gone to generic vectors where the name (like mz and > > intensity) have to be provided in a cvParam. It is easy already to > name > > the vectors anything you like. This is important, especially since we > > got rid of the supplemental data vectors for holding things like > > individual peak annotations, and alternative processing of the > spectrum > > (like digital filtering, etc.). This is all really good, and pretty > > generic already. I'm not suggesting that we complicate things more > with > > specialization, but acknowledge the generalization which is already > > present and needed to record common extensions to the base use case. > > > > Because of the generic, unnamed vectors, a display program will > already > > have to sort out what it's looking at when it reads each vector. They > > are not ordered, for example, and there is not a schema-enforced > > requirement that there are always two - or even that they are named at > > all. I'm suggesting that since a robust viewing program is going to > > have to do a lot of checking to determine how the vectors are used in > > the current scheme, we would not have to do much to make the schema > much > > more broadly applicable. Since the schema is being considered for use > > in metabolomics and other small molecule work, I think this is > > important. > > > Yes, the vectors are generic, but their parent element is not > (<spectrum*>), so the only thing they should be generic for are things > within the domain of the "spectrum" concept. You are suggesting that we > take away the (intuitive) attribute requirements of a spectrum so that > it can be used as a generic concept. I am not at all opposed to the idea > > of a generic concept at the level of <spectrum> in the data hierarchy, > I'm just opposed to the idea that such a concept be called a "spectrum". > > If you were to suggest that we rename the spectrum element to something > generic like "runItem" (and spectrumList possibly to "runItemList") I > could live with that. It looks silly, but it wouldn't be flat-out wrong > and counter-intuitive! :) I would prefer to keep the spectrum element > and add a generic sibling concept instead, though. > > -Matt > > > Randy > > > > -----Original Message----- > > From: Matthew Chambers [mailto:mat...@va...] > > Sent: Monday, February 18, 2008 2:23 PM > > To: Mass spectrometry standard development > > Cc: Randy Julian > > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Have you previously made a detailed proposal about what the > > representation of these non-MS signals should look like? And to my > > (limited) knowledge, calling them "analog" signals is rather > misleading, > > > > because by necessity they must be digitized to be represented > digitally. > > > > :) Don't MS signals come from analog detectors as well? > > > > It sounds like you either want a specialized way to encode each > > non-spectral data type, or a generic way to encode any non-spectral > data > > > > type. In the former case, the schema and the validator mapping would > > define semantics for which data axes are allowed in which data type > > (e.g. "mz vs. intensity in a <spectrum>", "time vs. intensity in a > > <chromatogram>", "x vs. y in a <uvChannel>", etc.), and in the latter > > case, there would be a generic <channel> element which would have a > > variable set of binary data arrays and the names/types of those arrays > > > would be determined by the file creator. Or both approaches could be > > combined. But either (or both) approaches are superior to trying to > > shove generic "channel" data into a <spectrum> element IMO. Like you > > said, it should be possible for readers which only care about spectral > > > data to easily skip the non-spectral data and that would be vastly > more > > intuitive if there were other element names to put the non-spectra > data > > in. > > > > -Matt > > > > > > Randy Julian wrote: > > > >> Matt, > >> > >> I'm only talking about data which is collected by the mass > >> > > spectrometer > > > >> data system in conjunction with the mass spectral experiment. > >> > >> When we did LC-LC experiments in my lab, we would sometimes put a UV > >> detector between the two columns, and collect data on analog channels > >> recorded by XCalibur. Most instruments have this capability. > >> > >> Since there seems to be resistance to the whole idea of a > >> > > <chromatogram> > > > >> element (which I appreciate), it leaves open the question about what > >> > > to > > > >> do with data collected by the data system during the LC-MS > experiment. > >> > >> I don't understand why we don't want to acknowledge that almost all > MS > >> data systems can be used to collect analog signals during experiments > >> along with spectra. This is simple stuff, and very useful. I don't > >> want to lose this use case, and we've no place else to put this data. > >> > >> Randy > >> > >> > >> -----Original Message----- > >> From: psi...@li... > >> [mailto:psi...@li...] On Behalf Of > >> Matthew Chambers > >> Sent: Monday, February 18, 2008 1:53 PM > >> To: Mass spectrometry standard development > >> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > >> > >> Is there a reason to accommodate non-spectral data inside spectrum > >> elements? If the file should be able to handle non-spectral data, > then > >> > > I > > > >> think we should have other kinds of elements instead of introducing > >> strange logic about deciding whether a spectrum is really spectrum or > > >> not based on its MS level. Working out the other data representations > > >> would take time, though. It's worth discussing in the teleconference. > >> > >> As for the scanNumber vs. scan element question, I'm a bit confused > >> about that so I'd also like to cover it tomorrow. > >> > >> When are we going to open the cvParam-format can of worms? > >> > >> -Matt > >> > >> > >> Randy Julian wrote: > >> > >> > >>> I'd like to get a couple of schema items on the agenda tomorrow. > >>> > >>> I've been asking about a possible change in the schema regarding > >>> msLevel. As an alternative to moving the attribute, or making it > >>> optional, I would like to propose that we allow non-MS channels > >>> > >>> > >> acquired > >> > >> > >>> by the MS data system and stored in the raw file to be marked as > >>> msLevel=0. This would require a change to the specification > document > >>> but would allow software to ignore non-spectral content (whatever it > >>> might be) if the level is not at least 1. > >>> > >>> Another approach which is also consistent with the rest of the > schema > >>> > >>> > >> is > >> > >> > >>> to make the attribute a cvParam like the axis names. This would > >>> > >>> > >> require > >> > >> > >>> a schema change and shift the validation of msLevel to the validator > >>> program. If there is strong support for a required msLevel > attribute > >>> > >>> > >> in > >> > >> > >>> the current location, we could still represent the other signals > with > >>> the suggestion above. > >>> > >>> Also, I haven't heard back about the relationship between the 'scan' > >>> number attributes and the scan elements. Has anyone looked at this > >>> > >>> > >> yet? > >> > >> > >>> Can we also discuss how this is supposed to work tomorrow? > >>> > >>> Thanks, > >>> Randy > >>> > >>> > >>> -----Original Message----- > >>> From: psi...@li... > >>> [mailto:psi...@li...] On Behalf Of > >>> Lennart Martens > >>> Sent: Monday, February 18, 2008 1:07 PM > >>> To: Mass spectrometry standard development > >>> Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > >>> > >>> Dear PSI-MS Enthousiasts, > >>> > >>> > >>> The next telephone conference for the PSI-MS development group will > >>> > >>> > >> take > >> > >> > >>> place on Tuesday, 19 february 2008. > >>> > >>> The phone conference will take place at the time indicated below > >>> > >>> > >> (please > >> > >> > >>> find a location near you ): > >>> > >>> > >>> > >>> > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > > > >> > >> > >>> =2008&hour=17&min=0&sec=0&p1=0 > >>> > >>> phone numbers are: > >>> > >>> + Germany: 08001012079 > >>> > >>> + Switzerland: 0800000860 > >>> > >>> + UK: 08081095644 > >>> > >>> + USA: 1-866-314-3683 > >>> > >>> + Generic international: +44 2083222500 (UK number) > >>> > >>> access code: 297427 > >>> > >>> > >>> You can also view these details online on the PSI website: > >>> > >>> http://www.psidev.info/index.php?q=node/313 > >>> > >>> > >>> Best regards, > >>> > >>> lnnrt. > >>> > >>> > >>> > >>> > > > ------------------------------------------------------------------------ > > > >> > >> > >>> - > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >>> > >>> > >>> > > > ------------------------------------------------------------------------ > > > >> - > >> > >> > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >>> > >>> > >>> > >> > > > ------------------------------------------------------------------------ > > > >> - > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > >> > >> > > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 06:41:11
|
Hi Matt, I don't know the answer to this exactly, but I will say that back after the EBI workshop most ids and refs an xs:anyURI and XMLSpy was happy to allow that "1" was an xs:anyURI, but my later attempts to validate the file with Xerces yielded angry errors that "1" was not an xs:anyURI, and I changed some things back to string. So I don't know the answer here is, but before we start making more things xs:anyURI, let's please test the Xerces validator to make sure it's okay or we're okay with how those attributes are filled. An item for the to do list, I guess. Thanks, Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Matthew Chambers > Sent: Thursday, February 14, 2008 7:28 AM > To: Mass spectrometry standard development > Subject: [Psidev-ms-dev] Schema error for SoftwareType > > Just saw what may be an error in either the documentation or in the > schema: > > <xs:complexType name="SourceFileType"> has an id attribute: > <xs:attribute name="id" type="xs:string" use="required"> > > <xs:attribute name="sourceFileRef" type="xs:anyURI" use="optional"> is > supposed to be a URI, but it should reference the 'id' which is a string? > That doesn't make sense. > <xs:documentation>This attribute can optionally reference the 'id' of the > appropriate SourceFileType.</xs:documentation> > > Actually, looking a little closer, a lot of the Ref types (but not all) > use xs:anyURI to point to an id attribute which is a string. What is the > rationale for this? Out-of-file referencing? > > > -Matt > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 06:33:36
|
I read that this discussion was deemed moot. Play-by-play below. Lennart, should we remove your new cvParam entry location to remove temptation to use it, or leave it in? Software type discussion: - Darren points out that mzXML: <software type="acquisition" can no longer be encoded - Lennart adds an allowable cvParam within <software> which can specify a software type - Angel suggests the CV/ontology should be used to fix this with somthing like "Xcalibur" is_a "acquisition software" - Lennart points out that some software might have different functions or be of different type in different contexts - Lennart goes on to say that the CV term could have multiple parents then - Angel agrees that's fine - Matt says that "Xcalibur" is a good example of a multi-function software - Matt suggests more than one CV entry for each applicaiton in the Xcalibur suite - Or, Matt suggests, have a separate set of entries for software type and then complex validator logis (which he labels bad) - Darren says that controlling function in the CV not a good idea. Likes Lennart's cvParam mechanism - Fredrik says that softwares are referenced by a dataProcessing and you can decribe how the software was used there - Darren concurs that Fredrik is right and this discussion is moot > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Kessner, Darren E. > Sent: Thursday, February 14, 2008 9:55 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] mzML 0.99.9 SNAPSHOT software::type attribute > > Thank you for pointing that out, Fredrik. Please ignore my last post. > > A couple of general CV terms for "type" should be fine. Everything I > was talking about should go in <dataProcessing>. > > > > Darren > > > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On Behalf Of > Fredrik Levander > Sent: Thursday, February 14, 2008 9:36 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] mzML 0.99.9 SNAPSHOT software::type > attribute > > Isn't the actual usage of the software described under dataProcessing? > If the same software suite was used for two processing steps it can be > defined using two separate dataProcessing elements. I think what Angel > proposes should work out fine. Ok, there may be some ambiguities s for > converting mzML to mzXML if a piece of software belongs to several > software types categories, but apart from that I see no problem. Maybe a > > CV term for 'data acquisition' could be added as a MS:1000543 ( data > processing action), though? Or doesn't data acquisition qualify as > 'processing'? > > Fredrik > > Matthew Chambers skrev: > > I share your concern Lennart. AFAIK, Xcalibur is the name of a library > > > or suite of applications, it's not a single program. It could be > called > > once for instrument control (acquisition) and another time for peak > > picking & export to XML (currently only mzData). There may be other > > processing options in the future. It probably either needs more than > one > > entry in the CV (one per application in the Xcalibur suite) or we need > a > > separate group of CV terms to annotate software purpose. The former > > route would probably be more CV-friendly and intuitive. The latter > route > > would require all the semantic logic about which software CV terms are > > > able to be used for which purpose to be in the validator or in client > > software, a bad idea IMO. > > > > -Matt > > > > > > Lennart Martens wrote: > > > >> Hi Angel, > >> > >> > >> > >> > >>> I would have thought the ontology entry for XCalibur would have > >>> qualified it as acquisition software (e.g. this should have been > encoded > >>> into the CV element and hence referencing the accession MS:1000532 > would > >>> suffice to identify it as acquisition software.) > >>> > >>> > >> Seems like a very reasonable suggestion to me. Currently not > implemented > >> in the CV, but I'll make another tentative note on CV development. > >> > >> One thing that I just thought of: what if a piece of software can > >> perform multiple functions (i.e.: 'acquisition' as well as > 'peakpicking' > >> -- doable in the CV through simple multi-parenting), but is used in > only > >> one capacity (say 'acquisition') while another piece of software is > used > >> for the other functionality (e.g., I used 'Mascot Distiller' for > >> 'peakpicking'). > >> > >> Do we want to keep track of such things, and is this possibly an > >> argument against CV encoding here? > >> > >> > >> Cheers, > >> > >> lnnrt. > >> > >> > >> > >>> On Thu, Feb 14, 2008 at 6:21 AM, Lennart Martens > >>> <len...@eb... <mailto:len...@eb...>> > wrote: > >>> > >>> Hi Darren, hi PSI-MS enthousiasts, > >>> > >>> > >>> I have included the ability to use cvParams in the 'software' > element in > >>> a new schema iteration as per your suggestion. > >>> Find it here: > >>> > >>> > http://www.ebi.ac.uk/~lmartens/mzML/20080214_mzML0.99.9_SNAPSHOT.xsd > >>> > <http://www.ebi.ac.uk/%7Elmartens/mzML/20080214_mzML0.99.9_SNAPSHOT.xsd> > >>> > >>> Kessner, Darren E. wrote: > >>> > Hi all, > >>> > > >>> > > >>> > > >>> > Please excuse me if this has been discussed before. > >>> > > >>> > > >>> > > >>> > In mzXML, the <software> element is encoded as follows: > >>> > > >>> > <software type="acquisition" > >>> > > >>> > name="Xcalibur" > >>> > > >>> > version="1.3 alpha 8"/> > >>> > > >>> > > >>> > > >>> > In mzML, we have: > >>> > > >>> > <software id="Xcalibur"> > >>> > > >>> > <softwareParam cvLabel="MS" accession="MS:1000532" > >>> > name="Xcalibur" version="2.0.5"/> > >>> > > >>> > </software> > >>> > > >>> > > >>> > > >>> > Note that the name and version are encodable, but there is no > >>> convenient > >>> > place to save the "type" attribute, since the <software> > element does > >>> > not have <cvParam> or <userParam> sub-elements. > >>> > >>> > >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > >> > >> > > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > ------------------------------------------------------------------------ > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is STRICTLY PROHIBITED. > > If you have received this message in error, please notify us immediately > by calling (310) 423-6428 and destroy the related message. Thank You for > your cooperation. > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 06:10:21
|
Hi everyone, regarding list dta to mzML conversion, here are my thoughts: 1) The current rule is that scanNumbers must be unique within a file and always increasing, although not necessarily sequentially. IDs must be unique within a file. I don't think should change for conversion from dta. 2) I would only encode the spectrum once, since as you say it is just one spectrum. 3) I don't even see why you need two precursors. When we convert dta to mzXML, duplicates were dropped and the actual observed precursor mass was put in the mzXML. Thus you are "losing" the information that the spectrum could be charge 2 or 3. However, this information was guessed in the first place, and most software I know that extracts a spectrum with no charge information will apply some rules to decide on what charges to search. So, I suggest that the conversion from dta to mzML is just the reverse of mzML to dta. One spectrum per scan. If only 1 charge (dta file) is provided, encode it at the user's discretion. If more than 1 charge (dta file) is provided, encode the spectrum without any charge information. For LCQ data, it would probably be reasonable to not encode *any* charge information in the mzML file at all. Because it doesn't come with any in the first place. We will be adding the functionality for multiple precursors anyway for the case when you have multiple peaks in your selection window as seen, e.g., in an orbitrap. I suppose there's no reason you couldn't take advantage of that to encode both the 2+ and 3+ although I wouldn't recommend it. Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Fredrik Levander > Sent: Thursday, February 14, 2008 9:55 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] DTA to mzML conversion > > Hi Matt and Rune, > > Thanks for the comments. I agree that the important information is the > scan number, since this is what you would like to look up in the raw > data file. And it doesn't make much sense to have the scan repeated > twice in the file, so I think we'll go for solution 2 and just keep the > sourceFileRef to one of the files. > However, since we do have unique spectrum ids there should not be any > real need to stick to the unique scan number requirement from what I got > from the indexing discussion, even if it is still in the specs (?). > Couldn't there be cases when data is collected in different channels > where the scan numbers are the same in different channels? > > Regards > > Fredrik > > Matthew Chambers skrev: > > Hi Fredrik, > > > > Our group has a converter that does this conversion (to mzXML or mzData > > currently, not yet mzML, but they all have the same uniqueness > > constraints on scan numbers and they all support multiple precursors at > > least in theory); we went with solution 2 because solution 1 is invalid > > for all the XML formats (i.e. it would need a schema change and that > > change isn't likely to happen, whereas multiple sourceFileRefs would be > > understandable). As I understand it, sourceFileRef is optional > > ("<xs:attribute name="sourceFileRef" type="xs:anyURI" use="optional">"), > > so if you can't or don't want to encode it correctly, just don't include > > it. Our converter doesn't even bother to include the sourceFileRefs to > > the DTAs, it's not helpful information IMO. As long as the conversion is > > done without data loss, get it over with and then have mercy on your > > filesystem by deleting the DTAs. ;) > > > > -Matt > > > > > > Fredrik Levander wrote: > > > >> Hi All, > >> > >> In the Proteios platform we're including converters from some peak list > >> formats to mzData, and now also to mzML. It is clearly not optimal with > >> such conversion since instrument settings etcetera are lost. However, I > >> guess there will be need for such converters if someone wants to use > >> their old instruments with manufacturer peak picking algorithms. > >> > >> There are sample files generated from DTAs and ProteinLynx by the > >> converters (0.99.1) at: > >> http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML > >> > >> The converters will be part of the new release of the Proteios Software > >> Environment, but if anyone would like to try them with their files, > >> there is a standalone package (mzMLconverters.zip) at the address above > >> which should work under Windows/Linux/OSX with Java 1.5 or higher. > >> > >> Please notice that the output files are not schematically valid since > >> some terms are still missing in the CV. > >> > >> For the conversion of multiple DTA files to one mzML file there is a > >> small problem which is related to how lcq_dta generates dta files: If > >> the charge state of the precursor can not be determined, a spectrum can > >> result in two DTA files which are identical apart from the precursor. > >> There are two solutions on how to handle this: > >> 1) Two spectra, with the same scanNumber but different spectrum Ids > (The > >> solution used by the current converter) > >> 2) One spectrum, two precursors. However, this will not work with the > >> current schema since there can only be one sourceFileRef for a > spectrum. > >> Do you all think solution 1 is fine, or is there a better solution? > >> Solution 2 seems to need schema changes. > >> Other comments are also welcome > >> > >> Thanks, > >> > >> Fredrik > >> > >> ----------------------------------------------------------------------- > -- > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > >> > >> > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Eric D. <ede...@sy...> - 2008-02-19 05:40:01
|
Hi everyone, by my tally, we now stand like this on the unknown cv param topic: A) <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model"> value=""/> B) <cvParam cvLabel="MS" accession="MS:1099931" name="unknown instrument model" value=""/> C) Votes for A: 3 (Lennart, Luisa and Angel) Votes for B: 5 (Matt, Josh, Eric, Darren, Marc, Randy) Any other votes/comments? Thanks, Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Randy Julian > Sent: Wednesday, February 13, 2008 1:29 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Some additional 'Unknown > instrument'CVparameter thoughts > > Eric, > > I am in favor of 'B', I think mzML validity and MIAPE compliance are two > different things. If a journal won't let you publish your old > convertered .dta files because you cannot remember the instrument > doesn't mean you can't continue to search them internal to your group or > share them with another group. > > Randy > > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On Behalf Of Eric > Deutsch > Sent: Tuesday, February 12, 2008 2:52 PM > To: Mass spectrometry standard development > Cc: Eric Deutsch > Subject: Re: [Psidev-ms-dev] Some additional 'Unknown instrument' > CVparameter thoughts > > > Hi everyone, I'm trying to see if we can get to some consensus on some > of these ongoing threads. Regarding the "unknown instrument" problem, I > think there has been some confusion, so let me see if I can clarify and > ask for a final round of opinions. I agree with Fredrik's comments > below that his examples below are *not* what is intended. Here is what I > believe Lennart intended: > > A) > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > value=""/> > > Or the other alternative is to create a term for unknown: > > B) > <cvParam cvLabel="MS" accession="MS:1099931" name="unknown instrument > model" value=""/> > (where the number is obviously made up by me right now, but would be in > the CV) > > So those are the choices. Putting something in the value attribute is > not an option as Fredrik concludes below. > > Benefits of A) > - No need to litter the CV with "xxx unknown" terms > - Happenstance very easy for the existing validator software to > accommodate > - Somewhat counterintuitive and thus dissuades laziness > Drawbacks of A) > - Somewhat counterintuitive and awkward > > Benefits of B) > - Very intuitive and straightforward: the concept of what instrument > generated these spectra is captured by the concept "sorry, I just don't > know which instrument it was" > Drawbacks of B) > - Opens the door to perhaps needing to sprinkle other unknowns in the CV > - Is a little more inviting to users to be lazy and claim they don't > know, when with a little more effort they could find out and report > properly (because "unknown" is not an *obvious* option) > - Would require more development in the validator to properly handle a > special term like this. > > Based on the feedback I saw so far, Lennart, Luisa and Angel like A. > Matt seemed more in favor of B. No clear reads on others. > > I myself prefer B. To me it feels like A is a convenient but > counterintuitive trick to working around the problem. B feels like the > right solution even if it facilitates laziness. I don't think that will > be a big problem. I'm sure we can come up with some syntax for the > validator to permit or disallow "ambiguity terms" as desired. > > So, what say ye? > > > > > > From: psi...@li... > [mailto:psidev-ms-dev- > > > > Hi Lennart, Josh, Matt and others, > > > > If the top level term is allowed it will be possible to define not > only > > instrument value='unknown', but also instruments that are not in the > CV > > by putting something in the value field: > > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > > value="The new mass spec not in CV"/> > > <cvParam cvLabel="MS" accession="MS:1000031" name="instrument model" > > value="unknown"/> > > Instead of the intended: > > <cvParam cvLabel="MS" accession="MS:1000189" name="q-tof ultima" > > value=""/> > > I'm not so sure that this is wanted. Especially since unknown could be > > written as 'not known', 'not specified' etcetera. It make sense to > have > > a CV term for 'unknown', but it would be quite a few 'unknown' terms > to > > add to the CV to get one for each required category in the mzML > > schema...At some places it would be enough with just 'unknown' > > (source,detector etc), but at other places it must be specified what > is > > unknown! > > > > Anyway, I am still for usage of top level elements :-) , see line 16 > at: > > http://trac.thep.lu.se/trac/fp6- > > prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML > > > > cheers > > > > Fredrik > > > > Joshua Tasman skrev: > > > I'm with Matt on this one, and like his solution. There are > > unfortunately lots of real use cases (combining dta, mgfs) where the > > information will really be unknown, and we should accurately represent > the > > lack of information. If it's not too much effort to add a little more > > code to the validator, I would much prefer the accurate addition of an > > "unknown" term. There has been so much effort getting the CV and > document > > to line up with reality, it looks very strange to me to force this > > ontological 'hack' by allowing the category to appear as a value, as > Matt > > has said. > > > > > > Josh > > > > > > > > > Matthew Chambers wrote: > > > > > >> Lennart Martens wrote: > > >> > > >>> Hi Matt, and Colleagues, > > >>> > > >>> > > >>> > > >>> > > >>>> I don't really prefer one to the other very much, but I don't see > how > > >>>> the parent term would be easier to validate ("all but X children > of a > > >>>> term" doesn't make sense to me, do you mean "all children of a > term > > >>>> except X"?) > > >>>> > > >>>> > > >>> You are right; I provided bad shorthand for: 'all children of a > term, > > >>> except X (and Y, and Z, ... -- potentially). > > >>> > > >>> The reason why it it is easier to validate is due to the way the > > >>> validator mapping file is designed, e.g. (example verbatim from > > current > > >>> 0.99.1 mapping file): > > >>> > > >>> <CvTerm termAccession="MS:1000031" useTerm="false" > > >>> termName="instrument model" isRepeatable="false" > > >>> scope="/mzML/instrumentList/instrument" allowChildren="true" > > >>> cvIdentifier="MS"></CvTerm> > > >>> > > >>> this means that although all children of term 'MS:1000031 -- > > instrument > > >>> model' are allowed (allowChildren="true"), the term itself is not > > >>> allowed (useTerm="false"). By flipping this latter boolean, we can > > allow > > >>> the parent term, thus separating between MIAPE requirements > (current > > >>> configuration) and the 'usable mzML requirements' (flipped boolean > as > > >>> explained above) -- for the instrument model at least. > > >>> > > >>> > > >> OK, so it's an implementation thing. That's fine. > > >> > > >> > > >>>> What about data converted from DTAs or MGFs > > >>>> where the user doesn't even remember (or never knew) what kind of > > >>>> instrument it came from? > > >>>> > > >>>> > > >>> When the instrument is really unknown (which is unfortunate and > > >>> constitutes dramatic metadata loss whichever way you look at it), > the > > >>> proposed scenario (usage of toplevel term) provides solace. For > all > > >>> other scenarios (where an incentive to adapt convertor software or > > >>> report the development of a new instrument is concerned), the > relative > > >>> obscurity of the 'fix' might contribute to 'going the extra mile' > > >>> (upgrading the convertor, mailing in the new instrument name). > > >>> > > >>> > > >> While the toplevel term does provide some solace, it is obscure > enough > > >> that a casual user might look at it and think that something was > wrong > > >> because it does not intuitively make sense for the category to > appear > > as > > >> a value. What about this alternative: provide an "unknown > instrument" > > >> term with a unique accession #, but make the term name something > like > > >> "unknown (instrument type not specified or not in CV)". That would > be > > >> intuitive but still eye-catching (and it would be the eye-catching > part > > >> that implementors would want to minimize, because it makes them > look > > >> bad). ;) > > >> > > >> -Matt > > >> > > >> > ----------------------------------------------------------------------- > > -- > > >> This SF.net email is sponsored by: Microsoft > > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > >> _______________________________________________ > > >> Psidev-ms-dev mailing list > > >> Psi...@li... > > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > >> > > > > > > > ------------------------------------------------------------------------ > > - > > > This SF.net email is sponsored by: Microsoft > > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > > _______________________________________________ > > > Psidev-ms-dev mailing list > > > Psi...@li... > > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------ > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Darren K. <dar...@cs...> - 2008-02-19 03:35:42
|
Hi all, We've posted the latest pwiz/msdata snapshot for anyone interested. The RAMP adapter is pretty much finished, and I've added a text->CV translator that is used in mzXML <-> mzML conversion. All info is now preserved in mzXML -> mzML -> mzXML conversion, with userParams used if an appropriate cvParam cannot be found. You can try out the msconvert.exe Windows binary included in the package. A "binary array size" userParam is still being used for the common m/z-intensity array length, until consensus can be reached on this issue, hopefully during the conference call... http://www.sfcap.cshs.org/private/pwiz_src_msdata_080215.zip psidev/pwiz Darren |
From: Matt C. <mat...@va...> - 2008-02-19 02:21:20
|
Hi Eric, Your summary looks fine to me. Let me clarify my proposal though: 1) add arrayLength attribute as a global in the spectrum hierarchy, and specify in the spec that the m/z and intensity arrays must be that length 2) make arrayLength attribute on binaryDataArray optional, and specify in the spec that the attribute can be used to override the global arrayLength attribute (but it would be semantically invalid to do so for the m/z or intensity array) This is far less complicated than other parts of the schema, like specifying scans vs. acquisitions, so I'm confident implementors would be able to grasp it. Implementors who wish to keep things simple and only care about the primary data arrays can simply always ignore a binaryArrayData's arrayLength attribute along with the extra arrays. Just implementing support for the extra arrays would be complicated, regardless of whether or not they are allowed to be different in length. -Matt Eric Deutsch wrote: > Hi everyone, thank you for the lively discussion on this arrayLength > topic. I have summarized the discussion as this: > > - Eric proposed putting arrayLength= attr in <spectrum> and gives > example > - Angel agrees > - Brian disagrees (or maybe just points out that this formatting goes > against our intent as perceived by him?) > - Matt proposes primayArrayLength or mzIntArrayLength and allow other > arrays of different length in addtiion to the primary one > - Angel asks for rational examples of non-same-length arrays > - Eric pleads for keeping it simple > - Matt proposes some complex multi-dim data: charge assignments, isotope > number, peak label > - Angel is not impressed with Matt's examples > - Matt defends his complex use case although it will "probably never be > seen" > - Rune tries to dissuade such "user-defined craziness" > - Marc likes the idea of being able to annotate a subset of peaks "b6, > y6, y7-H2O" > - Rune offers that arrayLength= could be under <spectrumDescription> > - Lennart counters that the annotations that Marc suggests do not belong > in mzML since they are interpretation, not raw mass spec output > - Marc concedes that mzML is not the right place for the above example, > but still likes the idea of being able to make such annotations > - Matt still lobbies for allowing multiple parallel X-axes and > corresponding Y-axes > - Fredrik votes for keeping it simple with 1 fixed array size > - Randy has no problem with "moving arrayLength up" > [end of thread] > > I apologize that I have bluntly oversimplified the elegant arguments put > forth, but I needed to see the whole discussion in a series of > one-liners. > > I tally up the votes like this: > > In favor of Eric's proposal: 6 (Eric, Angel, Rune, Lennart, Fredrik, > Randy) > In favor of expanding the schema to handle multiple groups of arrays > with > different lengths between groups: 3 (Brian?, Matt, Marc) > > Have I inaccurately pegged anyone? Anyone else want to change their vote > or weigh in anew? I suspect we may not all agree on what to do here and > may just have to go with the majority. > > Speaking for only myself, I have not yet seen an example that I find > compelling enough to complexify the schema to handle it. I can still > envision one mzML file containing profile spectra and a second mzML file > after peak picking that contains the centroided peaks along with a > charge array and even an isotope number for each picked peak, with some > agreed-upon, documented NULL value within the array for missing > information. This is already fully supported in mzML and not something > that is even possible in mzXML and mzData, so we're already extending > the capability in a clear but simple way IMHO. > > Well? > > > > >> -----Original Message----- >> From: psi...@li... >> > [mailto:psidev-ms-dev- > >> bo...@li...] On Behalf Of Fredrik Levander >> Sent: Wednesday, February 13, 2008 11:13 AM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] binaryArrayData lengths >> >> As Eric concluded, a problem with arrays of different lengths is that >> > you > >> would normally want pairs (or higher) of data, i.e. an m/z and charge >> state pair. This would require two m/z arrays in the set if there >> > would be > >> one set of m/z-intensity pairs and another set of different length >> > with > >> m/z-charge state. Using the current schema structure it would not be >> possible to determine which m/z array belong to which other array. OK, >> > you > >> could identify pairs by looking at the arrayLength of the different >> > arrays > >> and use that for pairing, but it seems suboptimal to me. >> Also, if the spectrum element represents a list of picked peaks I >> > think > >> you would have charge assignments for all the peaks, even if some >> > would be > >> zero or another dummy value if the assignment failed. >> If the spectrum element represents a profile spectrum I cannot see the >> > use > >> for a set of binary arrays of different lengths. By definition the >> spectrum has to be either profile or centroid (peak list), so there >> shouldn't be a mixture of profile / centroid data in one spectrum. >> >> So, I also vote for binary arrays of the same length for a spectrum. >> >> Fredrik >> >> ----- Original Message ----- >> From: Matthew Chambers <mat...@va...> >> Date: Wednesday, February 13, 2008 5:12 pm >> Subject: Re: [Psidev-ms-dev] binaryArrayData lengths >> >> >>> It's true that identification output doesn't belong in mzML, but >>> peak >>> charge state assignments and isotope assignments (to name two >>> examples) >>> do not fall under that umbrella. Such annotation does belong in the >>> mzML >>> IMO, either in the same file or in a new one, it doesn't really >>> matter. >>> And such advanced annotation is unlikely to be available for every >>> peak >>> (much less every data point for profile data!). I fail to see the >>> harm >>> of allowing the length attribute of binaryDataArrays to be >>> optional, and >>> if not present for a given binaryDataArray, readers would be >>> instructed >>> to treat it the same as the required length attribute (given as an >>> attribute on the corresponding spectrum element). As for how this >>> will >>> allow for user-defined craziness, "userParam" does already allow >>> for >>> that, but binary data cannot be encoded in a userParam to my >>> knowledge. >>> -Matt >>> >>> >>> Lennart Martens wrote: >>> >>>> Hi Marc, >>>> >>>> >>>> >>>> >>>>> i like that idea of being able to annotate a small subset of the >>>>> >>> peaks >>> >>>>> in a spectrum. >>>>> This is e.g. needed when assigning ion types for MS/MS: b1, b2, >>>>> >>> ..., y1, >>> >>>>> y2, ..., y7-H2O, ... >>>>> Most of the peaks are simply noise and so only a minority of >>>>> >>> peaks will >>> >>>>> have an annotation. >>>>> Using a full-sized array would be possible, but a waste of space. >>>>> In my opinion, there should be a recommended way to do such a >>>>> >>> thing. >>> >>>>> What do you suggest? >>>>> >>>>> Before i forget: Is it possible to annotate peaks with strings? >>>>> Otherwise we would have to use some kind of dictionary to assign >>>>> >>> ion >>> >>>>> type an integer index. >>>>> >>>>> >>>> The annotation of a mass spectrum with fragment ion types and >>>> >>> indices >>> >>>> presents a significant amount of processing of the original mass >>>> >>> spec >>> >>>> data, as well as a certain type of 'inference' (uncertainty, and >>>> >>> often >>> >>>> ambiguity!) that has nothing to do with the mass spectrometer, >>>> >>> but >>> >>>> relates to an identification algorithm of some description. >>>> >>>> As such, I don't think we want to annotate this information in >>>> >>> mzML at >>> >>>> all, or encourage people to do so. The scope of mzML should >>>> >>> remain >>> >>>> limited to the instrument output (with possibly some signal >>>> >>> processing >>> >>>> done by the instrument software). >>>> >>>> Fragment ion annotation should therefore be held elsewhere, and >>>> >>> the PSI >>> >>>> is actually creating analysisXML for the purpose of recording >>>> identification algorithm output (such as fragment ion >>>> >>> assignment). >>> >>>> analysisXML will link back to the mzML files used as input, and >>>> >>> through >>> >>>> this link, peak annotation can be extracted. >>>> >>>> >>>> Cheers, >>>> >>>> lnnrt. >>>> >>>> >>>>> -Marc >>>>> > |
From: Matt C. <mat...@va...> - 2008-02-19 02:08:20
|
OK, I understand. Actually though, it is entirely possible (and not very complicated) to encode string information in binary data arrays like we do for the arrays of reals, so even that use case is not lost (however, it does tie in with the array length discussion). Either a BSTR array (length followed by char data) or NULL terminated strings would work. -Matt Randy Julian wrote: >>> If we go with a generic data element with a binary vector, the only >>> > use > >>> case we lose is providing labels to individual peaks using non-binary >>> data (like strings). This did not get much use in mzData, so maybe >>> doesn't matter much. >>> >>> >> I have no idea how you arrived at this last statement. It seems to be >> more related to the previous discussion about allowing data arrays of >> differing length. >> >> -Matt >> > > All I meant was that the supplemental data vector in mzData was > implemented differently than the current data array in mzML. The > supplemental data vector of mzData allowed non base64 encoded data (most > of the xsd types). What I meant by "labels" were strings like: "a-ion" > or "b-ion" or "+2". I don't think not being able to do this is much of > a loss. > > Randy > |
From: Randy J. <rkj...@in...> - 2008-02-19 01:04:45
|
>> If we go with a generic data element with a binary vector, the only use >> case we lose is providing labels to individual peaks using non-binary >> data (like strings). This did not get much use in mzData, so maybe >> doesn't matter much. >> >I have no idea how you arrived at this last statement. It seems to be >more related to the previous discussion about allowing data arrays of >differing length. > >-Matt All I meant was that the supplemental data vector in mzData was implemented differently than the current data array in mzML. The supplemental data vector of mzData allowed non base64 encoded data (most of the xsd types). What I meant by "labels" were strings like: "a-ion" or "b-ion" or "+2". I don't think not being able to do this is much of a loss. Randy |
From: Eric D. <ede...@sy...> - 2008-02-19 01:04:30
|
Hi everyone, thank you for the lively discussion on this arrayLength topic. I have summarized the discussion as this: - Eric proposed putting arrayLength= attr in <spectrum> and gives example - Angel agrees - Brian disagrees (or maybe just points out that this formatting goes against our intent as perceived by him?) - Matt proposes primayArrayLength or mzIntArrayLength and allow other arrays of different length in addtiion to the primary one - Angel asks for rational examples of non-same-length arrays - Eric pleads for keeping it simple - Matt proposes some complex multi-dim data: charge assignments, isotope number, peak label - Angel is not impressed with Matt's examples - Matt defends his complex use case although it will "probably never be seen" - Rune tries to dissuade such "user-defined craziness" - Marc likes the idea of being able to annotate a subset of peaks "b6, y6, y7-H2O" - Rune offers that arrayLength= could be under <spectrumDescription> - Lennart counters that the annotations that Marc suggests do not belong in mzML since they are interpretation, not raw mass spec output - Marc concedes that mzML is not the right place for the above example, but still likes the idea of being able to make such annotations - Matt still lobbies for allowing multiple parallel X-axes and corresponding Y-axes - Fredrik votes for keeping it simple with 1 fixed array size - Randy has no problem with "moving arrayLength up" [end of thread] I apologize that I have bluntly oversimplified the elegant arguments put forth, but I needed to see the whole discussion in a series of one-liners. I tally up the votes like this: In favor of Eric's proposal: 6 (Eric, Angel, Rune, Lennart, Fredrik, Randy) In favor of expanding the schema to handle multiple groups of arrays with different lengths between groups: 3 (Brian?, Matt, Marc) Have I inaccurately pegged anyone? Anyone else want to change their vote or weigh in anew? I suspect we may not all agree on what to do here and may just have to go with the majority. Speaking for only myself, I have not yet seen an example that I find compelling enough to complexify the schema to handle it. I can still envision one mzML file containing profile spectra and a second mzML file after peak picking that contains the centroided peaks along with a charge array and even an isotope number for each picked peak, with some agreed-upon, documented NULL value within the array for missing information. This is already fully supported in mzML and not something that is even possible in mzXML and mzData, so we're already extending the capability in a clear but simple way IMHO. Well? > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Fredrik Levander > Sent: Wednesday, February 13, 2008 11:13 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > As Eric concluded, a problem with arrays of different lengths is that you > would normally want pairs (or higher) of data, i.e. an m/z and charge > state pair. This would require two m/z arrays in the set if there would be > one set of m/z-intensity pairs and another set of different length with > m/z-charge state. Using the current schema structure it would not be > possible to determine which m/z array belong to which other array. OK, you > could identify pairs by looking at the arrayLength of the different arrays > and use that for pairing, but it seems suboptimal to me. > Also, if the spectrum element represents a list of picked peaks I think > you would have charge assignments for all the peaks, even if some would be > zero or another dummy value if the assignment failed. > If the spectrum element represents a profile spectrum I cannot see the use > for a set of binary arrays of different lengths. By definition the > spectrum has to be either profile or centroid (peak list), so there > shouldn't be a mixture of profile / centroid data in one spectrum. > > So, I also vote for binary arrays of the same length for a spectrum. > > Fredrik > > ----- Original Message ----- > From: Matthew Chambers <mat...@va...> > Date: Wednesday, February 13, 2008 5:12 pm > Subject: Re: [Psidev-ms-dev] binaryArrayData lengths > > > It's true that identification output doesn't belong in mzML, but > > peak > > charge state assignments and isotope assignments (to name two > > examples) > > do not fall under that umbrella. Such annotation does belong in the > > mzML > > IMO, either in the same file or in a new one, it doesn't really > > matter. > > And such advanced annotation is unlikely to be available for every > > peak > > (much less every data point for profile data!). I fail to see the > > harm > > of allowing the length attribute of binaryDataArrays to be > > optional, and > > if not present for a given binaryDataArray, readers would be > > instructed > > to treat it the same as the required length attribute (given as an > > attribute on the corresponding spectrum element). As for how this > > will > > allow for user-defined craziness, "userParam" does already allow > > for > > that, but binary data cannot be encoded in a userParam to my > > knowledge. > > -Matt > > > > > > Lennart Martens wrote: > > > Hi Marc, > > > > > > > > > > > >> i like that idea of being able to annotate a small subset of the > > peaks > > >> in a spectrum. > > >> This is e.g. needed when assigning ion types for MS/MS: b1, b2, > > ..., y1, > > >> y2, ..., y7-H2O, ... > > >> Most of the peaks are simply noise and so only a minority of > > peaks will > > >> have an annotation. > > >> Using a full-sized array would be possible, but a waste of space. > > >> In my opinion, there should be a recommended way to do such a > > thing. > > >> What do you suggest? > > >> > > >> Before i forget: Is it possible to annotate peaks with strings? > > >> Otherwise we would have to use some kind of dictionary to assign > > ion > > >> type an integer index. > > >> > > > > > > The annotation of a mass spectrum with fragment ion types and > > indices > > > presents a significant amount of processing of the original mass > > spec > > > data, as well as a certain type of 'inference' (uncertainty, and > > often > > > ambiguity!) that has nothing to do with the mass spectrometer, > > but > > > relates to an identification algorithm of some description. > > > > > > As such, I don't think we want to annotate this information in > > mzML at > > > all, or encourage people to do so. The scope of mzML should > > remain > > > limited to the instrument output (with possibly some signal > > processing > > > done by the instrument software). > > > > > > Fragment ion annotation should therefore be held elsewhere, and > > the PSI > > > is actually creating analysisXML for the purpose of recording > > > identification algorithm output (such as fragment ion > > assignment). > > > analysisXML will link back to the mzML files used as input, and > > through > > > this link, peak annotation can be extracted. > > > > > > > > > Cheers, > > > > > > lnnrt. > > > > > >> -Marc > > >> > > >> ----------------------------------------------------------------- > > -------- > > >> This SF.net email is sponsored by: Microsoft > > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > >> _______________________________________________ > > >> Psidev-ms-dev mailing list > > >> Psi...@li... > > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > >> > > > > > > > > > ------------------------------------------------------------------ > > ------- > > > This SF.net email is sponsored by: Microsoft > > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > > _______________________________________________ > > > Psidev-ms-dev mailing list > > > Psi...@li... > > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > > > > > > -------------------------------------------------------------------- > > ----- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Matthew C. <mat...@va...> - 2008-02-18 22:38:47
|
Randy Julian wrote: > ADC data with a time stamp could be stored under a <chromatogram> > element, but if the data were coming from a positional encoding system > (like might be used for DESI or DART), then this would not be a > chromatogram. > > I too am in favor of a generically named data location with named > vectors that work like the current <binaryDataArray> element. > > TOF instruments have a true time-domain x-axis, and FT could first have > a time-domain (for the FID), which could then be transformed into a > frequency-domain axis prior to calibration into a m/z axis. All of > these are a legitimate spectrum axes (from a journal editors point of > view). PDA are also two dimensional (like MS) with the independent > variable normally given in terms of wavelength. > Technically you're right, any axis which could be transformed to the m/z axis could be considered a legitimate spectrum axis, but practically speaking I think the mzML standard should semantically (not schematically, given the current form of the schema) require that every spectrum have an m/z array and an intensity array. I was under the impression that this was already the case with the mapping file that the validator uses. Am I mistaken? I think it's absurd to expect every MS application to be able to understand the time representation of m/zs, the frequency representation, and the m/z representation, so we should stick to the m/z common denominator. Therefore, if a file writer added those other axes (frequency/time/wavelength) they would be "auxiliary" (in the spectrum element). Those axes might be the primary axis in a different (either specific or generic) element type, though (e.g. a pda element). > When you say 'auxiliary' array, do you mean just another > <binaryDataArray> within <spectrum>? There is no specific "primary > independent variable" indicated for <spectrum> in the current schema - > it looks like it has to be assumed based on an interpretation of a > cvParam. > > If we go with a generic data element with a binary vector, the only use > case we lose is providing labels to individual peaks using non-binary > data (like strings). This did not get much use in mzData, so maybe > doesn't matter much. > I have no idea how you arrived at this last statement. It seems to be more related to the previous discussion about allowing data arrays of differing length. -Matt > > -----Original Message----- > From: Matthew Chambers [mailto:mat...@va...] > Sent: Monday, February 18, 2008 4:04 PM > To: Mass spectrometry standard development > Cc: Randy Julian > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > If non-spectral data is intended to be supported by mzML, I like the > idea of a generic spectrum-level element in which the data array terms > are much more flexible than they are in spectrum elements. Just to > clarify, would ADC and PDA data always be chromatographic (time vs. > something else)? If so, I am partial to your <chromatogram> element. If > the independent variable might not be time, then <chromatogram> wouldn't > > be generic enough. If non-spectral data is not intended to be supported > by mzML, then it should not be acceptable to hack in support for such > data by manipulating the spectrum element. > > The frequency domain for TOF and FT instruments is an interesting > question as well; I don't think it would make sense as the primary > independent variable of a <spectrum> element, but it could be an > auxiliary array. > > -Matt > > > Randy Julian wrote: > >> I think we've hit at some of the key points for the discussion >> > tomorrow. > >> What is your recommendation for storing ADC (or PDA) data? >> >> Also, does the current idea for the data vectors support storing the >> original time axis from a TOF or an FT instrument? >> >> Thanks, >> Randy >> >> -----Original Message----- >> From: Matthew Chambers [mailto:mat...@va...] >> Sent: Monday, February 18, 2008 3:19 PM >> To: Mass spectrometry standard development >> Cc: Randy Julian >> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >> >> >> >> Randy Julian wrote: >> >> >>> I originally presented a draft of mzData 1.1 which had chromatogram >>> elements in it, and it worked just fine for all sorts of acquisitions >>> >>> >> an >> >> >>> instrument can perform in addition to acquiring a spectrum. I >>> appreciate that this suggestion also created some other difficulties >>> (like multiple ways to store the same data), and I dropped the draft >>> >>> >> as >> >> >>> a serious suggestion in favor of a merger between mzData and mzXML. >>> >>> >>> >> Yes, as I understand the term, a chromatogram is a generic concept for >> > > >> any data stored with time as one axis. >> >> >> >>> "Analog Channel" is a nickname for the typical analog-to-digital >>> converters available on most mass spectrometers for recording data >>> >>> >> from >> >> >>> external devices which generate either a voltage or current output. >>> These ADC inputs, and everything else recorded by the data system, >>> undergo digitization. And yes, historically detectors were mostly >>> analog, but over the past decade or so, they are increasingly pulse >>> counting systems with all sorts of signal processing possibilities. >>> Most people don't consider pulse counting systems to be analog... >>> >>> >>> >> OK. I can't say I like that nickname to refer to an extra/auxiliary >> > data > >> channel, but so be it. >> >> >> >>> We have already gone to generic vectors where the name (like mz and >>> intensity) have to be provided in a cvParam. It is easy already to >>> >>> >> name >> >> >>> the vectors anything you like. This is important, especially since >>> > we > >>> got rid of the supplemental data vectors for holding things like >>> individual peak annotations, and alternative processing of the >>> >>> >> spectrum >> >> >>> (like digital filtering, etc.). This is all really good, and pretty >>> generic already. I'm not suggesting that we complicate things more >>> >>> >> with >> >> >>> specialization, but acknowledge the generalization which is already >>> present and needed to record common extensions to the base use case. >>> >>> Because of the generic, unnamed vectors, a display program will >>> >>> >> already >> >> >>> have to sort out what it's looking at when it reads each vector. >>> > They > >>> are not ordered, for example, and there is not a schema-enforced >>> requirement that there are always two - or even that they are named >>> > at > >>> all. I'm suggesting that since a robust viewing program is going to >>> have to do a lot of checking to determine how the vectors are used in >>> the current scheme, we would not have to do much to make the schema >>> >>> >> much >> >> >>> more broadly applicable. Since the schema is being considered for >>> > use > >>> in metabolomics and other small molecule work, I think this is >>> important. >>> >>> >>> >> Yes, the vectors are generic, but their parent element is not >> (<spectrum*>), so the only thing they should be generic for are things >> > > >> within the domain of the "spectrum" concept. You are suggesting that >> > we > >> take away the (intuitive) attribute requirements of a spectrum so that >> > > >> it can be used as a generic concept. I am not at all opposed to the >> > idea > >> of a generic concept at the level of <spectrum> in the data hierarchy, >> > > >> I'm just opposed to the idea that such a concept be called a >> > "spectrum". > >> If you were to suggest that we rename the spectrum element to >> > something > >> generic like "runItem" (and spectrumList possibly to "runItemList") I >> could live with that. It looks silly, but it wouldn't be flat-out >> > wrong > >> and counter-intuitive! :) I would prefer to keep the spectrum element >> and add a generic sibling concept instead, though. >> >> -Matt >> >> >> >>> Randy >>> >>> -----Original Message----- >>> From: Matthew Chambers [mailto:mat...@va...] >>> Sent: Monday, February 18, 2008 2:23 PM >>> To: Mass spectrometry standard development >>> Cc: Randy Julian >>> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>> >>> Have you previously made a detailed proposal about what the >>> representation of these non-MS signals should look like? And to my >>> (limited) knowledge, calling them "analog" signals is rather >>> >>> >> misleading, >> >> >>> because by necessity they must be digitized to be represented >>> >>> >> digitally. >> >> >>> :) Don't MS signals come from analog detectors as well? >>> >>> It sounds like you either want a specialized way to encode each >>> non-spectral data type, or a generic way to encode any non-spectral >>> >>> >> data >> >> >>> type. In the former case, the schema and the validator mapping would >>> define semantics for which data axes are allowed in which data type >>> (e.g. "mz vs. intensity in a <spectrum>", "time vs. intensity in a >>> <chromatogram>", "x vs. y in a <uvChannel>", etc.), and in the latter >>> > > >>> case, there would be a generic <channel> element which would have a >>> variable set of binary data arrays and the names/types of those >>> > arrays > >>> >>> >> >> >>> would be determined by the file creator. Or both approaches could be >>> combined. But either (or both) approaches are superior to trying to >>> shove generic "channel" data into a <spectrum> element IMO. Like you >>> said, it should be possible for readers which only care about >>> > spectral > >>> >>> >> >> >>> data to easily skip the non-spectral data and that would be vastly >>> >>> >> more >> >> >>> intuitive if there were other element names to put the non-spectra >>> >>> >> data >> >> >>> in. >>> >>> -Matt >>> >>> >>> Randy Julian wrote: >>> >>> >>> >>>> Matt, >>>> >>>> I'm only talking about data which is collected by the mass >>>> >>>> >>>> >>> spectrometer >>> >>> >>> >>>> data system in conjunction with the mass spectral experiment. >>>> >>>> When we did LC-LC experiments in my lab, we would sometimes put a UV >>>> detector between the two columns, and collect data on analog >>>> > channels > >>>> recorded by XCalibur. Most instruments have this capability. >>>> >>>> Since there seems to be resistance to the whole idea of a >>>> >>>> >>>> >>> <chromatogram> >>> >>> >>> >>>> element (which I appreciate), it leaves open the question about what >>>> >>>> >>>> >>> to >>> >>> >>> >>>> do with data collected by the data system during the LC-MS >>>> >>>> >> experiment. >> >> >>>> I don't understand why we don't want to acknowledge that almost all >>>> >>>> >> MS >> >> >>>> data systems can be used to collect analog signals during >>>> > experiments > >>>> along with spectra. This is simple stuff, and very useful. I don't >>>> want to lose this use case, and we've no place else to put this >>>> > data. > >>>> Randy >>>> >>>> >>>> -----Original Message----- >>>> From: psi...@li... >>>> [mailto:psi...@li...] On Behalf Of >>>> Matthew Chambers >>>> Sent: Monday, February 18, 2008 1:53 PM >>>> To: Mass spectrometry standard development >>>> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>>> >>>> Is there a reason to accommodate non-spectral data inside spectrum >>>> elements? If the file should be able to handle non-spectral data, >>>> >>>> >> then >> >> >>>> >>>> >>>> >>> I >>> >>> >>> >>>> think we should have other kinds of elements instead of introducing >>>> strange logic about deciding whether a spectrum is really spectrum >>>> > or > >>>> >>>> >> >> >>>> not based on its MS level. Working out the other data >>>> > representations > >>>> >>>> >> >> >>>> would take time, though. It's worth discussing in the >>>> > teleconference. > >>>> As for the scanNumber vs. scan element question, I'm a bit confused >>>> about that so I'd also like to cover it tomorrow. >>>> >>>> When are we going to open the cvParam-format can of worms? >>>> >>>> -Matt >>>> >>>> >>>> Randy Julian wrote: >>>> >>>> >>>> >>>> >>>>> I'd like to get a couple of schema items on the agenda tomorrow. >>>>> >>>>> I've been asking about a possible change in the schema regarding >>>>> msLevel. As an alternative to moving the attribute, or making it >>>>> optional, I would like to propose that we allow non-MS channels >>>>> >>>>> >>>>> >>>>> >>>> acquired >>>> >>>> >>>> >>>> >>>>> by the MS data system and stored in the raw file to be marked as >>>>> msLevel=0. This would require a change to the specification >>>>> >>>>> >> document >> >> >>>>> but would allow software to ignore non-spectral content (whatever >>>>> > it > >>>>> might be) if the level is not at least 1. >>>>> >>>>> Another approach which is also consistent with the rest of the >>>>> >>>>> >> schema >> >> >>>>> >>>>> >>>>> >>>>> >>>> is >>>> >>>> >>>> >>>> >>>>> to make the attribute a cvParam like the axis names. This would >>>>> >>>>> >>>>> >>>>> >>>> require >>>> >>>> >>>> >>>> >>>>> a schema change and shift the validation of msLevel to the >>>>> > validator > >>>>> program. If there is strong support for a required msLevel >>>>> >>>>> >> attribute >> >> >>>>> >>>>> >>>>> >>>>> >>>> in >>>> >>>> >>>> >>>> >>>>> the current location, we could still represent the other signals >>>>> >>>>> >> with >> >> >>>>> the suggestion above. >>>>> >>>>> Also, I haven't heard back about the relationship between the >>>>> > 'scan' > >>>>> number attributes and the scan elements. Has anyone looked at this >>>>> >>>>> >>>>> >>>>> >>>> yet? >>>> >>>> >>>> >>>> >>>>> Can we also discuss how this is supposed to work tomorrow? >>>>> >>>>> Thanks, >>>>> Randy >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: psi...@li... >>>>> [mailto:psi...@li...] On Behalf Of >>>>> Lennart Martens >>>>> Sent: Monday, February 18, 2008 1:07 PM >>>>> To: Mass spectrometry standard development >>>>> Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>>>> >>>>> Dear PSI-MS Enthousiasts, >>>>> >>>>> >>>>> The next telephone conference for the PSI-MS development group will >>>>> >>>>> >>>>> >>>>> >>>> take >>>> >>>> >>>> >>>> >>>>> place on Tuesday, 19 february 2008. >>>>> >>>>> The phone conference will take place at the time indicated below >>>>> >>>>> >>>>> >>>>> >>>> (please >>>> >>>> >>>> >>>> >>>>> find a location near you ): >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > >> >> >>> >>> >>> >>>> >>>> >>>> >>>> >>>>> =2008&hour=17&min=0&sec=0&p1=0 >>>>> >>>>> phone numbers are: >>>>> >>>>> + Germany: 08001012079 >>>>> >>>>> + Switzerland: 0800000860 >>>>> >>>>> + UK: 08081095644 >>>>> >>>>> + USA: 1-866-314-3683 >>>>> >>>>> + Generic international: +44 2083222500 (UK number) >>>>> >>>>> access code: 297427 >>>>> >>>>> >>>>> You can also view these details online on the PSI website: >>>>> >>>>> http://www.psidev.info/index.php?q=node/313 >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> lnnrt. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> > ------------------------------------------------------------------------ > >> >> >>> >>> >>> >>>> >>>> >>>> >>>> >>>>> - >>>>> This SF.net email is sponsored by: Microsoft >>>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>> _______________________________________________ >>>>> Psidev-ms-dev mailing list >>>>> Psi...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> > ------------------------------------------------------------------------ > >> >> >>> >>> >>> >>>> - >>>> >>>> >>>> >>>> >>>>> This SF.net email is sponsored by: Microsoft >>>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>> _______________________________________________ >>>>> Psidev-ms-dev mailing list >>>>> Psi...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> > ------------------------------------------------------------------------ > >> >> >>> >>> >>> >>>> - >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>>> >>>> >>> >>> >>> >> >> > > |
From: Randy J. <rkj...@in...> - 2008-02-18 21:49:59
|
ADC data with a time stamp could be stored under a <chromatogram> element, but if the data were coming from a positional encoding system (like might be used for DESI or DART), then this would not be a chromatogram. I too am in favor of a generically named data location with named vectors that work like the current <binaryDataArray> element. TOF instruments have a true time-domain x-axis, and FT could first have a time-domain (for the FID), which could then be transformed into a frequency-domain axis prior to calibration into a m/z axis. All of these are a legitimate spectrum axes (from a journal editors point of view). PDA are also two dimensional (like MS) with the independent variable normally given in terms of wavelength. When you say 'auxiliary' array, do you mean just another <binaryDataArray> within <spectrum>? There is no specific "primary independent variable" indicated for <spectrum> in the current schema - it looks like it has to be assumed based on an interpretation of a cvParam. If we go with a generic data element with a binary vector, the only use case we lose is providing labels to individual peaks using non-binary data (like strings). This did not get much use in mzData, so maybe doesn't matter much. Randy -----Original Message----- From: Matthew Chambers [mailto:mat...@va...] Sent: Monday, February 18, 2008 4:04 PM To: Mass spectrometry standard development Cc: Randy Julian Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 If non-spectral data is intended to be supported by mzML, I like the idea of a generic spectrum-level element in which the data array terms are much more flexible than they are in spectrum elements. Just to clarify, would ADC and PDA data always be chromatographic (time vs. something else)? If so, I am partial to your <chromatogram> element. If the independent variable might not be time, then <chromatogram> wouldn't be generic enough. If non-spectral data is not intended to be supported by mzML, then it should not be acceptable to hack in support for such data by manipulating the spectrum element. The frequency domain for TOF and FT instruments is an interesting question as well; I don't think it would make sense as the primary independent variable of a <spectrum> element, but it could be an auxiliary array. -Matt Randy Julian wrote: > I think we've hit at some of the key points for the discussion tomorrow. > > > What is your recommendation for storing ADC (or PDA) data? > > Also, does the current idea for the data vectors support storing the > original time axis from a TOF or an FT instrument? > > Thanks, > Randy > > -----Original Message----- > From: Matthew Chambers [mailto:mat...@va...] > Sent: Monday, February 18, 2008 3:19 PM > To: Mass spectrometry standard development > Cc: Randy Julian > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Randy Julian wrote: > >> I originally presented a draft of mzData 1.1 which had chromatogram >> elements in it, and it worked just fine for all sorts of acquisitions >> > an > >> instrument can perform in addition to acquiring a spectrum. I >> appreciate that this suggestion also created some other difficulties >> (like multiple ways to store the same data), and I dropped the draft >> > as > >> a serious suggestion in favor of a merger between mzData and mzXML. >> >> > Yes, as I understand the term, a chromatogram is a generic concept for > any data stored with time as one axis. > > >> "Analog Channel" is a nickname for the typical analog-to-digital >> converters available on most mass spectrometers for recording data >> > from > >> external devices which generate either a voltage or current output. >> These ADC inputs, and everything else recorded by the data system, >> undergo digitization. And yes, historically detectors were mostly >> analog, but over the past decade or so, they are increasingly pulse >> counting systems with all sorts of signal processing possibilities. >> Most people don't consider pulse counting systems to be analog... >> >> > OK. I can't say I like that nickname to refer to an extra/auxiliary data > > channel, but so be it. > > >> We have already gone to generic vectors where the name (like mz and >> intensity) have to be provided in a cvParam. It is easy already to >> > name > >> the vectors anything you like. This is important, especially since we >> got rid of the supplemental data vectors for holding things like >> individual peak annotations, and alternative processing of the >> > spectrum > >> (like digital filtering, etc.). This is all really good, and pretty >> generic already. I'm not suggesting that we complicate things more >> > with > >> specialization, but acknowledge the generalization which is already >> present and needed to record common extensions to the base use case. >> >> Because of the generic, unnamed vectors, a display program will >> > already > >> have to sort out what it's looking at when it reads each vector. They >> are not ordered, for example, and there is not a schema-enforced >> requirement that there are always two - or even that they are named at >> all. I'm suggesting that since a robust viewing program is going to >> have to do a lot of checking to determine how the vectors are used in >> the current scheme, we would not have to do much to make the schema >> > much > >> more broadly applicable. Since the schema is being considered for use >> in metabolomics and other small molecule work, I think this is >> important. >> >> > Yes, the vectors are generic, but their parent element is not > (<spectrum*>), so the only thing they should be generic for are things > within the domain of the "spectrum" concept. You are suggesting that we > take away the (intuitive) attribute requirements of a spectrum so that > it can be used as a generic concept. I am not at all opposed to the idea > > of a generic concept at the level of <spectrum> in the data hierarchy, > I'm just opposed to the idea that such a concept be called a "spectrum". > > If you were to suggest that we rename the spectrum element to something > generic like "runItem" (and spectrumList possibly to "runItemList") I > could live with that. It looks silly, but it wouldn't be flat-out wrong > and counter-intuitive! :) I would prefer to keep the spectrum element > and add a generic sibling concept instead, though. > > -Matt > > >> Randy >> >> -----Original Message----- >> From: Matthew Chambers [mailto:mat...@va...] >> Sent: Monday, February 18, 2008 2:23 PM >> To: Mass spectrometry standard development >> Cc: Randy Julian >> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >> >> Have you previously made a detailed proposal about what the >> representation of these non-MS signals should look like? And to my >> (limited) knowledge, calling them "analog" signals is rather >> > misleading, > >> because by necessity they must be digitized to be represented >> > digitally. > >> :) Don't MS signals come from analog detectors as well? >> >> It sounds like you either want a specialized way to encode each >> non-spectral data type, or a generic way to encode any non-spectral >> > data > >> type. In the former case, the schema and the validator mapping would >> define semantics for which data axes are allowed in which data type >> (e.g. "mz vs. intensity in a <spectrum>", "time vs. intensity in a >> <chromatogram>", "x vs. y in a <uvChannel>", etc.), and in the latter >> case, there would be a generic <channel> element which would have a >> variable set of binary data arrays and the names/types of those arrays >> > > >> would be determined by the file creator. Or both approaches could be >> combined. But either (or both) approaches are superior to trying to >> shove generic "channel" data into a <spectrum> element IMO. Like you >> said, it should be possible for readers which only care about spectral >> > > >> data to easily skip the non-spectral data and that would be vastly >> > more > >> intuitive if there were other element names to put the non-spectra >> > data > >> in. >> >> -Matt >> >> >> Randy Julian wrote: >> >> >>> Matt, >>> >>> I'm only talking about data which is collected by the mass >>> >>> >> spectrometer >> >> >>> data system in conjunction with the mass spectral experiment. >>> >>> When we did LC-LC experiments in my lab, we would sometimes put a UV >>> detector between the two columns, and collect data on analog channels >>> recorded by XCalibur. Most instruments have this capability. >>> >>> Since there seems to be resistance to the whole idea of a >>> >>> >> <chromatogram> >> >> >>> element (which I appreciate), it leaves open the question about what >>> >>> >> to >> >> >>> do with data collected by the data system during the LC-MS >>> > experiment. > >>> I don't understand why we don't want to acknowledge that almost all >>> > MS > >>> data systems can be used to collect analog signals during experiments >>> along with spectra. This is simple stuff, and very useful. I don't >>> want to lose this use case, and we've no place else to put this data. >>> >>> Randy >>> >>> >>> -----Original Message----- >>> From: psi...@li... >>> [mailto:psi...@li...] On Behalf Of >>> Matthew Chambers >>> Sent: Monday, February 18, 2008 1:53 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>> >>> Is there a reason to accommodate non-spectral data inside spectrum >>> elements? If the file should be able to handle non-spectral data, >>> > then > >>> >>> >> I >> >> >>> think we should have other kinds of elements instead of introducing >>> strange logic about deciding whether a spectrum is really spectrum or >>> > > >>> not based on its MS level. Working out the other data representations >>> > > >>> would take time, though. It's worth discussing in the teleconference. >>> >>> As for the scanNumber vs. scan element question, I'm a bit confused >>> about that so I'd also like to cover it tomorrow. >>> >>> When are we going to open the cvParam-format can of worms? >>> >>> -Matt >>> >>> >>> Randy Julian wrote: >>> >>> >>> >>>> I'd like to get a couple of schema items on the agenda tomorrow. >>>> >>>> I've been asking about a possible change in the schema regarding >>>> msLevel. As an alternative to moving the attribute, or making it >>>> optional, I would like to propose that we allow non-MS channels >>>> >>>> >>>> >>> acquired >>> >>> >>> >>>> by the MS data system and stored in the raw file to be marked as >>>> msLevel=0. This would require a change to the specification >>>> > document > >>>> but would allow software to ignore non-spectral content (whatever it >>>> might be) if the level is not at least 1. >>>> >>>> Another approach which is also consistent with the rest of the >>>> > schema > >>>> >>>> >>>> >>> is >>> >>> >>> >>>> to make the attribute a cvParam like the axis names. This would >>>> >>>> >>>> >>> require >>> >>> >>> >>>> a schema change and shift the validation of msLevel to the validator >>>> program. If there is strong support for a required msLevel >>>> > attribute > >>>> >>>> >>>> >>> in >>> >>> >>> >>>> the current location, we could still represent the other signals >>>> > with > >>>> the suggestion above. >>>> >>>> Also, I haven't heard back about the relationship between the 'scan' >>>> number attributes and the scan elements. Has anyone looked at this >>>> >>>> >>>> >>> yet? >>> >>> >>> >>>> Can we also discuss how this is supposed to work tomorrow? >>>> >>>> Thanks, >>>> Randy >>>> >>>> >>>> -----Original Message----- >>>> From: psi...@li... >>>> [mailto:psi...@li...] On Behalf Of >>>> Lennart Martens >>>> Sent: Monday, February 18, 2008 1:07 PM >>>> To: Mass spectrometry standard development >>>> Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>>> >>>> Dear PSI-MS Enthousiasts, >>>> >>>> >>>> The next telephone conference for the PSI-MS development group will >>>> >>>> >>>> >>> take >>> >>> >>> >>>> place on Tuesday, 19 february 2008. >>>> >>>> The phone conference will take place at the time indicated below >>>> >>>> >>>> >>> (please >>> >>> >>> >>>> find a location near you ): >>>> >>>> >>>> >>>> >>>> > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > >> >> >>> >>> >>> >>>> =2008&hour=17&min=0&sec=0&p1=0 >>>> >>>> phone numbers are: >>>> >>>> + Germany: 08001012079 >>>> >>>> + Switzerland: 0800000860 >>>> >>>> + UK: 08081095644 >>>> >>>> + USA: 1-866-314-3683 >>>> >>>> + Generic international: +44 2083222500 (UK number) >>>> >>>> access code: 297427 >>>> >>>> >>>> You can also view these details online on the PSI website: >>>> >>>> http://www.psidev.info/index.php?q=node/313 >>>> >>>> >>>> Best regards, >>>> >>>> lnnrt. >>>> >>>> >>>> >>>> >>>> > ------------------------------------------------------------------------ > >> >> >>> >>> >>> >>>> - >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>>> >>>> > ------------------------------------------------------------------------ > >> >> >>> - >>> >>> >>> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>>> >>>> >>> >>> > ------------------------------------------------------------------------ > >> >> >>> - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >>> >>> >> >> > > |
From: Matthew C. <mat...@va...> - 2008-02-18 21:03:44
|
If non-spectral data is intended to be supported by mzML, I like the idea of a generic spectrum-level element in which the data array terms are much more flexible than they are in spectrum elements. Just to clarify, would ADC and PDA data always be chromatographic (time vs. something else)? If so, I am partial to your <chromatogram> element. If the independent variable might not be time, then <chromatogram> wouldn't be generic enough. If non-spectral data is not intended to be supported by mzML, then it should not be acceptable to hack in support for such data by manipulating the spectrum element. The frequency domain for TOF and FT instruments is an interesting question as well; I don't think it would make sense as the primary independent variable of a <spectrum> element, but it could be an auxiliary array. -Matt Randy Julian wrote: > I think we've hit at some of the key points for the discussion tomorrow. > > > What is your recommendation for storing ADC (or PDA) data? > > Also, does the current idea for the data vectors support storing the > original time axis from a TOF or an FT instrument? > > Thanks, > Randy > > -----Original Message----- > From: Matthew Chambers [mailto:mat...@va...] > Sent: Monday, February 18, 2008 3:19 PM > To: Mass spectrometry standard development > Cc: Randy Julian > Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 > > > > Randy Julian wrote: > >> I originally presented a draft of mzData 1.1 which had chromatogram >> elements in it, and it worked just fine for all sorts of acquisitions >> > an > >> instrument can perform in addition to acquiring a spectrum. I >> appreciate that this suggestion also created some other difficulties >> (like multiple ways to store the same data), and I dropped the draft >> > as > >> a serious suggestion in favor of a merger between mzData and mzXML. >> >> > Yes, as I understand the term, a chromatogram is a generic concept for > any data stored with time as one axis. > > >> "Analog Channel" is a nickname for the typical analog-to-digital >> converters available on most mass spectrometers for recording data >> > from > >> external devices which generate either a voltage or current output. >> These ADC inputs, and everything else recorded by the data system, >> undergo digitization. And yes, historically detectors were mostly >> analog, but over the past decade or so, they are increasingly pulse >> counting systems with all sorts of signal processing possibilities. >> Most people don't consider pulse counting systems to be analog... >> >> > OK. I can't say I like that nickname to refer to an extra/auxiliary data > > channel, but so be it. > > >> We have already gone to generic vectors where the name (like mz and >> intensity) have to be provided in a cvParam. It is easy already to >> > name > >> the vectors anything you like. This is important, especially since we >> got rid of the supplemental data vectors for holding things like >> individual peak annotations, and alternative processing of the >> > spectrum > >> (like digital filtering, etc.). This is all really good, and pretty >> generic already. I'm not suggesting that we complicate things more >> > with > >> specialization, but acknowledge the generalization which is already >> present and needed to record common extensions to the base use case. >> >> Because of the generic, unnamed vectors, a display program will >> > already > >> have to sort out what it's looking at when it reads each vector. They >> are not ordered, for example, and there is not a schema-enforced >> requirement that there are always two - or even that they are named at >> all. I'm suggesting that since a robust viewing program is going to >> have to do a lot of checking to determine how the vectors are used in >> the current scheme, we would not have to do much to make the schema >> > much > >> more broadly applicable. Since the schema is being considered for use >> in metabolomics and other small molecule work, I think this is >> important. >> >> > Yes, the vectors are generic, but their parent element is not > (<spectrum*>), so the only thing they should be generic for are things > within the domain of the "spectrum" concept. You are suggesting that we > take away the (intuitive) attribute requirements of a spectrum so that > it can be used as a generic concept. I am not at all opposed to the idea > > of a generic concept at the level of <spectrum> in the data hierarchy, > I'm just opposed to the idea that such a concept be called a "spectrum". > > If you were to suggest that we rename the spectrum element to something > generic like "runItem" (and spectrumList possibly to "runItemList") I > could live with that. It looks silly, but it wouldn't be flat-out wrong > and counter-intuitive! :) I would prefer to keep the spectrum element > and add a generic sibling concept instead, though. > > -Matt > > >> Randy >> >> -----Original Message----- >> From: Matthew Chambers [mailto:mat...@va...] >> Sent: Monday, February 18, 2008 2:23 PM >> To: Mass spectrometry standard development >> Cc: Randy Julian >> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >> >> Have you previously made a detailed proposal about what the >> representation of these non-MS signals should look like? And to my >> (limited) knowledge, calling them "analog" signals is rather >> > misleading, > >> because by necessity they must be digitized to be represented >> > digitally. > >> :) Don't MS signals come from analog detectors as well? >> >> It sounds like you either want a specialized way to encode each >> non-spectral data type, or a generic way to encode any non-spectral >> > data > >> type. In the former case, the schema and the validator mapping would >> define semantics for which data axes are allowed in which data type >> (e.g. "mz vs. intensity in a <spectrum>", "time vs. intensity in a >> <chromatogram>", "x vs. y in a <uvChannel>", etc.), and in the latter >> case, there would be a generic <channel> element which would have a >> variable set of binary data arrays and the names/types of those arrays >> > > >> would be determined by the file creator. Or both approaches could be >> combined. But either (or both) approaches are superior to trying to >> shove generic "channel" data into a <spectrum> element IMO. Like you >> said, it should be possible for readers which only care about spectral >> > > >> data to easily skip the non-spectral data and that would be vastly >> > more > >> intuitive if there were other element names to put the non-spectra >> > data > >> in. >> >> -Matt >> >> >> Randy Julian wrote: >> >> >>> Matt, >>> >>> I'm only talking about data which is collected by the mass >>> >>> >> spectrometer >> >> >>> data system in conjunction with the mass spectral experiment. >>> >>> When we did LC-LC experiments in my lab, we would sometimes put a UV >>> detector between the two columns, and collect data on analog channels >>> recorded by XCalibur. Most instruments have this capability. >>> >>> Since there seems to be resistance to the whole idea of a >>> >>> >> <chromatogram> >> >> >>> element (which I appreciate), it leaves open the question about what >>> >>> >> to >> >> >>> do with data collected by the data system during the LC-MS >>> > experiment. > >>> I don't understand why we don't want to acknowledge that almost all >>> > MS > >>> data systems can be used to collect analog signals during experiments >>> along with spectra. This is simple stuff, and very useful. I don't >>> want to lose this use case, and we've no place else to put this data. >>> >>> Randy >>> >>> >>> -----Original Message----- >>> From: psi...@li... >>> [mailto:psi...@li...] On Behalf Of >>> Matthew Chambers >>> Sent: Monday, February 18, 2008 1:53 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>> >>> Is there a reason to accommodate non-spectral data inside spectrum >>> elements? If the file should be able to handle non-spectral data, >>> > then > >>> >>> >> I >> >> >>> think we should have other kinds of elements instead of introducing >>> strange logic about deciding whether a spectrum is really spectrum or >>> > > >>> not based on its MS level. Working out the other data representations >>> > > >>> would take time, though. It's worth discussing in the teleconference. >>> >>> As for the scanNumber vs. scan element question, I'm a bit confused >>> about that so I'd also like to cover it tomorrow. >>> >>> When are we going to open the cvParam-format can of worms? >>> >>> -Matt >>> >>> >>> Randy Julian wrote: >>> >>> >>> >>>> I'd like to get a couple of schema items on the agenda tomorrow. >>>> >>>> I've been asking about a possible change in the schema regarding >>>> msLevel. As an alternative to moving the attribute, or making it >>>> optional, I would like to propose that we allow non-MS channels >>>> >>>> >>>> >>> acquired >>> >>> >>> >>>> by the MS data system and stored in the raw file to be marked as >>>> msLevel=0. This would require a change to the specification >>>> > document > >>>> but would allow software to ignore non-spectral content (whatever it >>>> might be) if the level is not at least 1. >>>> >>>> Another approach which is also consistent with the rest of the >>>> > schema > >>>> >>>> >>>> >>> is >>> >>> >>> >>>> to make the attribute a cvParam like the axis names. This would >>>> >>>> >>>> >>> require >>> >>> >>> >>>> a schema change and shift the validation of msLevel to the validator >>>> program. If there is strong support for a required msLevel >>>> > attribute > >>>> >>>> >>>> >>> in >>> >>> >>> >>>> the current location, we could still represent the other signals >>>> > with > >>>> the suggestion above. >>>> >>>> Also, I haven't heard back about the relationship between the 'scan' >>>> number attributes and the scan elements. Has anyone looked at this >>>> >>>> >>>> >>> yet? >>> >>> >>> >>>> Can we also discuss how this is supposed to work tomorrow? >>>> >>>> Thanks, >>>> Randy >>>> >>>> >>>> -----Original Message----- >>>> From: psi...@li... >>>> [mailto:psi...@li...] On Behalf Of >>>> Lennart Martens >>>> Sent: Monday, February 18, 2008 1:07 PM >>>> To: Mass spectrometry standard development >>>> Subject: [Psidev-ms-dev] Teleconference Tuesday 19 Feb 2008 >>>> >>>> Dear PSI-MS Enthousiasts, >>>> >>>> >>>> The next telephone conference for the PSI-MS development group will >>>> >>>> >>>> >>> take >>> >>> >>> >>>> place on Tuesday, 19 february 2008. >>>> >>>> The phone conference will take place at the time indicated below >>>> >>>> >>>> >>> (please >>> >>> >>> >>>> find a location near you ): >>>> >>>> >>>> >>>> >>>> > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=2&year > >> >> >>> >>> >>> >>>> =2008&hour=17&min=0&sec=0&p1=0 >>>> >>>> phone numbers are: >>>> >>>> + Germany: 08001012079 >>>> >>>> + Switzerland: 0800000860 >>>> >>>> + UK: 08081095644 >>>> >>>> + USA: 1-866-314-3683 >>>> >>>> + Generic international: +44 2083222500 (UK number) >>>> >>>> access code: 297427 >>>> >>>> >>>> You can also view these details online on the PSI website: >>>> >>>> http://www.psidev.info/index.php?q=node/313 >>>> >>>> >>>> Best regards, >>>> >>>> lnnrt. >>>> >>>> >>>> >>>> >>>> > ------------------------------------------------------------------------ > >> >> >>> >>> >>> >>>> - >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>>> >>>> > ------------------------------------------------------------------------ > >> >> >>> - >>> >>> >>> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>>> >>>> >>>> >>>> >>> >>> > ------------------------------------------------------------------------ > >> >> >>> - >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >>> >>> >> >> > > |