You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
(5) |
Sep
|
Oct
(5) |
Nov
(1) |
Dec
(2) |
2005 |
Jan
(2) |
Feb
(5) |
Mar
|
Apr
(1) |
May
(5) |
Jun
(2) |
Jul
(3) |
Aug
(7) |
Sep
(18) |
Oct
(22) |
Nov
(10) |
Dec
(15) |
2006 |
Jan
(15) |
Feb
(8) |
Mar
(16) |
Apr
(8) |
May
(2) |
Jun
(5) |
Jul
(3) |
Aug
(1) |
Sep
(34) |
Oct
(21) |
Nov
(14) |
Dec
(2) |
2007 |
Jan
|
Feb
(17) |
Mar
(10) |
Apr
(25) |
May
(11) |
Jun
(30) |
Jul
(1) |
Aug
(38) |
Sep
|
Oct
(119) |
Nov
(18) |
Dec
(3) |
2008 |
Jan
(34) |
Feb
(202) |
Mar
(57) |
Apr
(76) |
May
(44) |
Jun
(33) |
Jul
(33) |
Aug
(32) |
Sep
(41) |
Oct
(49) |
Nov
(84) |
Dec
(216) |
2009 |
Jan
(102) |
Feb
(126) |
Mar
(112) |
Apr
(26) |
May
(91) |
Jun
(54) |
Jul
(39) |
Aug
(29) |
Sep
(16) |
Oct
(18) |
Nov
(12) |
Dec
(23) |
2010 |
Jan
(29) |
Feb
(7) |
Mar
(11) |
Apr
(22) |
May
(9) |
Jun
(13) |
Jul
(7) |
Aug
(10) |
Sep
(9) |
Oct
(20) |
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
(4) |
Mar
(27) |
Apr
(15) |
May
(23) |
Jun
(13) |
Jul
(15) |
Aug
(11) |
Sep
(23) |
Oct
(18) |
Nov
(10) |
Dec
(7) |
2012 |
Jan
(23) |
Feb
(19) |
Mar
(7) |
Apr
(20) |
May
(16) |
Jun
(4) |
Jul
(6) |
Aug
(6) |
Sep
(14) |
Oct
(16) |
Nov
(31) |
Dec
(23) |
2013 |
Jan
(14) |
Feb
(19) |
Mar
(7) |
Apr
(25) |
May
(8) |
Jun
(5) |
Jul
(5) |
Aug
(6) |
Sep
(20) |
Oct
(19) |
Nov
(10) |
Dec
(12) |
2014 |
Jan
(6) |
Feb
(15) |
Mar
(6) |
Apr
(4) |
May
(16) |
Jun
(6) |
Jul
(4) |
Aug
(2) |
Sep
(3) |
Oct
(3) |
Nov
(7) |
Dec
(3) |
2015 |
Jan
(3) |
Feb
(8) |
Mar
(14) |
Apr
(3) |
May
(17) |
Jun
(9) |
Jul
(4) |
Aug
(2) |
Sep
|
Oct
(13) |
Nov
|
Dec
(6) |
2016 |
Jan
(8) |
Feb
(1) |
Mar
(20) |
Apr
(16) |
May
(11) |
Jun
(6) |
Jul
(5) |
Aug
|
Sep
(2) |
Oct
(5) |
Nov
(7) |
Dec
(2) |
2017 |
Jan
(10) |
Feb
(3) |
Mar
(17) |
Apr
(7) |
May
(5) |
Jun
(11) |
Jul
(4) |
Aug
(12) |
Sep
(9) |
Oct
(7) |
Nov
(2) |
Dec
(4) |
2018 |
Jan
(7) |
Feb
(2) |
Mar
(5) |
Apr
(6) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(1) |
Sep
(9) |
Oct
(5) |
Nov
(3) |
Dec
(5) |
2019 |
Jan
(10) |
Feb
|
Mar
(4) |
Apr
(4) |
May
(2) |
Jun
(8) |
Jul
(2) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
(9) |
Dec
(1) |
2020 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
|
May
(3) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Kessner, D. E. <Dar...@cs...> - 2008-01-22 16:24:54
|
Just a note from Parag, to explain my mention of "wading through our legal department's bureaucracy": =20 "it WILL be fully open source and distributable - but as it is a new package we need to get it cleared first." =20 =20 =20 Darren =20 =20 =20 ________________________________ From: psi...@li... [mailto:psi...@li...] On Behalf Of Kessner, Darren E. Sent: Monday, January 21, 2008 3:48 PM To: psi...@li... Subject: [Psidev-ms-dev] MSData: C++ library for mzML reading/writing =20 Hi all, =20 I just wanted to introduce a C++ library I've been working on for handling mzML (among other things). =20 Some background: Our group (Spielberg Family Center for Applied Proteomics, Cedars-Sinai Medical Center) has a number of software tools for analyzing RAW and mzXML data files. This library (MSData) is the next version of our data file access layer. The MSData library implements a C++ representation of mzML.=20 =20 Current implemented functionality: =20 - compile-time parsing of the psi-ms.obo controlled vocabulary file to generate C++ code for typesafe use of the controlled vocabulary terms =20 - mzML <-> MSData data structure mapping, including reading/writing mzML XML fragments from/to C++ iostreams =20 - diff functionality for each MSData data structure =20 - binary data array encoding with 64-32 bit conversions and endianization=20 =20 - abstract interface to spectrum binary data, to allow lazy evaluation of binary data backed by data files =20 Next steps: =20 - mzXML reading/writing, RAW reading =20 - simple analysis interface for accessing scan data and common meta-data fields =20 =20 I put a source package here: =20 http://www.sfcap.cshs.org/private =20 login: psidev pass: pwiz =20 =20 We will be making this library available open source, as soon as we finish wading through our legal department's bureaucracy. =20 =20 Please have a look, and feel free to email me with any questions, comments, or suggestions! =20 =20 Darren =20 =20 =20 Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 =20 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center http://www.sfcap.cshs.org/ =20 =20 IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Kessner, D. E. <Dar...@cs...> - 2008-01-22 16:14:33
|
Hi all, =20 I've collected some notes regarding the mzML spec: =20 1) There are references in the specification document to InstrumentType, SampleType, etc. that I assume mean <instrument> element, <sample> element, etc, though this is not explicitly stated anywhere. =20 2) The <precursor> element has a spectrumRef attribute that is supposed to refer to the id attribute of a <spectrum>. However, the <precursor> element in tiny1.mzML0.99.1.mzML appears to refer to a scanNumber, not id. Which is the intended attribute to reference (I assume 'id')? =20 3) The <cv> element has the attribute fullName=3D"Proteomics Standards Initiative Mass Spectrometry Ontology". This text does not appear in psi-ms.obo - perhaps it should? Basically, I think it would be useful to have some identifier that appears in both psi-ms.obo and in mzML files generated with that psi-ms.obo. Or even better, an id and a version, just like the <softwareParam> elements, but in the psi-ms.obo it could appear in the header. =20 4) Regarding <softwareParam> elements, is there a reason not to use two of the more general <cvParam> elements, one to specify the software, and one to specify the version? =20 5) Element reference naming consistency -- in many cases, there is an element name and a corresponding (either attribute or element) name for a reference to it: =20 <instrument> <-- instrumentRef <sourceFile> <-- sourceFileRef <spectrum> <-- spectrumRef =20 But there are a few exceptions: =20 <referenceableParamGroup> <-- paramGroupRef=20 <software> <-- softwareRef AND instrumentSoftwareRef =20 Suggestions: Replace <referenceableParamGroup> with <paramGroup> Remove <instrumentSoftwareRef> and use <softwareRef> =20 Since the id attribute is usually used for references, we could also have: <cv id=3D"MS" ... > ...=20 <cvParam cvRef=3D"MS" ...> =20 There is also some redundancy in the naming of <sourceFile> attributes: <sourceFile id=3D"1" sourceFileName=3D"tiny1.RAW" sourceFileLocation=3D"file://F:/data/Exp01" > could be shortened to: <sourceFile id=3D"1" name=3D"tiny1.RAW" location=3D"file://F:/data/Exp01"= > =20 =20 =20 Darren =20 =20 =20 Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 =20 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center http://www.sfcap.cshs.org/ =20 =20 IMPORTANT WARNING: This message is intended for the use of the person or = entity to which it is addressed and may contain information that is privi= leged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipi= ent, or the employee or agent responsible for delivering it to the intend= ed recipient, you are hereby notified that any dissemination, distributio= n or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for= your cooperation. |
From: Kessner, D. E. <Dar...@cs...> - 2008-01-21 23:48:34
|
Hi all, =20 I just wanted to introduce a C++ library I've been working on for handling mzML (among other things). =20 Some background: Our group (Spielberg Family Center for Applied Proteomics, Cedars-Sinai Medical Center) has a number of software tools for analyzing RAW and mzXML data files. This library (MSData) is the next version of our data file access layer. The MSData library implements a C++ representation of mzML.=20 =20 Current implemented functionality: =20 - compile-time parsing of the psi-ms.obo controlled vocabulary file to generate C++ code for typesafe use of the controlled vocabulary terms =20 - mzML <-> MSData data structure mapping, including reading/writing mzML XML fragments from/to C++ iostreams =20 - diff functionality for each MSData data structure =20 - binary data array encoding with 64-32 bit conversions and endianization=20 =20 - abstract interface to spectrum binary data, to allow lazy evaluation of binary data backed by data files =20 Next steps: =20 - mzXML reading/writing, RAW reading =20 - simple analysis interface for accessing scan data and common meta-data fields =20 =20 I put a source package here: =20 http://www.sfcap.cshs.org/private =20 login: psidev pass: pwiz =20 =20 We will be making this library available open source, as soon as we finish wading through our legal department's bureaucracy. =20 =20 Please have a look, and feel free to email me with any questions, comments, or suggestions! =20 =20 Darren =20 =20 =20 Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 =20 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center http://www.sfcap.cshs.org/ =20 =20 IMPORTANT WARNING: This message is intended for the use of the person or = entity to which it is addressed and may contain information that is privi= leged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipi= ent, or the employee or agent responsible for delivering it to the intend= ed recipient, you are hereby notified that any dissemination, distributio= n or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for= your cooperation. |
From: Brian P. <bri...@in...> - 2008-01-15 18:12:43
|
Why not just define an INSTRUMENT_UNKNOWN value so readers can know that you don't know? Brian -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Joshua Tasman Sent: Tuesday, January 15, 2008 10:06 AM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] Instrument info required Good point, Rune. I wrote that code and am still looking for a solution. Many of the other converters have similar issues. If it's a requirement, users will need to specific a parameter at conversion time; if not, I can change the default to leaving it blank instead of using a dummy value. Josh Rune Schjellerup Philosof wrote: > Hello all > > I think it is a problem that it is required to specify an instrument. > Generally you should be careful about requiring too much information, > if the information isn't readily available incorrect dummy values will > probably be put in the places. > Instrument information is not always available. For instance with > Waters files. > Masswolf has no way of getting the instrument model, which can be seen > from the below excerpt of MassLynxInterface.cpp revision 2441 > > 351 // TODO: set this correctly > 352 // set dummy instrument info: > 353 instrumentInfo_.instrumentModel_ = Q_TOF_MICRO; > 354 instrumentInfo_.instrumentName_ = "Q-TOF Micro"; > 355 instrumentInfo_.ionSource_ = ESI; > 356 instrumentInfo_.analyzerList_.push_back(TOFMS); > 357 instrumentInfo_.detector_ = DETECTOR_UNDEF; > > > -- > Best regards, > Rune Schjellerup Philosof > > > ---------------------------------------------------------------------- > --- Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for just about anything > Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marke > tplace _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Joshua T. <jt...@sy...> - 2008-01-15 18:05:49
|
Good point, Rune. I wrote that code and am still looking for a solution. Many of the other converters have similar issues. If it's a requirement, users will need to specific a parameter at conversion time; if not, I can change the default to leaving it blank instead of using a dummy value. Josh Rune Schjellerup Philosof wrote: > Hello all > > I think it is a problem that it is required to specify an instrument. > Generally you should be careful about requiring too much information, if > the information isn't readily available incorrect dummy values will > probably be put in the places. > Instrument information is not always available. For instance with Waters > files. > Masswolf has no way of getting the instrument model, which can be seen > from the below excerpt of MassLynxInterface.cpp revision 2441 > > 351 // TODO: set this correctly > 352 // set dummy instrument info: > 353 instrumentInfo_.instrumentModel_ = Q_TOF_MICRO; > 354 instrumentInfo_.instrumentName_ = "Q-TOF Micro"; > 355 instrumentInfo_.ionSource_ = ESI; > 356 instrumentInfo_.analyzerList_.push_back(TOFMS); > 357 instrumentInfo_.detector_ = DETECTOR_UNDEF; > > > -- > Best regards, > Rune Schjellerup Philosof > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Rune S. P. <mai...@ph...> - 2008-01-15 10:43:16
|
Hello all I think it is a problem that it is required to specify an instrument. Generally you should be careful about requiring too much information, if the information isn't readily available incorrect dummy values will probably be put in the places. Instrument information is not always available. For instance with Waters files. Masswolf has no way of getting the instrument model, which can be seen from the below excerpt of MassLynxInterface.cpp revision 2441 351 // TODO: set this correctly 352 // set dummy instrument info: 353 instrumentInfo_.instrumentModel_ = Q_TOF_MICRO; 354 instrumentInfo_.instrumentName_ = "Q-TOF Micro"; 355 instrumentInfo_.ionSource_ = ESI; 356 instrumentInfo_.analyzerList_.push_back(TOFMS); 357 instrumentInfo_.detector_ = DETECTOR_UNDEF; -- Best regards, Rune Schjellerup Philosof |
From: sneumann <sne...@ip...> - 2008-01-15 09:36:27
|
Hi, we'vre just experienced a problem with an mzData converter, and that kind of bug might easily slip into mzML converters, so we'd like to mention that here. We have a PC setup for our Qstar with Analyst and wiff2mzData converter, which works fine. Recently we set up a new PC, which now has german setting= s as default. Guess what, in the XML we find <spectrumInstrument msLevel=3D"1" mzRangeStart=3D"75,00" mzRangeS= top=3D"1000,00"> <cvParam cvLabel=3D"psi" accession=3D"PSI:1000038" name=3D"Time= InMinutes" value=3D"0,033" /> In the mzdata.xsd value is a string, so (syntactically) 0,033 is fine: <xs:attribute name=3D"value" type=3D"xs:string" use=3D"optional"> and mzRange is defined as float: <xs:attribute name=3D"mzRangeStart" type=3D"xs:float" use=3D"optional"/> The character encoding is defined in the first line of the XML files, <?xml version=3D"1.0" encoding=3D"utf-8"?> but how are different locales (supposed to be) handled ?=20 I know this is a very implementation-related question,=20 but I'd suggest to at least mention this kind of issue=20 in the documentation. Yours, Steffen --=20 IPB Halle AG Massenspektrometrie & Bioinformatik Dr. Steffen Neumann http://www.IPB-Halle.DE Weinberg 3 http://msbi.bic-gh.de 06120 Halle Tel. +49 (0) 345 5582 - 1470 +49 (0) 345 5582 - 0 sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409 |
From: Kessner, D. E. <Dar...@cs...> - 2007-12-07 21:51:04
|
Testing to see if my posting issue has been resolved...please disregard. =20 =20 Darren =20 IMPORTANT WARNING: This message is intended for the use of the person or = entity to which it is addressed and may contain information that is privi= leged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipi= ent, or the employee or agent responsible for delivering it to the intend= ed recipient, you are hereby notified that any dissemination, distributio= n or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for= your cooperation. |
From: Benison, T. <The...@cs...> - 2007-12-07 21:20:57
|
=20 IMPORTANT WARNING: This message is intended for the use of the person or = entity to which it is addressed and may contain information that is privi= leged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipi= ent, or the employee or agent responsible for delivering it to the intend= ed recipient, you are hereby notified that any dissemination, distributio= n or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for= your cooperation. |
From: Rune S. P. <mai...@ph...> - 2007-12-07 09:16:00
|
Hi It would be nice to be able to specify a range for a precursor. This could for instance be used when recording MS^E data. Actually it's also relevant for single precursor selection, although the information is rarely in the vendor raw format (as far as I know). -- Regards Rune |
From: Kessner, D. E. <Dar...@cs...> - 2007-11-28 23:29:49
|
<second try> =20 Hi all, =20 Eric asked me to post this to the list as a use case for mzML . =20 =20 First some background - we (Parag Mallick's lab SFCAP at Cedars-Sinai) have a tool that does a recalulation of precursor m/z values for ms2 spectra by analyzing the associated FT survey scan. During this analysis, we often find multiple species in a small window around the reported precursor. We would like to report all these precursors, preferably with scores, for use downstream during the database search. =20 Two possibities of encoding this information come to mind: 1) Adding the multiple precursors as additional <precursor> elements in the <precursorList>. I'm not sure if this is the intended use of the <precursorList>. 2) Adding multiple <ionSelection> elements to a single <precursor> element: =20 <precursor spectrumRef=3D"19"> <ionSelection> <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" name=3D"m/z" value=3D"445.34"/> <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" name=3D"charge state" value=3D"2"/> </ionSelection> <ionSelection> <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" name=3D"m/z" value=3D"444.00"/> <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" name=3D"charge state" value=3D"1"/> </ionSelection> <activation> ...=20 </activation> </precursor> =20 Adding this second <ionSelection> element causes validation to fail with the online validator. =20 We would like to report an assoicated score with each precursor m/z value, but I'm not sure what the preferred way is to do that. =20 <ionSelection> <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" name=3D"m/z"= value=3D"444.00"/> <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" name=3D"char= ge state" value=3D"1"/> <cvParam cvLabel=3D"MS" accession=3D"MS:9999999" name=3D"scor= e" value=3D".89"/> <-- ???? </ionSelection> =20 =20 Darren =20 =20 Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 =20 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center http://www.sfcap.cshs.org/ =20 =20 =20 IMPORTANT WARNING: This message is intended for the use of the person or = entity to which it is addressed and may contain information that is privi= leged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipi= ent, or the employee or agent responsible for delivering it to the intend= ed recipient, you are hereby notified that any dissemination, distributio= n or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for= your cooperation. |
From: Joshua T. <jt...@sy...> - 2007-11-27 22:32:03
|
Hi all, The ISB-produced prototype Thermo and Waters mzML converters have been updated with some small changes to produce valid mzML 0.99.1 output. http://tools.proteomecenter.org/software/mzMLKit/mzMLKit-0.99.1-Thermo.zip http://tools.proteomecenter.org/software/mzMLKit/mzMLKit-0.99.1-Waters.zip (Also referenced from http://www.psidev.info/index.php?q=node/257) As always, comments welcome. Josh |
From: Matthew C. <mat...@va...> - 2007-11-27 16:22:56
|
I like this approach, and a "score" attribute/element should probably be a userParam instead of a cvParam. Precursor scoring is too far away from basic peak processing to justify inclusion in the mzML CV, I think. The m/z attribute should be required, the intensity and charge state attributes should be optional. -Matt Fredrik Levander wrote: > Since I cannot see the scenario when someone is using the precursor > element without giving an m/z value, I definitely agree that m/z and > charge state should be promoted to schema attributes or elements > (non-required though). Intensity too, even if one could argue that an > intensity unit would be appropriate. The less verbose option with > attributes should work fine, or are the CV accession numbers needed > (These could be in the documentation)? Furthermore, multiple > ionSelection elements will have to allowed. Currently only one > ionSelection element is allowed. But it makes more sense to repeat > 'ionSelection' than 'precursor', since the activation parameters are the > same. The result could be something like this: > > <precursor spectrumRef="12"> > <ionSelection mz="123.232" intensity="1034" charge_state="1" /> > <ionSelection mz="123.263" intensity="534" charge_state="2" /> > <ionSelection mz="124.784" intensity="739" /> > <activation> > <cvParam cvLabel="MS" accession="MS:1000133" name="collision-induced > dissociation" value=""/> > <cvParam cvLabel="MS" accession="MS:1000045" name="collision energy" > value="26.00" unitAccession="MS:1000137" unitName="Electron Volt"/> > </activation> > </precursor> > > However, the current description of the ionSelection element indicates > that 'type of ion selection' should be given there, and in this case > that shouldn't be repeated with every detected peak. But isn't type of > ion selection, for example 'by intensity' or by 'charge state', > typically global parameters? > > I also agree that it makes a lot of sense to produce one mzML file > before processing and to cite this one in the processed file along with > processing parameters as proposed. > > Fredrik > > Angel Pizarro wrote: > >> OK, so we are all agreed that spectral processing is in the domain of >> mzML, so let's talk structure, specifically the whole precursor as a >> list assumption that we have been using. >> >> Does a processing step automatically mean a new mzML file that cites >> the source file and the software (with parameters) that produced the >> current file ? IMHO this is the best option. >> >> I actually think Darren's original proposal of multiple ionSelection >> tags is a good one. One question: are these ionSelection tags an >> ordered set or a bag? Also I would vote that this is a place in the >> model to promote some cvParams to named tags or attributes: >> >> <ionSelection mz="123.232" charge_state="1" /> >> >> or >> >> <ionSelection > >> <mz cvLabel="MS" accession="MS:1000040" value="444.00" /> >> <chargeState cvLabel="MS" accession="MS:1000041" value="2"/> >> </ionSelection> >> >> Also about that "score" cvParam, it is probably the best to put the >> value along with the ionSelection as a cvParam, but the topic leads to >> our previous discussion of the CV and how one where to get new leaves >> into the CV as methods are developed (and what does this do to the >> parsers ...) Big can o' worms and should be discussed in a separate >> thread I maybe? >> >> -angel >> >> On Nov 27, 2007 5:01 AM, Eric Deutsch <ede...@sy... >> <mailto:ede...@sy...>> wrote: >> >> >> I also agree with these sentiments. Although the primary intended use >> case for mzML is a faithful, open representation of what was >> produced by >> the instrument control software, in practice mzML should be a >> convenient >> vehicle for feeding data into search engines. If mzML does not capture >> the kind of data and information that users want to feed into their >> search engine, then those people will continue to use mzXML for this >> purpose and I don't think we want to encourage that. >> >> I could easily see a workflow where one could use a "converter" >> program >> to convert 123.RAW to 123.mzML without any conversion flags on and >> then >> also convert 123.RAW to 123_proc.mzML, wherein the spectra have been >> centroided, deisotoped, more accurate precursor masses calculated, >> etc. >> with proper documentation of these modifications in the header. It >> would >> then be 123_proc.mzML that was fed into the search engine and used for >> all downstream analysis. If we don't facilitate something like >> this, we >> won't have solved the multiple format problem. >> >> Eric >> >> >> > From: psi...@li... >> <mailto:psi...@li...> >> [mailto:psidev-ms-dev- <mailto:psidev-ms-dev-> >> > bo...@li... >> <mailto:bo...@li...>] On Behalf Of Matthew Chambers >> > Sent: Monday, November 26, 2007 2:28 PM >> > To: Mass spectrometry standard development >> > Subject: Re: [Psidev-ms-dev] multiple precursor reporting >> > >> > Indeed, we must be careful where we draw the line. We should >> keep in >> > mind that centroiding, charge deconvolution, and deisotoping are >> > certainly kinds of peak processing and they are already integrated >> into >> > all 3 Mass Spec XML formats. >> > >> > -Matt >> > >> > Darren Kessner wrote: >> > > >> > > >> > > In regard to mzML, I expect that this sort of preprocessing will >> > > become part of the normal data pipeline, so I hope that a hard >> line >> is >> > > not drawn between acquisition data and analysis in the >> definition of >> > > the format. >> > > >> > > >> > > Darren >> > > >> > >> > >> > > |
From: Fredrik L. <Fre...@im...> - 2007-11-27 15:56:08
|
Since I cannot see the scenario when someone is using the precursor element without giving an m/z value, I definitely agree that m/z and charge state should be promoted to schema attributes or elements (non-required though). Intensity too, even if one could argue that an intensity unit would be appropriate. The less verbose option with attributes should work fine, or are the CV accession numbers needed (These could be in the documentation)? Furthermore, multiple ionSelection elements will have to allowed. Currently only one ionSelection element is allowed. But it makes more sense to repeat 'ionSelection' than 'precursor', since the activation parameters are the same. The result could be something like this: <precursor spectrumRef="12"> <ionSelection mz="123.232" intensity="1034" charge_state="1" /> <ionSelection mz="123.263" intensity="534" charge_state="2" /> <ionSelection mz="124.784" intensity="739" /> <activation> <cvParam cvLabel="MS" accession="MS:1000133" name="collision-induced dissociation" value=""/> <cvParam cvLabel="MS" accession="MS:1000045" name="collision energy" value="26.00" unitAccession="MS:1000137" unitName="Electron Volt"/> </activation> </precursor> However, the current description of the ionSelection element indicates that 'type of ion selection' should be given there, and in this case that shouldn't be repeated with every detected peak. But isn't type of ion selection, for example 'by intensity' or by 'charge state', typically global parameters? I also agree that it makes a lot of sense to produce one mzML file before processing and to cite this one in the processed file along with processing parameters as proposed. Fredrik Angel Pizarro wrote: > OK, so we are all agreed that spectral processing is in the domain of > mzML, so let's talk structure, specifically the whole precursor as a > list assumption that we have been using. > > Does a processing step automatically mean a new mzML file that cites > the source file and the software (with parameters) that produced the > current file ? IMHO this is the best option. > > I actually think Darren's original proposal of multiple ionSelection > tags is a good one. One question: are these ionSelection tags an > ordered set or a bag? Also I would vote that this is a place in the > model to promote some cvParams to named tags or attributes: > > <ionSelection mz="123.232" charge_state="1" /> > > or > > <ionSelection > > <mz cvLabel="MS" accession="MS:1000040" value="444.00" /> > <chargeState cvLabel="MS" accession="MS:1000041" value="2"/> > </ionSelection> > > Also about that "score" cvParam, it is probably the best to put the > value along with the ionSelection as a cvParam, but the topic leads to > our previous discussion of the CV and how one where to get new leaves > into the CV as methods are developed (and what does this do to the > parsers ...) Big can o' worms and should be discussed in a separate > thread I maybe? > > -angel > > On Nov 27, 2007 5:01 AM, Eric Deutsch <ede...@sy... > <mailto:ede...@sy...>> wrote: > > > I also agree with these sentiments. Although the primary intended use > case for mzML is a faithful, open representation of what was > produced by > the instrument control software, in practice mzML should be a > convenient > vehicle for feeding data into search engines. If mzML does not capture > the kind of data and information that users want to feed into their > search engine, then those people will continue to use mzXML for this > purpose and I don't think we want to encourage that. > > I could easily see a workflow where one could use a "converter" > program > to convert 123.RAW to 123.mzML without any conversion flags on and > then > also convert 123.RAW to 123_proc.mzML, wherein the spectra have been > centroided, deisotoped, more accurate precursor masses calculated, > etc. > with proper documentation of these modifications in the header. It > would > then be 123_proc.mzML that was fed into the search engine and used for > all downstream analysis. If we don't facilitate something like > this, we > won't have solved the multiple format problem. > > Eric > > > > From: psi...@li... > <mailto:psi...@li...> > [mailto:psidev-ms-dev- <mailto:psidev-ms-dev-> > > bo...@li... > <mailto:bo...@li...>] On Behalf Of Matthew Chambers > > Sent: Monday, November 26, 2007 2:28 PM > > To: Mass spectrometry standard development > > Subject: Re: [Psidev-ms-dev] multiple precursor reporting > > > > Indeed, we must be careful where we draw the line. We should > keep in > > mind that centroiding, charge deconvolution, and deisotoping are > > certainly kinds of peak processing and they are already integrated > into > > all 3 Mass Spec XML formats. > > > > -Matt > > > > Darren Kessner wrote: > > > > > > > > > In regard to mzML, I expect that this sort of preprocessing will > > > become part of the normal data pipeline, so I hope that a hard > line > is > > > not drawn between acquisition data and analysis in the > definition of > > > the format. > > > > > > > > > Darren > > > > > > > > > > ------------------------------------------------------------------------ > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > <http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/> > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > <mailto:Psi...@li...> > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > <mailto:Psi...@li...> > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > <https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev> > > > > > -- > Angel Pizarro > Director, Bioinformatics Facility > Institute for Translational Medicine and Therapeutics > University of Pennsylvania > 806 BRB II/III > 421 Curie Blvd. > Philadelphia, PA 19104-6160 > > P: 215-573-3736 > F: 215-573-9004 > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Angel P. <an...@ma...> - 2007-11-27 14:17:58
|
OK, so we are all agreed that spectral processing is in the domain of mzML, so let's talk structure, specifically the whole precursor as a list assumption that we have been using. Does a processing step automatically mean a new mzML file that cites the source file and the software (with parameters) that produced the current file ? IMHO this is the best option. I actually think Darren's original proposal of multiple ionSelection tags is a good one. One question: are these ionSelection tags an ordered set or a bag? Also I would vote that this is a place in the model to promote some cvParams to named tags or attributes: <ionSelection mz="123.232" charge_state="1" /> or <ionSelection > <mz cvLabel="MS" accession="MS:1000040" value="444.00" /> <chargeState cvLabel="MS" accession="MS:1000041" value="2"/> </ionSelection> Also about that "score" cvParam, it is probably the best to put the value along with the ionSelection as a cvParam, but the topic leads to our previous discussion of the CV and how one where to get new leaves into the CV as methods are developed (and what does this do to the parsers ...) Big can o' worms and should be discussed in a separate thread I maybe? -angel On Nov 27, 2007 5:01 AM, Eric Deutsch <ede...@sy...> wrote: > > I also agree with these sentiments. Although the primary intended use > case for mzML is a faithful, open representation of what was produced by > the instrument control software, in practice mzML should be a convenient > vehicle for feeding data into search engines. If mzML does not capture > the kind of data and information that users want to feed into their > search engine, then those people will continue to use mzXML for this > purpose and I don't think we want to encourage that. > > I could easily see a workflow where one could use a "converter" program > to convert 123.RAW to 123.mzML without any conversion flags on and then > also convert 123.RAW to 123_proc.mzML, wherein the spectra have been > centroided, deisotoped, more accurate precursor masses calculated, etc. > with proper documentation of these modifications in the header. It would > then be 123_proc.mzML that was fed into the search engine and used for > all downstream analysis. If we don't facilitate something like this, we > won't have solved the multiple format problem. > > Eric > > > > From: psi...@li... > [mailto:psidev-ms-dev- > > bo...@li...] On Behalf Of Matthew Chambers > > Sent: Monday, November 26, 2007 2:28 PM > > To: Mass spectrometry standard development > > Subject: Re: [Psidev-ms-dev] multiple precursor reporting > > > > Indeed, we must be careful where we draw the line. We should keep in > > mind that centroiding, charge deconvolution, and deisotoping are > > certainly kinds of peak processing and they are already integrated > into > > all 3 Mass Spec XML formats. > > > > -Matt > > > > Darren Kessner wrote: > > > > > > > > > In regard to mzML, I expect that this sort of preprocessing will > > > become part of the normal data pipeline, so I hope that a hard line > is > > > not drawn between acquisition data and analysis in the definition of > > > the format. > > > > > > > > > Darren > > > > > > > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > -- Angel Pizarro Director, Bioinformatics Facility Institute for Translational Medicine and Therapeutics University of Pennsylvania 806 BRB II/III 421 Curie Blvd. Philadelphia, PA 19104-6160 P: 215-573-3736 F: 215-573-9004 |
From: Eric D. <ede...@sy...> - 2007-11-27 10:01:37
|
I also agree with these sentiments. Although the primary intended use case for mzML is a faithful, open representation of what was produced by the instrument control software, in practice mzML should be a convenient vehicle for feeding data into search engines. If mzML does not capture the kind of data and information that users want to feed into their search engine, then those people will continue to use mzXML for this purpose and I don't think we want to encourage that. I could easily see a workflow where one could use a "converter" program to convert 123.RAW to 123.mzML without any conversion flags on and then also convert 123.RAW to 123_proc.mzML, wherein the spectra have been centroided, deisotoped, more accurate precursor masses calculated, etc. with proper documentation of these modifications in the header. It would then be 123_proc.mzML that was fed into the search engine and used for all downstream analysis. If we don't facilitate something like this, we won't have solved the multiple format problem. Eric > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Matthew Chambers > Sent: Monday, November 26, 2007 2:28 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] multiple precursor reporting >=20 > Indeed, we must be careful where we draw the line. We should keep in > mind that centroiding, charge deconvolution, and deisotoping are > certainly kinds of peak processing and they are already integrated into > all 3 Mass Spec XML formats. >=20 > -Matt >=20 > Darren Kessner wrote: > > > > > > In regard to mzML, I expect that this sort of preprocessing will > > become part of the normal data pipeline, so I hope that a hard line is > > not drawn between acquisition data and analysis in the definition of > > the format. > > > > > > Darren > > >=20 >=20 > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Matthew C. <mat...@va...> - 2007-11-26 22:28:02
|
Indeed, we must be careful where we draw the line. We should keep in mind that centroiding, charge deconvolution, and deisotoping are certainly kinds of peak processing and they are already integrated into all 3 Mass Spec XML formats. -Matt Darren Kessner wrote: > > > In regard to mzML, I expect that this sort of preprocessing will > become part of the normal data pipeline, so I hope that a hard line is > not drawn between acquisition data and analysis in the definition of > the format. > > > Darren > |
From: Darren K. <dke...@ya...> - 2007-11-26 22:14:55
|
(Sorry -- my work email is having issues posting to this list) Just to flesh some things out: 1) ReAdW (the ISB Thermo RAW->mzXML conversion program) does do some minimal analysis in reporting the precursor m/z. In addition, Thermo's software itself does thresholding and possibly some other preprocessing when binary data arrays in the RAW file are accessed. 2) In our preprocessing, we take the original RAW or converted mzXML file, and produce a preprocessed mzXML file, with better results in our downstream analysis. In regard to mzML, I expect that this sort of preprocessing will become part of the normal data pipeline, so I hope that a hard line is not drawn between acquisition data and analysis in the definition of the format. Darren |
From: Matthew C. <mat...@va...> - 2007-11-26 20:47:32
|
That's a neat idea Fredrik, but it would mean that MS/MS search engines would have to parse any MS1 that led to an MS2 (and somehow cache it if the MS1 led to multiple MS2s) which is even worse than jumping around the file to find the MS1. The charge state array is interesting too, but as I understand it, charge state determination is even more ambiguous than monoisotopic peak determination if the selection window is crowded. And although your approach would get rid of the need to store anything with the MS2 but the actual selection window itself, some processors might want to give the individual peaks in the selection window multiple charge states due to uncertainty in the determination. -Matt Fredrik Levander wrote: > This is certainly an interesting case, and the mzML format is not > perfect for it. To separate acquisition parameters from analysis > results, one (xsd valid ) possibility would be that the ionSelection > element is used for representing only the selection window. In the > precursor element there is a direct reference to the MS scan that was > used for the ion selection. This referenced spectrum can in the mzML > file be represented as a peak list with all the peaks and their accurate > masses in the selection window. One could imagine three binary arrays, > with m/z, intensity and charge, respectively. This would clearly > distinguish the machine selection window and the post-run peak analysis. > However, this is not an optimal solution for search engines which would > have to jump around in the file to find the precursor masses to consider. > > Fredrik > > Matthew Chambers skrev: > >> I think most writers of the nearest-instrument-format mzML will have >> done some processing step(s) to determine the precursor mass to write >> into the file. To take the Thermo RAW format as an example, it seems to >> only be able to store one precursor m/z and charge state per MS2, but >> since MS2s are fragmented from a relatively wide m/z isolation window >> (e.g. +/- 2.5 Da/z), it's entirely possible in a complex sample for a >> MS2 to be a multiplex (aka chimeric) spectrum in that it represents the >> fragmentation of multiple precursors of different masses and charge >> states. So storing one precursor for such a spectrum is utterly >> inadequate if high mass accuracy is used and expected. Add to that the >> fact that Thermo's processing step(s) are not optimal (in other words: >> Thermo's monoisotopic mass determination and charge state determination >> leave room for improvement). >> >> You may be aware that it is common practice in shotgun proteomics to >> determine if an MS2+ came from a singly or multiply charged precursor, >> and if multiply charged, to treat it as both +2 and +3 (and on some >> data, even higher charge states). With higher mass accuracy instruments >> coming online and in the absence of better precursor estimation from the >> vendor software, it will be increasingly common practice to treat a scan >> as coming from multiple precursor masses, not just precursor charge >> states. The multiple precursor masses can be due to uncertain isotopic >> variants in the precursor's isotopic envelope and/or due to multiple >> precursor species in the same isolation window. >> >> I too would like your take on how to represent this in mzML. >> >> Thanks, >> Matt >> >> >> Angel Pizarro wrote: >> >> >>> Interesting. Here is how understand matters (keep in mind I don't >>> actually perform experiments) >>> >>> <precursorList> I thought was a list of selection windows for a MS^n >>> experiment. In other words a MS2 would only have one precursor >>> selection window, MS3 would have two, etc. etc. >>> >>> The experiment as described actually sounds to me like there is a FT >>> MS1 scan independent of the selection window for the MS2 spectrum. >>> You then run an analysis to determine a more accurate precursor for >>> the MS2 spectra, making a set of relationships and assigning a score >>> to the likely candidates. >>> >>> That to me sounds like a processing step or analysis and not part of >>> the data acquisition experiment. So before we start discussing >>> structure, is this use (e.g. spectral processing) a role of the mzML >>> format? And if so is this encoded as a new file, or as part of an >>> export from the original data format ( e.g. should the intermediate >>> original format to mzml be output in the process) ? >>> >>> -angel >>> >>> |
From: Fredrik L. <Fre...@im...> - 2007-11-26 20:34:51
|
This is certainly an interesting case, and the mzML format is not perfect for it. To separate acquisition parameters from analysis results, one (xsd valid ) possibility would be that the ionSelection element is used for representing only the selection window. In the precursor element there is a direct reference to the MS scan that was used for the ion selection. This referenced spectrum can in the mzML file be represented as a peak list with all the peaks and their accurate masses in the selection window. One could imagine three binary arrays, with m/z, intensity and charge, respectively. This would clearly distinguish the machine selection window and the post-run peak analysis. However, this is not an optimal solution for search engines which would have to jump around in the file to find the precursor masses to consider. Fredrik Matthew Chambers skrev: > I think most writers of the nearest-instrument-format mzML will have > done some processing step(s) to determine the precursor mass to write > into the file. To take the Thermo RAW format as an example, it seems to > only be able to store one precursor m/z and charge state per MS2, but > since MS2s are fragmented from a relatively wide m/z isolation window > (e.g. +/- 2.5 Da/z), it's entirely possible in a complex sample for a > MS2 to be a multiplex (aka chimeric) spectrum in that it represents the > fragmentation of multiple precursors of different masses and charge > states. So storing one precursor for such a spectrum is utterly > inadequate if high mass accuracy is used and expected. Add to that the > fact that Thermo's processing step(s) are not optimal (in other words: > Thermo's monoisotopic mass determination and charge state determination > leave room for improvement). > > You may be aware that it is common practice in shotgun proteomics to > determine if an MS2+ came from a singly or multiply charged precursor, > and if multiply charged, to treat it as both +2 and +3 (and on some > data, even higher charge states). With higher mass accuracy instruments > coming online and in the absence of better precursor estimation from the > vendor software, it will be increasingly common practice to treat a scan > as coming from multiple precursor masses, not just precursor charge > states. The multiple precursor masses can be due to uncertain isotopic > variants in the precursor's isotopic envelope and/or due to multiple > precursor species in the same isolation window. > > I too would like your take on how to represent this in mzML. > > Thanks, > Matt > > > Angel Pizarro wrote: > >> Interesting. Here is how understand matters (keep in mind I don't >> actually perform experiments) >> >> <precursorList> I thought was a list of selection windows for a MS^n >> experiment. In other words a MS2 would only have one precursor >> selection window, MS3 would have two, etc. etc. >> >> The experiment as described actually sounds to me like there is a FT >> MS1 scan independent of the selection window for the MS2 spectrum. >> You then run an analysis to determine a more accurate precursor for >> the MS2 spectra, making a set of relationships and assigning a score >> to the likely candidates. >> >> That to me sounds like a processing step or analysis and not part of >> the data acquisition experiment. So before we start discussing >> structure, is this use (e.g. spectral processing) a role of the mzML >> format? And if so is this encoded as a new file, or as part of an >> export from the original data format ( e.g. should the intermediate >> original format to mzml be output in the process) ? >> >> -angel >> >> On Nov 26, 2007 2:01 PM, Eric Deutsch <ede...@sy... >> <mailto:ede...@sy...>> wrote: >> >> [Posted on behalf of Darren Kessner] >> >> >> >> Hi all, >> >> >> >> First some background – we (Parag Mallick's lab SFCAP at >> Cedars-Sinai) have a tool that does a recalculation of precursor >> m/z values for ms2 spectra by analyzing the associated FT survey >> scan. During this analysis, we often find multiple species in a >> small window around the reported precursor. We would like to >> report all these precursors, preferably with scores, for use >> downstream during the database search. >> >> >> >> Two possibities of encoding this information come to mind: >> >> 1) Adding the multiple precursors as additional <precursor> >> elements in the <precursorList>. I'm not sure if this is the >> intended use of the <precursorList>. >> >> 2) Adding multiple <ionSelection> elements to a single >> <precursor> element: >> >> >> >> <precursor spectrumRef="19"> >> >> <ionSelection> >> >> <cvParam cvLabel="MS" >> accession="MS:1000040" name="m/z" value="445.34"/> >> >> <cvParam cvLabel="MS" >> accession="MS:1000041" name="charge state" value="2"/> >> >> </ionSelection> >> >> <ionSelection> >> >> <cvParam cvLabel="MS" >> accession="MS:1000040" name="m/z" value="444.00"/> >> >> <cvParam cvLabel="MS" >> accession="MS:1000041" name="charge state" value="1"/> >> >> </ionSelection> >> >> <activation> >> >> ... >> >> </activation> >> >> </precursor> >> >> >> >> Adding this second <ionSelection> element causes validation to >> fail with the online validator. >> >> >> >> We would like to report an assoicated score with each precursor >> m/z value, but I'm not sure what the preferred way is to do that. >> >> >> >> <ionSelection> >> >> <cvParam cvLabel="MS" accession="MS:1000040" >> name="m/z" value="444.00"/> >> >> <cvParam cvLabel="MS" accession="MS:1000041" >> name="charge state" value="1"/> >> >> <cvParam cvLabel="MS" accession="MS:9999999" >> name="score" value=".89"/> ß ???? >> >> </ionSelection> >> >> >> >> >> >> Darren >> >> >> >> >> >> Darren Kessner >> >> Scientific Programmer >> >> Dar...@cs... <mailto:Dar...@cs...> >> >> 310-423-9538 >> >> >> >> Spielberg Family Center for Applied Proteomics >> >> Cedars-Sinai Medical Center >> >> http://www.sfcap.cshs.org/ >> >> >> >> >> >> > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Matthew C. <mat...@va...> - 2007-11-26 20:05:59
|
I think most writers of the nearest-instrument-format mzML will have done some processing step(s) to determine the precursor mass to write into the file. To take the Thermo RAW format as an example, it seems to only be able to store one precursor m/z and charge state per MS2, but since MS2s are fragmented from a relatively wide m/z isolation window (e.g. +/- 2.5 Da/z), it's entirely possible in a complex sample for a MS2 to be a multiplex (aka chimeric) spectrum in that it represents the fragmentation of multiple precursors of different masses and charge states. So storing one precursor for such a spectrum is utterly inadequate if high mass accuracy is used and expected. Add to that the fact that Thermo's processing step(s) are not optimal (in other words: Thermo's monoisotopic mass determination and charge state determination leave room for improvement). You may be aware that it is common practice in shotgun proteomics to determine if an MS2+ came from a singly or multiply charged precursor, and if multiply charged, to treat it as both +2 and +3 (and on some data, even higher charge states). With higher mass accuracy instruments coming online and in the absence of better precursor estimation from the vendor software, it will be increasingly common practice to treat a scan as coming from multiple precursor masses, not just precursor charge states. The multiple precursor masses can be due to uncertain isotopic variants in the precursor's isotopic envelope and/or due to multiple precursor species in the same isolation window. I too would like your take on how to represent this in mzML. Thanks, Matt Angel Pizarro wrote: > Interesting. Here is how understand matters (keep in mind I don't > actually perform experiments) > > <precursorList> I thought was a list of selection windows for a MS^n > experiment. In other words a MS2 would only have one precursor > selection window, MS3 would have two, etc. etc. > > The experiment as described actually sounds to me like there is a FT > MS1 scan independent of the selection window for the MS2 spectrum. > You then run an analysis to determine a more accurate precursor for > the MS2 spectra, making a set of relationships and assigning a score > to the likely candidates. > > That to me sounds like a processing step or analysis and not part of > the data acquisition experiment. So before we start discussing > structure, is this use (e.g. spectral processing) a role of the mzML > format? And if so is this encoded as a new file, or as part of an > export from the original data format ( e.g. should the intermediate > original format to mzml be output in the process) ? > > -angel > > On Nov 26, 2007 2:01 PM, Eric Deutsch <ede...@sy... > <mailto:ede...@sy...>> wrote: > > [Posted on behalf of Darren Kessner] > > > > Hi all, > > > > First some background – we (Parag Mallick's lab SFCAP at > Cedars-Sinai) have a tool that does a recalculation of precursor > m/z values for ms2 spectra by analyzing the associated FT survey > scan. During this analysis, we often find multiple species in a > small window around the reported precursor. We would like to > report all these precursors, preferably with scores, for use > downstream during the database search. > > > > Two possibities of encoding this information come to mind: > > 1) Adding the multiple precursors as additional <precursor> > elements in the <precursorList>. I'm not sure if this is the > intended use of the <precursorList>. > > 2) Adding multiple <ionSelection> elements to a single > <precursor> element: > > > > <precursor spectrumRef="19"> > > <ionSelection> > > <cvParam cvLabel="MS" > accession="MS:1000040" name="m/z" value="445.34"/> > > <cvParam cvLabel="MS" > accession="MS:1000041" name="charge state" value="2"/> > > </ionSelection> > > <ionSelection> > > <cvParam cvLabel="MS" > accession="MS:1000040" name="m/z" value="444.00"/> > > <cvParam cvLabel="MS" > accession="MS:1000041" name="charge state" value="1"/> > > </ionSelection> > > <activation> > > ... > > </activation> > > </precursor> > > > > Adding this second <ionSelection> element causes validation to > fail with the online validator. > > > > We would like to report an assoicated score with each precursor > m/z value, but I'm not sure what the preferred way is to do that. > > > > <ionSelection> > > <cvParam cvLabel="MS" accession="MS:1000040" > name="m/z" value="444.00"/> > > <cvParam cvLabel="MS" accession="MS:1000041" > name="charge state" value="1"/> > > <cvParam cvLabel="MS" accession="MS:9999999" > name="score" value=".89"/> ß ???? > > </ionSelection> > > > > > > Darren > > > > > > Darren Kessner > > Scientific Programmer > > Dar...@cs... <mailto:Dar...@cs...> > > 310-423-9538 > > > > Spielberg Family Center for Applied Proteomics > > Cedars-Sinai Medical Center > > http://www.sfcap.cshs.org/ > > > > > |
From: Angel P. <an...@ma...> - 2007-11-26 19:18:33
|
Interesting. Here is how understand matters (keep in mind I don't actually perform experiments) <precursorList> I thought was a list of selection windows for a MS^n experiment. In other words a MS2 would only have one precursor selection window, MS3 would have two, etc. etc. The experiment as described actually sounds to me like there is a FT MS1 scan independent of the selection window for the MS2 spectrum. You then run an analysis to determine a more accurate precursor for the MS2 spectra, making a set of relationships and assigning a score to the likely candidates. That to me sounds like a processing step or analysis and not part of the data acquisition experiment. So before we start discussing structure, is this use (e.g. spectral processing) a role of the mzML format? And if so is this encoded as a new file, or as part of an export from the original data format (e.g. should the intermediate original format to mzml be output in the process) ? -angel On Nov 26, 2007 2:01 PM, Eric Deutsch <ede...@sy...> wrote: > [Posted on behalf of Darren Kessner] > > > > Hi all, > > > > First some background =96 we (Parag Mallick's lab SFCAP at Cedars-Sinai) > have a tool that does a recalculation of precursor m/z values for ms2 > spectra by analyzing the associated FT survey scan. During this analysis= , > we often find multiple species in a small window around the reported > precursor. We would like to report all these precursors, preferably with > scores, for use downstream during the database search. > > > > Two possibities of encoding this information come to mind: > > 1) Adding the multiple precursors as additional <precursor> element= s > in the <precursorList>. I'm not sure if this is the intended use of the > <precursorList>. > > 2) Adding multiple <ionSelection> elements to a single <precursor> > element: > > > > <precursor spectrumRef=3D"19"> > > <ionSelection> > > <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" > name=3D"m/z" value=3D"445.34"/> > > <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" > name=3D"charge state" value=3D"2"/> > > </ionSelection> > > <ionSelection> > > <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" > name=3D"m/z" value=3D"444.00"/> > > <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" > name=3D"charge state" value=3D"1"/> > > </ionSelection> > > <activation> > > ... > > </activation> > > </precursor> > > > > Adding this second <ionSelection> element causes validation to fail with > the online validator. > > > > We would like to report an assoicated score with each precursor m/z value= , > but I'm not sure what the preferred way is to do that. > > > > <ionSelection> > > <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" name=3D"m/z" > value=3D"444.00"/> > > <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" name=3D"char= ge > state" value=3D"1"/> > > <cvParam cvLabel=3D"MS" accession=3D"MS:9999999" name=3D"scor= e" > value=3D".89"/> =DF ???? > > </ionSelection> > > > > > > Darren > > > > > > Darren Kessner > > Scientific Programmer > > Dar...@cs... > > 310-423-9538 > > > > Spielberg Family Center for Applied Proteomics > > Cedars-Sinai Medical Center > > http://www.sfcap.cshs.org/ > > > > > > > > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is STRICTLY PROHIBITED. > > If you have received this message in error, please notify us immediately > by calling (310) 423-6428 and destroy the related message. Thank You for > your cooperation. > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > --=20 Angel Pizarro Director, Bioinformatics Facility Institute for Translational Medicine and Therapeutics University of Pennsylvania 806 BRB II/III 421 Curie Blvd. Philadelphia, PA 19104-6160 P: 215-573-3736 F: 215-573-9004 |
From: Eric D. <ede...@sy...> - 2007-11-26 19:01:36
|
[Posted on behalf of Darren Kessner] =20 Hi all, =20 First some background - we (Parag Mallick's lab SFCAP at Cedars-Sinai) have a tool that does a recalculation of precursor m/z values for ms2 spectra by analyzing the associated FT survey scan. During this analysis, we often find multiple species in a small window around the reported precursor. We would like to report all these precursors, preferably with scores, for use downstream during the database search. =20 Two possibities of encoding this information come to mind: 1) Adding the multiple precursors as additional <precursor> elements in the <precursorList>. I'm not sure if this is the intended use of the <precursorList>. 2) Adding multiple <ionSelection> elements to a single <precursor> element: =20 <precursor spectrumRef=3D"19"> <ionSelection> <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" name=3D"m/z" value=3D"445.34"/> <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" name=3D"charge state" value=3D"2"/> </ionSelection> <ionSelection> <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" name=3D"m/z" value=3D"444.00"/> <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" name=3D"charge state" value=3D"1"/> </ionSelection> <activation> ...=20 </activation> </precursor> =20 Adding this second <ionSelection> element causes validation to fail with the online validator. =20 We would like to report an assoicated score with each precursor m/z value, but I'm not sure what the preferred way is to do that. =20 <ionSelection> <cvParam cvLabel=3D"MS" accession=3D"MS:1000040" = name=3D"m/z" value=3D"444.00"/> <cvParam cvLabel=3D"MS" accession=3D"MS:1000041" = name=3D"charge state" value=3D"1"/> <cvParam cvLabel=3D"MS" accession=3D"MS:9999999" = name=3D"score" value=3D".89"/> <-- ???? </ionSelection> =20 =20 Darren =20 =20 Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 =20 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center http://www.sfcap.cshs.org/ <http://www.sfcap.cshs.org/>=20 =20 =20 =20 IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. =20 |
From: Steffen N. <sne...@ip...> - 2007-11-24 19:57:06
|
Hi, Am Donnerstag, den 22.11.2007, 16:22 +0100 schrieb Toorn, H.W.P. van den (Henk): > workflow would be: do an MS run, pick the peaks you are interested in, > rerun the MS, use the list of picked peaks to do further fragmentation. Just one question: isn't the MS1 data of the second file almost identical to the first MS1-only run ? If so, why would you want to keep the first one then ? Or is it more accurate if you don't have these MS2 scans inbetween ? Yours, Steffen |
From: Matt C. <mat...@va...> - 2007-11-23 16:26:51
|
*cough cough* This is a good example of why the standard should support multiple runs in a single file. I can't imagine that analysisXML would be suited to handle this unless it too supports multiple runs? And even then, the separate mzML files would have to be externally referenced instead of including them in the file itself since the standards are not supposed to overlap in purpose. -Matt Fredrik Levander wrote: > Hi Henk, Randy and others, > > I think that normally you will produce two separate mzML files from that > workflow. The first one will represent all the MS spectra collected in > the first run, and the second one will contain a mixture of MS scans and > MS/MS scans from the run that is performed with an inclusion list (pick > list). The second file would look similar to the file: > http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/instanceFile/1min.mzML > where some spectra are MS (level one) and some MSMS (level two). The > picked masses are found under 'precursor' for the MS/MS spectra. In > addition, probably the complete inclusion list should be given as > cvParams or userParams in a 'referencableParamGroup' to specify which > peaks the instrument was programmed to look for. > > One could imagine that you construct a third mzML file which is > assembled from the first two files, but I'm not sure if that is allowed > within the standard, since only one 'run' can be specified. What would > be the preferred way to accomplish this? analysisXML or mzML? Has anyone > created an mzML file from multiple runs? > > Regards > > Fredrik > > Randy Julian wrote: > >> Hi Henk, >> >> mzML was designed for the application you described. Take a look at the >> specification document: >> >> http://www.psidev.info/index.php?q=node/303 >> >> In this public comment release, the spectrum element allows multiple >> binary arrays to be stored. The main ones would be m/z and intensity. >> The thought was there could be others - like picked peaks. We have >> wrestled with allowing human readable arrays and I think the group >> concluded they would be too confusing. There are many ways to do human >> readable arrays and that violates the goal of minimizing 'ways to >> represent the same thing' in the standard - a very good goal. >> >> This means that you will either have to encode the peak list in binary, >> or you could use either the cvParam or userParam elements. I would >> recommend that we adopt a standard nomenclature for picked peaks and >> represent this in cvParams for situations where there are not too many. >> >> The fragmentation spectra can be stored directly and are best >> represented in the binaryDataArray - this is what it was meant for. If >> you have a large number of picked peaks, this binary array is also the >> best way to store this type of data. >> >> As for 'fragments' of mzML, the spectrum element does have an ID >> attribute. In theory, this means that each is uniquely identified in >> the file and could be returned as part of a query (I'm thinking XQuery >> style extraction from the document). While the spectrum element is not >> self contained, it is 'identifiable' so is a candidate for a return >> value from an XQuery or an LSID request - I don't think we have not >> gotten that far - any suggestions? >> >> Read through the specification and let us know if you think it's unclear >> on how the standard could do what you want. We are at the point where >> external readers are needed. >> >> Thanks, >> Randy Julian >> >> -----Original Message----- >> From: psi...@li... >> [mailto:psi...@li...] On Behalf Of Toorn, >> H.W.P. van den (Henk) >> Sent: Thursday, November 22, 2007 10:23 AM >> To: Mass spectrometry standard development >> Subject: [Psidev-ms-dev] mzML pick file question >> >> Dear developers, >> >> I have some questions concerning the mzML format. >> We have some collaborators who are forced to use MS-peak pick files in >> order to target peaks for MS-MS in a later run. To be more clear, the >> workflow would be: do an MS run, pick the peaks you are interested in, >> rerun the MS, use the list of picked peaks to do further fragmentation. >> >> My questions are: would it be possible to store such picked peaks in a >> part of the mzML file, together with the original MS-spectra and the >> resulting MS-MS fragmentations? Are there any obvious ways that >> fragments of the mzML files could be used as an intermediary file >> format? >> >> Thanks in advance, Henk van den Toorn >> > |