From: Eric D. <ede...@sy...> - 2009-05-11 22:08:51
|
Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=12 <http://www.timeanddate.com/worldclock/fixedtime.html?day=12&month=5&year=20 09&hour=16&min=0&sec=0&p1=136> &month=5&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - PAB submitted a nice World HUPO poster abstract - What's the final with empty binary? <binary></binary> or <binary/> or?? - David Creasy's problem MSXML4.0 DOM Parser dta_example.mzML:460,3: The keyref 'Bioworks' does not resolve to a key... - David Creasy's problem: <xs:selector xpath=".//dx:spectrum | .//dx:chromatogram ... - Confirm that David Creasy is happy with the example file validity before full release - Marc's two suggested "fraction" terms - "mzML - MS:1000033 - deisotoping" saga - Luisa has some other items. See email 2009-05-05 10:46pm PDT - Update documentation for profile & peak list representation - What is the official mzML 1.1 review timescale? Apparent mzML Timeline: - mzML 1.1.0 resubmitted 2009-03-27 - Announcement of beginning of 30 day public comment period: 2009-04-17 Only feedback we have is: - What can be done about imzML? - mzML 1.1.0 SRM encoding can inflate RAW->mzML 700x. Allow just chromatograms to fix this! ---- 2) Need to update the mzML implementations catalog ---- 3) imzML alignment ---- 4) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 5) TraML development |
From: Eric D. <ede...@sy...> - 2009-05-18 20:44:50
|
Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=19 <http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=5&year=20 09&hour=16&min=0&sec=0&p1=136> &month=5&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Plans for ASMS poster - David Creasy's problem: <xs:selector xpath=".//dx:spectrum | .//dx:chromatogram ... - Confirm that David Creasy is happy with the example file validity before full release - Marc's two suggested "fraction" terms - "mzML - MS:1000033 - deisotoping" saga - Luisa has some other items. See email 2009-05-05 10:46pm PDT - Update documentation for profile & peak list representation - mzML 1.1 review is officially over. Address the feedback items - Align with imzML. Does Andreas have a new mapping file that we can merge with ours? - mzML 1.1.0 SRM encoding can inflate RAW->mzML 700x. Allow just chromatograms to fix this! - Minor low-level schema format items from David Creasy found during Mascot implementation - Obsolete duplicate "spectrum title" and "neutral loss" ---- 2) Need to update the mzML implementations catalog ---- 4) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 5) TraML development - 0.2 has been released on web site - Matt or Darren is implementing in ProteoWizard? - Jim is implementing - ISB is implementing - ASMS poster draft |
From: Angel P. <an...@ma...> - 2009-05-19 14:22:47
|
On Mon, May 18, 2009 at 4:44 PM, Eric Deutsch <ede...@sy...>wrote: > Hi everyone, the next PSI Mass Spectrometry Standards Working Group call > will be Tuesday 8am PDT: > > ---- > > 2) Need to update the mzML implementations catalog > > I am in development of a non-validating Actionscript libs for both mzML and TraML. Will certainly have the mzML done by ASMS (already have it parsing mzXML), probably not traml. -angel > > > ---- > > 4) MIAPE-MS revision > > - Have revised document to discuss at ASMS > > > > ---- > > 5) TraML development > > - 0.2 has been released on web site > > - Matt or Darren is implementing in ProteoWizard? > > - Jim is implementing > > - ISB is implementing > > - ASMS poster draft > > > > > > > > > > > > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables > unlimited royalty-free distribution of the report engine > for externally facing server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > -- Angel Pizarro Director, ITMAT Bioinformatics Facility 806 Biological Research Building 421 Curie Blvd. Philadelphia, PA 19104-6160 215-573-3736 |
From: Andreas R. <And...@an...> - 2009-05-19 17:32:23
|
Dear all, unfortunately I was not be able to make the conference call today. About the alignment of imzML: We just had a meeting of our project last week and made a few modifications to the obo file. We are implementing those right now. The mapping file is under construction as well. I will let you know as soon as we are finished. BTW, does it really make sense to merge the mapping files (now) as long as we have two different obo files? Cheers, Andreas PS: At will be at ASMS as well. - Update documentation for profile & peak list representation - mzML 1.1 review is officially over. Address the feedback items - Align with imzML. Does Andreas have a new mapping file that we can merge with ours? - mzML 1.1.0 SRM encoding can inflate RAW->mzML 700x. Allow just chromatograms to fix this! - Minor low-level schema format items from David Creasy found during Mascot implementation |
From: Andreas R. <And...@an...> - 2009-05-19 17:50:04
|
Sorry, I misunderstood the mapping file request. Marc just enlightened me. We will provide a mapping file for the terms that were moved from our imzML obo to the mzML obo. Andreas Von: Andreas Römpp [mailto:And...@an...] Gesendet: Dienstag, 19. Mai 2009 19:32 An: Mass spectrometry standard development Betreff: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder Dear all, unfortunately I was not be able to make the conference call today. About the alignment of imzML: We just had a meeting of our project last week and made a few modifications to the obo file. We are implementing those right now. The mapping file is under construction as well. I will let you know as soon as we are finished. BTW, does it really make sense to merge the mapping files (now) as long as we have two different obo files? Cheers, Andreas PS: At will be at ASMS as well. - Update documentation for profile & peak list representation - mzML 1.1 review is officially over. Address the feedback items - Align with imzML. Does Andreas have a new mapping file that we can merge with ours? - mzML 1.1.0 SRM encoding can inflate RAW->mzML 700x. Allow just chromatograms to fix this! - Minor low-level schema format items from David Creasy found during Mascot implementation |
From: Eric D. <ede...@sy...> - 2009-05-25 15:36:55
|
Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=19 <http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=5&year=20 09&hour=16&min=0&sec=0&p1=136> &month=5&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 ---- 2) Need to update the mzML implementations catalog ---- 3) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 4) TraML development |
From: Pierre-Alain B. <pie...@is...> - 2009-05-25 18:27:17
|
Hi all, I might be missing the call (travelling). I will make sure Eric gets the revised proposal doc for MIAPE MS before the ASMS Pierre-Alain Eric Deutsch wrote: > > Hi everyone, the next PSI Mass Spectrometry Standards Working Group > call will be Tuesday 8am PDT: > > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=5&year=2009&hour=16&min=0&sec=0&p1=136 > <http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=5&year=2009&hour=16&min=0&sec=0&p1=136> > > > > 08:00 San Francisco > > 11:00 New York > > 16:00 London > > 17:00 Geneva > > > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > Generic international: +44 2083222500 (UK number) > > > > access code: 297427 > > > > Agenda: > > 1) mzML 1.1.0 > > > > ---- > > 2) Need to update the mzML implementations catalog > > > > ---- > > 3) MIAPE-MS revision > > - Have revised document to discuss at ASMS > > > > ---- > > 4) TraML development > > > > > > > > > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT > is a gathering of tech-side developers & brand creativity professionals. Meet > the minds behind Google Creative Lab, Visual Complexity, Processing, & > iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian > Group, R/GA, & Big Spaceship. http://www.creativitycat.com > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Darren K. <da...@pr...> - 2009-05-26 15:02:14
|
Hey Eric, I'm taking care of some car trouble, so I can't be on the conference call this morning. Darren On Mon, May 25, 2009 at 8:34 AM, Eric Deutsch <ede...@sy...> wrote: > Hi everyone, the next PSI Mass Spectrometry Standards Working Group call > will be Tuesday 8am PDT: > > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=19&month=5&year=2009&hour=16&min=0&sec=0&p1=136 > > > > 08:00 San Francisco > > 11:00 New York > > 16:00 London > > 17:00 Geneva > > > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > Generic international: +44 2083222500 (UK number) > > > > access code: 297427 > > > > Agenda: > > 1) mzML 1.1.0 > > > > ---- > > 2) Need to update the mzML implementations catalog > > > > ---- > > 3) MIAPE-MS revision > > - Have revised document to discuss at ASMS > > > > ---- > > 4) TraML development > > > > > > > > > > > > ------------------------------------------------------------------------------ > Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT > is a gathering of tech-side developers & brand creativity professionals. > Meet > the minds behind Google Creative Lab, Visual Complexity, Processing, & > iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian > Group, R/GA, & Big Spaceship. http://www.creativitycat.com > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Eric D. <ede...@sy...> - 2009-06-08 22:42:30
|
Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=09 <http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=20 09&hour=16&min=0&sec=0&p1=136> &month=6&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Released! - Allowed binarydata data types - 2) TraML development - Feedback from ASMS - Implementations - cvParams vs attributes |
From: Eric D. <ede...@sy...> - 2009-06-09 16:02:13
|
Present: Marc, Jim, Matt, Eric, Lennart, Pierre-Alain 1) mzML 1.1.0 - Released! + The whitespace issue in the xsd resolved before - Allowed binarydata data types + Add back in 32- and 64-bit integer. Those terms should be unobsoleted. There were there + Add string array: null-terminated array of strings. Must have as many nulls as elements. + Matt will add these to the CV + It will be implemented somehow in ProteoWizard and OpenMS + Matt will added another CV type which is binarydatatype and then annotated mzArray, IntensityArray, and chargeArray with the appropriate types - ASMS + There is a group that just put together a unified format for ion mobility mass spec. Matt and Eric met him, and we will followup + Also had discussion with ANiML. Being done through ASTM - ASMS might help out with CV + David Sparkman may help us out. + Eric will update MSS WG page + Eric will email Juan Antonio about top page + Marc will double check with Andreas Römpp on units addition and then add them as is 2) TraML development - Feedback from ASMS - Implementations + ProteoWizard has some implementation. OpenMS does as well by Andreas. Jim is working on something - cvParams vs attributes + Problem with attributes is lack on units specification + Problem with attributes is default value ambiguity in C++ + change transition name to id of type xsd:string + Apply rule: any attribute that is not an id or a Ref should be switched to cvParam + How does mzIdentML handle b9-18^2 ? Try to do he same? + What about string values? + Matt is advocating more cvParams, Pierre-Alain as well. Jim as well. + Make normalizationStandard should be cvParams H-PINS + Eric will make another rev beased on this. + Meet again next week same time _____ From: Eric Deutsch [mailto:ede...@sy...] Sent: Monday, June 08, 2009 3:41 PM To: 'Mass spectrometry standard development' Cc: 'Eric Deutsch' Subject: PSI-MSS WG Tuesday call reminder Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=09 <http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=20 09&hour=16&min=0&sec=0&p1=136> &month=6&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Released! - Allowed binarydata data types - 2) TraML development - Feedback from ASMS - Implementations - cvParams vs attributes |
From: Matthew C. <mat...@va...> - 2009-06-11 15:26:56
|
I just committed the binary data array/type changes to the CV: - added key-value relationships to specify which binary data types are valid for which binary data arrays - improved definitions for binary data types - added new binary data type for null-terminated ASCII strings -Matt Eric Deutsch wrote: > > Present: Marc, Jim, Matt, Eric, Lennart, Pierre-Alain > > 1) mzML 1.1.0 > > - Released! > > + The whitespace issue in the xsd resolved before > > - Allowed binarydata data types > > + Add back in 32- and 64-bit integer. Those terms should be > unobsoleted. There were there > > + Add string array: null-terminated array of strings. Must have as > many nulls as elements. > > + Matt will add these to the CV > > + It will be implemented somehow in ProteoWizard and OpenMS > > + Matt will added another CV type which is binarydatatype and then > annotated mzArray, IntensityArray, and chargeArray with the > appropriate types > > - ASMS > > + There is a group that just put together a “unified” format for ion > mobility mass spec. Matt and Eric met him, and we will followup > > + Also had discussion with ANiML. Being done through ASTM > > - ASMS might help out with CV > > + David Sparkman may help us out. > > + Eric will update MSS WG page > > + Eric will email Juan Antonio about top page > > + Marc will double check with Andreas Römpp on units addition and then > add them as is > > 2) TraML development > > - Feedback from ASMS > > - Implementations > > + ProteoWizard has some implementation. OpenMS does as well by > Andreas. Jim is working on something > > - cvParams vs attributes > > + Problem with attributes is lack on units specification > > + Problem with attributes is default value ambiguity in C++ > > + change transition name to id of type xsd:string > > + Apply rule: any attribute that is not an id or a Ref should be > switched to cvParam > > + How does mzIdentML handle b9-18^2 ? Try to do he same? > > + What about string values? > > + Matt is advocating more cvParams, Pierre-Alain as well. Jim as well. > > + Make normalizationStandard should be cvParams H-PINS > > + Eric will make another rev beased on this. > > + Meet again next week same time > > ------------------------------------------------------------------------ > > *From:* Eric Deutsch [mailto:ede...@sy...] > *Sent:* Monday, June 08, 2009 3:41 PM > *To:* 'Mass spectrometry standard development' > *Cc:* 'Eric Deutsch' > *Subject:* PSI-MSS WG Tuesday call reminder > > Hi everyone, the next PSI Mass Spectrometry Standards Working Group > call will be Tuesday 8am PDT: > > http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=2009&hour=16&min=0&sec=0&p1=136 > <http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=2009&hour=16&min=0&sec=0&p1=136> > > 08:00 San Francisco > > 11:00 New York > > 16:00 London > > 17:00 Geneva > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > Generic international: +44 2083222500 (UK number) > > access code: 297427 > > Agenda: > > 1) mzML 1.1.0 > > - Released! > > - Allowed binarydata data types > > - > > 2) TraML development > > - Feedback from ASMS > > - Implementations > > - cvParams vs attributes > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Stein, S. E. Dr. <ste...@ni...> - 2009-06-11 15:33:45
|
Congrats on a new version..... However, I wanted to again state what I think is a defect in the standard - the inability to accept an ASCII peak list. This prevents us from using mzML it as the format for libraries or reference data. --- 1 is different than 1.0 and different than 1.00 .... this difference, to some, is non trivial and changes the meaning of reference data when converted to binary. Also, the ability to see read the data is nice for those who want to do it. I suppose it's addition will do too much damage to add to 1.2 - but I just felt that I should bring it up again as our needs have not changed. -Steve Stein ________________________________ From: Eric Deutsch [mailto:ede...@sy...] Sent: Tuesday, June 09, 2009 12:02 PM To: 'Mass spectrometry standard development' Cc: 'Eric Deutsch' Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder Present: Marc, Jim, Matt, Eric, Lennart, Pierre-Alain 1) mzML 1.1.0 - Released! + The whitespace issue in the xsd resolved before - Allowed binarydata data types + Add back in 32- and 64-bit integer. Those terms should be unobsoleted. There were there + Add string array: null-terminated array of strings. Must have as many nulls as elements. + Matt will add these to the CV + It will be implemented somehow in ProteoWizard and OpenMS + Matt will added another CV type which is binarydatatype and then annotated mzArray, IntensityArray, and chargeArray with the appropriate types - ASMS + There is a group that just put together a "unified" format for ion mobility mass spec. Matt and Eric met him, and we will followup + Also had discussion with ANiML. Being done through ASTM - ASMS might help out with CV + David Sparkman may help us out. + Eric will update MSS WG page + Eric will email Juan Antonio about top page + Marc will double check with Andreas Römpp on units addition and then add them as is 2) TraML development - Feedback from ASMS - Implementations + ProteoWizard has some implementation. OpenMS does as well by Andreas. Jim is working on something - cvParams vs attributes + Problem with attributes is lack on units specification + Problem with attributes is default value ambiguity in C++ + change transition name to id of type xsd:string + Apply rule: any attribute that is not an id or a Ref should be switched to cvParam + How does mzIdentML handle b9-18^2 ? Try to do he same? + What about string values? + Matt is advocating more cvParams, Pierre-Alain as well. Jim as well. + Make normalizationStandard should be cvParams H-PINS + Eric will make another rev beased on this. + Meet again next week same time ________________________________ From: Eric Deutsch [mailto:ede...@sy...] Sent: Monday, June 08, 2009 3:41 PM To: 'Mass spectrometry standard development' Cc: 'Eric Deutsch' Subject: PSI-MSS WG Tuesday call reminder Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Released! - Allowed binarydata data types - 2) TraML development - Feedback from ASMS - Implementations - cvParams vs attributes |
From: Matthew C. <mat...@va...> - 2009-06-11 19:43:19
|
Hi Steve, Thanks for bringing up the great schism of mzData, dataXML, and mzML. ;) I'm not sure what you mean about "1 is different than 1.0 and different than 1.00" - when a computer parses these numbers into floating point (or fixed point, for that matter), they are not different. Mathematically, they aren't different. So why should they be treated different for a reference/library format? For any of you interested in one of the many historical discussions on this topic, I dug one up at: http://sourceforge.net/mailarchive/forum.php?thread_name=S375284AbWJEL1o%2F20061005112745Z%2B11766%40ams006.ftl.affinity.com&forum_name=psidev-ms-dev The many internal references in mzML to me means that it shouldn't be considered a light-weight format that simple scripts could parse: reading mzML with software takes a substantial API. Thus the only remaining benefit for ASCII peak representation (AFAIK) is human readability of peak lists and that's not enough to convince me that we should incur the "more than one way to do it" penalty. However, NIST library folks have a quite straight-forward way to meet the "human readability" requirement: XML comments. There's no reason you can't put what looks like an MGF peak list in an XML comment with every mzML spectrum (although presumably not profile-mode ones!). For example: <spectrum index="1" id="scan=20" defaultArrayLength="10"> ... <!-- m/z intensity 123.4 0.12 234.5 12.3 345.6 23.4 456.7 345.6 567.8 45.3 678.9 34.2 789.0 123.4 890.1 4567.8 901.2 345.6 1234.5 4.56 --> <binaryDataArrayList count="2"> ... base64 arrays are still required ... </binaryDataArrayList> </spectrum> Thoughts? -Matt Stein, Stephen E. Dr. wrote: > > Congrats on a new version….. > > However, I wanted to again state what I think is a defect in the > standard – the inability to accept an ASCII peak list. This prevents > us from using mzML it as the format for libraries or reference data. > > --- 1 is different than 1.0 and different than 1.00 …. > > this difference, to some, is non trivial and changes the meaning of > reference data when converted to binary. > > Also, the ability to see read the data is nice for those who want to > do it. > > I suppose it’s addition will do too much damage to add to 1.2 – but I > just felt that I should bring it up again as our needs have not changed. > > -Steve Stein > > ------------------------------------------------------------------------ > > *From:* Eric Deutsch [mailto:ede...@sy...] > *Sent:* Tuesday, June 09, 2009 12:02 PM > *To:* 'Mass spectrometry standard development' > *Cc:* 'Eric Deutsch' > *Subject:* Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder > > Present: Marc, Jim, Matt, Eric, Lennart, Pierre-Alain > > 1) mzML 1.1.0 > > - Released! > > + The whitespace issue in the xsd resolved before > > - Allowed binarydata data types > > + Add back in 32- and 64-bit integer. Those terms should be > unobsoleted. There were there > > + Add string array: null-terminated array of strings. Must have as > many nulls as elements. > > + Matt will add these to the CV > > + It will be implemented somehow in ProteoWizard and OpenMS > > + Matt will added another CV type which is binarydatatype and then > annotated mzArray, IntensityArray, and chargeArray with the > appropriate types > > - ASMS > > + There is a group that just put together a “unified” format for ion > mobility mass spec. Matt and Eric met him, and we will followup > > + Also had discussion with ANiML. Being done through ASTM > > - ASMS might help out with CV > > + David Sparkman may help us out. > > + Eric will update MSS WG page > > + Eric will email Juan Antonio about top page > > + Marc will double check with Andreas Römpp on units addition and then > add them as is > > 2) TraML development > > - Feedback from ASMS > > - Implementations > > + ProteoWizard has some implementation. OpenMS does as well by > Andreas. Jim is working on something > > - cvParams vs attributes > > + Problem with attributes is lack on units specification > > + Problem with attributes is default value ambiguity in C++ > > + change transition name to id of type xsd:string > > + Apply rule: any attribute that is not an id or a Ref should be > switched to cvParam > > + How does mzIdentML handle b9-18^2 ? Try to do he same? > > + What about string values? > > + Matt is advocating more cvParams, Pierre-Alain as well. Jim as well. > > + Make normalizationStandard should be cvParams H-PINS > > + Eric will make another rev beased on this. > > + Meet again next week same time > > ------------------------------------------------------------------------ > > *From:* Eric Deutsch [mailto:ede...@sy...] > *Sent:* Monday, June 08, 2009 3:41 PM > *To:* 'Mass spectrometry standard development' > *Cc:* 'Eric Deutsch' > *Subject:* PSI-MSS WG Tuesday call reminder > > Hi everyone, the next PSI Mass Spectrometry Standards Working Group > call will be Tuesday 8am PDT: > > http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=2009&hour=16&min=0&sec=0&p1=136 > <http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=2009&hour=16&min=0&sec=0&p1=136> > > 08:00 San Francisco > > 11:00 New York > > 16:00 London > > 17:00 Geneva > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > Generic international: +44 2083222500 (UK number) > > access code: 297427 > > Agenda: > > 1) mzML 1.1.0 > > - Released! > > - Allowed binarydata data types > > - > > 2) TraML development > > - Feedback from ASMS > > - Implementations > > - cvParams vs attributes > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Mike C. <tu...@gm...> - 2009-06-11 20:53:03
|
On Thu, Jun 11, 2009 at 2:41 PM, Matthew Chambers<mat...@va...> wrote: > The many internal references in mzML to me means that it shouldn't be > considered a light-weight format that simple scripts could parse: > reading mzML with software takes a substantial API. I hope this is not true--I would be quite disappointed if mzML could not be easily parsed by simple scripts. > Thus the only > remaining benefit for ASCII peak representation (AFAIK) is human > readability of peak lists [...] If one starts with the assumption that mzML is in its best form and should not be changed, this conclusion follows directly. But if we're trying to decide whether mzML should be changed, this seems a bit like begging the question. > However, NIST library folks have a quite straight-forward way to meet > the "human readability" requirement: XML comments. There's no reason you > can't put what looks like an MGF peak list in an XML comment with every > mzML spectrum (although presumably not profile-mode ones!). I think this would be worse than the status quo. If this change is to be made, though, may I suggest that the ASCII peaks be used in the "real" XML and that the binary peaks go in the comments? :-) Mike |
From: Matthew C. <mat...@va...> - 2009-06-11 21:09:29
|
With binary data, the same representation works for centroided and profile data points. I really hope you're not suggesting ASCII storage of profile mode data, where 9-12 bytes per X sample (12345.678901) would not be unusual? It's all the overhead of double precision floats (a constant 8 bytes) without the vastly higher dynamic range and taking much longer to parse. Using mzML to store the raw data for the libraries would be a great improvement over the status quo (assorted custom relational databases and ASCII archives?). There actually would be a standard representation. :) Coming up with a standard representation for application-specific annotations would be another challenge for a possibly separate format, but the raw data we can already handle. And with a reasonably optimized representation for profile mode, storing consensus profile spectra could become a reasonable approach for spectral libraries. Although I do wish there was an XML-friendly 8-byte text encoding standard, like the yenc encoding used on news://alt.bin.*, which we could choose instead of base64 to achieve practically no encoding bloat. -Matt Mike Coleman wrote: > On Thu, Jun 11, 2009 at 2:41 PM, Matthew > Chambers<mat...@va...> wrote: > >> However, NIST library folks have a quite straight-forward way to meet >> the "human readability" requirement: XML comments. There's no reason you >> can't put what looks like an MGF peak list in an XML comment with every >> mzML spectrum (although presumably not profile-mode ones!). >> > > I think this would be worse than the status quo. If this change is to > be made, though, may I suggest that the ASCII peaks be used in the > "real" XML and that the binary peaks go in the comments? :-) > > Mike > |
From: Angel P. <an...@ma...> - 2009-06-11 20:23:59
|
Hi Steve, Is your question whether we can successfully round-trip the numbers? Eg. go from an ascii format to mzML back to originating ascii format and get the same exact numbers? I believe that when we pack the numbers and unpack them (at least in my non-validating ruby implementations) the numbers and significance are completely the same. E.g. 1.005 === 1.005 and not 1.005000000000001 -angel On Thu, Jun 11, 2009 at 11:33 AM, Stein, Stephen E. Dr. < ste...@ni...> wrote: > Congrats on a new version….. > > > > However, I wanted to again state what I think is a defect in the standard – > the inability to accept an ASCII peak list. This prevents us from using mzML > it as the format for libraries or reference data. > > > > --- 1 is different than 1.0 and different than 1.00 …. > > > > this difference, to some, is non trivial and changes the meaning of > reference data when converted to binary. > > > > Also, the ability to see read the data is nice for those who want to do it. > > > > I suppose it’s addition will do too much damage to add to 1.2 – but I just > felt that I should bring it up again as our needs have not changed. > > > > -Steve Stein > > > ------------------------------ > > *From:* Eric Deutsch [mailto:ede...@sy...] > *Sent:* Tuesday, June 09, 2009 12:02 PM > > *To:* 'Mass spectrometry standard development' > *Cc:* 'Eric Deutsch' > *Subject:* Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder > > > > Present: Marc, Jim, Matt, Eric, Lennart, Pierre-Alain > > > > 1) mzML 1.1.0 > > - Released! > > + The whitespace issue in the xsd resolved before > > - Allowed binarydata data types > > + Add back in 32- and 64-bit integer. Those terms should be unobsoleted. > There were there > > + Add string array: null-terminated array of strings. Must have as many > nulls as elements. > > + Matt will add these to the CV > > + It will be implemented somehow in ProteoWizard and OpenMS > > + Matt will added another CV type which is binarydatatype and then > annotated mzArray, IntensityArray, and chargeArray with the appropriate > types > > - ASMS > > + There is a group that just put together a “unified” format for ion > mobility mass spec. Matt and Eric met him, and we will followup > > + Also had discussion with ANiML. Being done through ASTM > > - ASMS might help out with CV > > + David Sparkman may help us out. > > + Eric will update MSS WG page > > + Eric will email Juan Antonio about top page > > + Marc will double check with Andreas Römpp on units addition and then add > them as is > > > > 2) TraML development > > - Feedback from ASMS > > - Implementations > > + ProteoWizard has some implementation. OpenMS does as well by Andreas. Jim > is working on something > > - cvParams vs attributes > > + Problem with attributes is lack on units specification > > + Problem with attributes is default value ambiguity in C++ > > + change transition name to id of type xsd:string > > + Apply rule: any attribute that is not an id or a Ref should be switched > to cvParam > > + How does mzIdentML handle b9-18^2 ? Try to do he same? > > + What about string values? > > + Matt is advocating more cvParams, Pierre-Alain as well. Jim as well. > > + Make normalizationStandard should be cvParams H-PINS > > + Eric will make another rev beased on this. > > > > + Meet again next week same time > > > > > ------------------------------ > > *From:* Eric Deutsch [mailto:ede...@sy...] > *Sent:* Monday, June 08, 2009 3:41 PM > *To:* 'Mass spectrometry standard development' > *Cc:* 'Eric Deutsch' > *Subject:* PSI-MSS WG Tuesday call reminder > > > > Hi everyone, the next PSI Mass Spectrometry Standards Working Group call > will be Tuesday 8am PDT: > > > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=09&month=6&year=2009&hour=16&min=0&sec=0&p1=136 > > > > 08:00 San Francisco > > 11:00 New York > > 16:00 London > > 17:00 Geneva > > > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > Generic international: +44 2083222500 (UK number) > > > > access code: 297427 > > > > Agenda: > > > > 1) mzML 1.1.0 > > - Released! > > - Allowed binarydata data types > > - > > > > 2) TraML development > > - Feedback from ASMS > > - Implementations > > - cvParams vs attributes > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > -- Angel Pizarro Director, ITMAT Bioinformatics Facility 806 Biological Research Building 421 Curie Blvd. Philadelphia, PA 19104-6160 215-573-3736 |
From: Mike C. <tu...@gm...> - 2009-06-11 20:40:32
|
I took it to mean that with "1", "1.5", "1.50", one gets an implied level of precision. That is, "1.5" is generally understood to mean 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied about the precision of this value, unless it's explicitly stated elsewhere. (If you have a whole set of these, then you probably can work out the equivalent precision, but this is a bit of a stretch.) Mike On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: > Is your question whether we can successfully round-trip the numbers? Eg. go > from an ascii format to mzML back to originating ascii format and get the > same exact numbers? I believe that when we pack the numbers and unpack them > (at least in my non-validating ruby implementations) the numbers and > significance are completely the same. E.g. 1.005 === 1.005 and not > 1.005000000000001 > -angel |
From: Brian P. <bri...@in...> - 2009-06-11 21:17:01
|
The goal of "round trip" is best served by the binary representation. Keep in mind that these values come off the machine as IEEE floats, not tidy human readable representations. The value you think of as "1.5" is actually a bit pattern that may well have the value 1.5000001, but (assuming a chain of conversion that never attempts a human readable representation) it's the same bit pattern that came off the mass spec, so it's the "right" one. - Brian -----Original Message----- From: Mike Coleman [mailto:tu...@gm...] Sent: Thursday, June 11, 2009 1:41 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder I took it to mean that with "1", "1.5", "1.50", one gets an implied level of precision. That is, "1.5" is generally understood to mean 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied about the precision of this value, unless it's explicitly stated elsewhere. (If you have a whole set of these, then you probably can work out the equivalent precision, but this is a bit of a stretch.) Mike On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: > Is your question whether we can successfully round-trip the numbers? Eg. go > from an ascii format to mzML back to originating ascii format and get the > same exact numbers? I believe that when we pack the numbers and unpack them > (at least in my non-validating ruby implementations) the numbers and > significance are completely the same. E.g. 1.005 === 1.005 and not > 1.005000000000001 > -angel ---------------------------------------------------------------------------- -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Stein, S. E. Dr. <ste...@ni...> - 2009-06-11 21:27:53
|
Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. -Steve -----Original Message----- From: Mike Coleman [mailto:tu...@gm...] Sent: Thursday, June 11, 2009 4:41 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder I took it to mean that with "1", "1.5", "1.50", one gets an implied level of precision. That is, "1.5" is generally understood to mean 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied about the precision of this value, unless it's explicitly stated elsewhere. (If you have a whole set of these, then you probably can work out the equivalent precision, but this is a bit of a stretch.) Mike On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: > Is your question whether we can successfully round-trip the numbers? Eg. go > from an ascii format to mzML back to originating ascii format and get the > same exact numbers? I believe that when we pack the numbers and unpack them > (at least in my non-validating ruby implementations) the numbers and > significance are completely the same. E.g. 1.005 === 1.005 and not > 1.005000000000001 > -angel ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Matthew C. <mat...@va...> - 2009-06-11 22:00:16
|
No measurements I'm aware of in proteomic mass spec use more than 15 base 10 digits, which is the number of digits that double precision floats can represent without precision loss. That means that even if a value goes in as 1.5 (which can't be represented exactly), then as long as we round to the 15th digit we don't lose precision. As others have said, we can thus "round-trip" 15 digits. We get this high degree of fidelity to the source data without all the assumptions involved with the ASCII representation: I use doubles consistently then I'm always providing 15 significant digits. And if we did need more than 15, then ASCII is still a very inefficient encoding. You'd want to use arbitrary precision fixed or floating point binary types, which can't be computed on very easily or efficiently, but they are the Right Way to achieve arbitrary precision (i.e. no unspecified assumptions, well defined byte width, fast parsing). So in fact, you can preserve this "poor person's" significant digits encoding: if the software is doing its job, then it will go out the same way it came in! The real nastiness with floating point is when the precision loss accumulates every time an arithmetic operation happens on a cumulative sum or product. -Matt Stein, Stephen E. Dr. wrote: > Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). > > It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. > > Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. > > -Steve > > -----Original Message----- > From: Mike Coleman [mailto:tu...@gm...] > Sent: Thursday, June 11, 2009 4:41 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder > > I took it to mean that with "1", "1.5", "1.50", one gets an implied > level of precision. That is, "1.5" is generally understood to mean > 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied > about the precision of this value, unless it's explicitly stated > elsewhere. (If you have a whole set of these, then you probably can > work out the equivalent precision, but this is a bit of a stretch.) > > Mike > > > On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: > >> Is your question whether we can successfully round-trip the numbers? Eg. go >> from an ascii format to mzML back to originating ascii format and get the >> same exact numbers? I believe that when we pack the numbers and unpack them >> (at least in my non-validating ruby implementations) the numbers and >> significance are completely the same. E.g. 1.005 === 1.005 and not >> 1.005000000000001 >> -angel >> > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Pierre-Alain B. <pie...@is...> - 2009-06-12 10:04:36
|
One question to Steve and others. reading mzML, as well as any othe files, has to be done with an editor, being a simple text editor or a more elaborated viewer. Would a more elaborated XML viewer/editor that knows how to read binary data and round it if needed not be an ideal "straight" reader of mzML instead of using a more plain text viewer? I know and myself also like to "call back" values with a defined number of digits, as they were entered. And it's up to the software design to "not interpret" what I have entered. But today, it's relatively easy to get a XML reader that could "translate" the binary arrays in a "mz Intensity" two column format with appropriate rounding if necessary, so that it looks exactly as if it was an ascii table (don't forget that in mzML the mz and intensity arrays are separate and anyway have to be interpreted to look like a 2 column ascii table. If the answer is OK, then we could stay with binary format, taking care of the "precision issue" via the graphical view, and be therefore compatible with the ascii precision. This sounds like a way to bring the technical question to a more phylosophical, "ergonomic" one, but probably worth at that stage. Pierre-Alain Matthew Chambers wrote: > No measurements I'm aware of in proteomic mass spec use more than 15 > base 10 digits, which is the number of digits that double precision > floats can represent without precision loss. That means that even if a > value goes in as 1.5 (which can't be represented exactly), then as long > as we round to the 15th digit we don't lose precision. As others have > said, we can thus "round-trip" 15 digits. We get this high degree of > fidelity to the source data without all the assumptions involved with > the ASCII representation: I use doubles consistently then I'm always > providing 15 significant digits. And if we did need more than 15, then > ASCII is still a very inefficient encoding. You'd want to use arbitrary > precision fixed or floating point binary types, which can't be computed > on very easily or efficiently, but they are the Right Way to achieve > arbitrary precision (i.e. no unspecified assumptions, well defined byte > width, fast parsing). > > So in fact, you can preserve this "poor person's" significant digits > encoding: if the software is doing its job, then it will go out the same > way it came in! The real nastiness with floating point is when the > precision loss accumulates every time an arithmetic operation happens on > a cumulative sum or product. > > -Matt > > > Stein, Stephen E. Dr. wrote: > >> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >> >> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >> >> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >> >> -Steve >> >> -----Original Message----- >> From: Mike Coleman [mailto:tu...@gm...] >> Sent: Thursday, June 11, 2009 4:41 PM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >> >> I took it to mean that with "1", "1.5", "1.50", one gets an implied >> level of precision. That is, "1.5" is generally understood to mean >> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >> about the precision of this value, unless it's explicitly stated >> elsewhere. (If you have a whole set of these, then you probably can >> work out the equivalent precision, but this is a bit of a stretch.) >> >> Mike >> >> >> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >> >> >>> Is your question whether we can successfully round-trip the numbers? Eg. go >>> from an ascii format to mzML back to originating ascii format and get the >>> same exact numbers? I believe that when we pack the numbers and unpack them >>> (at least in my non-validating ruby implementations) the numbers and >>> significance are completely the same. E.g. 1.005 === 1.005 and not >>> 1.005000000000001 >>> -angel >>> >>> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Fredrik L. <Fre...@im...> - 2009-06-12 13:01:41
|
Wouldn't it make sense to add an optional CV term for the number of significant digits in a binary array? This way it would be easy to get back to the ASCII representation if a peak list with x number of decimals was converted to mzML. It might not be so useful for conversion of raw data, but if a peak list have been rounded to a certain number of decimals, that's information which shouldn't been thrown away when converting to mzML. The info could also be used for a viewer to show the right number of decimals. Fredrik Pierre-Alain Binz wrote: > One question to Steve and others. > reading mzML, as well as any othe files, has to be done with an > editor, being a simple text editor or a more elaborated viewer. > > Would a more elaborated XML viewer/editor that knows how to read > binary data and round it if needed not be an ideal "straight" reader > of mzML instead of using a more plain text viewer? > I know and myself also like to "call back" values with a defined > number of digits, as they were entered. And it's up to the software > design to "not interpret" what I have entered. But today, it's > relatively easy to get a XML reader that could "translate" the binary > arrays in a "mz Intensity" two column format with appropriate rounding > if necessary, so that it looks exactly as if it was an ascii table > (don't forget that in mzML the mz and intensity arrays are separate > and anyway have to be interpreted to look like a 2 column ascii table. > If the answer is OK, then we could stay with binary format, taking > care of the "precision issue" via the graphical view, and be therefore > compatible with the ascii precision. > > This sounds like a way to bring the technical question to a more > phylosophical, "ergonomic" one, but probably worth at that stage. > > Pierre-Alain > > Matthew Chambers wrote: >> No measurements I'm aware of in proteomic mass spec use more than 15 >> base 10 digits, which is the number of digits that double precision >> floats can represent without precision loss. That means that even if a >> value goes in as 1.5 (which can't be represented exactly), then as long >> as we round to the 15th digit we don't lose precision. As others have >> said, we can thus "round-trip" 15 digits. We get this high degree of >> fidelity to the source data without all the assumptions involved with >> the ASCII representation: I use doubles consistently then I'm always >> providing 15 significant digits. And if we did need more than 15, then >> ASCII is still a very inefficient encoding. You'd want to use arbitrary >> precision fixed or floating point binary types, which can't be computed >> on very easily or efficiently, but they are the Right Way to achieve >> arbitrary precision (i.e. no unspecified assumptions, well defined byte >> width, fast parsing). >> >> So in fact, you can preserve this "poor person's" significant digits >> encoding: if the software is doing its job, then it will go out the same >> way it came in! The real nastiness with floating point is when the >> precision loss accumulates every time an arithmetic operation happens on >> a cumulative sum or product. >> >> -Matt >> >> >> Stein, Stephen E. Dr. wrote: >> >>> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >>> >>> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >>> >>> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >>> >>> -Steve >>> >>> -----Original Message----- >>> From: Mike Coleman [mailto:tu...@gm...] >>> Sent: Thursday, June 11, 2009 4:41 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >>> >>> I took it to mean that with "1", "1.5", "1.50", one gets an implied >>> level of precision. That is, "1.5" is generally understood to mean >>> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >>> about the precision of this value, unless it's explicitly stated >>> elsewhere. (If you have a whole set of these, then you probably can >>> work out the equivalent precision, but this is a bit of a stretch.) >>> >>> Mike >>> >>> >>> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >>> >>> >>>> Is your question whether we can successfully round-trip the numbers? Eg. go >>>> from an ascii format to mzML back to originating ascii format and get the >>>> same exact numbers? I believe that when we pack the numbers and unpack them >>>> (at least in my non-validating ruby implementations) the numbers and >>>> significance are completely the same. E.g. 1.005 === 1.005 and not >>>> 1.005000000000001 >>>> -angel >>>> >>>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> |
From: Stein, S. E. Dr. <ste...@ni...> - 2009-06-12 13:14:40
|
that would be a nice addition - also allow ppm representation - more complex precision representations can be delayed for future versions. -----Original Message----- From: Fredrik Levander [mailto:Fre...@im...] Sent: Friday, June 12, 2009 8:28 AM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder Wouldn't it make sense to add an optional CV term for the number of significant digits in a binary array? This way it would be easy to get back to the ASCII representation if a peak list with x number of decimals was converted to mzML. It might not be so useful for conversion of raw data, but if a peak list have been rounded to a certain number of decimals, that's information which shouldn't been thrown away when converting to mzML. The info could also be used for a viewer to show the right number of decimals. Fredrik Pierre-Alain Binz wrote: > One question to Steve and others. > reading mzML, as well as any othe files, has to be done with an > editor, being a simple text editor or a more elaborated viewer. > > Would a more elaborated XML viewer/editor that knows how to read > binary data and round it if needed not be an ideal "straight" reader > of mzML instead of using a more plain text viewer? > I know and myself also like to "call back" values with a defined > number of digits, as they were entered. And it's up to the software > design to "not interpret" what I have entered. But today, it's > relatively easy to get a XML reader that could "translate" the binary > arrays in a "mz Intensity" two column format with appropriate rounding > if necessary, so that it looks exactly as if it was an ascii table > (don't forget that in mzML the mz and intensity arrays are separate > and anyway have to be interpreted to look like a 2 column ascii table. > If the answer is OK, then we could stay with binary format, taking > care of the "precision issue" via the graphical view, and be therefore > compatible with the ascii precision. > > This sounds like a way to bring the technical question to a more > phylosophical, "ergonomic" one, but probably worth at that stage. > > Pierre-Alain > > Matthew Chambers wrote: >> No measurements I'm aware of in proteomic mass spec use more than 15 >> base 10 digits, which is the number of digits that double precision >> floats can represent without precision loss. That means that even if a >> value goes in as 1.5 (which can't be represented exactly), then as long >> as we round to the 15th digit we don't lose precision. As others have >> said, we can thus "round-trip" 15 digits. We get this high degree of >> fidelity to the source data without all the assumptions involved with >> the ASCII representation: I use doubles consistently then I'm always >> providing 15 significant digits. And if we did need more than 15, then >> ASCII is still a very inefficient encoding. You'd want to use arbitrary >> precision fixed or floating point binary types, which can't be computed >> on very easily or efficiently, but they are the Right Way to achieve >> arbitrary precision (i.e. no unspecified assumptions, well defined byte >> width, fast parsing). >> >> So in fact, you can preserve this "poor person's" significant digits >> encoding: if the software is doing its job, then it will go out the same >> way it came in! The real nastiness with floating point is when the >> precision loss accumulates every time an arithmetic operation happens on >> a cumulative sum or product. >> >> -Matt >> >> >> Stein, Stephen E. Dr. wrote: >> >>> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >>> >>> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >>> >>> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >>> >>> -Steve >>> >>> -----Original Message----- >>> From: Mike Coleman [mailto:tu...@gm...] >>> Sent: Thursday, June 11, 2009 4:41 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >>> >>> I took it to mean that with "1", "1.5", "1.50", one gets an implied >>> level of precision. That is, "1.5" is generally understood to mean >>> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >>> about the precision of this value, unless it's explicitly stated >>> elsewhere. (If you have a whole set of these, then you probably can >>> work out the equivalent precision, but this is a bit of a stretch.) >>> >>> Mike >>> >>> >>> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >>> >>> >>>> Is your question whether we can successfully round-trip the numbers? Eg. go >>>> from an ascii format to mzML back to originating ascii format and get the >>>> same exact numbers? I believe that when we pack the numbers and unpack them >>>> (at least in my non-validating ruby implementations) the numbers and >>>> significance are completely the same. E.g. 1.005 === 1.005 and not >>>> 1.005000000000001 >>>> -angel >>>> >>>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Matt C. <mat...@va...> - 2009-06-12 13:23:59
|
Now this I can agree with, especially with ppm representation when appropriate. But doesn't the instrument's mass resolution and related CV terms convey this information? And if someone doesn't write those at all or can't write them in a machine-readable numeric representation, it seems unlikely they will have done a proper job of rounding m/z values. This is kind of the reason I was opposed to using strings to represent mass resolution, but I was overruled. Perhaps we should revisit that? It makes sense to me because it's a less redundant placement of this precision information. Steve, do you agree with using XML comments to actually show human-readable peak lists in the mzML? That seems like an orthogonal issue to the precision one. -Matt Stein, Stephen E. Dr. wrote: > that would be a nice addition - also allow ppm representation - more complex precision representations can be delayed for future versions. > > -----Original Message----- > From: Fredrik Levander [mailto:Fre...@im...] > Sent: Friday, June 12, 2009 8:28 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder > > Wouldn't it make sense to add an optional CV term for the number of > significant digits in a binary array? This way it would be easy to get > back to the ASCII representation if a peak list with x number of > decimals was converted to mzML. It might not be so useful for conversion > of raw data, but if a peak list have been rounded to a certain number of > decimals, that's information which shouldn't been thrown away when > converting to mzML. The info could also be used for a viewer to show the > right number of decimals. > > Fredrik > > Pierre-Alain Binz wrote: > >> One question to Steve and others. >> reading mzML, as well as any othe files, has to be done with an >> editor, being a simple text editor or a more elaborated viewer. >> >> Would a more elaborated XML viewer/editor that knows how to read >> binary data and round it if needed not be an ideal "straight" reader >> of mzML instead of using a more plain text viewer? >> I know and myself also like to "call back" values with a defined >> number of digits, as they were entered. And it's up to the software >> design to "not interpret" what I have entered. But today, it's >> relatively easy to get a XML reader that could "translate" the binary >> arrays in a "mz Intensity" two column format with appropriate rounding >> if necessary, so that it looks exactly as if it was an ascii table >> (don't forget that in mzML the mz and intensity arrays are separate >> and anyway have to be interpreted to look like a 2 column ascii table. >> If the answer is OK, then we could stay with binary format, taking >> care of the "precision issue" via the graphical view, and be therefore >> compatible with the ascii precision. >> >> This sounds like a way to bring the technical question to a more >> phylosophical, "ergonomic" one, but probably worth at that stage. >> >> Pierre-Alain >> >> Matthew Chambers wrote: >> >>> No measurements I'm aware of in proteomic mass spec use more than 15 >>> base 10 digits, which is the number of digits that double precision >>> floats can represent without precision loss. That means that even if a >>> value goes in as 1.5 (which can't be represented exactly), then as long >>> as we round to the 15th digit we don't lose precision. As others have >>> said, we can thus "round-trip" 15 digits. We get this high degree of >>> fidelity to the source data without all the assumptions involved with >>> the ASCII representation: I use doubles consistently then I'm always >>> providing 15 significant digits. And if we did need more than 15, then >>> ASCII is still a very inefficient encoding. You'd want to use arbitrary >>> precision fixed or floating point binary types, which can't be computed >>> on very easily or efficiently, but they are the Right Way to achieve >>> arbitrary precision (i.e. no unspecified assumptions, well defined byte >>> width, fast parsing). >>> >>> So in fact, you can preserve this "poor person's" significant digits >>> encoding: if the software is doing its job, then it will go out the same >>> way it came in! The real nastiness with floating point is when the >>> precision loss accumulates every time an arithmetic operation happens on >>> a cumulative sum or product. >>> >>> -Matt >>> >>> >>> Stein, Stephen E. Dr. wrote: >>> >>> >>>> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >>>> >>>> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >>>> >>>> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >>>> >>>> -Steve >>>> >>>> -----Original Message----- >>>> From: Mike Coleman [mailto:tu...@gm...] >>>> Sent: Thursday, June 11, 2009 4:41 PM >>>> To: Mass spectrometry standard development >>>> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >>>> >>>> I took it to mean that with "1", "1.5", "1.50", one gets an implied >>>> level of precision. That is, "1.5" is generally understood to mean >>>> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >>>> about the precision of this value, unless it's explicitly stated >>>> elsewhere. (If you have a whole set of these, then you probably can >>>> work out the equivalent precision, but this is a bit of a stretch.) >>>> >>>> Mike >>>> >>>> >>>> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >>>> >>>> >>>> >>>>> Is your question whether we can successfully round-trip the numbers? Eg. go >>>>> from an ascii format to mzML back to originating ascii format and get the >>>>> same exact numbers? I believe that when we pack the numbers and unpack them >>>>> (at least in my non-validating ruby implementations) the numbers and >>>>> significance are completely the same. E.g. 1.005 === 1.005 and not >>>>> 1.005000000000001 >>>>> -angel >>>>> >>>>> |
From: Stein, S. E. Dr. <ste...@ni...> - 2009-06-12 14:53:35
|
Matt, Resolution depends on instrument, tuning and settings - I don't know the current state of reporting such information (or its reliability) in current instruments. We have long held all of our data in ASCII form (not just MS) - if you want flexibility and accuracy, this is the only path without inventing a new data structure. Error limits and annotation can be added as we like (peak labeling, for example). We will consider using comments - but I suspect no one will know they are there but us. Note that our focus is quite different from others - we are dealing with data that we have processed, perhaps heavily. I still ask for an optional ASCII data representation for reference data. -Steve -----Original Message----- From: Matt Chambers [mailto:mat...@va...] Sent: Friday, June 12, 2009 9:22 AM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder Now this I can agree with, especially with ppm representation when appropriate. But doesn't the instrument's mass resolution and related CV terms convey this information? And if someone doesn't write those at all or can't write them in a machine-readable numeric representation, it seems unlikely they will have done a proper job of rounding m/z values. This is kind of the reason I was opposed to using strings to represent mass resolution, but I was overruled. Perhaps we should revisit that? It makes sense to me because it's a less redundant placement of this precision information. Steve, do you agree with using XML comments to actually show human-readable peak lists in the mzML? That seems like an orthogonal issue to the precision one. -Matt Stein, Stephen E. Dr. wrote: > that would be a nice addition - also allow ppm representation - more complex precision representations can be delayed for future versions. > > -----Original Message----- > From: Fredrik Levander [mailto:Fre...@im...] > Sent: Friday, June 12, 2009 8:28 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder > > Wouldn't it make sense to add an optional CV term for the number of > significant digits in a binary array? This way it would be easy to get > back to the ASCII representation if a peak list with x number of > decimals was converted to mzML. It might not be so useful for conversion > of raw data, but if a peak list have been rounded to a certain number of > decimals, that's information which shouldn't been thrown away when > converting to mzML. The info could also be used for a viewer to show the > right number of decimals. > > Fredrik > > Pierre-Alain Binz wrote: > >> One question to Steve and others. >> reading mzML, as well as any othe files, has to be done with an >> editor, being a simple text editor or a more elaborated viewer. >> >> Would a more elaborated XML viewer/editor that knows how to read >> binary data and round it if needed not be an ideal "straight" reader >> of mzML instead of using a more plain text viewer? >> I know and myself also like to "call back" values with a defined >> number of digits, as they were entered. And it's up to the software >> design to "not interpret" what I have entered. But today, it's >> relatively easy to get a XML reader that could "translate" the binary >> arrays in a "mz Intensity" two column format with appropriate rounding >> if necessary, so that it looks exactly as if it was an ascii table >> (don't forget that in mzML the mz and intensity arrays are separate >> and anyway have to be interpreted to look like a 2 column ascii table. >> If the answer is OK, then we could stay with binary format, taking >> care of the "precision issue" via the graphical view, and be therefore >> compatible with the ascii precision. >> >> This sounds like a way to bring the technical question to a more >> phylosophical, "ergonomic" one, but probably worth at that stage. >> >> Pierre-Alain >> >> Matthew Chambers wrote: >> >>> No measurements I'm aware of in proteomic mass spec use more than 15 >>> base 10 digits, which is the number of digits that double precision >>> floats can represent without precision loss. That means that even if a >>> value goes in as 1.5 (which can't be represented exactly), then as long >>> as we round to the 15th digit we don't lose precision. As others have >>> said, we can thus "round-trip" 15 digits. We get this high degree of >>> fidelity to the source data without all the assumptions involved with >>> the ASCII representation: I use doubles consistently then I'm always >>> providing 15 significant digits. And if we did need more than 15, then >>> ASCII is still a very inefficient encoding. You'd want to use arbitrary >>> precision fixed or floating point binary types, which can't be computed >>> on very easily or efficiently, but they are the Right Way to achieve >>> arbitrary precision (i.e. no unspecified assumptions, well defined byte >>> width, fast parsing). >>> >>> So in fact, you can preserve this "poor person's" significant digits >>> encoding: if the software is doing its job, then it will go out the same >>> way it came in! The real nastiness with floating point is when the >>> precision loss accumulates every time an arithmetic operation happens on >>> a cumulative sum or product. >>> >>> -Matt >>> >>> >>> Stein, Stephen E. Dr. wrote: >>> >>> >>>> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >>>> >>>> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >>>> >>>> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >>>> >>>> -Steve >>>> >>>> -----Original Message----- >>>> From: Mike Coleman [mailto:tu...@gm...] >>>> Sent: Thursday, June 11, 2009 4:41 PM >>>> To: Mass spectrometry standard development >>>> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >>>> >>>> I took it to mean that with "1", "1.5", "1.50", one gets an implied >>>> level of precision. That is, "1.5" is generally understood to mean >>>> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >>>> about the precision of this value, unless it's explicitly stated >>>> elsewhere. (If you have a whole set of these, then you probably can >>>> work out the equivalent precision, but this is a bit of a stretch.) >>>> >>>> Mike >>>> >>>> >>>> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >>>> >>>> >>>> >>>>> Is your question whether we can successfully round-trip the numbers? Eg. go >>>>> from an ascii format to mzML back to originating ascii format and get the >>>>> same exact numbers? I believe that when we pack the numbers and unpack them >>>>> (at least in my non-validating ruby implementations) the numbers and >>>>> significance are completely the same. E.g. 1.005 === 1.005 and not >>>>> 1.005000000000001 >>>>> -angel >>>>> >>>>> ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |