From: Stein, S. E. Dr. <ste...@ni...> - 2009-06-12 13:14:40
|
that would be a nice addition - also allow ppm representation - more complex precision representations can be delayed for future versions. -----Original Message----- From: Fredrik Levander [mailto:Fre...@im...] Sent: Friday, June 12, 2009 8:28 AM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder Wouldn't it make sense to add an optional CV term for the number of significant digits in a binary array? This way it would be easy to get back to the ASCII representation if a peak list with x number of decimals was converted to mzML. It might not be so useful for conversion of raw data, but if a peak list have been rounded to a certain number of decimals, that's information which shouldn't been thrown away when converting to mzML. The info could also be used for a viewer to show the right number of decimals. Fredrik Pierre-Alain Binz wrote: > One question to Steve and others. > reading mzML, as well as any othe files, has to be done with an > editor, being a simple text editor or a more elaborated viewer. > > Would a more elaborated XML viewer/editor that knows how to read > binary data and round it if needed not be an ideal "straight" reader > of mzML instead of using a more plain text viewer? > I know and myself also like to "call back" values with a defined > number of digits, as they were entered. And it's up to the software > design to "not interpret" what I have entered. But today, it's > relatively easy to get a XML reader that could "translate" the binary > arrays in a "mz Intensity" two column format with appropriate rounding > if necessary, so that it looks exactly as if it was an ascii table > (don't forget that in mzML the mz and intensity arrays are separate > and anyway have to be interpreted to look like a 2 column ascii table. > If the answer is OK, then we could stay with binary format, taking > care of the "precision issue" via the graphical view, and be therefore > compatible with the ascii precision. > > This sounds like a way to bring the technical question to a more > phylosophical, "ergonomic" one, but probably worth at that stage. > > Pierre-Alain > > Matthew Chambers wrote: >> No measurements I'm aware of in proteomic mass spec use more than 15 >> base 10 digits, which is the number of digits that double precision >> floats can represent without precision loss. That means that even if a >> value goes in as 1.5 (which can't be represented exactly), then as long >> as we round to the 15th digit we don't lose precision. As others have >> said, we can thus "round-trip" 15 digits. We get this high degree of >> fidelity to the source data without all the assumptions involved with >> the ASCII representation: I use doubles consistently then I'm always >> providing 15 significant digits. And if we did need more than 15, then >> ASCII is still a very inefficient encoding. You'd want to use arbitrary >> precision fixed or floating point binary types, which can't be computed >> on very easily or efficiently, but they are the Right Way to achieve >> arbitrary precision (i.e. no unspecified assumptions, well defined byte >> width, fast parsing). >> >> So in fact, you can preserve this "poor person's" significant digits >> encoding: if the software is doing its job, then it will go out the same >> way it came in! The real nastiness with floating point is when the >> precision loss accumulates every time an arithmetic operation happens on >> a cumulative sum or product. >> >> -Matt >> >> >> Stein, Stephen E. Dr. wrote: >> >>> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >>> >>> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >>> >>> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >>> >>> -Steve >>> >>> -----Original Message----- >>> From: Mike Coleman [mailto:tu...@gm...] >>> Sent: Thursday, June 11, 2009 4:41 PM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >>> >>> I took it to mean that with "1", "1.5", "1.50", one gets an implied >>> level of precision. That is, "1.5" is generally understood to mean >>> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >>> about the precision of this value, unless it's explicitly stated >>> elsewhere. (If you have a whole set of these, then you probably can >>> work out the equivalent precision, but this is a bit of a stretch.) >>> >>> Mike >>> >>> >>> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >>> >>> >>>> Is your question whether we can successfully round-trip the numbers? Eg. go >>>> from an ascii format to mzML back to originating ascii format and get the >>>> same exact numbers? I believe that when we pack the numbers and unpack them >>>> (at least in my non-validating ruby implementations) the numbers and >>>> significance are completely the same. E.g. 1.005 === 1.005 and not >>>> 1.005000000000001 >>>> -angel >>>> >>>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >>> >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |