From: Pierre-Alain B. <pie...@is...> - 2009-06-12 10:04:36
|
One question to Steve and others. reading mzML, as well as any othe files, has to be done with an editor, being a simple text editor or a more elaborated viewer. Would a more elaborated XML viewer/editor that knows how to read binary data and round it if needed not be an ideal "straight" reader of mzML instead of using a more plain text viewer? I know and myself also like to "call back" values with a defined number of digits, as they were entered. And it's up to the software design to "not interpret" what I have entered. But today, it's relatively easy to get a XML reader that could "translate" the binary arrays in a "mz Intensity" two column format with appropriate rounding if necessary, so that it looks exactly as if it was an ascii table (don't forget that in mzML the mz and intensity arrays are separate and anyway have to be interpreted to look like a 2 column ascii table. If the answer is OK, then we could stay with binary format, taking care of the "precision issue" via the graphical view, and be therefore compatible with the ascii precision. This sounds like a way to bring the technical question to a more phylosophical, "ergonomic" one, but probably worth at that stage. Pierre-Alain Matthew Chambers wrote: > No measurements I'm aware of in proteomic mass spec use more than 15 > base 10 digits, which is the number of digits that double precision > floats can represent without precision loss. That means that even if a > value goes in as 1.5 (which can't be represented exactly), then as long > as we round to the 15th digit we don't lose precision. As others have > said, we can thus "round-trip" 15 digits. We get this high degree of > fidelity to the source data without all the assumptions involved with > the ASCII representation: I use doubles consistently then I'm always > providing 15 significant digits. And if we did need more than 15, then > ASCII is still a very inefficient encoding. You'd want to use arbitrary > precision fixed or floating point binary types, which can't be computed > on very easily or efficiently, but they are the Right Way to achieve > arbitrary precision (i.e. no unspecified assumptions, well defined byte > width, fast parsing). > > So in fact, you can preserve this "poor person's" significant digits > encoding: if the software is doing its job, then it will go out the same > way it came in! The real nastiness with floating point is when the > precision loss accumulates every time an arithmetic operation happens on > a cumulative sum or product. > > -Matt > > > Stein, Stephen E. Dr. wrote: > >> Yes, that is what I had in mind - you get drilled in that when you take a lab course in Chemistry or Physics (maybe it has been dropped in recent years). It is a poor person's way of providing error limits (the lowest significant figure contains the precision of measurement). >> >> It is true that if only affects 10% of values, but that's enough for me to be concerned. I suppose we could put ASCII in a comment field, but physical quantities do have precisions, and stuffing measured values in those floating formats loses some of it. >> >> Sorry to say, this problem generally affects binary representations of measured values - one reason why I have liked the ASCII nature of XML - and hate to lose it. >> >> -Steve >> >> -----Original Message----- >> From: Mike Coleman [mailto:tu...@gm...] >> Sent: Thursday, June 11, 2009 4:41 PM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] PSI-MSS WG Tuesday call reminder >> >> I took it to mean that with "1", "1.5", "1.50", one gets an implied >> level of precision. That is, "1.5" is generally understood to mean >> 1.5 +/- 0.05. If I give you the IEEE float 1.5, much less is implied >> about the precision of this value, unless it's explicitly stated >> elsewhere. (If you have a whole set of these, then you probably can >> work out the equivalent precision, but this is a bit of a stretch.) >> >> Mike >> >> >> On Thu, Jun 11, 2009 at 3:23 PM, Angel Pizarro<an...@ma...> wrote: >> >> >>> Is your question whether we can successfully round-trip the numbers? Eg. go >>> from an ascii format to mzML back to originating ascii format and get the >>> same exact numbers? I believe that when we pack the numbers and unpack them >>> (at least in my non-validating ruby implementations) the numbers and >>> significance are completely the same. E.g. 1.005 === 1.005 and not >>> 1.005000000000001 >>> -angel >>> >>> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> >> > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |