From: Randy J. <rkj...@in...> - 2006-10-05 11:27:53
|
There was concern in the NBT review of the mzData manuscript that the format was not able specifically designed for either quantitation or 'raw' data. Quite the opposite is true - it handles these better than it handles a 'peak list'. Given the broad scope we are going for, I think mzData 2.0 needs to cover both of Mike's suggestions. The representation should allow an ASCII list representation, _and_ a base64 list option. Within each of these, the _desired_ precision should be used. If you want to make some kind of 21CFR11 claim regarding GLP or GCP for clinical data (metabolites, proteins or biomarker analyses) then the ability to represent 'raw' data is critical and part of the current design. It is the simple case of 'represent a single tandem MS spectrum of a single peptide at only the precision of the m/z calibration' that is harder than it needs to be with the current representation. During the Washington PSI meeting a proposal was made to re-introduce the ASCII data representation that was dropped at the PSI meeting in Nice. What does everyone think of this idea? Randy -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Mike Coleman Sent: Wednesday, October 04, 2006 3:13 PM To: Angel Pizarro Cc: Psi...@li... Subject: Re: [Psidev-ms-dev] Why base64? [This message seems to have been bounced by Sourceforge, so I'm resending it. I'm sorry to see that apparently they are having serious email problems these days. See today's Slashdot article at http://it.slashdot.org/article.pl?sid=06/10/04/1324214. (Apparently the problem isn't limited to email coming from gmail accounts.) ] On 9/28/06, Mike Coleman <tu...@gm...> wrote: > Makes sense. To put it in other words, there are two questions here: > > 1. Are the values represented as base64-encoded bitstrings or as ASCII text? > > 2. Should the values be rounded to the precision of the instrument > (probably plus a digit, etc.), or should an arbitrary number of > figures be used? Again, this isn't about losing information, as we're > only discussing rounding away noise. > > These two questions are entirely orthogonal, as far as I can see, and > it would be possible to allow both options for both questions, if this > were seen as being worthwhile. The one interaction is that if you use > the ASCII text encoding, rounding the figures will make the mzData > file smaller. > > Regarding ambiguity, the ASCII text representation would allow > differing whitespace (which produce no semantic difference). I guess > the base64 encoding also allows differing surrounding whitespace. > > With respect to the base64 encoding, one corner case comes to mind. > Are special IEEE values like NaN, the infinities, negative zero, etc., > allowed? If so, what should the interpretation be? > > Mike > > > The example code I mentioned: > > /* gcc -g -O2 -ffloat-store -o ieee-test ieee-test.c */ > > /* strtof is GNU/C99 */ > #define _GNU_SOURCE > > #include <assert.h> > #include <errno.h> > #include <limits.h> > #include <stdio.h> > #include <stdlib.h> > > > union bits { > unsigned int u; > float f; > }; > > > int > main() { > unsigned int i; > union bits x, x2; > int zeros_seen = 0; > > assert(sizeof x.u == sizeof x.f); > assert(&x.u == &x.f); > > > > for (i=0; ; i++) { > char buf[128]; > > if (i == 0) > if (++zeros_seen > 1) > break; > > #if 0 > if (!(i % 100000)) > putc('.', stderr); > #endif > > x.u = i; > if (x.f != x.f) > continue; /* skip error values */ > > sprintf(buf, "%.8e", x.f); > > errno = 0; > x2.f = strtof(buf, 0); > if (errno == ERANGE) { > printf("strtof error for %s\n", buf); > continue; > } > > if (x2.u != x.u) > printf("bit difference for %s (%u != %u)\n", buf, x2.u, x.u); > } > } > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |