From: Pierre-Alain B. <pie...@is...> - 2006-10-05 12:14:04
|
I am for the possibility to represent a spectrum/peaklist/even chromatogram in more than one manner ONLY if these representations are easy and straighforward to generate and to parse AND if there is a good (or better blocking) reason to do so. We need to avoid optional things that make any implementation subject to interpretation and missunderstanding. So yes only if the two formats are strictly and clearly described and discriminated (specification issue) Pierre-Alain Randy Julian wrote: >There was concern in the NBT review of the mzData manuscript that the format >was not able specifically designed for either quantitation or 'raw' data. >Quite the opposite is true - it handles these better than it handles a 'peak >list'. > >Given the broad scope we are going for, I think mzData 2.0 needs to cover >both of Mike's suggestions. > >The representation should allow an ASCII list representation, _and_ a base64 >list option. Within each of these, the _desired_ precision should be used. >If you want to make some kind of 21CFR11 claim regarding GLP or GCP for >clinical data (metabolites, proteins or biomarker analyses) then the ability >to represent 'raw' data is critical and part of the current design. > >It is the simple case of 'represent a single tandem MS spectrum of a single >peptide at only the precision of the m/z calibration' that is harder than it >needs to be with the current representation. > >During the Washington PSI meeting a proposal was made to re-introduce the >ASCII data representation that was dropped at the PSI meeting in Nice. What >does everyone think of this idea? > >Randy > >-----Original Message----- >From: psi...@li... >[mailto:psi...@li...] On Behalf Of Mike >Coleman >Sent: Wednesday, October 04, 2006 3:13 PM >To: Angel Pizarro >Cc: Psi...@li... >Subject: Re: [Psidev-ms-dev] Why base64? > >[This message seems to have been bounced by Sourceforge, so I'm >resending it. I'm sorry to see that apparently they are having >serious email problems these days. See today's Slashdot article at >http://it.slashdot.org/article.pl?sid=06/10/04/1324214. (Apparently >the problem isn't limited to email coming from gmail accounts.) ] > >On 9/28/06, Mike Coleman <tu...@gm...> wrote: > > >>Makes sense. To put it in other words, there are two questions here: >> >>1. Are the values represented as base64-encoded bitstrings or as ASCII >> >> >text? > > >>2. Should the values be rounded to the precision of the instrument >>(probably plus a digit, etc.), or should an arbitrary number of >>figures be used? Again, this isn't about losing information, as we're >>only discussing rounding away noise. >> >>These two questions are entirely orthogonal, as far as I can see, and >>it would be possible to allow both options for both questions, if this >>were seen as being worthwhile. The one interaction is that if you use >>the ASCII text encoding, rounding the figures will make the mzData >>file smaller. >> >>Regarding ambiguity, the ASCII text representation would allow >>differing whitespace (which produce no semantic difference). I guess >>the base64 encoding also allows differing surrounding whitespace. >> >>With respect to the base64 encoding, one corner case comes to mind. >>Are special IEEE values like NaN, the infinities, negative zero, etc., >>allowed? If so, what should the interpretation be? >> >>Mike >> >> >>The example code I mentioned: >> >>/* gcc -g -O2 -ffloat-store -o ieee-test ieee-test.c */ >> >>/* strtof is GNU/C99 */ >>#define _GNU_SOURCE >> >>#include <assert.h> >>#include <errno.h> >>#include <limits.h> >>#include <stdio.h> >>#include <stdlib.h> >> >> >>union bits { >> unsigned int u; >> float f; >>}; >> >> >>int >>main() { >> unsigned int i; >> union bits x, x2; >> int zeros_seen = 0; >> >> assert(sizeof x.u == sizeof x.f); >> assert(&x.u == &x.f); >> >> >> >> for (i=0; ; i++) { >> char buf[128]; >> >> if (i == 0) >> if (++zeros_seen > 1) >> break; >> >>#if 0 >> if (!(i % 100000)) >> putc('.', stderr); >>#endif >> >> x.u = i; >> if (x.f != x.f) >> continue; /* skip error values */ >> >> sprintf(buf, "%.8e", x.f); >> >> errno = 0; >> x2.f = strtof(buf, 0); >> if (errno == ERANGE) { >> printf("strtof error for %s\n", buf); >> continue; >> } >> >> if (x2.u != x.u) >> printf("bit difference for %s (%u != %u)\n", buf, x2.u, x.u); >> } >>} >> >> >> > >------------------------------------------------------------------------- >Take Surveys. Earn Cash. Influence the Future of IT >Join SourceForge.net's Techsay panel and you'll get the chance to share your >opinions on IT & business topics through brief surveys -- and earn cash >http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >_______________________________________________ >Psidev-ms-dev mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > >------------------------------------------------------------------------- >Take Surveys. Earn Cash. Influence the Future of IT >Join SourceForge.net's Techsay panel and you'll get the chance to share your >opinions on IT & business topics through brief surveys -- and earn cash >http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >_______________________________________________ >Psidev-ms-dev mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > -- -- Dr. Pierre-Alain Binz Swiss Institute of Bioinformatics Proteome Informatics Group 1, Rue Michel Servet CH-1211 Geneve 4 Switzerland - - - - - - - - - - - - - - - - - Tel: +41-22-379 50 50 Fax: +41-22-379 58 58 Pie...@is... http://www.expasy.org/people/Pierre-Alain.Binz.html |