From: Brian P. <bri...@in...> - 2006-09-20 19:21:52
|
> If I understand correctly, with properly implemented numeric I/O > routines (in libc), you can have a 1-1 mapping between the > internal and > ASCII representation, so that it is possible to round trip without > introducing error. Well, no, but this is something a lot of folks don't realize. For (a previously cited by Randy) example consider "0.1" - see http://www.yoda.arachsys.com/csharp/floatingpoint.html for an explanation. > One additional note: We seem to be assuming that mass specs > all already > do IEEE FP. Is this actually true? AFAIK, yes. That's a wheel that nobody has cared to reinvent for some time now. - Brian > -----Original Message----- > From: Coleman, Michael [mailto:MK...@St...] > Sent: Wednesday, September 20, 2006 11:17 AM > To: bri...@in...; psi...@li... > Subject: RE: [Psidev-ms-dev] Why base64? > > > Brian Pratt: > > > Accuracy: Mass spec data in its raw form is generally stored > > in binary formats, since mass specs are front ended by binary > > computers. Conversion to and from base 10 human readable > > representations introduces error. It's best to hold the data at its > > original precision and translate out to human readable format > > at whatever precision is deemed useful for eyeballing. > > This is a complicated topic and I don't claim to be an expert by any > means. Here's my understanding. > > Error is present, and we want to avoid amplifying it. If, > for example, > the instrument has an internal IEEE FP value 1234.56789012345 and we > know that its precision is only +/- 0.1, then there's no particular > benefit (nor harm) to reporting this as anything beyond 1234.6 or > 1234.57. The 0.00089012345 is more or less noise. > > As a practical matter, it might be more efficient to move the > IEEE bits > directly from the instrument to the mzData file. A cost of > doing this, > though, is that this format is not human-readable. > > An alternative would be to fully represent the IEEE bits as a number. > If I understand correctly, with properly implemented numeric I/O > routines (in libc), you can have a 1-1 mapping between the > internal and > ASCII representation, so that it is possible to round trip without > introducing error. This *would* make the textual > representation larger, > and it's not clear that it really makes sense to do this, > because of the > noise issue (above). > > One additional note: We seem to be assuming that mass specs > all already > do IEEE FP. Is this actually true? > > > > File size: Sure, you can make files smaller by throwing away > > precision, but as you begin to desire higher precision > base64 quickly > > becomes much more efficient. > > Just to confirm, I agree that discarding *real* precision is > unacceptable. (By "real", I mean what's being physically > measured, not > bits that are an artifact of the IEEE representation.) > > Mike > |