From: Randy J. <rkj...@in...> - 2006-10-06 13:48:48
|
In the mass spectrometry community there is a long history of building spectral databases which benefit from direct readability. Historically these have been plain ASCII representations including things like JCAMP-DX, etc. I think this list would agree that it would be better to use a HUPO format if for a peptide database. mzData could provide desirable additional instrument parameter information and provide a consistent mechanism for dealing with MS data across the proteomics community. To choose a numeric representation which causes groups like the NIST to use another format to receive and deliver data would be a loss. Instrument vendors are now providing exports to mzData, and I think it is critical that these exports be usable to submit data to mass spectral databases like those used by the MS community for years. If the cost is a little more code in the parser to deal with one more 'choice' element (of which we have many), then that seems small compared to the consequence of the NIST not being able to use the standard to deliver results to the community and thus requiring us to have a completely difference parser to read yet another MS format. Randy === Steve wrote: ... In our library, for example, we want the users to see the values that we put there, so we use ASCII. It would be very desirable for us if the same were offered in the XML's - otherwise we will have to go non-standard. ... -Steve Stein === Later Mike wrote: that touches on this issue. Also, an example on that page suggests another possibility for the encoding of peaklists that I prefer to those discussed so far: <peaklist> <peak mz="234.56" i="789" /> <peak mz="3456.43" i="2" /> <peak mz="3457.22" i="234" /> </peaklist> This would have the virtue of being highly accessible to eyeball and quick-and-dirty scripts as well. It would also clearly compress well. And it keeps the peak data within the realm of XML. It would be conceivable, I think, to use XSLT to create a table of peak data or even an SVG image of the spectrum, for example, since everything would be living in XML-land. > ...A standard that provides n>1 ways > to state the same thing is n times as difficult to implement and maintain, > which reduces vendor enthusiasm by a factor of n (squared?), which hinders > widespread adoption. ... I generally agree with this, and in particular, I suspect that if the specification allowed both representations, possibly most vendors would only produce base64 output. For this reason, if the textual representation is preferred, maybe the base64 alternative should be deprecated and marked for removal in a future version. However, I think that there is still an advantage to having the textual alternative in the specification, even if instrument vendors never produce it. It would allow those of us who prefer the textual format to do convert to it in a standard way, in a way that coordinates with the mzData standard. |