|
From: Angel P. <an...@ma...> - 2007-10-04 20:09:49
|
On 10/4/07, Brian Pratt <bri...@in...> wrote: > > These are interesting questions about how folks will use the format. I'm > not comfortable with the idea that the format is intended for repositories > instead of processing. I'd think you'd want a repository to contain > exactly > the same artifacts that were processed lest anyone wonder later what > differences may have existed in the various representations of the data. I think we agree here but are coming from different perspectives. In my mind in order for a repository to have the most accurate representation of the data, the standard has to be purposed for data archival and flexible experimental annotation. Data processing routines would then take that format and do whatever it will for peak detection, noise reduction, base-line correction, etc. to give a final set of values (that typically go into the search algorithms). All of the intermediate steps in the processing should in theory be able to be represented by the same format. I think that mzML as it stands is able to do track the data and the processes that where applied to it, but it will certainly not be the most efficient way to represent the data *as the processing is being done*. A special purpose format for the algorithm at hand will always win in terms of engineering ease / speed / performance / interoperability (within a set of tools). This I think is at the heart of the whole discussion, and why I think cvParam is always getting hammered on the list. So while it seems that we are talking cross purposes, I really don't think we are. -angel |