Re: [Psidev-ms-dev] honey vs vinegar

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Angel, counter to what you're suggesting, I do believe that mzML was
developed to at least try and be an operational format also.
Otherwise, there would not be a need for a scan index with file offset
pointers in the wrapper schema, no?

The primary reason why mzXML was developed was to replace native MS
binary data files with something transparent and platform neutral (and
be an operational format for tools that consume these files).
Obviously everyone imagines mzML to address many, and it looks like
sometimes different & non-inclusive, use cases.  My short sighted
personal interest is to see mzML address the operational raw file
replacement use case succinctly w/o any adverse complexities to make
its adoption for this use case difficult.  Otherwise Angel's proposal
of mzML->SRF, mzML->mgf, and I dare say mzML->mzXML is going to end up
being reality for some subset of users.  And in the world of these
users, why bother going from native->mzML->XYZ if native files are
around and you can do native->XYZ?

Sorry I can't contribute to the cvParam talk here because I don't even
know what that is! :)

On 10/4/07, Angel Pizarro <an...@ma...> wrote:
> On 10/4/07, Brian Pratt <bri...@in...> wrote:
> > Hi Angel,
> >
> > I fear I may be misunderstanding your point, though?  It might be read as
> implying, for example, that converting from mzML back to mzXML for the
> purposes of ASAPRatio and its elution profiling is a proper thing to do, but
> I don't expect that's what you meant to say.  Can you clarify?
> >
> Yep, that's exactly what I was proposing, but maybe ASAP ration is a bad
> example since ASAP ratio is open source and controlled by the TPP folks ;) A
> better example would be sequest and bioworks, which uses a binary file
> format for storing processed peaks and the result in one file. The
> conversion would be mzML -> RAW/SRF -> SRF -> whatever you want here. The
> pay-off for bioworks to do something like this is fine-tuned random access
> for spectral processing. Plus the code investment in supporting mzML is
> relatively small and restricted to in/out of their format.
>
> Actually I take it back, ASAPR is a good example b/c using this model of
> translating an archive format to/from operational formats allows the ISB to
> put its development effort on newer algorithms, and prevent older projects
> from being put out to pasture.
>
> -angel
>
> >
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Brian
> >
> >
> >
> >
> >
> > ________________________________
>
> >
> > From: psi...@li...
> [mailto:psi...@li...] On
> Behalf Of Angel Pizarro
> > Sent: Thursday, October 04, 2007 1:10 PM
> > To: Mass spectrometry standard development
> > Subject: Re: [Psidev-ms-dev] honey vs vinegar
> >
> >
> >
> >
> > On 10/4/07, Brian Pratt <bri...@in...> wrote:
> >
> >
> >
> > These are interesting questions about how folks will use the format.  I'm
> > not comfortable with the idea that the format is intended for repositories
> > instead of processing.  I'd think you'd want a repository to contain
> exactly
> > the same artifacts that were processed lest anyone wonder later what
> > differences may have existed in the various representations of the data.
> >
> >
> >
> > I think we agree here but are coming from different perspectives. In my
> mind in order for a repository to have the most accurate representation of
> the data, the standard has to be purposed for data archival and flexible
> experimental annotation. Data processing routines would then take that
> format and do whatever it will for peak detection, noise reduction,
> base-line correction, etc. to give a final set of values (that typically go
> into the search algorithms). All of the intermediate steps in the processing
> should in theory be able to be represented by the same format.
> >
> > I think that mzML as it stands is able to do track the data and the
> processes that where applied to it, but it will certainly not be the most
> efficient way to represent the data *as the processing is being done*. A
> special purpose format for the algorithm at hand will always win in terms of
> engineering ease / speed / performance / interoperability (within a set of
> tools).
> >
> > This I think is at the heart of the whole discussion, and why I think
> cvParam is always getting hammered on the list. So while it seems that we
> are talking cross purposes, I really don't think we are.
> >
> > -angel
> >
> -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> > _______________________________________________
> > Psidev-ms-dev mailing list
> > Psi...@li...
> >
> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev
> >
> >
>
>
>
> --
> Angel Pizarro
> Director, Bioinformatics Facility
> Institute for Translational Medicine and Therapeutics
> University of Pennsylvania
> 806 BRB II/III
> 421 Curie Blvd.
> Philadelphia, PA 19104-6160
>
> P: 215-573-3736
> F: 215-573-9004
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Psidev-ms-dev mailing list
> Psi...@li...
> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev
>
>