From: Mike C. <tu...@gm...> - 2007-10-04 18:37:35
|
As far as I'm concerned, the only thing mzML had to do to be a wild success was to provide a standard replacement for the current vendor-specific, secret RAW file formats, anything further being gravy. This is clearly going to happen and I think those responsible should pat themselves on the back for a job well done. That said, it's not yet clear how things will play out beyond that. Will users/programmers/shops choose to keep their data in mzML format and develop lots of programs to deal with that format? Or will they choose to immediately "rip" mzML files into some other format that they perceive to be simpler, better, or more familiar? Each shop will be forced to make this decision eventually, and developers will also have to make it for each program that they write. I think most people would prefer the mzML-everywhere alternative, but there is no Microsoft here to force the decision, so mzML must win by being as appealing as possible. To me, this means keeping things as simple and intuitive as possible, and keeping them as decoupled as possible from other systems and programs. Ideally, mzML would be even "fun" to use. I know that this is a lot to ask for. I'll happily take all the gravy I can get. Mike |
From: Matthew C. <mat...@va...> - 2007-10-04 19:16:50
|
Mike Coleman wrote: > As far as I'm concerned, the only thing mzML had to do to be a wild > success was to provide a standard replacement for the current > vendor-specific, secret RAW file formats, anything further being > gravy. This is clearly going to happen and I think those responsible > should pat themselves on the back for a job well done. > > I am under the impression that /replacing/ the vendor-specific raw formats was never the intent of this specification. By replacing, I assume you mean that the people who run MS instruments would choose to store all their data in mzML instead of the raw instrument formats (i.e. the mzML would be archived and the raw formats deleted). I do not understand how such a thing would even be possible without intense cooperation, dedication, and commitment from all of the vendors. I don't think we have that and I think it's not realistic to expect it. I am under the impression that the intent of this specification is to provide a way to exchange the most significant metadata and data of MS runs. By that intent, the current spec looks great (excepting the current cvParam controversy). > That said, it's not yet clear how things will play out beyond that. > Will users/programmers/shops choose to keep their data in mzML format > and develop lots of programs to deal with that format? Or will they > choose to immediately "rip" mzML files into some other format that > they perceive to be simpler, better, or more familiar? > If a software group develops support for reading mzML, developing a writer should be a piece of cake and I see no incentive for them to create and write their own redundant format when they already developed a reader for mzML. > Each shop will be forced to make this decision eventually, and > developers will also have to make it for each program that they write. > I think most people would prefer the mzML-everywhere alternative, but > there is no Microsoft here to force the decision, so mzML must win by > being as appealing as possible. > I don't think there is a lot of controversy over using a single MS data exchange format (excepting the current cvParam one); it's when the analysisXml standard nears completion that software groups will really get serious headaches trying to decide what format to store their analysis results in. :) > To me, this means keeping things as simple and intuitive as possible, > and keeping them as decoupled as possible from other systems and > programs. Ideally, mzML would be even "fun" to use. > > Agreed. I will add that, to me, intuitive means that values are not stored in a 'name' attribute with no explicit category context. -Matt |
From: Brian P. <bri...@in...> - 2007-10-04 19:36:11
|
These are interesting questions about how folks will use the format. I'm not comfortable with the idea that the format is intended for repositories instead of processing. I'd think you'd want a repository to contain exactly the same artifacts that were processed lest anyone wonder later what differences may have existed in the various representations of the data. Seems to me the format has to be suitable for processing first and foremost or it's not likely to end up in a repository at all. - Brian -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Matthew Chambers Sent: Thursday, October 04, 2007 12:17 PM To: Mike Coleman Cc: PSI MS Dev Subject: Re: [Psidev-ms-dev] honey vs vinegar Mike Coleman wrote: > As far as I'm concerned, the only thing mzML had to do to be a wild > success was to provide a standard replacement for the current > vendor-specific, secret RAW file formats, anything further being > gravy. This is clearly going to happen and I think those responsible > should pat themselves on the back for a job well done. > > I am under the impression that /replacing/ the vendor-specific raw formats was never the intent of this specification. By replacing, I assume you mean that the people who run MS instruments would choose to store all their data in mzML instead of the raw instrument formats (i.e. the mzML would be archived and the raw formats deleted). I do not understand how such a thing would even be possible without intense cooperation, dedication, and commitment from all of the vendors. I don't think we have that and I think it's not realistic to expect it. I am under the impression that the intent of this specification is to provide a way to exchange the most significant metadata and data of MS runs. By that intent, the current spec looks great (excepting the current cvParam controversy). > That said, it's not yet clear how things will play out beyond that. > Will users/programmers/shops choose to keep their data in mzML format > and develop lots of programs to deal with that format? Or will they > choose to immediately "rip" mzML files into some other format that > they perceive to be simpler, better, or more familiar? > If a software group develops support for reading mzML, developing a writer should be a piece of cake and I see no incentive for them to create and write their own redundant format when they already developed a reader for mzML. > Each shop will be forced to make this decision eventually, and > developers will also have to make it for each program that they write. > I think most people would prefer the mzML-everywhere alternative, but > there is no Microsoft here to force the decision, so mzML must win by > being as appealing as possible. > I don't think there is a lot of controversy over using a single MS data exchange format (excepting the current cvParam one); it's when the analysisXml standard nears completion that software groups will really get serious headaches trying to decide what format to store their analysis results in. :) > To me, this means keeping things as simple and intuitive as possible, > and keeping them as decoupled as possible from other systems and > programs. Ideally, mzML would be even "fun" to use. > > Agreed. I will add that, to me, intuitive means that values are not stored in a 'name' attribute with no explicit category context. -Matt ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Angel P. <an...@ma...> - 2007-10-04 20:09:49
|
On 10/4/07, Brian Pratt <bri...@in...> wrote: > > These are interesting questions about how folks will use the format. I'm > not comfortable with the idea that the format is intended for repositories > instead of processing. I'd think you'd want a repository to contain > exactly > the same artifacts that were processed lest anyone wonder later what > differences may have existed in the various representations of the data. I think we agree here but are coming from different perspectives. In my mind in order for a repository to have the most accurate representation of the data, the standard has to be purposed for data archival and flexible experimental annotation. Data processing routines would then take that format and do whatever it will for peak detection, noise reduction, base-line correction, etc. to give a final set of values (that typically go into the search algorithms). All of the intermediate steps in the processing should in theory be able to be represented by the same format. I think that mzML as it stands is able to do track the data and the processes that where applied to it, but it will certainly not be the most efficient way to represent the data *as the processing is being done*. A special purpose format for the algorithm at hand will always win in terms of engineering ease / speed / performance / interoperability (within a set of tools). This I think is at the heart of the whole discussion, and why I think cvParam is always getting hammered on the list. So while it seems that we are talking cross purposes, I really don't think we are. -angel |
From: Brian P. <bri...@in...> - 2007-10-04 21:13:37
|
Hi Angel, I don't think anyone meant to say that mzML should represent the data as the processing is being done, that's normally some in-memory representation. There's just concern that the cvParam approach makes getting the data out of the file and into data structures for processing more complex than it needs to be. I fear I may be misunderstanding your point, though? It might be read as implying, for example, that converting from mzML back to mzXML for the purposes of ASAPRatio and its elution profiling is a proper thing to do, but I don't expect that's what you meant to say. Can you clarify? Thanks, Brian _____ From: psi...@li... [mailto:psi...@li...] On Behalf Of Angel Pizarro Sent: Thursday, October 04, 2007 1:10 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] honey vs vinegar On 10/4/07, Brian Pratt <bri...@in...> wrote: These are interesting questions about how folks will use the format. I'm not comfortable with the idea that the format is intended for repositories instead of processing. I'd think you'd want a repository to contain exactly the same artifacts that were processed lest anyone wonder later what differences may have existed in the various representations of the data. I think we agree here but are coming from different perspectives. In my mind in order for a repository to have the most accurate representation of the data, the standard has to be purposed for data archival and flexible experimental annotation. Data processing routines would then take that format and do whatever it will for peak detection, noise reduction, base-line correction, etc. to give a final set of values (that typically go into the search algorithms). All of the intermediate steps in the processing should in theory be able to be represented by the same format. I think that mzML as it stands is able to do track the data and the processes that where applied to it, but it will certainly not be the most efficient way to represent the data *as the processing is being done*. A special purpose format for the algorithm at hand will always win in terms of engineering ease / speed / performance / interoperability (within a set of tools). This I think is at the heart of the whole discussion, and why I think cvParam is always getting hammered on the list. So while it seems that we are talking cross purposes, I really don't think we are. -angel |
From: Mike C. <tu...@gm...> - 2007-10-04 21:56:06
|
On 10/4/07, Brian Pratt <bri...@in...> wrote: > I'm > not comfortable with the idea that the format is intended for repositories > instead of processing. I'd think you'd want a repository to contain exactly > the same artifacts that were processed lest anyone wonder later what > differences may have existed in the various representations of the data. If you're talking about mzML files vs (say) ms2 files, it makes sense to me to archive the mzML file and then specify that version X of mzML-to-ms2 was used to prepare the spectra for search. If you're talking about mzML files vs RAW files, I'd still prefer to archive the mzML files, even though they are conceptually downstream from the RAW files. Although both files are produced via magical processes (secret vendor software), at least the mzML file follows a standard and can be read and understood without further magic. Mike |
From: Brian P. <bri...@in...> - 2007-10-04 22:27:57
|
Agreed on both counts. I'm just making the case for a format that requires as few conversion steps as possible in an analysis pipeline, since each is an opportunity for introduction of error. In some cases (input to closed source tools) another file format conversion is unavoidable, but in all others it would be best if mzML was a format that lends itself to easy and fast conversion directly to data structures by the tool (that is, easy to write and maintain parsers for). This in response to a perceived argument along the lines of "it's ok if it's kind of hard to parse efficiently, just convert it to some special-purpose format that better suits the performance needs of the tool in question", which just strikes me as the wrong approach. - Brian -----Original Message----- From: psi...@li... [mailto:psi...@li...] On Behalf Of Mike Coleman Sent: Thursday, October 04, 2007 2:56 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] honey vs vinegar On 10/4/07, Brian Pratt <bri...@in...> wrote: > I'm > not comfortable with the idea that the format is intended for repositories > instead of processing. I'd think you'd want a repository to contain exactly > the same artifacts that were processed lest anyone wonder later what > differences may have existed in the various representations of the data. If you're talking about mzML files vs (say) ms2 files, it makes sense to me to archive the mzML file and then specify that version X of mzML-to-ms2 was used to prepare the spectra for search. If you're talking about mzML files vs RAW files, I'd still prefer to archive the mzML files, even though they are conceptually downstream from the RAW files. Although both files are produced via magical processes (secret vendor software), at least the mzML file follows a standard and can be read and understood without further magic. Mike ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Mike C. <tu...@gm...> - 2007-10-04 21:40:37
|
On 10/4/07, Matthew Chambers <mat...@va...> wrote: > I am under the impression that /replacing/ the vendor-specific raw > formats was never the intent of this specification. By replacing, I > assume you mean that the people who run MS instruments would choose to > store all their data in mzML instead of the raw instrument formats (i.e. > the mzML would be archived and the raw formats deleted). It's possible that I have missed the plot. This is what *I* am hoping to get out of the spec, anyway. It appears to me that mzML files will have all of the information in them that I care about, and since they are standardized and relatively readable, in that sense they would be superior to RAW files, which are ad hoc and opaque. > > That said, it's not yet clear how things will play out beyond that. > > Will users/programmers/shops choose to keep their data in mzML format > > and develop lots of programs to deal with that format? Or will they > > choose to immediately "rip" mzML files into some other format that > > they perceive to be simpler, better, or more familiar? > > > If a software group develops support for reading mzML, developing a > writer should be a piece of cake and I see no incentive for them to > create and write their own redundant format when they already developed > a reader for mzML. To be a little more concrete, our current pipeline uses ms2 files. Our current primary search program does not accept mzML input and we do not have source code for the program, which means that we cannot adapt it to do so. Also, it appears that for our spectra, mzML files may be somewhat larger than the corresponding ms2 files. They're also not as human-readable. Plus other issues like what we're discussing today. So, there may be notable pluses and minuses to a full-scale conversion to mzML, versus a rip-to-ms2 approach, which would be cheap and simple in the short run. I'm taking a wait-and-see approach for now. > I don't think there is a lot of controversy over using a single MS data > exchange format (excepting the current cvParam one) I don't think there will be shouting matches, no. Perhaps it would be more like the IPv6 conversion that was--and is still--just around the corner. > it's when the > analysisXml standard nears completion that software groups will really > get serious headaches trying to decide what format to store their > analysis results in. :) I've been too afraid of this to look. :-) Mike |
From: Angel P. <an...@ma...> - 2007-10-05 00:28:49
|
On 10/4/07, Brian Pratt <bri...@in...> wrote: > > Hi Angel, > > I fear I may be misunderstanding your point, though? It might be read as > implying, for example, that converting from mzML back to mzXML for the > purposes of ASAPRatio and its elution profiling is a proper thing to do, but > I don't expect that's what you meant to say. Can you clarify? > Yep, that's exactly what I was proposing, but maybe ASAP ration is a bad example since ASAP ratio is open source and controlled by the TPP folks ;) A better example would be sequest and bioworks, which uses a binary file format for storing processed peaks and the result in one file. The conversion would be mzML -> RAW/SRF -> SRF -> whatever you want here. The pay-off for bioworks to do something like this is fine-tuned random access for spectral processing. Plus the code investment in supporting mzML is relatively small and restricted to in/out of their format. Actually I take it back, ASAPR is a good example b/c using this model of translating an archive format to/from operational formats allows the ISB to put its development effort on newer algorithms, and prevent older projects from being put out to pasture. -angel Thanks, > > > > Brian > > > > > ------------------------------ > > *From:* psi...@li... [mailto: > psi...@li...] *On Behalf Of *Angel Pizarro > *Sent:* Thursday, October 04, 2007 1:10 PM > *To:* Mass spectrometry standard development > *Subject:* Re: [Psidev-ms-dev] honey vs vinegar > > > > On 10/4/07, *Brian Pratt* <bri...@in...> wrote: > > These are interesting questions about how folks will use the format. I'm > not comfortable with the idea that the format is intended for repositories > instead of processing. I'd think you'd want a repository to contain > exactly > the same artifacts that were processed lest anyone wonder later what > differences may have existed in the various representations of the data. > > > I think we agree here but are coming from different perspectives. In my > mind in order for a repository to have the most accurate representation of > the data, the standard has to be purposed for data archival and flexible > experimental annotation. Data processing routines would then take that > format and do whatever it will for peak detection, noise reduction, > base-line correction, etc. to give a final set of values (that typically go > into the search algorithms). All of the intermediate steps in the processing > should in theory be able to be represented by the same format. > > I think that mzML as it stands is able to do track the data and the > processes that where applied to it, but it will certainly not be the most > efficient way to represent the data *as the processing is being done*. A > special purpose format for the algorithm at hand will always win in terms of > engineering ease / speed / performance / interoperability (within a set of > tools). > > This I think is at the heart of the whole discussion, and why I think > cvParam is always getting hammered on the list. So while it seems that we > are talking cross purposes, I really don't think we are. > > -angel > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > -- Angel Pizarro Director, Bioinformatics Facility Institute for Translational Medicine and Therapeutics University of Pennsylvania 806 BRB II/III 421 Curie Blvd. Philadelphia, PA 19104-6160 P: 215-573-3736 F: 215-573-9004 |
From: Brian P. <bri...@in...> - 2007-10-05 00:46:51
|
Well, ASAPRatio (like most ISB tools) uses the RAMP API, which, we are promised, will read mzML, so it's just a recompile. I just hope it's not a big performance hit due to the nature of the mzML format. For closed source tools, yes, at least until the vendors get on board then intermediate conversion steps are a fact of life. But each translation step introduces the possibility of error (I'm a very suspicious guy when it comes to software: more code means more bugs) and is best avoided when possible, so starting with a format that pretty much expects you to convert away from it for operational purposes spooks me. Bad for throughput, too. - Brian _____ From: psi...@li... [mailto:psi...@li...] On Behalf Of Angel Pizarro Sent: Thursday, October 04, 2007 5:29 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] honey vs vinegar On 10/4/07, Brian Pratt <bri...@in...> wrote: Hi Angel, I fear I may be misunderstanding your point, though? It might be read as implying, for example, that converting from mzML back to mzXML for the purposes of ASAPRatio and its elution profiling is a proper thing to do, but I don't expect that's what you meant to say. Can you clarify? Yep, that's exactly what I was proposing, but maybe ASAP ration is a bad example since ASAP ratio is open source and controlled by the TPP folks ;) A better example would be sequest and bioworks, which uses a binary file format for storing processed peaks and the result in one file. The conversion would be mzML -> RAW/SRF -> SRF -> whatever you want here. The pay-off for bioworks to do something like this is fine-tuned random access for spectral processing. Plus the code investment in supporting mzML is relatively small and restricted to in/out of their format. Actually I take it back, ASAPR is a good example b/c using this model of translating an archive format to/from operational formats allows the ISB to put its development effort on newer algorithms, and prevent older projects from being put out to pasture. -angel Thanks, Brian _____ From: psi...@li... [mailto:psi...@li...] On Behalf Of Angel Pizarro Sent: Thursday, October 04, 2007 1:10 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] honey vs vinegar On 10/4/07, Brian Pratt <bri...@in...> wrote: These are interesting questions about how folks will use the format. I'm not comfortable with the idea that the format is intended for repositories instead of processing. I'd think you'd want a repository to contain exactly the same artifacts that were processed lest anyone wonder later what differences may have existed in the various representations of the data. I think we agree here but are coming from different perspectives. In my mind in order for a repository to have the most accurate representation of the data, the standard has to be purposed for data archival and flexible experimental annotation. Data processing routines would then take that format and do whatever it will for peak detection, noise reduction, base-line correction, etc. to give a final set of values (that typically go into the search algorithms). All of the intermediate steps in the processing should in theory be able to be represented by the same format. I think that mzML as it stands is able to do track the data and the processes that where applied to it, but it will certainly not be the most efficient way to represent the data *as the processing is being done*. A special purpose format for the algorithm at hand will always win in terms of engineering ease / speed / performance / interoperability (within a set of tools). This I think is at the heart of the whole discussion, and why I think cvParam is always getting hammered on the list. So while it seems that we are talking cross purposes, I really don't think we are. -angel ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev -- Angel Pizarro Director, Bioinformatics Facility Institute for Translational Medicine and Therapeutics University of Pennsylvania 806 BRB II/III 421 Curie Blvd. Philadelphia, PA 19104-6160 P: 215-573-3736 F: 215-573-9004 |
From: Jimmy E. <jk...@gm...> - 2007-10-05 02:30:48
|
Angel, counter to what you're suggesting, I do believe that mzML was developed to at least try and be an operational format also. Otherwise, there would not be a need for a scan index with file offset pointers in the wrapper schema, no? The primary reason why mzXML was developed was to replace native MS binary data files with something transparent and platform neutral (and be an operational format for tools that consume these files). Obviously everyone imagines mzML to address many, and it looks like sometimes different & non-inclusive, use cases. My short sighted personal interest is to see mzML address the operational raw file replacement use case succinctly w/o any adverse complexities to make its adoption for this use case difficult. Otherwise Angel's proposal of mzML->SRF, mzML->mgf, and I dare say mzML->mzXML is going to end up being reality for some subset of users. And in the world of these users, why bother going from native->mzML->XYZ if native files are around and you can do native->XYZ? Sorry I can't contribute to the cvParam talk here because I don't even know what that is! :) On 10/4/07, Angel Pizarro <an...@ma...> wrote: > On 10/4/07, Brian Pratt <bri...@in...> wrote: > > Hi Angel, > > > > I fear I may be misunderstanding your point, though? It might be read as > implying, for example, that converting from mzML back to mzXML for the > purposes of ASAPRatio and its elution profiling is a proper thing to do, but > I don't expect that's what you meant to say. Can you clarify? > > > Yep, that's exactly what I was proposing, but maybe ASAP ration is a bad > example since ASAP ratio is open source and controlled by the TPP folks ;) A > better example would be sequest and bioworks, which uses a binary file > format for storing processed peaks and the result in one file. The > conversion would be mzML -> RAW/SRF -> SRF -> whatever you want here. The > pay-off for bioworks to do something like this is fine-tuned random access > for spectral processing. Plus the code investment in supporting mzML is > relatively small and restricted to in/out of their format. > > Actually I take it back, ASAPR is a good example b/c using this model of > translating an archive format to/from operational formats allows the ISB to > put its development effort on newer algorithms, and prevent older projects > from being put out to pasture. > > -angel > > > > > > > > > > > Thanks, > > > > > > > > Brian > > > > > > > > > > > > ________________________________ > > > > > From: psi...@li... > [mailto:psi...@li...] On > Behalf Of Angel Pizarro > > Sent: Thursday, October 04, 2007 1:10 PM > > To: Mass spectrometry standard development > > Subject: Re: [Psidev-ms-dev] honey vs vinegar > > > > > > > > > > On 10/4/07, Brian Pratt <bri...@in...> wrote: > > > > > > > > These are interesting questions about how folks will use the format. I'm > > not comfortable with the idea that the format is intended for repositories > > instead of processing. I'd think you'd want a repository to contain > exactly > > the same artifacts that were processed lest anyone wonder later what > > differences may have existed in the various representations of the data. > > > > > > > > I think we agree here but are coming from different perspectives. In my > mind in order for a repository to have the most accurate representation of > the data, the standard has to be purposed for data archival and flexible > experimental annotation. Data processing routines would then take that > format and do whatever it will for peak detection, noise reduction, > base-line correction, etc. to give a final set of values (that typically go > into the search algorithms). All of the intermediate steps in the processing > should in theory be able to be represented by the same format. > > > > I think that mzML as it stands is able to do track the data and the > processes that where applied to it, but it will certainly not be the most > efficient way to represent the data *as the processing is being done*. A > special purpose format for the algorithm at hand will always win in terms of > engineering ease / speed / performance / interoperability (within a set of > tools). > > > > This I think is at the heart of the whole discussion, and why I think > cvParam is always getting hammered on the list. So while it seems that we > are talking cross purposes, I really don't think we are. > > > > -angel > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. > > Still grepping through log files to find problems? Stop. > > Now Search log events and configuration files using AJAX and a browser. > > Download your FREE copy of Splunk now >> http://get.splunk.com/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > > > > -- > Angel Pizarro > Director, Bioinformatics Facility > Institute for Translational Medicine and Therapeutics > University of Pennsylvania > 806 BRB II/III > 421 Curie Blvd. > Philadelphia, PA 19104-6160 > > P: 215-573-3736 > F: 215-573-9004 > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Angel P. <an...@ma...> - 2007-10-05 12:20:23
|
On 10/4/07, Jimmy Eng <jk...@gm...> wrote: > > Angel, counter to what you're suggesting, I do believe that mzML was > developed to at least try and be an operational format also. > Otherwise, there would not be a need for a scan index with file offset > pointers in the wrapper schema, no? Very true. And I hope that decent performance comes from the API's written for the format. I am playing devils advocate here. Call me a pessimist, but I don't think any instrument manufacturer is going to use mzML as their native format (Or at least it will take on the order of 4 or more years for this to happen). If vendors do adopt it as the native format, great! I would be more than ecstatic, but I am not holding my breadth. Vendors, please correct me if I am making a wrong assumption here. Silence == agreement ;) The primary reason why mzXML was developed was to replace native MS > binary data files with something transparent and platform neutral (and > be an operational format for tools that consume these files). > Obviously everyone imagines mzML to address many, and it looks like > sometimes different & non-inclusive, use cases. My short sighted > personal interest is to see mzML address the operational raw file > replacement use case succinctly w/o any adverse complexities to make > its adoption for this use case difficult. Otherwise Angel's proposal > of mzML->SRF, mzML->mgf, and I dare say mzML->mzXML is going to end up > being reality for some subset of users. And in the world of these > users, why bother going from native->mzML->XYZ if native files are > around and you can do native->XYZ? My hope is that by having mzML in the middle, we can reliably say SRF == MGF, where with the current situation of native -> XYZ, we just can't make that claim. Also, it is my hope to reduce the burden of 3rd party vendors by having mzML be the officially supported format for input. -angel |