From: <cod...@go...> - 2008-11-27 22:09:03
|
Comment #42 on issue 42 by matthew....@vanderbilt.edu: Issues with the CV http://code.google.com/p/psi-pi/issues/detail?id=42 Hi David, sorry for the long reply to follow... RE: spectrumID reference I'm aware of the decision to use mzML's id as the spectrumID but I'm bringing the point back up because the issue of non-mzML inputs was not discussed at the time (AFAIK). I do not see the justification for using the id instead of the nativeID when the latter must always exist for any input format whereas the former only makes sense from an actual mzML file. RE: MGF ids Having CV terms for various format attributes is not a terrible thing, but I worry because the scope is potentially much bigger than MGF->DAT->analysisXML. All of the non-mzML input formats that could potentially be used to generate an intermediate search result format and then converted to analysisXML will more often than not have this problem. Trying to account for the various transformations of the identifiers that could happen from this translation seems like a lost cause to me. The exception would be very specific pipelines where the inputs and outputs are tightly controlled and in those cases, userParams seem more appropriate than cvParams. Even in the case of MGF->DAT->analysisXML, some of your MGF inputs may be completely lacking in title, rt, and scan attributes, because they're all optional, so without an index it's all screwed! :( Just think of the combinations: modern vendor formats: Thermo RAW, Waters RAW, WIFF, YEP, BAF, FID, MassHunter, Shimadzu open formats: mzML, mzXML, mzData, MGF, DTA, MS[12], PKL, search result formats: pepXML, SQT, OUT, SRF, DAT, X! Tandem As I understand it, your specific use case is: take existing DAT files that were searched from MGFs with (unique?) title/RT/scan attributes and convert to analysisXML in a way that a generic reader can directly go back to the MGF data. The generic version of that use case is: take existing search results in any format that were searched from any spectra format and convert to analysisXML in a way that a generic reader can directly go back to the data in the input spectra format. Supporting the specific use case and not the generic one makes me cringe a bit, which is why I chimed in on the issue. Can't users just re-search their data and output directly to analysisXML with the index attribute intact? :P -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |