From: Wilfred H T. <Ta...@ap...> - 2008-06-18 00:36:38
|
Any thoughts regarding the comments below? For the last point, I would like to propose to add the following terms to the controlled vocabulary: [Term] id: MS:9999999 name: MS1 scan def: "The specific scan function or process that records a MS1 spectrum." [PSI:MS] is_a: MS:1000020 ! scanning method [Term] id: MS:9999999 name: MSn scan def: "The specific scan function or process that records a MSn spectrum." [PSI:MS] is_a: MS:1000020 ! scanning method [Term] id: MS:9999999 name: precursor ion spectrum def: "Spectrum generated by scanning precursor m/z while monitoring a fixed product m/z" [PSI:MS] is_a: MS:1000524 ! data file content is_a: MS:1000559 ! spectrum type [Term] id: MS:9999999 name: constant neutral loss spectrum def: "Spectrum generated by scanning precursor m/z and product m/z simultaneously, maintaining a constant difference between the two" [PSI:MS] is_a: MS:1000524 ! data file content is_a: MS:1000559 ! spectrum type Thanks, Wilfred Wilfred H Tang/FOS/PEC 05/24/2008 07:03 PM To psi...@li... cc Subject mzML comments I recently looked over the mzML format draft and have a few comments. Regarding the schema (http://www.sbeams.org/tmp/mzML0.99.12.html): * For the "cvParam Mapping Rules," I suggest that the number of "musts" be decreased significantly (or changed to "mays"). For example, for the element <sourceFile>, there's a rule saying: "MUST supply a *child* term of MS:1000561 (data file checksum type) one or more times." This sort of information does not seem to be critical for downstream processing of mzML files and thus doesn't deserve "must" status. There are quite a few similar examples. * For the elements <fileDescription>, <sourceFileList>, <sourceFile>, etc., would it make sense to change the name to be more general (such as dataSource) to reflect the fact that not all instrument data is stored in files? For example, for the Applied Biosystems|MDS Sciex instruments, some instruments (such as the QSTAR or QTRAP systems) store data in .wiff files, while other instruments (such as the 4700 and 4800 systems) store data in an Oracle database. Regarding the controlled vocabulary ( http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo ): * Under "spectrum"-->"spectrum representation", the only 2 choices are "centroid mass spectrum" and "profile mass spectrum". That doesn't adequately capture the full range of spectrum representations. For example in addition to centroiding, the data could be de-isotoped, smoothed, converted to +1 charge, etc. Also, having just the 2 choices of centroid vs. profile is inconsistent with the software processing options listed under "data transformation"-->"data processing action", where a much wider range of options are listed ("baseline reduction", "charge deconvolution", "deisotoping", etc.). It seems desirable to expand the choices under "spectrum"-->"spectrum representation" to at least be consistent. Alternatively, maybe a slightly different categorization might make sense - something like raw data vs. processed data (full data vs. reduced data). * Under "spectrum"-->"spectrum type", some triple quad-type scans are missing - precursor ion (scan Q1, fixed Q3), neutral loss (scan Q1 and Q3 together with a constant difference between Q1 and Q3). Please accept my apologies in advance if any of these topics have already been discussed/resolved previously, as I am new to this discussion. Thanks, Wilfred |