From: Thaman C. <th...@ut...> - 2011-09-03 20:33:14
|
Hi Experts, I am somehow lost in the beginning phase of my first project in the Proteomics department. My main task: "Designing DATABASE" of various "FILE" formats existed in the Proteomics research (RAW, wiff, MGF, pepXML, dta). Till now main file formats are RAW, WIFF, MGF & DTA which are not in XML and XML based: pepXM. Obviously there is overlap in information in multiple different data file formats/experimental. Though analysis can be made from single file (experimental measurements) or by comparison between “sets of files”. But, what is missing? Understanding the re-occurring pattern in and between experiments. Of course “Database” is needed in querying experiments/(files) with varied parameters. That’s why file based searching is not efficient to our answer our question. What kind of "search term" in an experiments? "Mono-isotopic mass". "Have we seen this mono isotopic mass before in other experiments is the main issue in our research"? I have started to work with RAW-> mzML (convert). I went through the documentation of mzML available in the HUPO trying to design relational schema from mzML schema manually. Though mzML is quite well documented I have confess not being clear. Questions -------------- 1) Where information like*"mono isotopic mass"* are recorded in the mzML file? Is it precursor charge attribute in the spectrum Element? Or am I missing something? 2) Can I be sure that all RAW and mzML after conversion regardless of different vendors consists of "mono-isotpic mass" info? 3) Further, I am wondering does all mentioned files (wiff, pepXML,dta,mgf) too contains "mono-isotopic mass" information? Please guide me! Regards, RawProt |
From: Martin E. <mar...@ru...> - 2011-09-09 15:34:43
|
Dear Thaman Chand! Thanks for your interest and your contribution! I believe, if you dig deeper into the field you will find that information in the descriptions of the formats or example files. With current spectrometers masses are usually monoisotopic, with old masses the parent/precursor mass was sometimes reported as "average isotopic". Only short (and as far as I know, maybe corrected or supplemented by others): - in MGF there is an optional MASS tag stating the type, PEPMASS is the precursor mass - DTA files themselves contain the precursor mas sin the first line, but do NOT contain the mass type; that is e.g. in seuqest.params in the same folder (mass_type_parent tag) - raw and wiff are instrument files, usually containing the profile - mzML can be profile OR peak list, in peak list the mass type is probably given as CV term, but that should be answered by mzML experts Best regards! Martin Von: Thaman Chand [mailto:th...@ut...] Gesendet: Samstag, 3. September 2011 22:18 An: psi...@li... Betreff: [Psidev-ms-dev] Does mzML, WIFF, MGF, DTA, and pepXML contains "mono-isotopic mass weight Information"? Hi Experts, I am somehow lost in the beginning phase of my first project in the Proteomics department. My main task: "Designing DATABASE" of various "FILE" formats existed in the Proteomics research (RAW, wiff, MGF, pepXML, dta). Till now main file formats are RAW, WIFF, MGF & DTA which are not in XML and XML based: pepXM. Obviously there is overlap in information in multiple different data file formats/experimental. Though analysis can be made from single file (experimental measurements) or by comparison between “sets of files”. But, what is missing? Understanding the re-occurring pattern in and between experiments. Of course “Database” is needed in querying experiments/(files) with varied parameters. That’s why file based searching is not efficient to our answer our question. What kind of "search term" in an experiments? "Mono-isotopic mass". "Have we seen this mono isotopic mass before in other experiments is the main issue in our research"? I have started to work with RAW-> mzML (convert). I went through the documentation of mzML available in the HUPO trying to design relational schema from mzML schema manually. Though mzML is quite well documented I have confess not being clear. Questions -------------- 1) Where information like*"mono isotopic mass"* are recorded in the mzML file? Is it precursor charge attribute in the spectrum Element? Or am I missing something? 2) Can I be sure that all RAW and mzML after conversion regardless of different vendors consists of "mono-isotpic mass" info? 3) Further, I am wondering does all mentioned files (wiff, pepXML,dta,mgf) too contains "mono-isotopic mass" information? Please guide me! Regards, RawProt |