From: David O. <do...@eb...> - 2011-10-11 12:48:41
|
Dear all, Sorry about the really long mail. I've divided the several things we had to do in two blocks: changes that I considered for the next release ("Changes and additions" section) and pending changes related to things not clarified ("Pending things"). Attached to this mail, there's a release candidate (new version number will be 3.12.0). Please check the changes and additions. I expect everything is OK. If you don't agree with the changes, please, email the list and we'll change whatever is needed. As soon everything is agreed, I'll publish the new version. ===================================================== Changes and additions ===================================================== - Two new terms proposed by Steve Robles (last mail Oct 10th, "Request for new terms"): MS:1001878 ("Offset Voltage") and MS:1001879 ("In-source collision-induced dissociation"). Eric Deutsch commented a possible overlapping between the first one (MS:1001878-> "Offset Voltage") and the next two terms MS:1000876 ("cone voltage") and MS:1000877 ("tube lens"). No mails following this point. I should add that the last term's name ("tube lens"), maybe is better described as "tube lens voltage". (change not included in the temporal obo file). The two new terms: [Term] id: MS:1001878 name: Offset Voltage def: "The potential difference between two adjacent interface voltages affecting in-source collision induced dissociation." [PSI:MS] is_a: MS:1000482 ! source attribute relationship: has_units UO:0000218 ! volt [Term] id: MS:1001879 name: In-source collision-induced dissociation def: "The dissociation of an ion as a result of collisional excitation during ion transfer from an atmospheric pressure ion source and the mass spectrometer vacuum." [PSI:MS] is_a: MS:1000044 ! dissociation method - Change in definition for MS:1000932 ("...MDS SCIEX TripleTOF 5500..." changed by "...MDS SCIEX TripleTOF 5600...") (last mail Oct 10th, "Request for new terms") - A definition for the term MS:1000672 (name: Cliquid) (last mail Oct 10th, "Request for new terms"). Is an Applied Biosystems software. Browsing I found that maybe the definition could be something like this: def: "AB SCIEX or Applied Biosystems software for data analysis and quantitation." [PSI:MS] I found the most complete descripcion of the software in this link: http://www.mass-spec-capital.com/product/cliquid-software-applied-biosystems-abi-unit-life-technologies-2001-18662.html - New term added for the mz5 format (last mail Oct 10th, "Request for new terms")(last mail Oct 10th, "Request for new terms"): [Term] id: MS:1001880 name: mz5 file def: "mz5 file format, modeled after mzML, developed by the Steen Lab." [PSI:MS] is_a: MS:1000560 ! mass spectrometer file format - Space removed in 1000139 name "4000 Q TRAP" à “4000 QTRAP”. (last mail Oct 10th, "Request for new terms") Does this cause a problem with obsoleted term of same name (id: MS:1000870)? No problems reported. - Adding 3 transition validation attributes (last mail Oct 10th, "Request for new terms"): [Term] id: MS:1001881 name: transition validation attribute def: "Attributes of the quality of a transition that affect its selection as appropriate." [PSI:MS] relationship: part_of MS:1000908 ! transition [Term] id: MS:1001882 name: coefficient of variation def: "Variation of a set of signal measurements calculated as the standard deviation relative to the mean." [PSI:MS] xref: value-type:xsd\:float "The allowed value-type for this CV term." is_a: MS:1001881 ! transition validation attribute [Term] id: MS:1001883 name: signal-to-noise ratio def: "Unitless number providing the ratio of the total measured intensity of a signal relative to the estimated noise level for that signal." [PSI:MS] xref: value-type:xsd\:float "The allowed value-type for this CV term." is_a: MS:1001881 ! transition validation attribute - Adding 1 command-line parameter term as suggested by Magnus (last mail Oct 10th, "Request for new terms"): [Term] id: MS:1001884 name: command-line parameters def: "Parameters string passed to a command-line interface software application." [PSI:MS] xref: value-type:xsd\:string "The allowed value-type for this CV term." is_a: MS:1000630 ! data processing parameter - Two terms (MS:1001843 and id: MS:1001844 -peak intensity and area- now sons of MS:1000042 ! peak intensity) proposed by Eric Deutsch about “peak area” and “peak height” (last email Oct 10th "PSI-MS CV: peak intensity area terms"). Related to this, the term MS:1001845 ("peak area" too) has been obsoleted because duplicated toMS:1001844. ===================================================== Pending things ===================================================== Pending things 1 -> Some comments in Eric Deutsch mail (Oct 10th) not yet implemented in the controlled vocabulary and to be discussed : 2) Is there really a difference between “peak area” and “XIC area”. I suspect not. If there really is an intended subtle difference (e.g. “peak area” takes into account background removal, which “XIC area” is irrespective of background) then we should define this. 3) It seems to me that all these term names are incomplete and potentially misleading. For example, the “peak area” term is not really for the concept of “peak area”; it is the concept of “quantifying signal by measuring peak area”. One can infer this by knowing the parent, but if our goal is to create term names that can stand on their own, I think these should be clarified. It will be tempting to users to use the “peak area” term to provide a measurement of a peak area. 4) For 1859, normalized to what? 5) For 1130, can peptides have an area? Mass specs don’t see peptides, they see peptide ions. And they see them as peaks. So it would see that “peptide raw area” is a badly named term that probably when decomposed means the same as “peak area”. Or, if I’m wrong, can we improve the definition? Pending things 2 -> Some comments by David Ovelleiro (mail Sept 27th, "Possible need of changing some things under "identification result details" (MS:1001405)") not yet implemented in the controlled vocabulary: - comment 1: there are two terms, MS:1001362 and MS:1001114, which in addition to be children to their respective parents (MS:1001116 and MS:1001105 resp), are also direct children to MS:1001405. The problem I see here is that the two parents, are also direct descendants of MS:1001405. Is this not redundant and unnecessary? My proposal is to remove the is_a (direct) relationship to MS:1001405. Please, check the picture "screen1.jpg" for a more graphical description. - comment 2: the term "Mascot query number" (MS:1001528) is direct child to "spectrum identification result details" (MS:1001405). Don't you think that this term would be better placed under "search engine specific score" (MS:1001153) (child to the previous MS:1001528) - comment 3: the terms related to the "False Discovery Rate" are, in my opinion, some confusing at this point. I attach a screen-shot called screen2.jpg to illustrate what I'm saying. At least two of the terms ("pep:global FDR" and "prot:global FDR") seem miss located to me. Maybe they should work like "local FDR", with a unique term called "global FDR" child to both "peptide" and "proteine" / "identification confidence metric" (MS:1001198 and MS:1001092). Or maybe two terms could be used (the way is now), but children to MS:1001198 and MS:1001092 and changing the prefix "pep:" and "prot:" by the proper "peptide" and "protein". Two replies (Eric Deutsch and David Creasy, mails Sept 27th) seem to give support to the first point. Second point rejected. Third point extended in Eric Deutsch mail. Pending things 3 -> the "modification specificity N-term/C-term" related terms were NOT modified following the proposal in Martin Eisenacher mail (Sept 1st). In mail sent by David Ovelleiro (Sept 16th) changes related to modifications specificity were put to a stop until more consensus was reached. This should be clarified (pretty sure a final soultion for this is needed). ================================================================================= Thanks for your attention. -- David Ovelleiro Bioinformatician PRIDE Group Proteomics Services Team, PANDA Group EMBL European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, UK CB10 1SD |