From: Jones, A. <And...@li...> - 2008-08-01 10:26:05
|
> If this is describing three Y-H20 ions, 3, 8 and 10 (i.e. all of the > Y-H20 ions for this peptide identification) then the attribute > value="3" on the cvParam element should be removed - or have I > misunderstood how this works? Correct, my mistake. The example says we have found y3-H2O y8-H2O and y10-H2O, the cvParam should not have had the value <Fragmentation> <IonType> <cvParam cvLabel="Waters" accession="PLGS:00035" name="y ion -H2O"/> <FragArrayIndex values = "3 8 10"/> <FragArray Measure_ref = "m1" values = "379.2215 457.12345 540.234"/> <FragArray Measure_ref = "m2" values = "1382.0 2055.5 340.0"/> <!-- and so on for other measures as defined in the FragmentationTable --> </IonType> <IonType> <cvParam cvLabel="Waters" accession="PLGS:00032" name="b ion"/> <FragArrayIndex values = "2 12 14"/> <FragArray Measure_ref = "m1" values = "560.153 859.111 945.653"/> <FragArray Measure_ref = "m2" values = "502.0 330.5 559.5"/> <!-- and so on for other measures as defined in the FragmentationTable --> </IonType> </Fragmentation> > Please excuse me for stating the obvious, but... there is no reason > why the pointers m1, m2, m3, m4 could not be more human readable, so > changed in this example to mz, inten, mz_error, ret_error for example. > (To help implementors understand the mechanism). Good suggestion. Cheers Andy > -----Original Message----- > From: phi...@go... [mailto:phi...@go...] On > Behalf Of Phil Jones @ EBI > Sent: 01 August 2008 11:23 > To: Jones, Andy; psi...@li... > Subject: Re: [Psidev-pi-dev] Fragmentation Ions > > Hi Andy, > > This looks really good - both flexible and compact. > > Just to clarify - in your example: > > <IonType> > <cvParam cvLabel="Waters" accession="PLGS:00035" > name="y ion -H2O" value="3"/> > <FragArrayIndex values = "3 8 10"/> > <FragArray Measure_ref = "m1" values = "379.2215 > 457.1234 540.234"/> > <FragArray Measure_ref = "m2" values = "1382.0 2055.5 340.0"/> > <!-- and so on for other measures as defined in the > FragmentationTable --> > </IonType> > > If this is describing three Y-H20 ions, 3, 8 and 10 (i.e. all of the > Y-H20 ions for this peptide identification) then the attribute > value="3" on the cvParam element should be removed - or have I > misunderstood how this works? > > Please excuse me for stating the obvious, but... there is no reason > why the pointers m1, m2, m3, m4 could not be more human readable, so > changed in this example to mz, inten, mz_error, ret_error for example. > (To help implementors understand the mechanism). > > best regards, > > Phil. > > > > 2008/8/1 Jones, Andy <And...@li...>: > > Hi all, > > > > Here's a proposal for fragmentation ions as discussed on the call that's halfway > between using cvParams for all values and using an array based encoding. I think > it's pretty flexible and concise. > > > > > > First up, setup a FragmentationTable for the entire list of the spectra, which says > the kinds of measures you're going to report lower down: > > > > > > <SpectrumIdentificationList id="MASCOT_results"> > > <FragmentationTable> > > <Measures> > > <Measure id = "m1"> > > <cvParam cvLabel="Waters" accession="PLGS:00024" > name="product ion m/z"/> > > </Measure> > > <Measure id = "m2"> > > <cvParam cvLabel="Waters" accession="PLGS:00025" > name="product ion intensity"/> > > </Measure> > > <Measure id = "m3"> > > <cvParam cvLabel="Waters" accession="PLGS:00026" > name="product ion m/z error"/> > > </Measure> > > <Measure id = "m4"> > > <cvParam cvLabel="Waters" accession="PLGS:00027" > name="product ion retention time error"/> > > </Measure> > > </Measures> > > </FragmentationTable> > > > > Then for each SpectrumIdentificationItem, you reference back to these > Measures > > > > <SpectrumIdentificationItem id="SEQ_spec1_pep1" Peptide_ref="prot1_pep1" > chargeState="1"> > > <PeptideEvidence id="PE1_SEQ_spec1_pep1" start="67" pre="-" end="79" > isDecoy="false" /> > > > > ... > > > > <Fragmentation> > > <IonType> > > <cvParam cvLabel="Waters" accession="PLGS:00035" name="y ion - > H2O" value="3"/> > > <FragArrayIndex values = "3 8 10"/> > > <FragArray Measure_ref = "m1" values = "379.2215 457.1234 > 540.234"/> > > <FragArray Measure_ref = "m2" values = "1382.0 2055.5 340.0"/> > > <!-- and so on for other measures as defined in the > FragmentationTable --> > > </IonType> > > <IonType> > > <cvParam cvLabel="Waters" accession="PLGS:00032" name="b ion" > value="4"/> > > <FragArrayIndex values = "2 12 14"/> > > <FragArray Measure_ref = "m1" values = "560.153 859.111 > 945.653"/> > > <FragArray Measure_ref = "m2" values = "502.0 330.5 559.5"/> > > <!-- and so on for other measures as defined in the > FragmentationTable --> > > </IonType> > > > > </Fragmentation> > > > > > > Each array contains space separated values (i.e. an xsd:list). The FragArrayIndex > tells you which ions you've found i.e. for the second IonType we have b2 b12 and > b14 which have the m/z and intensity values in the m1 and m2 arrays. This will > save a lot of space if there are many ions of the same type in each array and I > think it is fairly easy to read as well. Slightly more space could be saved by > defining the ion types in the FragmentationTable but not much really once you've > added a reference back up to it. > > > > Cheers > > Andy > > > > > > > > > > > > > > > > > >> -----Original Message----- > >> From: psi...@li... [mailto:psidev-pi-dev- > >> bo...@li...] On Behalf Of Matthew Chambers > >> Sent: 18 July 2008 16:00 > >> To: psi...@li... > >> Subject: Re: [Psidev-pi-dev] Fragment Ions in analysisXML - how it is currently > >> handled in PRIDE (Issue 28) > >> > >> I also agree that anything beyond an array is far too verbose. To answer > >> this question, I think we need to decide the scope of the problem. What > >> do we want fragment ion information to represent? I think analysis > >> software is too diverse to use it for anything more than basic > >> annotation, but basic annotation is important. If there are ways people > >> want it to be usable beyond that, speak up. :) > >> > >> For basic annotation, all I think is needed is the fragment type, series > >> number, charge state, and possibly any modification like a neutral loss > >> or radical. The array can be an attribute or text node. We can use a > >> grammar for each term, where each term represents an ion and terms are > >> space delimited. The grammar might look like: <a|b|c|x|y|z><# between 1 > >> and peptide_length>[<+|-><formula>][,(<+|-><charge>] > >> We could make the charge part mandatory or if it was optional, assume a > >> +1 charge (or possibly allow the charge to be based on the polarity of > >> the source scan?). I assume there is a standard chemical formula format > >> that can be represented compactly in ASCII text, but I don't know it. > >> An example to show how compact it could be: > >> fragmentIons="b3 y7,+2 b4 y5 y4 b7-H2O y3 y2 b7-H2O,+2 y3 y2" > >> > >> For basic annotation, the masses are not necessary I think. Expected > >> mass can be recomputed if all the label metadata is complete and > >> regular, and the observed mass is unimportant for annotation (IMO). > >> > >> -Matt > >> > >> > >> David Creasy wrote: > >> > Hi Phil, > >> > > >> > Just to be sure I've not misunderstood... from below, each fragment ion > >> > takes approx 500 bytes. Lets assume a conservative average of 20 > >> > fragment matches per spectrum and a modest search with 100k spectra. > >> > Assuming that we just report fragment matches for the top match for each > >> > spectrum, this would result in a file that is 500 x 20 x 100,000 = 1Gb. > >> > If we reported fragment matches for the the top 10 matches for each > >> > spectrum, this would be 10Gb. Is this reasonable and acceptable? > >> > > >> > David > >> > > >> > > >> > > >> > Phil Jones @ EBI wrote: > >> > > >> >> Hi, > >> >> > >> >> Regarding Issue 28 > >> >> <http://code.google.com/p/psi-pi/issues/detail?id=28> "support > >> >> reporting of fragment ions" > >> >> > >> >> As a suggestion of how this might be tackled: > >> >> > >> >> The latest development version of the PRIDE database includes a very > >> >> simple mechanism > >> >> for recording fragment ion information, illustrated below. (Please > >> >> note - made up data.) > >> >> > >> >> In this example, CV terms are used to define the type of ion and > >> >> related information > >> >> / annotation. Note that this is even more simple that the suggestion > >> >> made by Andy > >> >> above - no attempt is made here to indicate which residue has been > >> >> called for each > >> >> fragment ion - it is just listing the ions. > >> >> > >> >> Also note that while the PeptideItem is referencing the mass spectrum > (which is > >> >> reported in detail in the associated mzData file), the individual > >> >> fragment ions are > >> >> just reporting the m/z value and not attempting to make any kind of > >> >> hard reference to > >> >> the spectrum. > >> >> > >> >> As you can see, this has been developed in collaboration with Waters, > >> >> with output > >> >> from the ProteinLynx Global Server. (Actual values / sequence have > >> >> been changed). > >> >> > >> >> One possible change would be to make the m/z value an attribute of the > >> >> FragmentIon element, as this value will be mandatory and required to > >> >> relate the fragment ion to the correct peak on the mass spectrum. The > >> >> CV used for the annotation would also need to be part of the PI CV ?? > >> >> > >> >> Note that in the existing model, there are other terms available, to > >> >> allow any kind of fragment ion to be described (not just B and Y ions) > >> >> > >> >> In the context of analysisXML, the <FragmentIon/> elements would be > >> >> children of a <SpectrumIdentificationResultItem/> > >> >> > >> >> best regards, > >> >> > >> >> Phil. > >> >> > >> >> <PeptideItem> > >> >> <Sequence>LFQQSQWTREVFSNSCK</Sequence> > >> >> <Start>435</Start> > >> >> <End>460</End> > >> >> <SpectrumReference>123</SpectrumReference> > >> >> <FragmentIon> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00032" name="b ion" > >> value="3"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00024" name="product ion > >> >> m/z" value="379.2215"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00025" name="product ion > >> >> intensity" value="1382.0"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00026" name="product ion > m/z > >> >> error" value="-7.1543"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00027" name="product ion > >> >> retention time error" value="0.0207"/> > >> >> </FragmentIon> > >> >> <FragmentIon> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00032" name="b ion" > >> value="4"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00024" name="product ion > >> >> m/z" value="534.2811"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00025" name="product ion > >> >> intensity" value="1242.0"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00026" name="product ion > m/z > >> >> error" value="-8.2315"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00027" name="product ion > >> >> retention time error" value="0.0029"/> > >> >> </FragmentIon> > >> >> <FragmentIon> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00031" name="y ion" > >> value="3"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00024" name="product ion > >> >> m/z" value="394.1813"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00025" name="product ion > >> >> intensity" value="1917.0"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00026" name="product ion > m/z > >> >> error" value="-14.7098"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00027" name="product ion > >> >> retention time error" value="-0.0013"/> > >> >> </FragmentIon> > >> >> <FragmentIon> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00035" name="y ion -H2O" > >> value="3"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00024" name="product ion > >> >> m/z" value="367.1669"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00025" name="product ion > >> >> intensity" value="345.0"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00026" name="product ion > m/z > >> >> error" value="-18.767"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00027" name="product ion > >> >> retention time error" value="0.0025"/> > >> >> </FragmentIon> > >> >> <additional> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00014" name="precursor > mass" > >> >> value="1971.9194"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00015" name="precursor > >> >> intensity" value="181349.0"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00016" name="precursor > error > >> >> in ppm" value="0.8043"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00017" name="precursor > >> >> retention time in minutes" value="57.3537"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00019" name="product ion > >> >> mass RMS error" value="14.5969"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00020" name="product ion > >> >> retention time RMS error" value="0.0093"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00021" name="weighted > >> >> average charge state" value="2.2"/> > >> >> <cvParam cvLabel="Waters" accession="PLGS:00039" name="pass one > match" > >> >> value="" /> > >> >> </additional> > >> >> </PeptideItem> > >> >> > >> >> > >> >> -- > >> >> Phil Jones > >> >> Senior Software Engineer > >> >> PRIDE Project Team > >> >> PANDA Group, EMBL-EBI > >> >> Wellcome Trust Genome Campus > >> >> Hinxton, Cambridge, CB10 1SD > >> >> UK. > >> >> > >> >> Work phone: +44 1223 492662 (NEW NUMBER) > >> >> Skype: philip-jones > >> >> > >> >> ------------------------------------------------------------------------- > >> >> This SF.Net email is sponsored by the Moblin Your Move Developer's > >> challenge > >> >> Build the coolest Linux based applications with Moblin SDK & win great > prizes > >> >> Grand prize is a trip for two to an Open Source event anywhere in the world > >> >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ > >> >> _______________________________________________ > >> >> Psidev-pi-dev mailing list > >> >> Psi...@li... > >> >> https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev > >> >> > >> > > >> > > >> > >> ------------------------------------------------------------------------- > >> This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > >> Build the coolest Linux based applications with Moblin SDK & win great prizes > >> Grand prize is a trip for two to an Open Source event anywhere in the world > >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ > >> _______________________________________________ > >> Psidev-pi-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev > > > > ------------------------------------------------------------------------- > > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > > Build the coolest Linux based applications with Moblin SDK & win great prizes > > Grand prize is a trip for two to an Open Source event anywhere in the world > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > > Psidev-pi-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev > > > > > > -- > Phil Jones > Senior Software Engineer > PRIDE Project Team > PANDA Group, EMBL-EBI > Wellcome Trust Genome Campus > Hinxton, Cambridge, CB10 1SD > UK. > > Work phone: +44 1223 492662 (NEW NUMBER) > Skype: philip-jones |