From: Darren K. <Dar...@cs...> - 2008-05-02 20:28:39
|
Hi Eric, Thanks for the nice summary. I agree with Fredrik's proposal to add: sourceFileRef + externalNativeID since the sourceFileRef may refer to a native data file, so externalSpectrumID won't make sense. I believe Matt's case is covered by the two possibilities: sourceFileRef + externalNativeID (original file is native) sourceFileRef + externalSpectrumID (original file is mzML) Darren On May 2, 2008, at 12:28 PM, Fredrik Levander wrote: > Hi Eric and others, > > All nice thoughts and proposals. I would also add an example 4 which > is: > > Example 4: (the current spectrum is a sum of two scans which had > nativeIDs func1scan19 and func1scan20, the source file is found in > the source file list) > > <acquisitionList count="2"> > <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> > <acquisition nativeID="func1scan19" sourceFileRef="rawFile1" /> > <acquisition nativeID="func1scan20" sourceFileRef="rawFile1" /> > </acquisitionList> > > It would all be covered by the following xsd: > > <xs:complexType name="AcquisitionType"> > <xs:annotation> > <xs:documentation>Scan or acquisition from original file > used to > create this peak list. A spectrumRef or an externalSpectrumID or > nativeID > plus sourceFileRef should be given .</xs:documentation> > </xs:annotation> > <xs:complexContent> > <xs:extension base="dx:ParamGroupType"> > <xs:attribute name="number" type="xs:int" > use="required"> > <xs:annotation> > <xs:documentation>A number for this > acquisition.</ > xs:documentation> > </xs:annotation> > </xs:attribute> > <xs:attribute name="spectrumRef" type="xs:IDREF" > use="optional"> > <xs:annotation> > <xs:documentation>This attribute must reference > the 'id' > attribute of the appropriate spectrum. </xs:documentation> > </xs:annotation> > </xs:attribute> > <xs:attribute name="nativeID" type="xs:string" > use="optional"> > <xs:annotation> > <xs:documentation>This attribute references the > native > spectrum identifier in a raw file. </xs:documentation> > </xs:annotation> > </xs:attribute> > <xs:attribute name="externalSpectrumID" type="xs:string" > use="optional"> > <xs:annotation> > <xs:documentation>This attribute must reference > the 'id' > attribute of the appropriate spectrum in an external mzML > file. </xs:documentation> > </xs:annotation> > </xs:attribute> > <xs:attribute name="sourceFileRef" type="xs:IDREF" > use="optional"> > <xs:annotation> > <xs:documentation>This attribute must reference > the 'id' > attribute of the appropriate sourceFile. > </xs:documentation> > </xs:annotation> > </xs:attribute> > </xs:extension> > </xs:complexContent> > </xs:complexType> > > The semantic validation would be to check that at least one of the > attributes spectrumRef, externalSpectrumID (+sourceFileRef) or > nativeID > is given. > > Or maybe there is even more to consider? > > Fredrik > > > Eric Deutsch skrev: >> Hi Fredrik and everyone, thank you for thinking about these last few >> problems. It seems that there are several different ways in which one >> might want to reference the source scans for a summed scan. Based on >> what's been said so far, here are my thoughts and proposal. What do >> you >> think? >> >> Currently for acquisitionList we have (in XML/XSD hybrid): >> --------------- >> <acquisitionList count="2"> >> <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> >> <acquisition number="xs:int" sourceFileRef="xs:IDREF" >> spectrumRef="xs:IDREF"> >> <cvParam cvLabel="MS" (optional child of scan attribute)/> >> </acquisition> >> <acquisition number="xs:int" sourceFileRef="xs:IDREF" >> spectrumRef="xs:IDREF"> >> </acquisitionList> >> --------------- >> >> Frederik suggests spectrumRef -> xs:string >> externalSpectrumID="xs:string" >> >> This brought up the question of how would sourceFileRef reference >> itself >> if everything were in the same file? >> >> Rune points out that we want nativeID references, like for Waters: >> function1scan2, func1scan2, 1.2 or .... >> >> Darren suggests: >> externalSpectrumID="URI" >> externalNativeID="xs:string" >> >> ------------------------------------------- >> >> So, I suggest something like this (in XML/XSD hybrid): >> >> <acquisitionList count="2"> >> <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> >> (three possible options:) >> <acquisition spectrumRef="xs:IDREF"> >> <acquisition nativeID="xs:string"> >> <acquisition sourceFileRef="xs:IDREF" >> externalSpectrumID="xs:string"> >> <acquisition >> </acquisitionList> >> >> --- >> >> Example 1: (the current spectrum is a sum of two scans which are also >> present in the current file as ids S57 and S58.) >> >> <acquisitionList count="2"> >> <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> >> <acquisition spectrumRef="S57"> >> <acquisition spectrumRef="S58"> >> </acquisitionList> >> >> --- >> >> Example 2: (the current spectrum is a sum of two scans which had >> nativeIDs func1scan19 and func1scan20, the exact location of which >> are not specifiable) >> >> <acquisitionList count="2"> >> <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> >> <acquisition nativeID="func1scan19"> >> <acquisition nativeID="func1scan20"> >> </acquisitionList> >> >> --- >> >> Example 3: (the current spectrum is a sum of two scans which are >> explicitly referenced externally by a specific file previously >> defined in the current document and with IDs in that other file) >> >> <acquisitionList count="2"> >> <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> >> <acquisition sourceFileRef="mzMLsF01" externalSpectrumID="S57"> >> <acquisition sourceFileRef="mzMLsF01" externalSpectrumID="S58"> >> </acquisitionList> >> >> --- >> >> Also okay is 1+2: >> >> <acquisition spectrumRef="S57" nativeID="func1scan19"> >> <acquisition spectrumRef="S58" nativeID="func1scan20"> >> >> --- >> >> Also okay is 2+3: >> >> <acquisition nativeID="func1scan19" sourceFileRef="mzMLsF01" >> externalSpectrumID="S57"> >> <acquisition nativeID="func1scan20" sourceFileRef="mzMLsF01" >> externalSpectrumID="S58"> >> >> --- >> >> Thus, all four possible attributes are optional, and we would rely on >> the sematic validator to enforce: >> >> spectrumRef alone >> OR nativeID alone >> OR spectrumRef AND nativeID >> OR sourceFileRef AND externalSpectrumID >> OR sourceFileRef AND externalSpectrumID AND nativeID >> >> What do you think? A little unpleasant to have several different >> options, but I don't see how we could practically exclude any of the >> options. >> >> >> As a related side note, Matt also asks if we can handle the case >> where >> MS1 scans have been stripped out of a file, but the the MS2 scans >> still >> need to say something useful about their precursor scan (IDREF not >> possible). >> >> I have not checked this, but we should spend some time thinking about >> that once we have solved this problem. >> >> Thanks, >> Eric >> >> >> >>> -----Original Message----- >>> From: psi...@li... >>> >> [mailto:psidev-ms-dev- >> >>> bo...@li...] On Behalf Of Darren Kessner >>> Sent: Friday, May 02, 2008 9:13 AM >>> To: Mass spectrometry standard development >>> Subject: Re: [Psidev-ms-dev] Spectra from summed acquisitions in >>> mzML >>> 0.99.10 >>> >>> I think it's a bad idea to have a spectrumRef to a spectrum that >>> isn't >>> in the file. We should be consistent in the use of internal >>> references, all of which use IDREF so that dangling references can >>> be >>> caught during XML validation. >>> >>> External references, including those to spectra that have been >>> removed >>> for space reasons, should be indicated specifically (e.g. with >>> externalSpectrumID or externalNativeID) so that the reader (human or >>> software) knows that they'll have to do some extra work to find the >>> referent. >>> >>> >>> Darren >>> >>> >>> >>> On May 2, 2008, at 6:17 AM, Matt Chambers wrote: >>> >>> >>>> Perhaps we should follow the same logic of the spectrum element >>>> itself, >>>> where id is required but nativeID is optional. Thus, id is required >>>> for >>>> a spectrum reference and a reference to the nativeID would be >>>> >> optional >> >>>> (but recommended!). We can't use the same attribute to point to >>>> >> either >> >>>> id or nativeID because then it won't be known which is being >>>> >> referred >> >>>> to, unless I'm missing something in Fredrik's proposal. >>>> >>>> To deal with IDs that aren't in the file, I agree with Fredrik. Any >>>> time >>>> we have a reference that can be to a non-existent or external >>>> >> element, >> >>>> we can't use IDREF. We either need to switch to xlink (in which >>>> case >>>> IDREF would probably still be ok) or fall back to string. However, >>>> since >>>> I think that acquisitions should be able to reference spectra in >>>> the >>>> current file, we shouldn't change it to externalSpectrumID. We >>>> >> should >> >>>> just document that all spectrumRef and spectrumNativeID(Ref?) >>>> attributes >>>> may refer to a spectrum in the current file or in another one, and >>>> >> in >> >>>> the former case the spectrum might not actually be there (i.e. MSn >>>> spectra referencing precursor spectra that have been stripped out >>>> to >>>> conserve space). >>>> >>>> -Matt >>>> >>>> >>>> Rune Schjellerup Philosof wrote: >>>> >>>>> I think the format by which external file elements are reference >>>>> should >>>>> be defined. >>>>> For instance, a reference to a Waters raw file, should that be >>>>> function1scan2, func1scan2, 1.2 or .... >>>>> >>>>> -- >>>>> Rune >>>>> >>>>> Fredrik Levander wrote: >>>>> >>>>> >>>>>> Hi All, >>>>>> >>>>>> There is an issue with the current mzML schema (0.99.10) when it >>>>>> comes >>>>>> to referencing the origin of acquisitions in an acquistionList of >>>>>> summed spectra. In most cases the referenced spectra will not be >>>>>> >> in >> >>>>>> the current mzML file, and thus the spectrumRef cannot be of the >>>>>> type >>>>>> xs:IDREF, but should be xs:string to also cover references to >>>>>> spectrum >>>>>> IDs in other mzML files, or native IDs in vendor files. >>>>>> >>>>>> The following would cover both internal and external spectrum >>>>>> referencing: >>>>>> >>>>>> >>>>>> <xs:complexType name="AcquisitionType"> >>>>>> <xs:annotation> >>>>>> <xs:documentation>Scan or acquisition from >>>>>> >> original raw >> >>> file used >>> >>>>>> to create this peak list, as specified in sourceFile.</ >>>>>> xs:documentation> >>>>>> </xs:annotation> >>>>>> <xs:complexContent> >>>>>> <xs:extension base="dx:ParamGroupType"> >>>>>> <xs:attribute name="number" >>>>>> >> type="xs:int" >> >>> use="required"> >>> >>>>>> <xs:annotation> >>>>>> <xs:documentation>A >>>>>> >> number for this >> >>> acquisition.</ >>> >>>>>> xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> <xs:attribute name="spectrumRef" >>>>>> >> type="xs:string" >> >>>>>> use="required"> >>>>>> <xs:annotation> >>>>>> <xs:documentation>This >>>>>> >> attribute must >> >>> reference the 'id' >>> >>>>>> attribute of the appropriate spectrum if found within an mzML >>>>>> file, or >>>>>> the native spectrum identifier in a raw file in another format. >>>>>> </ >>>>>> xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> <xs:attribute name="sourceFileRef" >>>>>> >> type="xs:IDREF" >> >>>>>> use="required"> >>>>>> <xs:annotation> >>>>>> <xs:documentation>This >>>>>> >> attribute must >> >>> reference the 'id' >>> >>>>>> attribute of the appropriate sourceFile. It can also refer to the >>>>>> present mzML file.</xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> </xs:extension> >>>>>> </xs:complexContent> >>>>>> </xs:complexType> >>>>>> >>>>>> However, I am not sure if there are any use cases when both the >>>>>> summed >>>>>> spectrum and the original spectra are found in the same file. If >>>>>> not, >>>>>> the spectrumRef attribute should maybe be renamed to >>>>>> 'externalSpectrumID' or something else, since 'spectrumRef' >>>>>> >> somehow >> >>>>>> indicates that the spectrum is in the present file. >>>>>> >>>>>> If there is a need to reference spectra within the file the >>>>>> following >>>>>> may be an alternative (I think Darren proposed something >>>>>> similar): >>>>>> >>>>>> <xs:complexType name="AcquisitionType"> >>>>>> <xs:annotation> >>>>>> <xs:documentation>Scan or acquisition from >>>>>> >> original file >> >>> used to >>> >>>>>> create this peak list. Either a spectrumRef or an >>>>>> >> externalSpectrumID >> >>>>>> plus sourceFileRef should be given .</xs:documentation> >>>>>> </xs:annotation> >>>>>> <xs:complexContent> >>>>>> <xs:extension base="dx:ParamGroupType"> >>>>>> <xs:attribute name="number" >>>>>> >> type="xs:int" >> >>> use="required"> >>> >>>>>> <xs:annotation> >>>>>> <xs:documentation>A >>>>>> >> number for this >> >>> acquisition.</ >>> >>>>>> xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> <xs:attribute name="spectrumRef" >>>>>> >> type="xs:IDREF" >> >>> use="optional"> >>> >>>>>> <xs:annotation> >>>>>> <xs:documentation>This >>>>>> >> attribute must >> >>> reference the 'id' >>> >>>>>> attribute of the appropriate spectrum. </xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> <xs:attribute name="externalSpectrumID" >>>>>> >>> type="xs:string" >>> >>>>>> use="optional"> >>>>>> <xs:annotation> >>>>>> <xs:documentation>This >>>>>> >> attribute must >> >>> reference the 'id' >>> >>>>>> attribute of the appropriate spectrum if found within an external >>>>>> mzML >>>>>> file, or the native spectrum identifier in a raw file in another >>>>>> format. </xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> <xs:attribute name="sourceFileRef" type="xs:IDREF" >>>>>> use="optional"> >>>>>> <xs:annotation> >>>>>> <xs:documentation>This >>>>>> >> attribute must >> >>> reference the 'id' >>> >>>>>> attribute of the appropriate sourceFile. It can also refer to the >>>>>> present mzML file.</xs:documentation> >>>>>> </xs:annotation> >>>>>> </xs:attribute> >>>>>> </xs:extension> >>>>>> </xs:complexContent> >>>>>> </xs:complexType> >>>>>> >>>>>> So, main question: are there any use cases with the original >>>>>> scans >>>>>> and >>>>>> the summed spectrum in the same file, and is there in this case a >>>>>> need >>>>>> to distinguish clearly between external and internal referencing >>>>>> (second schema alternative)? >>>>>> >>>>>> A minor point is also that the documentation of the spectrum >>>>>> attribute >>>>>> nativeID should be updated to something like: >>>>>> >>>>>> The native identifier for the spectrum, used by the acquisition >>>>>> software. If the spectrum is reconstructed from more than one >>>>>> spectrum, the native identifier of the first acquisition in time >>>>>> should be used. >>>>>> >>>>>> Regards >>>>>> >>>>>> Fredrik >>>>>> >>>>>> >>>>>> >>>> >>>> >> ------------------------------------------------------------------------ >> >>> - >>> >>>> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference >>>> Don't miss this year's exciting event. There's still time to save >>>> $100. >>>> Use priority code J8TL2D2. >>>> >>>> >> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j >> av >> >>> aone >>> >>>> _______________________________________________ >>>> Psidev-ms-dev mailing list >>>> Psi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>>> >>> >>> >> ------------------------------------------------------------------------ >> - >> >>> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference >>> Don't miss this year's exciting event. There's still time to save >>> >> $100. >> >>> Use priority code J8TL2D2. >>> >>> >> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j >> av >> >>> aone >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference >> Don't miss this year's exciting event. There's still time to save >> $100. >> Use priority code J8TL2D2. >> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save > $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |