From: Fredrik L. <Fre...@im...> - 2008-05-02 19:28:37
|
Hi Eric and others, All nice thoughts and proposals. I would also add an example 4 which is: Example 4: (the current spectrum is a sum of two scans which had nativeIDs func1scan19 and func1scan20, the source file is found in the source file list) <acquisitionList count="2"> <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> <acquisition nativeID="func1scan19" sourceFileRef="rawFile1" /> <acquisition nativeID="func1scan20" sourceFileRef="rawFile1" /> </acquisitionList> It would all be covered by the following xsd: <xs:complexType name="AcquisitionType"> <xs:annotation> <xs:documentation>Scan or acquisition from original file used to create this peak list. A spectrumRef or an externalSpectrumID or nativeID plus sourceFileRef should be given .</xs:documentation> </xs:annotation> <xs:complexContent> <xs:extension base="dx:ParamGroupType"> <xs:attribute name="number" type="xs:int" use="required"> <xs:annotation> <xs:documentation>A number for this acquisition.</ xs:documentation> </xs:annotation> </xs:attribute> <xs:attribute name="spectrumRef" type="xs:IDREF" use="optional"> <xs:annotation> <xs:documentation>This attribute must reference the 'id' attribute of the appropriate spectrum. </xs:documentation> </xs:annotation> </xs:attribute> <xs:attribute name="nativeID" type="xs:string" use="optional"> <xs:annotation> <xs:documentation>This attribute references the native spectrum identifier in a raw file. </xs:documentation> </xs:annotation> </xs:attribute> <xs:attribute name="externalSpectrumID" type="xs:string" use="optional"> <xs:annotation> <xs:documentation>This attribute must reference the 'id' attribute of the appropriate spectrum in an external mzML file. </xs:documentation> </xs:annotation> </xs:attribute> <xs:attribute name="sourceFileRef" type="xs:IDREF" use="optional"> <xs:annotation> <xs:documentation>This attribute must reference the 'id' attribute of the appropriate sourceFile. </xs:documentation> </xs:annotation> </xs:attribute> </xs:extension> </xs:complexContent> </xs:complexType> The semantic validation would be to check that at least one of the attributes spectrumRef, externalSpectrumID (+sourceFileRef) or nativeID is given. Or maybe there is even more to consider? Fredrik Eric Deutsch skrev: > Hi Fredrik and everyone, thank you for thinking about these last few > problems. It seems that there are several different ways in which one > might want to reference the source scans for a summed scan. Based on > what's been said so far, here are my thoughts and proposal. What do you > think? > > Currently for acquisitionList we have (in XML/XSD hybrid): > --------------- > <acquisitionList count="2"> > <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> > <acquisition number="xs:int" sourceFileRef="xs:IDREF" > spectrumRef="xs:IDREF"> > <cvParam cvLabel="MS" (optional child of scan attribute)/> > </acquisition> > <acquisition number="xs:int" sourceFileRef="xs:IDREF" > spectrumRef="xs:IDREF"> > </acquisitionList> > --------------- > > Frederik suggests spectrumRef -> xs:string > externalSpectrumID="xs:string" > > This brought up the question of how would sourceFileRef reference itself > if everything were in the same file? > > Rune points out that we want nativeID references, like for Waters: > function1scan2, func1scan2, 1.2 or .... > > Darren suggests: > externalSpectrumID="URI" > externalNativeID="xs:string" > > ------------------------------------------- > > So, I suggest something like this (in XML/XSD hybrid): > > <acquisitionList count="2"> > <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> > (three possible options:) > <acquisition spectrumRef="xs:IDREF"> > <acquisition nativeID="xs:string"> > <acquisition sourceFileRef="xs:IDREF" externalSpectrumID="xs:string"> > <acquisition > </acquisitionList> > > --- > > Example 1: (the current spectrum is a sum of two scans which are also > present in the current file as ids S57 and S58.) > > <acquisitionList count="2"> > <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> > <acquisition spectrumRef="S57"> > <acquisition spectrumRef="S58"> > </acquisitionList> > > --- > > Example 2: (the current spectrum is a sum of two scans which had > nativeIDs func1scan19 and func1scan20, the exact location of which > are not specifiable) > > <acquisitionList count="2"> > <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> > <acquisition nativeID="func1scan19"> > <acquisition nativeID="func1scan20"> > </acquisitionList> > > --- > > Example 3: (the current spectrum is a sum of two scans which are > explicitly referenced externally by a specific file previously > defined in the current document and with IDs in that other file) > > <acquisitionList count="2"> > <cvParam cvLabel="MS" accession="MS:1000571" name="sum of spectra"/> > <acquisition sourceFileRef="mzMLsF01" externalSpectrumID="S57"> > <acquisition sourceFileRef="mzMLsF01" externalSpectrumID="S58"> > </acquisitionList> > > --- > > Also okay is 1+2: > > <acquisition spectrumRef="S57" nativeID="func1scan19"> > <acquisition spectrumRef="S58" nativeID="func1scan20"> > > --- > > Also okay is 2+3: > > <acquisition nativeID="func1scan19" sourceFileRef="mzMLsF01" > externalSpectrumID="S57"> > <acquisition nativeID="func1scan20" sourceFileRef="mzMLsF01" > externalSpectrumID="S58"> > > --- > > Thus, all four possible attributes are optional, and we would rely on > the sematic validator to enforce: > > spectrumRef alone > OR nativeID alone > OR spectrumRef AND nativeID > OR sourceFileRef AND externalSpectrumID > OR sourceFileRef AND externalSpectrumID AND nativeID > > What do you think? A little unpleasant to have several different > options, but I don't see how we could practically exclude any of the > options. > > > As a related side note, Matt also asks if we can handle the case where > MS1 scans have been stripped out of a file, but the the MS2 scans still > need to say something useful about their precursor scan (IDREF not > possible). > > I have not checked this, but we should spend some time thinking about > that once we have solved this problem. > > Thanks, > Eric > > > >> -----Original Message----- >> From: psi...@li... >> > [mailto:psidev-ms-dev- > >> bo...@li...] On Behalf Of Darren Kessner >> Sent: Friday, May 02, 2008 9:13 AM >> To: Mass spectrometry standard development >> Subject: Re: [Psidev-ms-dev] Spectra from summed acquisitions in mzML >> 0.99.10 >> >> I think it's a bad idea to have a spectrumRef to a spectrum that isn't >> in the file. We should be consistent in the use of internal >> references, all of which use IDREF so that dangling references can be >> caught during XML validation. >> >> External references, including those to spectra that have been removed >> for space reasons, should be indicated specifically (e.g. with >> externalSpectrumID or externalNativeID) so that the reader (human or >> software) knows that they'll have to do some extra work to find the >> referent. >> >> >> Darren >> >> >> >> On May 2, 2008, at 6:17 AM, Matt Chambers wrote: >> >> >>> Perhaps we should follow the same logic of the spectrum element >>> itself, >>> where id is required but nativeID is optional. Thus, id is required >>> for >>> a spectrum reference and a reference to the nativeID would be >>> > optional > >>> (but recommended!). We can't use the same attribute to point to >>> > either > >>> id or nativeID because then it won't be known which is being >>> > referred > >>> to, unless I'm missing something in Fredrik's proposal. >>> >>> To deal with IDs that aren't in the file, I agree with Fredrik. Any >>> time >>> we have a reference that can be to a non-existent or external >>> > element, > >>> we can't use IDREF. We either need to switch to xlink (in which case >>> IDREF would probably still be ok) or fall back to string. However, >>> since >>> I think that acquisitions should be able to reference spectra in the >>> current file, we shouldn't change it to externalSpectrumID. We >>> > should > >>> just document that all spectrumRef and spectrumNativeID(Ref?) >>> attributes >>> may refer to a spectrum in the current file or in another one, and >>> > in > >>> the former case the spectrum might not actually be there (i.e. MSn >>> spectra referencing precursor spectra that have been stripped out to >>> conserve space). >>> >>> -Matt >>> >>> >>> Rune Schjellerup Philosof wrote: >>> >>>> I think the format by which external file elements are reference >>>> should >>>> be defined. >>>> For instance, a reference to a Waters raw file, should that be >>>> function1scan2, func1scan2, 1.2 or .... >>>> >>>> -- >>>> Rune >>>> >>>> Fredrik Levander wrote: >>>> >>>> >>>>> Hi All, >>>>> >>>>> There is an issue with the current mzML schema (0.99.10) when it >>>>> comes >>>>> to referencing the origin of acquisitions in an acquistionList of >>>>> summed spectra. In most cases the referenced spectra will not be >>>>> > in > >>>>> the current mzML file, and thus the spectrumRef cannot be of the >>>>> type >>>>> xs:IDREF, but should be xs:string to also cover references to >>>>> spectrum >>>>> IDs in other mzML files, or native IDs in vendor files. >>>>> >>>>> The following would cover both internal and external spectrum >>>>> referencing: >>>>> >>>>> >>>>> <xs:complexType name="AcquisitionType"> >>>>> <xs:annotation> >>>>> <xs:documentation>Scan or acquisition from >>>>> > original raw > >> file used >> >>>>> to create this peak list, as specified in sourceFile.</ >>>>> xs:documentation> >>>>> </xs:annotation> >>>>> <xs:complexContent> >>>>> <xs:extension base="dx:ParamGroupType"> >>>>> <xs:attribute name="number" >>>>> > type="xs:int" > >> use="required"> >> >>>>> <xs:annotation> >>>>> <xs:documentation>A >>>>> > number for this > >> acquisition.</ >> >>>>> xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> <xs:attribute name="spectrumRef" >>>>> > type="xs:string" > >>>>> use="required"> >>>>> <xs:annotation> >>>>> <xs:documentation>This >>>>> > attribute must > >> reference the 'id' >> >>>>> attribute of the appropriate spectrum if found within an mzML >>>>> file, or >>>>> the native spectrum identifier in a raw file in another format. </ >>>>> xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> <xs:attribute name="sourceFileRef" >>>>> > type="xs:IDREF" > >>>>> use="required"> >>>>> <xs:annotation> >>>>> <xs:documentation>This >>>>> > attribute must > >> reference the 'id' >> >>>>> attribute of the appropriate sourceFile. It can also refer to the >>>>> present mzML file.</xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> </xs:extension> >>>>> </xs:complexContent> >>>>> </xs:complexType> >>>>> >>>>> However, I am not sure if there are any use cases when both the >>>>> summed >>>>> spectrum and the original spectra are found in the same file. If >>>>> not, >>>>> the spectrumRef attribute should maybe be renamed to >>>>> 'externalSpectrumID' or something else, since 'spectrumRef' >>>>> > somehow > >>>>> indicates that the spectrum is in the present file. >>>>> >>>>> If there is a need to reference spectra within the file the >>>>> following >>>>> may be an alternative (I think Darren proposed something similar): >>>>> >>>>> <xs:complexType name="AcquisitionType"> >>>>> <xs:annotation> >>>>> <xs:documentation>Scan or acquisition from >>>>> > original file > >> used to >> >>>>> create this peak list. Either a spectrumRef or an >>>>> > externalSpectrumID > >>>>> plus sourceFileRef should be given .</xs:documentation> >>>>> </xs:annotation> >>>>> <xs:complexContent> >>>>> <xs:extension base="dx:ParamGroupType"> >>>>> <xs:attribute name="number" >>>>> > type="xs:int" > >> use="required"> >> >>>>> <xs:annotation> >>>>> <xs:documentation>A >>>>> > number for this > >> acquisition.</ >> >>>>> xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> <xs:attribute name="spectrumRef" >>>>> > type="xs:IDREF" > >> use="optional"> >> >>>>> <xs:annotation> >>>>> <xs:documentation>This >>>>> > attribute must > >> reference the 'id' >> >>>>> attribute of the appropriate spectrum. </xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> <xs:attribute name="externalSpectrumID" >>>>> >> type="xs:string" >> >>>>> use="optional"> >>>>> <xs:annotation> >>>>> <xs:documentation>This >>>>> > attribute must > >> reference the 'id' >> >>>>> attribute of the appropriate spectrum if found within an external >>>>> mzML >>>>> file, or the native spectrum identifier in a raw file in another >>>>> format. </xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> <xs:attribute name="sourceFileRef" type="xs:IDREF" use="optional"> >>>>> <xs:annotation> >>>>> <xs:documentation>This >>>>> > attribute must > >> reference the 'id' >> >>>>> attribute of the appropriate sourceFile. It can also refer to the >>>>> present mzML file.</xs:documentation> >>>>> </xs:annotation> >>>>> </xs:attribute> >>>>> </xs:extension> >>>>> </xs:complexContent> >>>>> </xs:complexType> >>>>> >>>>> So, main question: are there any use cases with the original scans >>>>> and >>>>> the summed spectrum in the same file, and is there in this case a >>>>> need >>>>> to distinguish clearly between external and internal referencing >>>>> (second schema alternative)? >>>>> >>>>> A minor point is also that the documentation of the spectrum >>>>> attribute >>>>> nativeID should be updated to something like: >>>>> >>>>> The native identifier for the spectrum, used by the acquisition >>>>> software. If the spectrum is reconstructed from more than one >>>>> spectrum, the native identifier of the first acquisition in time >>>>> should be used. >>>>> >>>>> Regards >>>>> >>>>> Fredrik >>>>> >>>>> >>>>> >>> >>> > ------------------------------------------------------------------------ > >> - >> >>> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference >>> Don't miss this year's exciting event. There's still time to save >>> $100. >>> Use priority code J8TL2D2. >>> >>> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j > av > >> aone >> >>> _______________________________________________ >>> Psidev-ms-dev mailing list >>> Psi...@li... >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >>> >> >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference >> Don't miss this year's exciting event. There's still time to save >> > $100. > >> Use priority code J8TL2D2. >> >> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j > av > >> aone >> _______________________________________________ >> Psidev-ms-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |