From: Eric D. <ede...@sy...> - 2008-03-04 08:48:30
|
So it seems like the final suggestion is something like: <nativeScanRefFormat> <cvParam cvLabel="MS" accession="MS:1099580" name="Masswolf format nativeScanReference"/> </nativeScanRefFormat> <spectrum index="18" id="S2,4,6" > <scan nativeScanReference="2,4,6"> </scan> </spectrum> <offset index="18" id="S19" nativeScanReference="2,4,6">1234</offset> -------------- For Thermo, we would have: <nativeScanRefFormat> <cvParam cvLabel="MS" accession="MS:1099581" name="Thermo format nativeScanReference"/> </nativeScanRefFormat> <spectrum index="18" id="S19" > <scan nativeScanReference="19"> </scan> </spectrum> <offset index="18" id="S19" nativeScanReference="19">1234</offset> -------------- The one thing that concerns me is that there isn't much backward compatibility here. It would be nice to preserve one attribute that behaves in the same way it always did. If we made index start with 1 instead of 0, then that is probably as close to the traditional "scanNumber" as we can hope for. Would that offend you, Darren? -------------- My summary of the discussion goes like this: New thread on acquisitionNumbers - Darren posts example on what this would look like - Matt suggests that there should be no acquisitionNumber in <index> - Darren counters that having scanNumber aka acquisitionNumber in <index> is critical - Darren proffers: <offset id="S17" externalID="17">4826</offset> with externalID interpreted according to some other metadatum (original source file type, instrument vendor, something else...) - Matt: Why do you need to know scan number at open time? - Darren: the point is we *know* the scan number and need to seek to it - Mike brings up subsetting of one mzML file to another - Matt brings up the externalID idea - Darren offers a way to either annotate the assumption that externalID=scan number or have an optional scanNumber attribute. Neither is liked - Rune asks why preserving thermo scan number is important - Darren says that he has tools that also go back to a RAW file with a given scan number. So preserving the scan number is important. - Darren provides a dump of a huge number of obscure bits of configuration data availabel for each scan in Thermo RAW format. How to encode in mzML? cvParams? userParams? - Matt likes externalID instead of scanNumber and describes some possible naming conventions - Josh suggests that we need to encode a "native scan reference" to be able to go back to the vendor software - Darren agrees that native unique identifier needs to be in the index and the rest in <scan> - Josh suggests a slew of optional vendor-specific attributes in <index> - Darren is okay with that, but also suggests using cvParams in the <index> to define the meaning of externalID in a specific context, different for each vendor - Josh points out that some vendors have multi-part keys instead of a single one so this makes the above a lot trickier - Darren wonders if the multi-parts information needs to be in each <offset> tag or could be global to the index? <index externalIDTypeAccession1="cycle" externalIDTypeAccession2="scan"> <offset id="S19" externalID="(19,123)">4826</offset> - Josh agrees, although suggests nativeScanID - Darren votes for nativeID - Josh is fine with that - Matt seems in agreement - so the suggestions seemed to be: <index> <externalIDTypeList count="2"> <cvParam .../> <cvParam .../> </externalIDTypeList> ... <offset id="S19" nativeID="(19,123)">4826</offset> ... </index> -- Unique scan numbers thread - Josh points out that MassWolf just renumbers the scans But there will be a "native scan reference" in mzXML - Matt asks for details - Josh suggests something like: <nativeScanRefFormat> containing ordered list <cvParam cvLabel="MS" accession="MS:1099580" name="scan cycle number" value=""/> <cvParam cvLabel="MS" accession="MS:1099581" name="scan function number" value=""/> <cvParam cvLabel="MS" accession="MS:1099582" name="scan number" value=""/> </nativeScanRefFormat> <spectrum> <scan ... nativeScanRef="(2,4,6") </scan> </spectrum> - Matt suggests just having a single cvParam to describe "MassWolf nativeID format" -- Related to the above if a <scan> nativeID thread - Darren suggests: <offset id="S19" nativeID="19">1234</offset> It would also be convenient, and consistent, to have nativeID in <scan>: <spectrum index=0 id="S19" nativeID="19"> ... </spectrum> - Matt suggests that <offset> should be <spectrum_offset> - Darren says the important part of the discussion is that nativeID are both in <spectrum> and <index> - > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Matthew Chambers > Sent: Monday, March 03, 2008 11:52 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Unique scan numbers > > To be honest, for the mzML approach, I would prefer a single CV term > describing the format and the axes it corresponds to. I see no reason to > allow formats with arbitrary axes in a controlled nativeID system. I'm > happy to restrict that capability to the arbitrary id string. Perhaps > there is a reason though and I'm not seeing it. > > For mzXML, the axes definition block makes more sense to me. I would > vote against flanking the id with parentheses though as that kind of > makes them look like Cartesian coordinates. :) > > -Matt > > > Joshua Tasman wrote: > > Hi Matt, > > > > After the discussion here last week with you and Darren, it seemed an > efficient way to deal with this would be to have each scan contain a > string, and the header would have some description on how to parse this. > > > > Off the top of my head, you could have something in the head like: > > > > mzML-ish: > > in header: > > <nativeScanRefFormat> containing ordered list > > <cvParam with cv term for first axis /> > > <cvParam with cv term for first axis /> > > <cvParam with cv term for first axis /> > > </nativeScanRefFormat> > > > > in spectrum: a string represenation like "(1st,2nd,3rd)" > > > > mzXML-ish: > > header: > > <nativeScanRefFormat Vendor="VendorX"> containing ordered list > > <axis name="cycle"> > > <axis name="function"> > > <axis name="scan"> > > </nativeScanRefFormat> > > > > ... > > <scan > > ... > > nativeScanRef="(2,4,6") > > </scan> > > > > What do you think? > > > > Josh > > > > > > > > Matthew Chambers wrote: > > > >> Hi Josh, > >> > >> What design are you planning for the "native scan reference" in mzXML? > >> It seems the same issues I just posted about in response to Darren will > >> apply to the mzXML design as well. > >> > >> -Matt > >> > >> > >> Joshua Tasman wrote: > >> > >>> Hi Fredrik, > >>> > >>> Catching up: massWolf simply renumbers all scans starting with "1" in > the mzXML output. Like I said in a different post, we'll be adding a > "native scan reference" to mzXML for each vendor software type. > >>> > >>> Josh > >>> > >>> > >>> Fredrik Levander wrote: > >>> > >>> > >>>> Hi All, > >>>> > >>>> In QTOF files from Waters with mixed MS1 and MS2 data we have several > >>>> parallel 'functions' with data being recorded into separate files. > The > >>>> scan numbers are only unique within each function. In the raw data > >>>> folder we thus have several different spectra with the same scan > number > >>>> (but different source files). When converting this into an mzML file > it > >>>> would be good to keep the original scan numbers which are useful for > >>>> traceability, but to generate unique spectrum ids. I thus propose > that > >>>> the requirement for unique scanNumbers within an mzML file is > removed. > >>>> However, spectra should not be repeated within the file, so this > would > >>>> NOT be applicable to the dta to mzML conversion use case. > >>>> Would such a change generate problems for the readers? > >>>> How is this solved in MassWolf? > >>>> > >>>> > >>>> Regards > >>>> > >>>> Fredrik > >>>> > >>>> --------------------------------------------------------------------- > ---- > >>>> This SF.net email is sponsored by: Microsoft > >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>>> _______________________________________________ > >>>> Psidev-ms-dev mailing list > >>>> Psi...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>>> > >>>> > >>> ---------------------------------------------------------------------- > --- > >>> This SF.net email is sponsored by: Microsoft > >>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>> _______________________________________________ > >>> Psidev-ms-dev mailing list > >>> Psi...@li... > >>> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >>> > >>> > >>> > >> ----------------------------------------------------------------------- > -- > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Psidev-ms-dev mailing list > >> Psi...@li... > >> https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > >> > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > > > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |