You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
(5) |
Sep
|
Oct
(5) |
Nov
(1) |
Dec
(2) |
2005 |
Jan
(2) |
Feb
(5) |
Mar
|
Apr
(1) |
May
(5) |
Jun
(2) |
Jul
(3) |
Aug
(7) |
Sep
(18) |
Oct
(22) |
Nov
(10) |
Dec
(15) |
2006 |
Jan
(15) |
Feb
(8) |
Mar
(16) |
Apr
(8) |
May
(2) |
Jun
(5) |
Jul
(3) |
Aug
(1) |
Sep
(34) |
Oct
(21) |
Nov
(14) |
Dec
(2) |
2007 |
Jan
|
Feb
(17) |
Mar
(10) |
Apr
(25) |
May
(11) |
Jun
(30) |
Jul
(1) |
Aug
(38) |
Sep
|
Oct
(119) |
Nov
(18) |
Dec
(3) |
2008 |
Jan
(34) |
Feb
(202) |
Mar
(57) |
Apr
(76) |
May
(44) |
Jun
(33) |
Jul
(33) |
Aug
(32) |
Sep
(41) |
Oct
(49) |
Nov
(84) |
Dec
(216) |
2009 |
Jan
(102) |
Feb
(126) |
Mar
(112) |
Apr
(26) |
May
(91) |
Jun
(54) |
Jul
(39) |
Aug
(29) |
Sep
(16) |
Oct
(18) |
Nov
(12) |
Dec
(23) |
2010 |
Jan
(29) |
Feb
(7) |
Mar
(11) |
Apr
(22) |
May
(9) |
Jun
(13) |
Jul
(7) |
Aug
(10) |
Sep
(9) |
Oct
(20) |
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
(4) |
Mar
(27) |
Apr
(15) |
May
(23) |
Jun
(13) |
Jul
(15) |
Aug
(11) |
Sep
(23) |
Oct
(18) |
Nov
(10) |
Dec
(7) |
2012 |
Jan
(23) |
Feb
(19) |
Mar
(7) |
Apr
(20) |
May
(16) |
Jun
(4) |
Jul
(6) |
Aug
(6) |
Sep
(14) |
Oct
(16) |
Nov
(31) |
Dec
(23) |
2013 |
Jan
(14) |
Feb
(19) |
Mar
(7) |
Apr
(25) |
May
(8) |
Jun
(5) |
Jul
(5) |
Aug
(6) |
Sep
(20) |
Oct
(19) |
Nov
(10) |
Dec
(12) |
2014 |
Jan
(6) |
Feb
(15) |
Mar
(6) |
Apr
(4) |
May
(16) |
Jun
(6) |
Jul
(4) |
Aug
(2) |
Sep
(3) |
Oct
(3) |
Nov
(7) |
Dec
(3) |
2015 |
Jan
(3) |
Feb
(8) |
Mar
(14) |
Apr
(3) |
May
(17) |
Jun
(9) |
Jul
(4) |
Aug
(2) |
Sep
|
Oct
(13) |
Nov
|
Dec
(6) |
2016 |
Jan
(8) |
Feb
(1) |
Mar
(20) |
Apr
(16) |
May
(11) |
Jun
(6) |
Jul
(5) |
Aug
|
Sep
(2) |
Oct
(5) |
Nov
(7) |
Dec
(2) |
2017 |
Jan
(10) |
Feb
(3) |
Mar
(17) |
Apr
(7) |
May
(5) |
Jun
(11) |
Jul
(4) |
Aug
(12) |
Sep
(9) |
Oct
(7) |
Nov
(2) |
Dec
(4) |
2018 |
Jan
(7) |
Feb
(2) |
Mar
(5) |
Apr
(6) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(1) |
Sep
(9) |
Oct
(5) |
Nov
(3) |
Dec
(5) |
2019 |
Jan
(10) |
Feb
|
Mar
(4) |
Apr
(4) |
May
(2) |
Jun
(8) |
Jul
(2) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
(9) |
Dec
(1) |
2020 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
|
May
(3) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Matthew C. <mat...@va...> - 2009-07-09 22:52:14
|
I concur. This should be given a forced value in the schema equal to the schema's version. I think we have a chance to use our revision number which indicates the new schema version will cause no problems with backward compatibility (unless of course you wrote the wrong version, which is the whole point of the revision). -Matt Marc Sturm wrote: > Hi all, > > we just noticed that the 'version' attribute of the 'mzML' element is > mandatory but can be left empty. We should either force a correct > version it with a regular expression, with an enum, or at least set the > minimum length to 5. Any opinions? > > -Marc |
From: Marc S. <stu...@gm...> - 2009-07-08 12:55:17
|
Hi all, we just noticed that the 'version' attribute of the 'mzML' element is mandatory but can be left empty. We should either force a correct version it with a regular expression, with an enum, or at least set the minimum length to 5. Any opinions? -Marc |
From: Eric D. <ede...@sy...> - 2009-07-07 02:47:48
|
Hi everyone, I have become ill this evening, so there will be no PSI MSS WG call tomorrow. Thanks, Eric |
From: Matthew C. <mat...@va...> - 2009-07-06 15:14:10
|
The Agilent test data included with ProteoWizard in both native and mzML format (it's used as a unit test) should serve nicely as example mzML data. Currently all the data is QQQ. There is SRM and SIM currently supported, and there's a NL example too but Reader_Agilent doesn't support that yet (thus the file is just a single TIC). Some of the SRM data is mixed with MS2 data, although I don't think they are MIDAS experiments. Reader_Agilent also supports precursor ion spectra, but that example was too big so I nixed it. Agilent has approved us using and distributing their example data. I wish other vendors had similar redistributable test data sets. I should be getting a couple of good Thermo example files from Mike MacCoss soon as well - an LTQ example with MS1-3 and zoom scans, and a TSQ example with SRM, SIM, PI, and NL. Not a very sensible test, but very good for a unit test. ;) Here's the download for Agilent: http://proteowizard.svn.sourceforge.net/viewvc/proteowizard/trunk/pwiz/pwiz/data/vendor_readers/Agilent/Reader_Agilent_Test.data.tar.bz2 If anybody on the list has small Agilent QTOF data (less than a megabyte after compression) I'd like to include that for testing and example purposes as well. Thanks, Matt Fredrik Levander wrote: > Hi All, > > Just a quick note to tell that the hand crafted SRM example has been > updated to use chromatograms instead of spectra, according to the 1.1.0 > specification. > Please tell me if you spot any errors. Also, the binary arrays in this > example does not make sense, so it would be nice to have a real SRM > example file available online (pwiz conversion?) > The file (also linked from the spec site): > http://dev.thep.lu.se/fp6-prodac/browser/trunk/mzML/scans/MRM_example_1.1.0.mzML?format=raw > > Regards > > Fredrik > > |
From: Fredrik L. <Fre...@im...> - 2009-07-06 14:42:24
|
Hi All, Just a quick note to tell that the hand crafted SRM example has been updated to use chromatograms instead of spectra, according to the 1.1.0 specification. Please tell me if you spot any errors. Also, the binary arrays in this example does not make sense, so it would be nice to have a real SRM example file available online (pwiz conversion?) The file (also linked from the spec site): http://dev.thep.lu.se/fp6-prodac/browser/trunk/mzML/scans/MRM_example_1.1.0.mzML?format=raw Regards Fredrik |
From: Steffen N. <sne...@ip...> - 2009-07-03 08:35:57
|
On Tue, 2009-06-30 at 14:20 -0500, Matthew Chambers wrote: > Is it reasonable to determine which files in these > sources are used by the APIs and put that information in the CV > definition for the source types - possibly in a machine-readable way? I like that idea, but the information might not be constant, i.e. software version dependent. So (hypothetically) between version 7 and 8 of CompassXtract that could change. Even worse, it might be contextually dependent: this could vary whether you obtain raw data as measured by the machine, or the (externally) calibrated, which might include additional files. Yours, Steffen -- IPB Halle AG Massenspektrometrie & Bioinformatik Dr. Steffen Neumann http://www.IPB-Halle.DE Weinberg 3 http://msbi.bic-gh.de 06120 Halle Tel. +49 (0) 345 5582 - 1470 +49 (0) 345 5582 - 0 sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409 |
From: Matthew C. <mat...@va...> - 2009-07-01 20:15:46
|
Lest any of you think I was joking about how messy these directory-based sources can be, look at these delightful examples: Directory of \bsalgconcc12_1-A,3_01_692.d 10/02/2008 03:07 PM <DIR> 080519_lgpepadducts_ms3_flowramp_4_mam_692.m 07/28/2008 12:28 PM 223 893f91d9-b133-4529-af43-6da496be4766_1.mcf 07/28/2008 12:28 PM 15,360 893f91d9-b133-4529-af43-6da496be4766_1.mcf_idx 06/16/2008 02:24 PM 4,175 Analysis.0.DataAnalysis.method 06/16/2008 02:24 PM 314,142 Analysis.0.result_c 07/28/2008 12:28 PM 4,562 Analysis.1.DataAnalysis.method 07/28/2008 12:28 PM 408,878 Analysis.1.result_c 07/29/2008 03:04 PM 4,175 Analysis.2.DataAnalysis.method 07/29/2008 03:04 PM 349,561 Analysis.2.result_c 07/29/2008 03:04 PM 282 Analysis.content 05/20/2008 08:35 AM 51,221,174 Analysis.mzData 05/20/2008 08:36 AM 1,197 ANALYSIS.MZXML 05/19/2008 04:47 PM 39,906,451 Analysis.yep 07/25/2008 01:36 PM 44 BackgroundLineNeg.ami 07/25/2008 01:36 PM 424,040 BackgroundLinePos.ami 07/25/2008 01:36 PM 44 BackgroundProfNeg.ami 07/25/2008 01:36 PM 44 BackgroundProfPos.ami 05/19/2008 03:27 PM 502 bsalgconcc12_1-A,3_01_692.hdx 05/19/2008 04:47 PM 226,000 bsalgconcc12_1-A,3_01_692.unt 07/28/2008 12:28 PM 486 Calibrator.ami 01/18/2008 02:26 PM 4,781 CapLCMSRC.hss 05/20/2008 08:36 AM 624 CompassXport.log 07/25/2008 01:36 PM 57 DensViewNeg.ami 07/25/2008 01:36 PM 57 DensViewNegBgnd.ami 07/25/2008 01:36 PM 85,283,161 DensViewPos.ami 07/25/2008 01:36 PM 48,824,297 DensViewPosBgnd.ami 07/29/2008 03:04 PM 127 desktop.ini 05/19/2008 04:47 PM 909,020 extension.baf 04/10/2007 10:09 AM 1,394 HS_columns.xmc 05/19/2008 04:47 PM 3,675 LCParms.txt 05/19/2008 04:47 PM 222 NuGenesisTemplate.txt 05/19/2008 04:47 PM 1,303 SampleInfo.xml 07/28/2008 12:28 PM 24,576 Storage.mcf_idx 07/25/2008 01:36 PM 0 SyncHelper 33 File(s) 227,934,634 bytes 3 Dir(s) 20,683,268,096 bytes free Directory of \10 fm bsa_1-A,1_01_76.d 10/02/2008 03:08 PM <DIR> 071022_caplc_76.m 10/23/2007 09:25 AM 494 10 fm bsa_1-A,1_01_76.hdx 10/23/2007 10:41 AM 203,544 10 fm bsa_1-A,1_01_76.unt 08/27/2008 03:03 PM 48,697,601 91f8a826-2331-44ae-b684-017143b1a8df_1.mcf 08/27/2008 03:03 PM 99,328 91f8a826-2331-44ae-b684-017143b1a8df_1.mcf_idx 06/10/2008 10:20 AM 4,387 Analysis.0.DataAnalysis.method 06/10/2008 10:20 AM 16,805,076 Analysis.0.result_c 06/17/2008 10:22 AM 4,387 Analysis.1.DataAnalysis.method 06/17/2008 10:22 AM 16,827,442 Analysis.1.result_c 06/20/2008 03:53 PM 2,750 Analysis.2.DataAnalysis.method 06/20/2008 03:53 PM 16,818,436 Analysis.2.result_c 08/27/2008 03:03 PM 3,137 Analysis.3.DataAnalysis.method 08/27/2008 03:03 PM 496,255 Analysis.3.result_c 08/27/2008 03:03 PM 370 Analysis.content 06/09/2008 05:15 PM 213,982 Analysis.ETD.mgf 06/09/2008 05:15 PM 1,094 Analysis.mgf 10/25/2007 10:25 AM 17,805,450 Analysis.mzData 06/09/2008 04:01 PM 65,489,623 ANALYSIS.MZXML 10/23/2007 10:41 AM 38,598,852 Analysis.yep 08/22/2008 01:09 PM 25,288 BackgroundLineNeg.ami 08/22/2008 01:08 PM 341,776 BackgroundLinePos.ami 08/22/2008 01:09 PM 84,088 BackgroundProfNeg.ami 08/22/2008 01:09 PM 279,280 BackgroundProfPos.ami 06/18/2008 12:38 PM 2,652 BSA76_cmpd168_mH527_3_.mgf 08/26/2008 02:15 PM 2,462 bsa_etd_76_mgf_mz_722.mgf 08/26/2008 02:45 PM 2,081 bsa_etd_76_mgf_mz_722_no_decon.mgf 08/27/2008 03:03 PM 486 Calibrator.ami 10/17/2007 04:06 PM 4,513 CapLCMSRC.hss 06/09/2008 04:01 PM 235,481 CompassXport.log 08/22/2008 01:08 PM 927,585 DensViewNeg.ami 08/22/2008 01:14 PM 220,777 DensViewNegBgnd.ami 08/22/2008 01:08 PM 29,456,201 DensViewPos.ami 08/22/2008 01:09 PM 26,476,073 DensViewPosBgnd.ami 08/27/2008 03:03 PM 125 desktop.ini 10/23/2007 10:41 AM 675,228 extension.baf 10/23/2007 10:26 AM 291,978 file 76.ETD.mgf 10/23/2007 10:26 AM 1,139 file 76.mgf 10/23/2007 10:34 AM 291,978 file 76mod.ETD.mgf 04/10/2007 10:09 AM 1,394 HS_columns.xmc 10/23/2007 10:41 AM 3,052 LCParms.txt 10/02/2008 03:08 PM <DIR> Mascot Results 10/23/2007 10:41 AM 219 NuGenesisTemplate.txt 10/02/2008 03:08 PM <DIR> Ommsa etd search results 06/09/2008 05:15 PM 0 ProteinDataBaseQuery.mct 06/09/2008 05:15 PM 1,080 ProteinDataBaseQuery.MGF 10/23/2007 10:41 AM 1,176 SampleInfo.xml 08/27/2008 03:03 PM 24,576 Storage.mcf_idx 08/22/2008 01:08 PM 0 SyncHelper 45 File(s) 281,422,896 bytes 5 Dir(s) 20,682,661,888 bytes free Directory of \D7_band4_mlm9_44_1-F,7_01_119.d 12/06/2007 03:25 AM 529 D7_band4_mlm9_44_1-F,7_01_119.hdx 12/06/2007 04:41 AM 220,120 D7_band4_mlm9_44_1-F,7_01_119.unt 10/02/2008 02:52 PM <DIR> 071119_caplc_lo_thres_119.m 12/10/2007 06:58 PM 4,175 Analysis.0.DataAnalysis.method 12/10/2007 06:58 PM 480,225 Analysis.0.result_c 12/21/2007 05:10 PM 4,175 Analysis.1.DataAnalysis.method 12/21/2007 05:10 PM 480,213 Analysis.1.result_c 12/21/2007 05:10 PM 190 Analysis.content 12/10/2007 12:49 PM 50,823,424 Analysis.mzData 12/06/2007 04:41 AM 105,680,964 Analysis.yep 10/17/2007 04:06 PM 4,513 CapLCMSRC.hss 12/21/2007 05:10 PM 131 desktop.ini 12/06/2007 04:41 AM 902,936 extension.baf 04/10/2007 10:09 AM 1,394 HS_columns.xmc 12/06/2007 04:41 AM 3,072 LCParms.txt 12/06/2007 04:41 AM 227 NuGenesisTemplate.txt 12/06/2007 04:41 AM 1,265 SampleInfo.xml 16 File(s) 158,607,553 bytes 3 Dir(s) 20,684,148,736 bytes free So...any input about what files we should include in the sourceFileList? -Matt Matthew Chambers wrote: > Hi all, > > We need terms for Agilent MassHunter sources in the CV. In the > MassHunter API there are two ways to uniquely address a spectrum: by > "row number" or "scan id". Row number is essentially a 0-based index > that refers to the spectra after the acquisition software has done > something...perhaps internal merging? Scan id represents the ordinal > number of acquisitions as they come off the instrument. So, at least on > their (Q)TOF instruments, the rowNumber is very disparate from the > scanId, but both of them are unique identifiers that can technically be > used to refer to a native spectrum. The kink is that the MassHunter API > only refers to the parent scan by its scan id and doesn't provide a way > to directly translate a scan id to a row number - translation must be > done indirectly by enumerating all the row numbers and building a > mapping of scan id to row number. For this reason I would recommend that > the nativeID format be defined as "scanId=xsd:nonNegativeInteger" but > I'm open to comment on this! > > The source type brings another issue to a head. We actually have more > vendor formats that use directories to store their raw data than those > that use files. > Directories: Agilent MassHunter (read with MHDAC API), Bruker/Agilent > YEP, Bruker BAF, Bruker FID, Bruker U2 (previous 4 formats read with > CompassXtract API), Waters MassLynx (read with DACServer API) > Files: ABI WIFF (read with either Analyst or WiffFileDataReader APIs), > Thermo RAW (read with XRawfile API) > > But we don't clearly define how to deal with the directory-based > formats. I'm tempted to recommend that we include and checksum all files > within the directories, but it's entirely possible that some people > store alternative encodings of the data inside these directories, e.g. > mzXML and MGF (I've seen this). So it would be silly to include mzXML > and MGF as source files for the native data. There can also be analyses > of the data stored there, like Bruker and Agilent's *.m subdirectories, > or even pepXML files. Is it reasonable to determine which files in these > sources are used by the APIs and put that information in the CV > definition for the source types - possibly in a machine-readable way? > Also, if we're not going to (and I wouldn't want to) define a separate > source type for each subfile (see attached thread ending on 2009-19-03), > we would have to document somewhere that every file that should be > included in these directory-based formats should be given the > directory-level CV term as its source type. > > -Matt > > > >> Matthew Chambers wrote: >> >> >>>> Yes, I made that change, but I forgot that every sourceFile has to have >>>> a type. That does make it ugly. I was trying to make things consistent >>>> between Waters and Bruker formats because they both use directories, but >>>> perhaps I should have gone the other direction and made the source type >>>> for Bruker directories more applicable to the format as a whole - the >>>> problem is I'm not knowledgeable about those formats to know what each >>>> one corresponds to that is analogous to MassLynx. In any case, I don't >>>> think the meaning of the term changed. The important part is that it's >>>> the MassLynx format, not whether it's called DAT or RAW. >>>> >>>> -Matt >>>> >>>> >>>> Fredrik Levander wrote: >>>> >>>> >>> >>> >>>>>> Just noticed that the name and definition of MS:1000526 MassLynx raw >>>>>> format has changed to Waters DAT format. Is this really wanted? I guess >>>>>> that one would like to have all files in a MassLynx raw folder as source >>>>>> files, since they will all contain some information that is used in the >>>>>> mzML file, and then they are all part of the the same source file format >>>>>> (in my opinion). Or otherwise there will be need to add an _FUNCTNS.INF >>>>>> file format and a header.txt format, etc. >>>>>> If there is need for separate file formats for these sub-files, I think >>>>>> those terms (including the DAT one) should have new accession numbers, >>>>>> since the meaning of the term has changed, or am I interpreting this in >>>>>> the wrong way? >>>>>> >>>>>> Fredrik >>>>>> |
From: Matthew C. <mat...@va...> - 2009-07-01 18:00:56
|
Hi Jim, I've got a user asking us to support converting TSQ Vantage files and I notice that their example file has the model string set to "TSQ Vantage Standard." The Vantage is not in the CV yet at all. Before I add it to the CV, can you clarify which variants of it are available? Thanks, -Matt |
From: Paul R. <rud...@ni...> - 2009-07-01 15:04:52
|
Matt and other Fearless PSI volunteers - If you think you may be able to squeeze spectral libraries into an evolving format, I'd like to help out. Here are a couple of items that might be useful: 1) Spectral libraries are typically one of two types: 'replicate' or 'consensus.' The latter indicating that multiple spectra identifying the same peptide ion have been used to generate the final spectrum. This also means that the stats in the annotation may have additional variance values. 2) The first section of metadata is evidence supporting that spectrum. This includes things like search engine scores, sample sources and quality metrics -- e.g., fraction of unexplained abundance, similarity of replicates, etc. Many of these are internal to us and not used by all library generators, however, some might well be required. Here's that section from the first spectrum in the human library (ion trap), Name: AAAAAAAAAAAAAAAGAGAGAK/1 MW: 1596.846 Comment: Spec=Consensus Pep=Tryptic Fullname=R.AAAAAAAAAAAAAAAGAGAGAK.Q/1 Mods=0 Parent=1596.846 Inst=it Mz_diff=-0.066 Mz_exact=1596.8457 Mz_av=1597.771 Protein="IPI00220844.1|SWISS-PROT:P55011-3|ENSEMBL:ENSP00000340878 Tax_Id=9606 Splice Isoform 2 of Solute carrier family 12 member 2" Pseq=36 Organism="human" Se=3^X2:ex=2.25065e-009/2.249e-009,td=6.9122e+010/6.888e+010,sd=0/0,hs=59.9/4,bs=1.3e-012,b2=4.5e-009,bd=1.38e+011^O2:ex=0.00144675/0.001383,td=402850/3.852e+005,pr=4.235e-009/4.035e-009,bs=6.35e-005,b2=0.00283,bd=788000^P2:sc=27.5/0.8,dc=18/0.4,ps=2.015/0.125,bs=0 Sample=1/human_ncrr_hprob_cam,2,2 Nreps=2/2 Missing=0.2229/0.0094 Parent_med=1596.78/0.16 Max2med_orig=29.6/5.3 Dotfull=0.728/0.000 Dot_cons=0.870/0.005 Unassign_all=0.171 Unassigned=0.000 Dotbest=0.88 Flags=0,0,0 Naa=22 DUScorr=0.85/3.3/15 Dottheory=0.83 Pfin=1.5e+013 Probcorr=1 Tfratio=1.6e+008 Pfract=0 Num peaks: 109 3) The business end - the spectrum - is a basepeak normalized, annotated peaklist. 452.9 847 "? 2/2 14.3" 462.8 535 "b7-35/-0.46 2/2 5.5" 480.3 620 "b7-18/0.04 2/2 10.4" 498.2 2165 "b7/-0.06 2/2 9.8" 524.0 1397 "? 2/2 8.8" 527.0 381 "Int/AAAAGAGA/0.2 2/2 2.1" 531.1 1140 "y7/-0.19 2/2 7.9" 537.2 675 "? 2/2 8.9" 551.2 910 "b8-18/-0.10 2/2 4.7" 569.1 4712 "b8/-0.20 2/2 25.4" 581.2 289 "? 2/2 3.1" 593.7 249 "? 2/1 2.6" 595.1 1182 "? 2/2 7.3" 602.2 2209 "y8/-0.13 2/2 21.5" 608.6 335 "? 2/2 3.9" 622.2 1625 "b9-18/-0.13 2/2 6.4" 626.2 767 "Int/AAAAAAAAG/0.1 2/2 1.8" [..] This is MSP format (also used for small molecules), and it is simple ASCII. Additional "formats" are usually borne for the purposes of searching (e.g, indexing). MSP is easy to work with but it is not a standard -- a common, cross-domain representation would be better :) Paul Matthew Chambers wrote: > I brought this issue up in the call because I think that the "evidence" > tag in traML is too weak to go in a standard - I was proposing that a > better way to do it might be to refer to an mzIdentML file for a more > complete context of the evidence. > > I know very little about metabolomics, but the spectra libraries we're > talking about are indeed proteomics-oriented. But I don't think you need > to shut up - indeed, I think this may be another case where we can and > should standardize within and even across domains because the > annotations of a peptide's spectrum could be even more rich and detailed > than that of a smaller molecule. In the spectral library domain just for > peptides, there are at least 5 common formats already > (http://peptide.nist.gov): MSP/ASCII, a multi-file NIST binary format, > SpectraST, BiblioSpec, and X!Hunter. I'm sure you have a few more in the > metabolomics domain. A lack of standards makes life so much more > interesting, don't you think? ;) > > -Matt > > > Steffen Neumann wrote: > >> On Tue, 2009-06-30 at 08:32 -0700, Eric Deutsch wrote: >> >> >>> +Can mzIdentML encode a spectral library? >>> >>> >> I am unsure whether this was discussed during the conference call, >> or is left as an open point to the list. >> >> Anyway, I am inclined to say "no" about this, >> for two reasons: >> >> 1) I don't know enough about analys^H^H^H^H^mzIdentML, >> because my very brief looks made it look like proteomics-only. >> (Or are you actually referring to a proteomics-spectral library?! >> in that case I'll shut up and you can skip the rest of this mail.) >> >> 2) A spectral library will (at least in the future) >> contain sets of spectra (different eV, MS1-MS^n, ...) >> and associated annotations, which might be as complicated >> as a molecule and its fragmentation brake-down products. >> This requires a rich set of links between individual peaks >> and their (molecular) annotation. >> >> So for small molecules (read: metabolomics stuff) we have started >> to create mzAnnotate under the umbrella of the Metabolomics Standards >> Initiative (MSI). http://msi-workgroups.sourceforge.net/exchange-format/ >> >> We have drafted some use cases, shown on >> http://sourceforge.net/apps/mediawiki/metware/index.php?title=MzAnnotate >> and prepared a converter for both the spectral library www.massbank.jp >> and our own MassFrontier clone MetFrag, and will present these >> on the MSI mailing list soon. >> >> Yours, >> Steffen >> >> >> > > ------------------------------------------------------------------------------ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Steffen N. <sne...@ip...> - 2009-07-01 07:42:20
|
On Mon, 2009-06-29 at 12:18 -0500, Coleman, Michael wrote: ... > I would argue that the possibility of writing trivial programs that read > peak data is also a reason, perhaps a more important one. Having the peaks > encoded does make it a bit harder to jump in and start doing something with > them. I'd say that was true before the web and Open Source stuff was all over the place. Googl'ing for "decrypt base64 <yourfavoritelanguage>" will almost always yield a 1-3 liner you can get inspiration from, including zip'ing the data. Funny, one of the top ranks for php gives you excerpt from a malware script: http://justin.madirish.net/node/321 Yours, Steffen -- IPB Halle AG Massenspektrometrie & Bioinformatik Dr. Steffen Neumann http://www.IPB-Halle.DE Weinberg 3 http://msbi.bic-gh.de 06120 Halle Tel. +49 (0) 345 5582 - 1470 +49 (0) 345 5582 - 0 sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409 |
From: Mike C. <tu...@gm...> - 2009-07-01 02:18:00
|
On Tue, Jun 30, 2009 at 6:11 PM, Matthew Chambers<mat...@va...> wrote: > Hi Mike, > > Are you using long doubles in greylag? The reasonable fix if more than > 15 digits are truly needed is to use a bigger data type, although a > standard and portable long double does not exist AFAIK. No, long doubles seem like overkill, at least at present. I gave the example only for informational purposes. It does appear to me that single precision floats are too small for some of the calculations required for recent instruments. Maybe they're enough for the peak representation itself--I'm not sure. > If one wanted to write trivial code to read XML... For the languages I use, access to XML parsing, base64 coding, and libz aren't really serious issues, but it is a little more involved than what I do with our current format (old ms2, which is basically one peak per line, represented as two floats in ASCII format), which makes it simple to do several basic transformations using standard Unix command-line tools. It's tempting to concentrate on the size of spectrum files as a metric, but the amount of programmer time it takes to do things probably matters more at my shop. |
From: Matthew C. <mat...@va...> - 2009-06-30 23:12:37
|
Hi Mike, Are you using long doubles in greylag? The reasonable fix if more than 15 digits are truly needed is to use a bigger data type, although a standard and portable long double does not exist AFAIK. If one wanted to write trivial code to read XML, it would probably be a simple token parsing approach in which case reading the XML comments I proposed earlier is even easier than reading some cooked up ASCII notation. And remember that the ASCII notation, whether in standard form or in XML comments, would necessarily be an optional representation. I shudder with glee at the thought of how much fun that the optional "standard form" would be to deal with! ;) A DOM approach is conceivable, but unlikely to be scalable and in any case if you've got the facilities to read XML with a DOM then you almost certainly have access to base64 decoding or can get it easily. -Matt Coleman, Michael wrote: > I¹ve been on vacation, so this is a bit late. Comments below. > > On 6/12/09 10:05 AM, "Matthew Chambers" <mat...@va...> > wrote: > >> This is what I was refuting below. Assuming 15 or fewer base10 digits >> are needed, a double precision float is a better representation than >> ASCII in every way except human readability. Do you have examples of >> reference data that uses more than 15 digits in ASCII? >> > > For what it's worth, in greylag, the mass used for O (Oxygen) has 12 decimal > digits to the right of the decimal point. (This value comes from NIST, and > is meant to be as precise as possible.) Since peptides/proteins have masses > of at least 1000 Da, this means that at least 16-17 significant digits would > be needed to fully represent these calculations. > > One might dispute whether or not this level of precision is useful, but > since you asked, there's an example. > > >> And unless you can demonstrate that you need more >> than 15 digits of precision in your data, human readability is the only >> reason for ASCII representation. >> > > I would argue that the possibility of writing trivial programs that read > peak data is also a reason, perhaps a more important one. Having the peaks > encoded does make it a bit harder to jump in and start doing something with > them. > > Mike > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Matthew C. <mat...@va...> - 2009-06-30 20:13:13
|
I brought this issue up in the call because I think that the "evidence" tag in traML is too weak to go in a standard - I was proposing that a better way to do it might be to refer to an mzIdentML file for a more complete context of the evidence. I know very little about metabolomics, but the spectra libraries we're talking about are indeed proteomics-oriented. But I don't think you need to shut up - indeed, I think this may be another case where we can and should standardize within and even across domains because the annotations of a peptide's spectrum could be even more rich and detailed than that of a smaller molecule. In the spectral library domain just for peptides, there are at least 5 common formats already (http://peptide.nist.gov): MSP/ASCII, a multi-file NIST binary format, SpectraST, BiblioSpec, and X!Hunter. I'm sure you have a few more in the metabolomics domain. A lack of standards makes life so much more interesting, don't you think? ;) -Matt Steffen Neumann wrote: > On Tue, 2009-06-30 at 08:32 -0700, Eric Deutsch wrote: > >> +Can mzIdentML encode a spectral library? >> > > I am unsure whether this was discussed during the conference call, > or is left as an open point to the list. > > Anyway, I am inclined to say "no" about this, > for two reasons: > > 1) I don't know enough about analys^H^H^H^H^mzIdentML, > because my very brief looks made it look like proteomics-only. > (Or are you actually referring to a proteomics-spectral library?! > in that case I'll shut up and you can skip the rest of this mail.) > > 2) A spectral library will (at least in the future) > contain sets of spectra (different eV, MS1-MS^n, ...) > and associated annotations, which might be as complicated > as a molecule and its fragmentation brake-down products. > This requires a rich set of links between individual peaks > and their (molecular) annotation. > > So for small molecules (read: metabolomics stuff) we have started > to create mzAnnotate under the umbrella of the Metabolomics Standards > Initiative (MSI). http://msi-workgroups.sourceforge.net/exchange-format/ > > We have drafted some use cases, shown on > http://sourceforge.net/apps/mediawiki/metware/index.php?title=MzAnnotate > and prepared a converter for both the spectral library www.massbank.jp > and our own MassFrontier clone MetFrag, and will present these > on the MSI mailing list soon. > > Yours, > Steffen > > |
From: Matthew C. <mat...@va...> - 2009-06-30 19:21:36
|
Hi all, We need terms for Agilent MassHunter sources in the CV. In the MassHunter API there are two ways to uniquely address a spectrum: by "row number" or "scan id". Row number is essentially a 0-based index that refers to the spectra after the acquisition software has done something...perhaps internal merging? Scan id represents the ordinal number of acquisitions as they come off the instrument. So, at least on their (Q)TOF instruments, the rowNumber is very disparate from the scanId, but both of them are unique identifiers that can technically be used to refer to a native spectrum. The kink is that the MassHunter API only refers to the parent scan by its scan id and doesn't provide a way to directly translate a scan id to a row number - translation must be done indirectly by enumerating all the row numbers and building a mapping of scan id to row number. For this reason I would recommend that the nativeID format be defined as "scanId=xsd:nonNegativeInteger" but I'm open to comment on this! The source type brings another issue to a head. We actually have more vendor formats that use directories to store their raw data than those that use files. Directories: Agilent MassHunter (read with MHDAC API), Bruker/Agilent YEP, Bruker BAF, Bruker FID, Bruker U2 (previous 4 formats read with CompassXtract API), Waters MassLynx (read with DACServer API) Files: ABI WIFF (read with either Analyst or WiffFileDataReader APIs), Thermo RAW (read with XRawfile API) But we don't clearly define how to deal with the directory-based formats. I'm tempted to recommend that we include and checksum all files within the directories, but it's entirely possible that some people store alternative encodings of the data inside these directories, e.g. mzXML and MGF (I've seen this). So it would be silly to include mzXML and MGF as source files for the native data. There can also be analyses of the data stored there, like Bruker and Agilent's *.m subdirectories, or even pepXML files. Is it reasonable to determine which files in these sources are used by the APIs and put that information in the CV definition for the source types - possibly in a machine-readable way? Also, if we're not going to (and I wouldn't want to) define a separate source type for each subfile (see attached thread ending on 2009-19-03), we would have to document somewhere that every file that should be included in these directory-based formats should be given the directory-level CV term as its source type. -Matt > Matthew Chambers wrote: > >> > Yes, I made that change, but I forgot that every sourceFile has to have >> > a type. That does make it ugly. I was trying to make things consistent >> > between Waters and Bruker formats because they both use directories, but >> > perhaps I should have gone the other direction and made the source type >> > for Bruker directories more applicable to the format as a whole - the >> > problem is I'm not knowledgeable about those formats to know what each >> > one corresponds to that is analogous to MassLynx. In any case, I don't >> > think the meaning of the term changed. The important part is that it's >> > the MassLynx format, not whether it's called DAT or RAW. >> > >> > -Matt >> > >> > >> > Fredrik Levander wrote: >> > >> >>> >> Just noticed that the name and definition of MS:1000526 MassLynx raw >>> >> format has changed to Waters DAT format. Is this really wanted? I guess >>> >> that one would like to have all files in a MassLynx raw folder as source >>> >> files, since they will all contain some information that is used in the >>> >> mzML file, and then they are all part of the the same source file format >>> >> (in my opinion). Or otherwise there will be need to add an _FUNCTNS.INF >>> >> file format and a header.txt format, etc. >>> >> If there is need for separate file formats for these sub-files, I think >>> >> those terms (including the DAT one) should have new accession numbers, >>> >> since the meaning of the term has changed, or am I interpreting this in >>> >> the wrong way? >>> >> >>> >> Fredrik >>> |
From: Steffen N. <sne...@ip...> - 2009-06-30 19:21:28
|
On Tue, 2009-06-30 at 08:32 -0700, Eric Deutsch wrote: > +Can mzIdentML encode a spectral library? I am unsure whether this was discussed during the conference call, or is left as an open point to the list. Anyway, I am inclined to say "no" about this, for two reasons: 1) I don't know enough about analys^H^H^H^H^mzIdentML, because my very brief looks made it look like proteomics-only. (Or are you actually referring to a proteomics-spectral library?! in that case I'll shut up and you can skip the rest of this mail.) 2) A spectral library will (at least in the future) contain sets of spectra (different eV, MS1-MS^n, ...) and associated annotations, which might be as complicated as a molecule and its fragmentation brake-down products. This requires a rich set of links between individual peaks and their (molecular) annotation. So for small molecules (read: metabolomics stuff) we have started to create mzAnnotate under the umbrella of the Metabolomics Standards Initiative (MSI). http://msi-workgroups.sourceforge.net/exchange-format/ We have drafted some use cases, shown on http://sourceforge.net/apps/mediawiki/metware/index.php?title=MzAnnotate and prepared a converter for both the spectral library www.massbank.jp and our own MassFrontier clone MetFrag, and will present these on the MSI mailing list soon. Yours, Steffen |
From: Eric D. <ede...@sy...> - 2009-06-30 15:32:39
|
Present: Jim, Fredrik, Matt, Darren, Eric 1) mzML 1.1.0 - Outstanding items + none - Manuscript + Get feedback back soon. ---- 2) mzML implementations catalog + No comments at present. Please send comments. ---- 3) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 4) TraML development - Updates pending - Implementations? +Try splitting "transition predicted from consensus spectrum ion trap" into two terms +Do we want to add the mzML sourceFileList section? +<evidence> is not pretty +Can mzIdentML encode a spectral library? _____ From: Eric Deutsch [mailto:ede...@sy...] Sent: Monday, June 29, 2009 10:50 PM To: 'Mass spectrometry standard development' Cc: 'Eric Deutsch' Subject: RE: PSI-MSS WG call reminder Hi everyone, attached is a revised TraML toy example file. I have the CV mostly updated, and the xsd updated. Please have a look at the file if you have a chance before the call, and then based on the feedback, I'll check in the rest and post it later. Also, I updated the implementations table at: http://www.psidev.info/index.php?q=node/257 Please let me know if you have any suggested changes to the table. Thanks, Eric _____ From: Eric Deutsch [mailto:ede...@sy...] Sent: Monday, June 29, 2009 5:25 PM To: 'Mass spectrometry standard development' Cc: 'Eric Deutsch' Subject: PSI-MSS WG call reminder Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=30 <http://www.timeanddate.com/worldclock/fixedtime.html?day=30&month=6&year=20 09&hour=16&min=0&sec=0&p1=136> &month=6&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Outstanding items - Manuscript ---- 2) mzML implementations catalog ---- 3) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 4) TraML development - Updates pending - Implementations? |
From: Marc S. <stu...@gm...> - 2009-06-30 14:35:02
|
Hi all, I won't make it to the phone conference today. I read the manuscript and think it is already quite good. The image has to be adapted to the current schema though. Best, Marc Eric Deutsch wrote: > > Hi everyone, the next PSI Mass Spectrometry Standards Working Group > call will be Tuesday 8am PDT: > > > > http://www.timeanddate.com/worldclock/fixedtime.html?day=30&month=6&year=2009&hour=16&min=0&sec=0&p1=136 > <http://www.timeanddate.com/worldclock/fixedtime.html?day=30&month=6&year=2009&hour=16&min=0&sec=0&p1=136> > > > > 08:00 San Francisco > > 11:00 New York > > 16:00 London > > 17:00 Geneva > > > > + Germany: 08001012079 > > + Switzerland: 0800000860 > > + UK: 08081095644 > > + USA: 1-866-314-3683 > > Generic international: +44 2083222500 (UK number) > > > > access code: 297427 > > > > Agenda: > > 1) mzML 1.1.0 > > - Outstanding items > > - Manuscript > > > > ---- > > 2) mzML implementations catalog > > > > ---- > > 3) MIAPE-MS revision > > - Have revised document to discuss at ASMS > > > > ---- > > 4) TraML development > > - Updates pending > > - Implementations? > > > > > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Eric D. <ede...@sy...> - 2009-06-30 01:25:29
|
Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=30 <http://www.timeanddate.com/worldclock/fixedtime.html?day=30&month=6&year=20 09&hour=16&min=0&sec=0&p1=136> &month=6&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Outstanding items - Manuscript ---- 2) mzML implementations catalog ---- 3) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 4) TraML development - Updates pending - Implementations? |
From: Coleman, M. <MK...@st...> - 2009-06-29 17:18:41
|
I¹ve been on vacation, so this is a bit late. Comments below. On 6/12/09 10:05 AM, "Matthew Chambers" <mat...@va...> wrote: > This is what I was refuting below. Assuming 15 or fewer base10 digits > are needed, a double precision float is a better representation than > ASCII in every way except human readability. Do you have examples of > reference data that uses more than 15 digits in ASCII? For what it's worth, in greylag, the mass used for O (Oxygen) has 12 decimal digits to the right of the decimal point. (This value comes from NIST, and is meant to be as precise as possible.) Since peptides/proteins have masses of at least 1000 Da, this means that at least 16-17 significant digits would be needed to fully represent these calculations. One might dispute whether or not this level of precision is useful, but since you asked, there's an example. > And unless you can demonstrate that you need more > than 15 digits of precision in your data, human readability is the only > reason for ASCII representation. I would argue that the possibility of writing trivial programs that read peak data is also a reason, perhaps a more important one. Having the peaks encoded does make it a bit harder to jump in and start doing something with them. Mike |
From: Lennart M. <len...@eb...> - 2009-06-23 16:01:35
|
Dear PSI-MS Enthusiasts, Please find the meeting minutes of the 20090623 PSI-MS phoneconference below. Telephone conference 20090623 Participants: - Jim Shofstahl - Pierre-Alain Binz - Marc Sturm - Darren Kessner - Eric Deutsch - Matt Chambers - Lennart Martens (notes) Agenda: - Numerical precision; topic highlighted by Steve Stein: binary floats did not come with (implicit) precision + Eric suggested an optional additional array containing precision, but this might be overkill + Matt suggests that precision is typically a derived variable (e.g. from instrument resolution) + Steve's precision is likely to come from meta-analysis (in the spectrum library) + Pierre-Alain points out that the libraries want to move from nominal values to more precise values, where this precision can vary depending on the datapoint + Matt asks whether this means that experimental peaks that are (confidently) assigned are corrected or replaced by theoretical vlaues + Eric mentions that mzML was not intended for spectral libraries in design, and that we had this discussion before. Maybe we should therefore also consider the output from vendors, and see if they need to store such information. + Jim says that Thermo can provide m/z resolution information in addition to the actual m/z values. But this information is currently not used nor readily available in the Thermo export tool. + Darren asks whether a binary representation for a fixed floating point number (like 15.00) exists? Nobody knows. --> Propose to add an m/z precision array and ask wheter we need an intensity precision array type in mzML on the list (similar to the signal-to-noise array type we already have) - Validation of example files + Steffen's file does not yet validate, due to a missing term in the CV (absorbance units) --> Eric wil contact the Unit Ontology maintainer to add this term to the Unit Ontology - CV terms changes + PSI-PI people have been adding some terms, and Marc has committed some changes to MALDI laser units (together with Andreas) - Converters + Several implementations are coming along nicely, and various people are working with vendors on implementations - MIAPE MS compared to MCP Guidelines + A revised version will be discussed with the various journals, this version should be finalized this week by Pierre-Alain, and Sandra will create a SurveyMonkey survey as well. - TraML development + Eric hasn't gotten around to sopending much time on this lately, but will get around to updating the schema in the next days. - Upcoming phone conference schedule --> Will be decided each Monday, based on the number of topics we have lined up - Manuscript about mzML + Nature Biotechnology back-to-back with mzIdentML seems like a real option right now, based on commmunication with Nat Biotech editors --> Lennart takes care of writing the draft. Cheers, lnnrt. |
From: Steffen N. <sne...@ip...> - 2009-06-23 10:18:29
|
On Tue, 2009-06-23 at 01:18 -0700, Eric Deutsch wrote: > Hi everyone, the next PSI Mass Spectrometry Standards Working Group > call will be Tuesday 8am PDT: I won't make it. My example file is one of those not yet validating :-( > 1) mzML 1.1.0 > - Validation of example files > - Controlled vocabulary I have the problem that there are no "absorbance units AU" yet, see my mail with some proposals: Subject: Re: [Psidev-ms-dev] Example files Date: Fri, 12 Jun 2009 15:38:55 +0200 Yours, Steffen -- IPB Halle AG Massenspektrometrie & Bioinformatik Dr. Steffen Neumann http://www.IPB-Halle.DE Weinberg 3 http://msbi.bic-gh.de 06120 Halle Tel. +49 (0) 345 5582 - 1470 +49 (0) 345 5582 - 0 sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409 |
From: Eric D. <ede...@sy...> - 2009-06-23 08:20:14
|
Hi everyone, the next PSI Mass Spectrometry Standards Working Group call will be Tuesday 8am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=23 <http://www.timeanddate.com/worldclock/fixedtime.html?day=23&month=6&year=20 09&hour=16&min=0&sec=0&p1=136> &month=6&year=2009&hour=16&min=0&sec=0&p1=136 08:00 San Francisco 11:00 New York 16:00 London 17:00 Geneva + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 Generic international: +44 2083222500 (UK number) access code: 297427 Agenda: 1) mzML 1.1.0 - Numerical precision - Validation of example files - Controlled vocabulary - Other items? ---- 2) Need to update the mzML implementations catalog ---- 3) MIAPE-MS revision - Have revised document to discuss at ASMS ---- 4) TraML development - Updates pending - Implementations? - Precision issue (1., 1.0, 1.00) - XML comment: ASCII 1.00 but what good is this? - just have a precision array type - Or significant digits? - 1001.5 has precision 1 and 5 significant digits? - 1.24e7 has precision -6 (?!) and 3 significant digits? |
From: Eric D. <ede...@sy...> - 2009-06-16 06:13:19
|
Hi everyone, I think our topic list is so thin that it's not worth having a call this week. Let's plan on next week, though! Agenda for next week's call: 1) mzML 1.1.0 - Precision issue (1., 1.0, 1.00) - XML comment: hmm.. - just have a precision array type - Or significant digits? - 1001.50 has precision 2 and 6 significant digits? - 1.24e7 has precision -6 (?!) and 3 significant digits? 2) TraML development - Revision still to be prepared and sent around |
From: Florian R. <fl...@eb...> - 2009-06-15 16:06:57
|
Hi Angel, yes, thanks! I did not take into account that JAXB is clever and already does the base64 decoding! That is why I got troubles trying to decode (the already decoded) byte[]... I am also using the Inflater/Deflater for the compression and that works fine. Thanks a lot for pointing that out! Florian Angel Pizarro wrote: > OK, so think I am reading this completely wrong, but isn't the default JAXB > base64binary to byte[] decoding preserve the zlib compression, resulting in > a byte[] that is still zlib compressed? If so the java.util.zip.Inflater > should handle this correctly to deflate the byte[] before translating it > into Float[], int[] whatever. > My AS3 parser (soon to be released) encounters the same issue and I also > have deflate the byte[] before reading into an Array of Number (same data > type for double or float in AS3). > > Am I missing something? -angel > > On Mon, Jun 15, 2009 at 10:45 AM, Florian Reisinger <fl...@eb...>wrote: > >> Hi Matt, >> >> maybe I did not express myself well enough. >> >> We do have methods that take care of converting the data from the base64 >> encoded XML data into a >> more convenient double[] in Java. Taking into account the CVParams for >> precision and compression, >> this will actually hold double or float values. The user can then check if >> the actual data contained >> in the array is in double or float values... >> (I still have to cater for the int and String representations, though) >> >> >> But what I am having trouble with is the data I read from an existing mzML >> file. We used the default >> JAXB generated mapping and the data stored by the Unmarshaller in the >> mapped byte[] variable does >> not seem to be base64 encoded. It seems the data in the XML is not >> correctly converted into the Java >> byte[], which renders my attempts to base64-decode it useless. >> >> With the changes I mentioned to the XML schema, the mapping changes and the >> data is now unmarshalled >> into a String variable. The byte[] obtained from this String is base64 >> encoded and if converted into >> double/float I get reasonable values... >> >> >> Cheers, >> Florian >> >> >> >> > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > > > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Angel P. <an...@ma...> - 2009-06-15 15:25:54
|
OK, so think I am reading this completely wrong, but isn't the default JAXB base64binary to byte[] decoding preserve the zlib compression, resulting in a byte[] that is still zlib compressed? If so the java.util.zip.Inflater should handle this correctly to deflate the byte[] before translating it into Float[], int[] whatever. My AS3 parser (soon to be released) encounters the same issue and I also have deflate the byte[] before reading into an Array of Number (same data type for double or float in AS3). Am I missing something? -angel On Mon, Jun 15, 2009 at 10:45 AM, Florian Reisinger <fl...@eb...>wrote: > Hi Matt, > > maybe I did not express myself well enough. > > We do have methods that take care of converting the data from the base64 > encoded XML data into a > more convenient double[] in Java. Taking into account the CVParams for > precision and compression, > this will actually hold double or float values. The user can then check if > the actual data contained > in the array is in double or float values... > (I still have to cater for the int and String representations, though) > > > But what I am having trouble with is the data I read from an existing mzML > file. We used the default > JAXB generated mapping and the data stored by the Unmarshaller in the > mapped byte[] variable does > not seem to be base64 encoded. It seems the data in the XML is not > correctly converted into the Java > byte[], which renders my attempts to base64-decode it useless. > > With the changes I mentioned to the XML schema, the mapping changes and the > data is now unmarshalled > into a String variable. The byte[] obtained from this String is base64 > encoded and if converted into > double/float I get reasonable values... > > > Cheers, > Florian > > > > |