psidev-ms-dev Mailing List for Proteomics Standards Initiative (Page 107)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

My comments on mzML0.99.0 after reading (most of) the posts on the 
mailing list and trying to convert a peak list into the format are as 
follows:

The standard is composed of a schema with little control and a lot of 
cvParams that are controlled by a separate file. Updates to the CV does 
not require schema updates, and the CV rules file should also be stable. 
For the validation of files it would, as pointed out by several people, 
be straightforward to automate generate an XSD which reflect the current 
CV. Otherwise the semantic Java validator also does the job (and also 
have other benefits when it comes to large files). For us it doesn't 
matter which method is used, but the real issue is how to handle 
versions of the CV. As long as nothing is deleted from the CV everything 
should be fine from an implementation point of view though.

A major problem would be if something is added to the CV which breaks 
current parsers. A new compression type could be added to the CV without 
notice, and if someone is using that compression type they're producing 
standard compliant files, but parsers that are supposed to be standard 
compliant would not be able to parse the file correctly. So, there are a 
few places where I think the allowed values should be set under enum 
constraints in the main standard schema, so that a new schema version is 
enforced if these fields are changed. I have the feeling that CV version 
will not be as controlled as the schema version. Fields that I propose 
should be enums are (this is maybe one step back again...):

In binaryDataArray:

compressionType (no compression/zlib compression)
valueType (32-bit float, 64-bit float, 16-bit integer, 32-bit integer or 
64-bit integer)

In spectrum:

spectrumType (centroid, profile).

these parameters could be attributes or cvParams (but under schema 
control) if CV accession numbers are important.

Other comments:

There is also an acquisitionList spectrumType attribute which probably 
could be removed since we have spectrumDescription - 
spectrumRepresentation (spectrumType). Only use would be if the 
acquisitions were in profile mode but the peak picking algorithm that 
worked on the spectra turned them into a centroid peak list and one 
would like to specify this (?).

If the spectrum is a combination of multiple scans (as specified using 
acquistionList) one would normally not use the 'scan' element. The 
question is then how to give the retention time? We did not succeed in 
doing this in a valid way, see 
http://trac.thep.lu.se/trac/fp6-prodac/browser/trunk/mzML/FF_070504_MSMS_5B.mzML 

for a simple (but invalid way of doing it). More correct would be to put 
the cvParam under the acquisition with the retention time, but this is 
not allowed either.

Why not allow softwareParam to be userParam or cvParam or must all 
software that work on mzML be in the CV?

How about having precursor m/z, intensity and charge state as 
non-required attributes to ionSelection? These fields are really used in 
every file.

Final comment is though that all these things are really minor, and that 
getting the standard released is what matters!

Regards

Fredrik

2002	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (3)	Nov	Dec
2003	Jan	Feb	Mar	Apr (1)	May	Jun	Jul (1)	Aug	Sep	Oct	Nov (3)	Dec
2004	Jan	Feb	Mar	Apr	May (2)	Jun	Jul (1)	Aug (5)	Sep	Oct (5)	Nov (1)	Dec (2)
2005	Jan (2)	Feb (5)	Mar	Apr (1)	May (5)	Jun (2)	Jul (3)	Aug (7)	Sep (18)	Oct (22)	Nov (10)	Dec (15)
2006	Jan (15)	Feb (8)	Mar (16)	Apr (8)	May (2)	Jun (5)	Jul (3)	Aug (1)	Sep (34)	Oct (21)	Nov (14)	Dec (2)
2007	Jan	Feb (17)	Mar (10)	Apr (25)	May (11)	Jun (30)	Jul (1)	Aug (38)	Sep	Oct (119)	Nov (18)	Dec (3)
2008	Jan (34)	Feb (202)	Mar (57)	Apr (76)	May (44)	Jun (33)	Jul (33)	Aug (32)	Sep (41)	Oct (49)	Nov (84)	Dec (216)
2009	Jan (102)	Feb (126)	Mar (112)	Apr (26)	May (91)	Jun (54)	Jul (39)	Aug (29)	Sep (16)	Oct (18)	Nov (12)	Dec (23)
2010	Jan (29)	Feb (7)	Mar (11)	Apr (22)	May (9)	Jun (13)	Jul (7)	Aug (10)	Sep (9)	Oct (20)	Nov (1)	Dec
2011	Jan	Feb (4)	Mar (27)	Apr (15)	May (23)	Jun (13)	Jul (15)	Aug (11)	Sep (23)	Oct (18)	Nov (10)	Dec (7)
2012	Jan (23)	Feb (19)	Mar (7)	Apr (20)	May (16)	Jun (4)	Jul (6)	Aug (6)	Sep (14)	Oct (16)	Nov (31)	Dec (23)
2013	Jan (14)	Feb (19)	Mar (7)	Apr (25)	May (8)	Jun (5)	Jul (5)	Aug (6)	Sep (20)	Oct (19)	Nov (10)	Dec (12)
2014	Jan (6)	Feb (15)	Mar (6)	Apr (4)	May (16)	Jun (6)	Jul (4)	Aug (2)	Sep (3)	Oct (3)	Nov (7)	Dec (3)
2015	Jan (3)	Feb (8)	Mar (14)	Apr (3)	May (17)	Jun (9)	Jul (4)	Aug (2)	Sep	Oct (13)	Nov	Dec (6)
2016	Jan (8)	Feb (1)	Mar (20)	Apr (16)	May (11)	Jun (6)	Jul (5)	Aug	Sep (2)	Oct (5)	Nov (7)	Dec (2)
2017	Jan (10)	Feb (3)	Mar (17)	Apr (7)	May (5)	Jun (11)	Jul (4)	Aug (12)	Sep (9)	Oct (7)	Nov (2)	Dec (4)
2018	Jan (7)	Feb (2)	Mar (5)	Apr (6)	May (7)	Jun (7)	Jul (7)	Aug (1)	Sep (9)	Oct (5)	Nov (3)	Dec (5)
2019	Jan (10)	Feb	Mar (4)	Apr (4)	May (2)	Jun (8)	Jul (2)	Aug (2)	Sep	Oct (2)	Nov (9)	Dec (1)
2020	Jan (3)	Feb (1)	Mar (2)	Apr	May (3)	Jun	Jul (2)	Aug	Sep	Oct (1)	Nov	Dec (1)
2021	Jan	Feb	Mar	Apr (5)	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2022	Jan	Feb	Mar	Apr	May	Jun	Jul (1)	Aug	Sep	Oct	Nov	Dec
2023	Jan	Feb	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2024	Jan	Feb (1)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (2)
2025	Jan	Feb	Mar	Apr	May	Jun	Jul (1)	Aug	Sep	Oct	Nov	Dec

psidev-ms-dev Mailing List for Proteomics Standards Initiative (Page 107)

psidev-ms-dev — Mass spectroscopy standard development