From: Jones, A. <And...@li...> - 2012-03-16 23:08:34
|
Hi all, I've written up my notes on mzQuantML and circulated for comment and amendment. I've included the PSI-MS list as well, since I would like comments from this group as well, especially with regards to SRM support, cheers Andy Summary of mzQuantML outcomes of PSI meeting ********************************************** 1. No major problems with general schema, but very challenging to implement consistently for different techniques. 2. Some minor changes to the schema seem sensible, some we discussed, some I am proposing now: - Move all metadata-type elements up to the top of the hierarchy - Enforce a reference from PeptideConsensus to one or more Features (a feature is minimally just a m/z and charge value), no reason not to provide this - I would like to propose a simplication in the reference system from PeptideConsensus to Feature We currently have parallel arrays within PeptideConsensus: <PeptideConsensus id="PEPTIDER" charge="2"> <PeptideSequence>PEPTIDER</PeptideSequence> <Fe_refs>ft_13768 ft_137629 ft_137630 ft_137631 ft_137632 ft_137633 ft_137634 ft_137635 ft_137636 ft_137637 ft_137638 ft_137639 ft_573540 ft_573541 ft_573542 ft_573543 ft_573544 ft_573545 ft_573546 ft_573547 ft_573548 ft_573549 ft_573550 ft_573551</Fe_refs> <Assay_refs>ass_0 ass_1 ass_2 ass_3 ass_4 ass_5 ass_6 ass_7 ass_8 ass_9 ass_10 ass_11</Assay_refs> </PeptideConsensus> In our example files, the parallel arrays contain lots of errors. These could be checked by the semantic validator but I prefer the long-hand way of doing this that explicitly ties every feature to an assay: <PeptideConsensus id="PEPTIDER" charge="2"> <PeptideSequence>PEPTIDER</PeptideSequence> <FeatureRef feature_ref="featureB_1" assay_ref="B1"/> <FeatureRef feature_ref="featureB_1" assay_ref="B1"/> <FeatureRef feature_ref="featureB_1" assay_ref="B1"/> <FeatureRef feature_ref="featureB_1" assay_ref="C1"/> <FeatureRef feature_ref="featureE_1" assay_ref="E1"/> .... </PeptideConsensus> - Some minor changes to the case of elements - Added optional chromatogram_id and spectrum_id to feature. The former is required for SRM, the latter seems like it would be useful for approaches that are not based on rt maps. - Made charge mandatory on Feature - is this ever not known? 3. The major new proposal at the meeting was that we need to build (very soon) semantic validation software to enforce rules beyond XML schema checking and mapping file validation. This is because mzQuantML is a general specification and different types of quant method need to be represented consistently. As such, we have created a set of rules which will be formalised in semantic validation software. The semantic validator will perform the following steps (to be coded in Java): - XML Schema validation - First check the type of method (in level 1 we support MS1 label free, MS1 label, MS2 tag and spectral count) - Load the appropriate module encoding the rules defined in Rule Files in the schema directory for this technique - Provide warnings or error messages for each rule broken by the file - Perform CV validation by loading the general mzq Mapping file and a specific mapping file for the technique (can we re-use any of the EBI's validator for this?) ACTION: Da to start work on the software, release for beta testing by the end of the month. 4. The plan is to get the core specifications for mzQuantML back into the PSI document process very soon with the semantic validation software encoding the specific types of example soon after or at the same time - still to discuss. The core documentation needs to describe only the basic format. The mechanism for describing the technique-specific encodings may be: - Appendices to the main spec doc - PSI Informational documents - Some other mechanism? 5. We will resume weekly conference calls when possible for the next 2-3 months to get everything finalised, proposal is to resume at 4pm UK time on Thursdays? 6. SRM support - This was identified by a reviewer as an important need. Eric has started an example document and we should see this through to completion before submitting the core specifications, since there may be further minor schema changes. My current feeling is to leave SRM out of the level 1 technique support, since we still do not have sufficient experience of what is currenty produced by SRM tools. If anyone from the MS community is willing to help with this, we can try to include it in level 1, else it will have to wait for the level 2 release, due at the end of the year. ACTION: Andy to add further schema changes for SRM support including method files for referencing TraML etc. 7. We would like to get mzQuantML support into ProteoWizard soon to sanity check the schema e.g. exporting from TPP tools. ACTION: Andy to liaise with Parag to find a suitable person to code this up - someone could be employed on a short term basis to do this, any volunteers? Did I miss anything else...? |