From: Jones, A. <And...@li...> - 2019-07-02 12:53:35
|
Hi Ville, Yes, I see that this is not a use case we covered when we designed this mechanism. All of the alternatives you propose cause some minor potential problems, and probably we would be best making a minor schema increment to fix the issue rather than trying to hack something that works. Are you happy for me to post this as an issue on GitHub, so we can track it and come up with a solution there? Best wishes Andy From: Ville Koskinen [mailto:vi...@ma...] Sent: 02 July 2019 13:26 To: psi...@li... Subject: [Psidev-ms-vocab] Encoding site localisation confidence for multiple modifications (mzIdentML 1.2) Dear all, we're looking at what changes are needed to export mzIdentML 1.2 from Mascot Server. One problem we've encountered so far is with site localisation scores. Here's an example query with two modifications and multiple permutations (scroll to the bottom of page): http://www.matrixscience.com/cgi/peptide_view.pl?file=..%2Fdata%2F20190108%2FF001291.dat;_msresflags=3138;_msresflags2=266;ave_thresh=17;db_idx=2;hit=1;index=TRAP_PLAFA;px=1;query=8118;section=5 The confidence percentages are based on the score difference between adjacent ranks. There is no site-specific score; the percentage is for the joint assignment confidence of Oxidation *and* dHex(1)Hex(1). 1) One possibility is to have the same modification index for different <Modification> elements. For example: <Peptide Id="8118_1"> <PeptideSequence>TASCGVWDEWSPCSVTCGK</PeptideSequence> <Modification monoisotopicMassDelta="15.994915" location="7" residues="M"> <cvParam cvRef="PSI-MS" accession="MS:1002504" name="modification index" value="1"/> </Modification> <Modification monoisotopicMassDelta="308.110732" location="16" residues="T"> <cvParam cvRef="PSI-MS" accession="MS:1002504" name="modification index" value="1"/> </Modification> </Peptide> Then, under <SpectrumIdentificationResult>, the line referring to modification index 1 means a simultaneous assignment. We could then encode the 10 different permutations like: <cvParam cvRef="PSI-MS" accession="MS:xxxxx" name="yyyy" value="1:19.39:7|16:true" /> <cvParam cvRef="PSI-MS" accession="MS:xxxxx" name="yyyy" value="1:19.39:7|14:true" /> ... <cvParam cvRef="PSI-MS" accession="MS:xxxxx" name="yyyy" value="1:0.61:10|3:true" /> <cvParam cvRef="PSI-MS" accession="MS:xxxxx" name="yyyy" value="1:0.61:10|1:true" /> This is assuming the site alternation is in the same order as the <Modification> elements (oxidation|dhex). Is this allowed, and is this the intended encoding? 2) If there must be a one-to-one mapping between modification index and modification name/delta, another possibility is to extend the regular expression constraint. Maybe something like this would work? <cvParam cvRef="PSI-MS" accession="MS:xxxxx" name="yyyy" value="1:19.39:7:true,2:19.39:16:true" /> The least ambiguous syntax would be the following, but I realise this is a big departure from the existing syntax: <cvParam cvRef="PSI-MS" accession="MS:xxxxx" name="yyyy" value="(1:7,2:16):19.39:true" /> Regards, Ville -- Ville Koskinen Matrix Science 64 Baker Street London W1U 7GB, UK Tel: +44 (0)20 7486 1050 Fax: +44 (0)20 7224 1344 vi...@ma...<mailto:vi...@ma...> http://www.matrixscience.com Matrix Science Ltd. is registered in England and Wales Company number 3533898 |