From: Martin E. <mar...@ru...> - 2008-05-27 12:32:01
|
Hi! Currently we are discussing how to describe search engine parameters (schema attributes, elements, CV params). To describe the behavior of a cleavage enzyme (spectrum identification via database search) we shall discuss the attached schema with examples, derived from the Excel spreadsheet http://code.google.com/p/psi-pi/source/b rowse/trunk/cv/search_engine_outputs_200 8May19.xls and phenyx / insilicospectro (Pierre-Alains Mail 5/21/2008). See schema annotations for further explanation. Instead of the <site> element the schema allows a <siteregexp> element, allowing a short description of complex cleavage rules. Bye Martin Trypsin: <cleavageEnzymes> <oneCleavageEnzyme identifier="Trypsin_(KR_noP)"> <cleavageEnzymeNames> <cvParam accession="PSI-PI:000XXX" name="Trypsin" cvRef="PSI-PI"/> </cleavageEnzymeNames> <site cleavPattern="K" adjacentPatternBlocking="P" terminus="C"/> <site cleavPattern="R" adjacentPatternBlocking="P" terminus="C"/> <CTermGain>OH</CTermGain> <NTermGain>H</NTermGain> </oneCleavageEnzyme> </cleavageEnzymes> Complex unrealistic example to show different possibilities of cleavPattern, cleavSite, terminus, minSpacing, missedcleavages: <cleavageEnzymes> <oneCleavageEnzyme identifier="Fantasy"> <cleavageEnzymeNames> <cvParam accession="PSI-PI:000XX1" name="Fantasy" cvRef="PSI-PI"/> <cvParam accession="PSI-PI:000XX2" name="Fantasy_synonym" cvRef="PSI-PI"/> </cleavageEnzymeNames> <site cleavPattern="EISI" terminus="C"/> <!-- cleave after each EISI --> <site cleavPattern="SIKI" terminus="N" minSpacing="6"/> <!-- cleave before each SIKI; between two cleavage sites at least 6 amino acids --> <site cleavPattern="DEQQD" cleavSite="DEQQD" terminus="C"/> <!-- cleave after each DEQQD --> <site cleavPattern="LEQ" cleavSite="" terminus="C"/> <!-- cleave after each LEQ --> <site cleavPattern="DET" cleavSite="" terminus="N"/> <!-- cleave before each DET --> <site cleavPattern="DQTD" cleavSite="DQT" terminus="C"/> <!-- recognize DGTD, cleave after the DQT of it --> <site cleavPattern="ELPD" cleavSite="E" terminus="N"/> <!-- semantic error, because terminus must be C, if cleavSite is given! --> <CTermGain>OH</CTermGain> <NTermGain>H</NTermGain> </oneCleavageEnzyme> <oneCleavageEnzyme identifier="Trypsin"> <cleavageEnzymeNames> <cvParam accession="PSI-PI:000456" name="Trypsin" cvRef="PSI-PI"/> </cleavageEnzymeNames> <site cleavPattern="K" terminus="C" adjacentPatternBlocking="P"/> <site cleavPattern="R" terminus="C" adjacentPatternBlocking="P"/> <CTermGain>OH</CTermGain> <NTermGain>H</NTermGain> </oneCleavageEnzyme> </cleavageEnzymes> |