From: Martin E. <mar...@ru...> - 2008-07-03 09:12:36
|
Hi! > > I am in the process of trying to put together an example instance > > document for OMSSA and have a few questions. To make things more > > complicated I have gone for an example where I run the search on a > > concatenated forward/reverse database. Great! We need real use cases / example docs, otherwise our discussions are quite academic. > > At the moment I have all the > > results in the analysisXML file i.e. in the ConceptualMoleculeCollection > > I am listing all proteins and peptides identified including the reverse > > sequences. I am unsure if (a) I am supposed to be listing all results > > and (b) if all results are supposed to be listed how I mark the reverse > > ones as decoy or does it not matter? > In some ways it doesn't matter, because they are just lists of > proteins/peptides. I agree, what you list is your decision, but it would be helpful to report this decision. So e.g. a CVparam, that it was a reverse search; and a FDR threshold, if you list only the forward proteins below this threshold. > However, you might like to look at Martin's example which contains Originally I shouted out for an own Analysis type "Quality Assurance" but I was convinced that it is not necessary. In our (MPC) use case I decided to list ALL identified proteins, the forward and decoy; to mark the decoy, I reported a "decoy pattern" CVParam. I have no FDR threshold, or I could have set it to "1.0". The decoy pattern belongs to the ProteinDetermination, because in doing a SpectrumIdentification, it has no meaning. > > I am also listing all results (forward and reverse) in DataCollection. > I'd recommend two sets of results: > <SpectrumIdentificationList id="OMSSA_forward"> ... > <SpectrumIdentificationList id="OMSSA_reverse"> But you cannot specify two result sets of ONE SpectrumIdentification. So with this suggestion you would have to have one SpectrumIdentification for the forward and one for reverse. I used one and reported a decoy pattern. > AnalysisXML is (currently) expected to > report for just one 'cutoff' - > i.e. a consumer of the analysisXML > document couldn't recalculate the > value. Yes, we agreed to have another AnalysisXML for another cut-off. I should put that into the wiki page ;-) > > N-terminal peptide what would pre be? pre="" or pre="-"? > We just need to decide and document - > maybe at the conference call later > today. New issue 34; in SEQUEST it is "-"; Oh, I see, that David finished this issue just-in-time because we decided on that in TeleCon 26th June. ;-) It is in the wiki now... > > Finally the database searched was a > custom database, is there anywhere > > to report how a database was > generated? > Possibly outside the scope of > AnalysisXML. In Inputs we have SearchDatabase. Then follows DatabaseName. We could add DatabaseProperties... That would be quite useful to describe the type of decoy DB. Bye Martin |