From: David C. <dc...@ma...> - 2008-06-26 14:07:21
|
Hi Jenny, Jennifer Siepen wrote: > Hi, > > I am in the process of trying to put together an example instance > document for OMSSA and have a few questions. To make things more > complicated I have gone for an example where I run the search on a > concatenated forward/reverse database. Nothing like jumping in at the deep end! > At the moment I have all the > results in the analysisXML file i.e. in the ConceptualMoleculeCollection > I am listing all proteins and peptides identified including the reverse > sequences. I am unsure if (a) I am supposed to be listing all results > and (b) if all results are supposed to be listed how I mark the reverse > ones as decoy or does it not matter? In some ways it doesn't matter, because they are just lists of proteins/peptides. However, you might like to look at Martin's example which contains results from Mascot and Sequest, and model forward/reverse on this: http://code.google.com/p/psi-pi/source/browse/trunk/examples/schema_usecase_examples/working12June/MPC_use_case_working12June.axml (See also: http://code.google.com/p/psi-pi/issues/detail?id=32) For the proteins, if the reverse entries don't have different accessions, you could use a different Database_ref. For the peptides, to make it more human readable, you could encode 'Reverse' into the id? > > I am also listing all results (forward and reverse) in DataCollection. I'd recommend two sets of results: <SpectrumIdentificationList id="OMSSA_forward"> ... <SpectrumIdentificationList id="OMSSA_reverse"> ... > The next step for me would be to calculate false discovery rates based > upon the OMSSA results and select 'good' peptides, I am not sure where > these results would be reported? And nor am I yet. One issue is that this is a 'dynamic' sort of thing. For a particular cutoff expect value (or some rule), you might get x hits from the forward database, and y hits from the reverse database. For a different cutoff expect value, you would get x' and y' results. AnalysisXML is (currently) expected to report for just one 'cutoff' - i.e. a consumer of the analysisXML document couldn't recalculate the value. So, the proteins reported (from the forward / and reverse database) are the list for the cutoff decided by the producer of the file. We will discuss this in a conference call > > A quick question relates to the 'PeptideEvidence'. One of the attributes > is "pre" as in the previous flanking sequence. If my peptide is the > N-terminal peptide what would pre be? pre="" or pre="-"? or does it not > matter? We just need to decide and document - maybe at the conference call later today. > > Finally the database searched was a custom database, is there anywhere > to report how a database was generated? Possibly outside the scope of AnalysisXML. > Sometimes we also search > peptide databases i.e. the database would have the same number of > 'protein' entries as the original but there would only be one peptide > per protein would I be able to report how many peptides are in the > underlying database searched - would it be a cvParam? This was something we discussed briefly on 2008-05-15: http://www.psidev.info/index.php?q=node/325 We need the number of residues and sequences, although we don't currently have a record of the number of peptides in the database. Discussion of how to specify databases at: http://code.google.com/p/psi-pi/issues/detail?id=31 So, maybe you could add some notes there? David > > Thanks, > > Jenny > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Psidev-pi-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev -- David Creasy Matrix Science 64 Baker Street London W1U 7GB, UK Tel: +44 (0)20 7486 1050 Fax: +44 (0)20 7224 1344 dc...@ma... http://www.matrixscience.com Matrix Science Ltd. is registered in England and Wales Company number 3533898 |