From: <cod...@go...> - 2008-12-08 12:31:55
|
Status: Accepted Owner: dcreasy Labels: Type-Defect Priority-Medium New issue 45 by dcreasy: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 Writing documentation always brings a few problems to the surface... Consider 2 separate use cases: - Import into a repository such as Pride - Import into an application such as Scaffold or Peptide Prophet With the first, you probably only want to report and store proteins above a certain threshold that for example gives an FDR described in the instance document. With the second, we need to give scores for all spectra. (See: http://code.google.com/p/psi-pi/wiki/NotesForDocumentation#How_many_should__be_saved for details). One option is to insist that these two cases require two different instance documents. Another option is to 'encourage' people to save a larger number of results and add (say) a boolean attribute to <SpectrumIdentificationItem> and <ProteinDetectionHypothesis> that indicates that the result is above the specified thresholds. Without the boolean flag, if all results are saved (as in the current Mascot examples), a repository such as Pride would need to understand the relevant CV and only store results above a given score. Any thoughts / preferences? -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |
From: <cod...@go...> - 2008-12-08 15:58:45
|
Comment #1 on issue 45 by andrewrobertjones: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 I think we can support this for SpectrumIdentifications with no schema change. We can encourage data producers to produce large number of spectrum identifications. The identifications deemed "correct" are the ones referenced from ProteinDetectionHypotheses. This then also supports the use case of peptide identification scores being boosted once a protein detection has been made. PRIDE can make a call on how many of the SpectrumIdentifications to load. This does not solve the FDR problem for proteins though. However, I think the solution of creating two different instance documents would be okay. If sending to PRIDE, just send the list of proteins found. If using analysisXML as an intermediate format, send a long list of proteins (and peptides) with no set criteria for what is "correct". -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |
From: <cod...@go...> - 2009-04-29 06:21:12
|
Updates: Status: Fixed Comment #2 on issue 45 by dcreasy: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 Added PassThreshold attribute to SpectrumIdentificationItem and ProteinDetectionHypothesis -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |
From: <cod...@go...> - 2009-06-23 17:06:46
|
Updates: Status: Accepted Comment #3 on issue 45 by dcreasy: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 I've re-opened this issue rather than add the problem to the end of the CV issue because it seems more relevant here. We (or maybe just me) may have lost the plot here... Comment 2 says what we added to the schema. The structure for specifying which items "PassThreshold" is described here: http://code.google.com/p/psi-pi/issues/detail?id=49#c3 Currently, the only CV items allowed in the 'Threshold' are: MS:1001448: pep:FDR threshold MS:1001447: prot:FDR threshold MS:1001494: no threshold However, using an FDR is not the only way to do this. For example, if you've only got a few spectra, then FDR is definitely is not an option. I think we need other terms to be allowed here. For example, you might want to specify that the threshold is "ZYXScore > 23.4" In the Mascot example, there is for example: <ProteinDetectionProtocol id="PDP_MascotParser... <AnalysisParams> <cvParam accession="MS:1001316" name="mascot:SigThreshold" cvRef="PSI-MS" value="0.05"/> And I guess in this case I would like to specify MS:1001316 in AnalysisProtocolCollection/SpectrumIdentificationProtocol/Threshold/ -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |
From: <cod...@go...> - 2009-06-24 18:01:47
|
Comment #4 on issue 45 by andrewrobertjones: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 Agreed, we need a few different terms here, not sure if they exist in the CV or just not in the mapping at present e.g. p-value, mascot:SigThreshold, some terms for Sequest etc. -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |
From: <cod...@go...> - 2009-06-25 07:35:38
|
Comment #5 on issue 45 by pierrealainbinz: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 for Phenyx we probably already have something: a) representation of peptide scores: id: MS:1001395 name: Phenyx:Pepzscore and id: MS:1001396 name: Phenyx:PepPvalue both are is_a: MS:1001143 ! search engine specific score for peptides is_a: MS:1001153 ! search engine specific score id: MS:1001384 name: Phenyx:MinPepzscore and id: MS:1001385 name: Phenyx:MaxPepPvalue both is_a: MS:1001302 ! search engine specific input parameter 2) and we also have 2 binary values for a "valid" or "accepted" peptide status, corresponding to automatic selection (unedited search result) and user-defined selection (that has gone through manual selection), respectively: id: MS:1001393 name: Phenyx:Auto and id: MS:1001394 name: Phenyx:User both defined is_a: MS:1001143 ! search engine specific score for peptides is_a: MS:1001153 ! search engine specific score As the thresholding we talk about about is a "post processing" event (something we apply to the "raw" result), which one would you consider? I'm just wondering which one could fit best for a passThreshold criterium. -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |
From: <cod...@go...> - 2009-08-20 11:29:16
|
Updates: Status: Fixed Labels: Milestone-Release1.0 Comment #6 on issue 45 by eisenachM: Number of peptides / proteins to be reported http://code.google.com/p/psi-pi/issues/detail?id=45 (No comment was entered for this change.) -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings |