Re: [Psidev-qc-dev] HUPO PSI Quality Control Working Group

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Dave,

Attached is the section of the Spectrum Mill v6.0 Manual that describes the
various Quality Metrics that I calculate, and the output table of the SM
quality metrics module for the CPTAC breast cancer proteome dataset from the
Mertins et all 2016 Nature paper. The table is included in the supplementary
materials that can be downloaded from the CPTAC DCC:

https://cptac-data-portal.georgetown.edu/cptac/s/S029 

select
Supplementary_Data_Proteome_Peptide_Spectrum_Match_results_SpectrumMill

Some metrics described in the manual, particularly the spectrum
identifiability metrics, are more recently developed and not included in the
example.

For iTRAQ and TMT labeling I believe the most important practical metrics
are for the completeness of labeling. N-termini are typically not completely
labeled. TMT labeling tends to be more complete than iTRAQ labeling, in our
hands. That is why we routinely search TMT and iTRAQ data allowing for the
peptide N-termini to be either labeled or unlabeled. When doing our label
check experiments we run more comprehensive searches which allow for Lysines
also to be either labeled or unlabeled.

One new metric that we are beginning to track and use as a filter for
inclusion in protein level quant calculations (but not yet described in the
manual or present in the example) is the median signal/noise of the reporter
ions MS/MS spectrum. I now extract that s/n value for each peak via the
Thermo MSFileReader API:

       FRAW::IXRawfile2Ptr pRawfile2 =  m_pRawfile;

       nRet = pRawfile2->GetLabelData(pvarLabels, pvarFlags, &nScanNumber);

I believe a similar metric was included in the most recent release of
Proteome Discoverer.

I hope this helps. Let me know if you have questions.

--Karl

From: Tabb, David, Prof <dt...@su...> [mailto:dt...@su...] 
Sent: Wednesday, November 30, 2016 7:39 AM
To: cl...@br...
Cc: psi...@li...
Subject: HUPO PSI Quality Control Working Group

Hi, Karl.

I am part of a working group at HUPO-PSI that seeks to further the use of
quality control pipelines in conjunction with proteomics and metabolomics
experiments.  We are pretty familiar with tool sets for dealing with general
identification data, but we would like to give some concrete examples of
experiments that are less well-covered by quality control tools.  iTRAQ and
TMT came to mind, and I recalled some conversations we had back in the CPTAC
Data Analysis teleconferences that touched on this subject.

Have you gone ahead to create tools for producing quality control metrics in
iTRAQ or TMT experiments?  For example, one might ask what fraction of MS/MS
scans include reporter ions from all four channels (if one uses the 4-way
reagent, of course).  I believe you were on your way to computing such
metrics.

I hope to include some concrete examples of how quality control becomes more
concrete in specialized areas of proteomics and metabolomics as part of a
manuscript we are fielding to Analytical Chemistry that introduces our
working group and its goals to a broader audience.

Thanks!

Dave