From: Hugo L. F. <hlf...@uv...> - 2018-04-17 08:34:14
|
Hi David, please, find attached the draft qcML document that our Mass-Up could create from the quality control analysis. We created it using data from a real QC analysis on the "Cancer dataset" available at the Mass-Up downloads page (https://www.sing-group.org/mass-up/downloads/datasets/Cancer-Dataset.zip). As in the example you sent me, we also put a few comments about some things that were unclear. Regarding the QC metrics, there are two types: replicates (or spectra) and samples. The first 5 QC metrics in the document are related to replicates so in this case they have 60 values (12 samples x 5 replicates each). The remaining 11 QC metrics are related to samples: the first 8 samples QC metrics have 12 values, one for each sample; and the remaining 3 samples QC metrics have, at the same time, different sets of 12 values. Each set corresponds to a POP (Percentage of Presence) value. For instance, the "matching-pop" metric have five sets of 12 values for POP values of 0.2, 0.4, 0.6, 0.8 and 1.0. Regarding the sets of values in replicates metrics: they are provided in the same order as replicates are presented in the metadata section. The same applies for samples metrics: they are provided in the same order as samples are presented in the metadata section because replicates with the same "sampleId" appear consecutively. To avoid possible misunderstandings, we discussed an alternative option for indicating samples order: adding a "userParam" with the ordered "sampleId"s. I hope this draft is useful for your meeting. Please, let me know if you need anything else. Best regards, Hugo. El 11/04/18 a las 10:52, David Tabb escribió: > I really appreciate it, sir! > > Our meeting will take place from April 18-20th. If we can discuss it > during the 19th, that would be ideal! > > I have included your image in the update I will be providing at the > start of the meeting (see attached). > > Merci, > Dave > > On 4/11/2018 11:40 AM, Hugo López Fernández wrote: >> Hi David, >> >> sure, I would be pleased to provide such draft. Please, let me know >> the deadline for sending you the document. I will have a meeting >> tomorrow or the day after tomorrow with my Mass-Up collaborators to >> address this. >> >> Regards, >> >> Hugo. >> >> El 11/04/18 a las 09:41, David Tabb escribió: >>> Hi, Hugo. >>> >>> Would you be willing to assemble a draft qcML document that might be >>> created by your Mass-Up software? We will be looking at such examples >>> at our annual HUPO-PSI meeting in a week. I am providing a pointer >>> to a >>> draft qcML example of the quality metrics that are produced within the >>> QuaMeter "IDFree" mode. >>> https://github.com/HUPO-PSI/qcML-development/blob/master/20180403-1091_Pool_start_v0.8.qc.xml >>> >>> >>> >>> Thanks, >>> Dave >>> >>> On 1/20/2017 1:31 PM, Hugo López Fernández wrote: >>>> Hello David, >>>> >>>> I am Hugo, we met last week in Semmering. I hope this email finds you >>>> well and that you had a good trip to back home. >>>> >>>> As we talked in the EuBIC, I am writing you to let you know more about >>>> the quality control analysis that we have included in Mass-Up >>>> (http://sing-group.org/mass-up/). This quality control is intended to >>>> work with peak lists. We would like to incorporate quality control for >>>> raw data, specially to detect batch effects as I also commented you. >>>> >>>> Basically, the quality control (which is explained with most details >>>> in the paper http://doi.org/10.1186/s12859-015-0752-4) can be done at >>>> two levels: at the replicates leve and at the samples level, which >>>> includes additional information from the intra-sample m/z matching >>>> process and consensus spectrum creation (this is because our >>>> collaborators usually want to reduce replicates spectra to a unique >>>> sample "consensus" spectrum). You can find attached the quality >>>> control image included in the paper. >>>> >>>> At the replicates level, the user can check basic information about >>>> each individual spectrum (i.e. peak count, m/z range, intensity >>>> ranges, etc.) and compare all spectra in the dataset. At the samples >>>> level, the user can check the performance of the intra-sample peak >>>> matching process, by comparing the percentages of presence (POP) >>>> counts (i.e.: the counts of peaks that are present in, for example, >>>> 60%, 80% or 100% of replicates) and the POPs of each sample. >>>> >>>> In spite of being a very simple quality control it allowed us to >>>> detect some problems with datasets and we encourage our collaborators >>>> to have a quick look at this quality control metrics before any other >>>> analysis. Unfortunately they usually don't but we must encourage good >>>> practices, which is the reason why I am developing this other software >>>> (http://www.sing-group.org/s2p/), also presented in other poster at >>>> the EuBIC. Basically it is a software to manage, process and integrate >>>> different data sources (Mascot identifications, MALDI plates, 2D-gel >>>> spots). >>>> >>>> I will be happy to answer any question you may have or to receive any >>>> feedback from you. Looking forward to see you again, in other >>>> conference or wherever. >>>> >>>> Best regards, >>>> >>>> Hugo. >>>> >>> >>> [http://cdn.sun.ac.za/100/ProductionFooter.jpg]<http://www.sun.ac.za/english/Pages/Water-crisis.aspx> >>> >>> >>> >>> The integrity and confidentiality of this email is governed by these >>> terms. Disclaimer<http://www.sun.ac.za/emaildisclaimer> >>> Die integriteit en vertroulikheid van hierdie e-pos word deur die >>> volgende bepalings gereël. >>> Vrywaringsklousule<http://www.sun.ac.za/emaildisclaimer> >>> >> > > [http://cdn.sun.ac.za/100/ProductionFooter.jpg]<http://www.sun.ac.za/english/Pages/Water-crisis.aspx> > > > The integrity and confidentiality of this email is governed by these > terms. Disclaimer<http://www.sun.ac.za/emaildisclaimer> > Die integriteit en vertroulikheid van hierdie e-pos word deur die > volgende bepalings gereël. > Vrywaringsklousule<http://www.sun.ac.za/emaildisclaimer> -- --------------------------------------------------------------------------- Hugo López-Fernández, PhD Email: hlf...@uv... Web: http://www.sing-group.org/~hlfernandez/ --------------------------------------------------------------------------- SING Research Group http://www.sing-group.org ESEI: Escuela Superior de Ingeniería Informática "Politécnico" Building, Room 306 "As Lagoas" Campus 32004 - Ourense - Spain --------------------------------------------------------------------------- CINBIO: Centro de Investigaciones Biomédicas http://cinbio.es/en/si4-next-generation-computer-systems-group/ --------------------------------------------------------------------------- IISGS: Instituto de Investigación Sanitaria Galicia Sur http://www.iisgaliciasur.es/sistemas-informaticos-de-nueva-generacion-sing/ --------------------------------------------------------------------------- The information in this e-mail and in any attachments is confidential and intended exclusively for the named addressee(s). Any use of this information not in accordance with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except if formally approved. |