From: White, Joseph A. <JW<hite@ti...>  20030716 14:22:21

Hello Everyone, The ontology working group has been collecting terms and definitions for use in generating MAGEML, as well as in submissions to ArrayExpress. One item for which we don't have enough information is 'DerivedBioAssayType'. The MAGE documentation lists the following definition for DBAT: 'The type association indicates the derivation type, for instance collapsed spot replicate, ratio, averaged intensity, bioassay replicates, etc.' A DerivedBioAssay is the result of some form of Transformation of BioAssayData, including but not limited to averaging, normalization and clustering. The DerivedBioAssayType association to OntologyEntry is supposed to describe the DerivedBioAssay. Below is a list of suggested terms for DerivedBioAssayType. The terms in it describe the data items, not the methods used to obtain them. Compiling the list was not straightforward because there is overlap with QuantitationType and NormalizationDescriptionType. At this time, we plan to use terms from this list, but leave the list open for new suggestions, since this is a rapidly changing area of research. We would like to sollicit from the members of the Ontologies, MAGE and DataProccessing working groups approval or disapproval of each term listed belowa simple yes or no will suffice. (If you believe the definitions listed are completely unacceptable, please suggest a brief alternative.) Since the first four terms are listed in the MAGE documentation, they must be used; however the others are fair game, so please fire away! We would like to have responses by Aug 1st in order to tally the lists for our next ontology working group meeting in August. Thanks for any help you all can provide. Cheers, Joe DerivedBioAssayType (term: proposed definition) required terms collapsed_spot_replicate: results of data reduction involving computation of a representative value, e.g. by averaging, for a group of replicated Features from an array. ratio: results of data reduction involving computation of the ratio of intensities from a 2channel spottedcDNA hybridization. averaged_intensity: results of data reduction involving computation of the average intensity from identical Features, Reporters or CompositeSequences in different hybridizations. bioassay_replicate_reduction: results of data reduction involving computation of a representative value, e.g. by averaging, for a group of replicated Features from multiple BioAssays. additional terms log_ratio: results of data reduction involving computation of the ratio of intensities from a 2channel spottedcDNA hybridization, followed by log transformation. clustered_data: results of an analysis method that groups data based on some measure of similarity, e.g. Pearson correlation coeficient, Euclidean distance. (Although clustered data can be reported in HigherLevelAnalysis, they are DerivedBioAssayData nonetheless.) dye_swap_replicate_reduction: results of data reduction involving computation of a representative value, e.g. by averaging, for identical Features from 2 BioAssays employing dyeswapped LabeledExtracts. (Note: dye_swap_replicate_reduction is not merely a set of bioassay_replicates; bioassay_replicates and dye_swap_replicates are processed in different but significant ways.) normalized_intensities: intensities from one or both channels resulting from normalization of MeasuredBioAssayData. normalized_ratios: ratios resulting from normalization of MeasuredBioAssayData. mean_and_standard_deviation: the mean and standard deviation values resulting from computationally combining 2 or more sets of BioAssayData. mean_and_variance: the mean and variance values resulting from computationally combining 2 or more sets of BioAssayData. mean_and_coeficient_of_variation: the mean and coeficient of variation values resulting from computationally combining 2 or more sets of BioAssayData. mean_and_p_values: the mean and associated pvalues resulting from computationally combining 2 or more sets of BioAssayData. mean_and_confidence_indicators: the mean and associated calculated confidence resulting from computationally combining 2 or more sets of BioAssayData. The confidence indicators include, but are not limited to: confidence interval, standard deviation, coeficient of variation, and pvalue. filtered_data: A data reduction method that involves removal of Features from the data set based on some criteria, e.g. lowintensity threshhold. average_difference: the method used by Affymetrix to obtain mean signal intensity from a group of related Features. This involves calculation of the overall sum of perfect match minus mismatch pairs divided by the number of pairs. 
From: <jason@op...>  20030716 15:37:59

"White, Joseph A." <JWhite@...> writes: > Below is a list of suggested terms for DerivedBioAssayType. The > terms in it describe the data items, not the methods used to obtain > them. Excellent. Thanks for this work. > Compiling the list was not straightforward because there is > overlap with QuantitationType and NormalizationDescriptionType. Yes, one of MAGE's strongpoints  no shortage of ways to represent your data... From my memory of designing NormalizationDescriptionType  this was put in to be MIAME compiant, and was intended as a higherlevel description of what techniques were used in the Experiment. Any overlap with that should be perfectly acceptable. Genex has primarily used DerivedBioAssay:Type to indicate what method was used to produce the data, and from looking at your list, it seems your provided a really thorough place to start from. Thanks! jas. 