Thread: [R-gregmisc-users] SF.net SVN: r-gregmisc: [970] trunk/ssize/inst/doc/ssize.Rnw
Brought to you by:
warnes
From: <wa...@us...> - 2006-07-24 21:23:39
|
Revision: 970 Author: warnes Date: 2006-07-24 14:23:26 -0700 (Mon, 24 Jul 2006) ViewCVS: http://svn.sourceforge.net/r-gregmisc/?rev=970&view=rev Log Message: ----------- Put into RNews format. Modified Paths: -------------- trunk/ssize/inst/doc/ssize.Rnw Modified: trunk/ssize/inst/doc/ssize.Rnw =================================================================== --- trunk/ssize/inst/doc/ssize.Rnw 2006-06-29 18:11:22 UTC (rev 969) +++ trunk/ssize/inst/doc/ssize.Rnw 2006-07-24 21:23:26 UTC (rev 970) @@ -4,29 +4,24 @@ %\VignettePackage{ssize} -\documentclass[12pt]{article} -\usepackage{url} -\usepackage{amsmath} -\usepackage{natbib} +%\documentclass[letter]{report} +\documentclass[a4paper]{report} +\usepackage{/Library/Frameworks/R.framework/Resources/share/texmf/Sweave} +\usepackage{Rnews} +\usepackage[round]{natbib} - -\newcommand{\Robject}[1]{{\texttt{#1}}} -\newcommand{\Rfunction}[1]{{\texttt{#1}}} -\newcommand{\Rpackage}[1]{{\textit{#1}}} -\newcommand{\Rclass}[1]{{\textit{#1}}} -\newcommand{\Rmethod}[1]{{\textit{#1}}} -\newcommand{\code}[1]{\texttt{#1}} - \begin{document} +\begin{article} \title{Sample Size Estimation for Microarray Experiments Using the - \code{ssize} package.} + \code{ssize} package.} \author{Gregory R. Warnes \\ - email:\code{gre...@pf...}} + email:\code{wa...@bs...}} +\subtitle{ ~ } \maketitle -\begin{abstract} +\section*{abstract} RNA Expression Microarray technology is widely applied in biomedical and pharmaceutical research. The huge number of RNA concentrations @@ -39,20 +34,19 @@ the Bioconductor project (\url{http://www.bioconductor.org}) web site. -\end{abstract} +\section*{Note} -\section{Note} - This document is a simplified version of the manuscript \begin{quote} - Warnes, G. R., Liu, P. (2005) - Sample Size Estimation for Microarray Experiments, \emph{submitted - to} {\it Biometrics}. + Warnes, G. R., Liu, P. (2006) Sample Size Estimation for Microarray + Experiments, Technical Report, Department of Biostatisticsa and + Computational Biology, University of Rochester. \end{quote} -Please refer to that document for a detailed discussion of the -sample size estimation method. +which has been available as a pre-publication manuscript since 2004. +Please refer to that document for a detailed discussion of the sample +size estimation method. -\section{Introduction} +\section*{Introduction} High-throughput microarray experiments allow the measurement of expression levels for tens of thousands of genes simultaneously. @@ -72,9 +66,9 @@ essential to take into account multiple testing and dependency among variables when calculating sample size. -\section{Method} +\section*{Method} -\subsection{Overview} +\subsection*{Overview} \citet{Warnes05} provides a simple method for computing sample size for micrarray experiments, and reports on a sereies of simulations @@ -89,7 +83,7 @@ exceptionally valuable in helping scientific clients to make the difficult trade offs between experiment cost and statistical power. -\subsection{Assumptions} +\subsection*{Assumptions} In the current implementation, we assume that a microarray experiment is set up to compare gene expressions between one @@ -112,7 +106,7 @@ where $\mu_{T}$ and $\mu_{C}$ are means of gene expressions for treatment and control group respectively. -\subsection{Computations} +\subsection*{Computations} The proposed procedure to estimate sample size is: @@ -178,7 +172,7 @@ \ref{fig:CumFoldChangePlot}. -\section{Example} +\section*{Example} First, we need to load the \code{ssize} library: @@ -186,6 +180,7 @@ library(ssize) library(xtable) library(gdata) # for nobs() +options(width=30) @ As part of the \code{ssize} library, I've provided an example data @@ -196,39 +191,41 @@ <<>>= data(exp.sd) -str(exp.sd) +#str(exp.sd) @ This data was calculated via something like \begin{verbatim} library(affy) -setwd("/data/rstat-data/Standard_Affymetrix_Analysis/GT_methods_050316/WORK") +setwd("~/GT_methods_050316/WORK") load("probeset_data.Rda") expression.values <- exprs(probeset.data) covariate.data <- pData(probeset.data) -controls <- expression.values[,covariate.data$GROUP=="Control"] #$ +controls <- expression.values[, + covariate.data$GROUP=="Control"] #$ exp.sd <- apply(controls, 1, sd) \end{verbatim} Lets see what the distribution looks like: -\begin{figure}[h!] - \centering - \caption{Distribution of exp.sd} - \label{exp.sd.hist} -<<fig=TRUE,width=12,height=6>>= - hist(exp.sd,n=20, col="cyan", border="blue", main="", - xlab="Standard Deviation (for data on the log scale)") +%\begin{figure}[h!] +% \centering +% \caption{Distribution of exp.sd} +% \label{exp.sd.hist} +%\end{figure} + +<<fig=TRUE,width=12,height=12>>= + xlab <- c("Standard Deviation", " (for data on the log scale)") + hist(exp.sd,n=20, col="cyan", border="blue", main="", xlab=xlab) dens <- density(exp.sd) - lines(dens$x, dens$y*par("usr")[4]/max(dens$y),col="red",lwd=2) #$ - title("Histogram of Standard Deviations (log2 scale)") + scaled.y <- dens$y*par("usr")[4]/max(dens$y) + lines(dens$x,scaled.y ,col="red",lwd=2) #$ + title("Histogram of Standard Deviations") @ -\end{figure} -\begin{center} -\end{center} + Note that this distribution is right skewed, even though it is on the $\log_2$ scale. @@ -241,24 +238,29 @@ @ There are 6 functions available in the \code{ssize} package. -<<eval=FALSE>>= -?pow -@ \begin{verbatim} - pow(sd, n, delta, sig.level, alpha.correct = "Bonferonni") - power.plot(x, xlab = "Power", ylab = "Proportion of Genes with Power >= x", - marks = c(0.7, 0.8, 0.9), ...) + pow(sd, n, delta, sig.level, + alpha.correct = "Bonferonni") + power.plot(x, xlab = "Power", + ylab = "Proportion of Genes with" + " Power >= x", + marks = c(0.7, 0.8, 0.9), ...) - ssize(sd, delta, sig.level, power, alpha.correct = "Bonferonni") - ssize.plot(x, xlab = "Sample Size (per group)", - ylab = "Proportion of Genes Needing Sample Size <= n", - marks = c(2, 3, 4, 5, 6, 8, 10, 20), ...) + ssize(sd, delta, sig.level, power, + alpha.correct = "Bonferonni") + ssize.plot(x, + xlab = "Sample Size (per group)", + ylab = "Proportion of Genes Needing Sample" + " Size <= n", + marks = c(2, 3, 4, 5, 6, 8, 10, 20), ...) - delta(sd, n, power, sig.level, alpha.correct = "Bonferonni") - delta.plot (x, xlab = "Fold Change", - ylab = "Proportion of Genes with Power >= 80\% at Fold Change=delta", - marks = c(1.5, 2, 2.5, 3, 4, 6, 10), ...) + delta(sd, n, power, sig.level, + alpha.correct = "Bonferonni") + delta.plot (x, xlab = "Fold Change", + ylab = "Proportion of Genes with " + "Power >= 80\% at Fold Change=delta", + marks = c(1.5, 2, 2.5, 3, 4, 6, 10), ...) \end{verbatim} You will note that there are three pairs. @@ -297,10 +299,7 @@ \item What is the power for 6 patients per group with $\delta=1.0$, $\alpha=0.05$? -\begin{figure}[h!] - \caption{Effect of Sample Size on Power} \label{fig:CumNPlot} - \centering -<<fig=TRUE,width=12,height=6>>= +<<fig=TRUE,width=12,height=12>>= all.power <- pow(sd=exp.sd, n=n, delta=log2(fold.change), sig.level=sig.level) @@ -314,18 +313,16 @@ xjust=1, yjust=1, cex=1.0) title("Power to Detect 2-Fold Change") @ -\end{figure} +%\begin{figure}[h!] +% \caption{Effect of Sample Size on Power} \label{fig:CumNPlot} +% \centering +%\end{figure} - \item What is the necessary per-group sample size for $80\%$ power when $\delta=1.0$, and $\alpha=0.05$? -\begin{figure}[h!] - \caption{Sample size required to detect a 2-fold treatment effect.} - \label{fig:CumPowerPlot} - \centering -<<fig=TRUE,width=12,height=6>>= +<<fig=TRUE,width=12,height=12>>= all.size <- ssize(sd=exp.sd, delta=log2(fold.change), sig.level=sig.level, power=power) ssize.plot(all.size, lwd=2, col="magenta", xlim=c(1,20)) @@ -339,21 +336,27 @@ xjust=1, yjust=0, cex=1.0) title("Sample Size to Detect 2-Fold Change") @ -\end{figure} +%\begin{figure}[h!] +% \caption{Sample size required to detect a 2-fold treatment effect.} +% \label{fig:CumPowerPlot} +% \centering +%\end{figure} -\clearpage +%\clearpage \item What is necessary fold change to achieve $80\%$ with $n=6$ patients per group, when $\delta=1.0$ and $\alpha=0.05$? -\begin{figure}[h!] - \caption[Given Sample Size, Fold Change (Effect Size) Necessary to - Achieving a Specified Power]{Given sample size, this plot allows - visualization of the fraction of genes achieving the specified - power for different fold changes.} - \label{fig:CumFoldChangePlot} - \centering -<<fig=TRUE,width=12,height=6>>= +%\begin{figure}[h!] +% \caption[Given Sample Size, Fold Change (Effect Size) Necessary to +% Achieving a Specified Power]{Given sample size, this plot allows +% visualization of the fraction of genes achieving the specified +% power for different fold changes.} +% \label{fig:CumFoldChangePlot} +% \centering +%\end{figure} + +<<fig=TRUE,width=12,height=12>>= all.delta <- delta(sd=exp.sd, power=power, n=n, sig.level=sig.level) delta.plot(all.delta, lwd=2, col="magenta", xlim=c(1,10)) @@ -366,11 +369,10 @@ xjust=1, yjust=0, cex=1.0) title("Fold Change to Achieve 80\% Power") @ -\end{figure} \end{enumerate} -\section{Modifications} +\section*{Modifications} While the \code{ssize} package has been implemented using the simple 2-sample pooled t-test, you can easily modify the code for other @@ -379,13 +381,13 @@ appropriate computation for the desired experimental design. -\section{Future Work} +\section*{Future Work} Peng Liu is currently developing methods and code for substituting False Discovery Rate for the Bonferonni multiple comparison adjustment. -\section{Contributions} +\section*{Contributions} Contributions and discussion are welcome. @@ -452,4 +454,6 @@ 653-667. \end{thebibliography} + +\end{article} \end{document} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <wa...@us...> - 2006-11-20 23:51:32
|
Revision: 1014 http://svn.sourceforge.net/r-gregmisc/?rev=1014&view=rev Author: warnes Date: 2006-11-20 15:48:54 -0800 (Mon, 20 Nov 2006) Log Message: ----------- Updates after review for RNews submission Modified Paths: -------------- trunk/ssize/inst/doc/ssize.Rnw Modified: trunk/ssize/inst/doc/ssize.Rnw =================================================================== --- trunk/ssize/inst/doc/ssize.Rnw 2006-11-16 13:05:05 UTC (rev 1013) +++ trunk/ssize/inst/doc/ssize.Rnw 2006-11-20 23:48:54 UTC (rev 1014) @@ -3,83 +3,87 @@ %\VignetteKeywords{Expression Analysis, Sample Size} %\VignettePackage{ssize} - %\documentclass[letter]{report} \documentclass[a4paper]{report} +\usepackage{Rnews} \usepackage{/Library/Frameworks/R.framework/Resources/share/texmf/Sweave} -\usepackage{Rnews} \usepackage[round]{natbib} +\usepackage{here} \begin{document} + +\setkeys{Gin}{width=0.9\textwidth} + \begin{article} \title{Sample Size Estimation for Microarray Experiments Using the \code{ssize} package.} \author{Gregory R. Warnes \\ email:\code{wa...@bs...}} -\subtitle{ ~ } +%\subtitle{ ~ } \maketitle \section*{abstract} -RNA Expression Microarray technology is widely applied in biomedical -and pharmaceutical research. The huge number of RNA concentrations +mRNA Expression Microarray technology is widely applied in biomedical +and pharmaceutical research. The huge number of mRNA concentrations estimated for each sample make it difficult to apply traditional sample size calculation techniques and has left most practitioners to rely on rule-of-thumb techniques. In this paper, we briefly describe and then demonstrate a simple method for performing and visualizing sample size calculations for microarray experiments as -implemented in the \code{ssize} R package, which is available from -the Bioconductor project (\url{http://www.bioconductor.org}) web -site. +implemented in the \code{ssize} R package. \section*{Note} This document is a simplified version of the manuscript \begin{quote} Warnes, G. R., Liu, P. (2006) Sample Size Estimation for Microarray - Experiments, Technical Report, Department of Biostatisticsa and + Experiments, Technical Report, Department of Biostatistics and Computational Biology, University of Rochester. \end{quote} which has been available as a pre-publication manuscript since 2004. Please refer to that document for a detailed discussion of the sample -size estimation method. +size estimation method and an evaluation of its performance. \section*{Introduction} High-throughput microarray experiments allow the measurement of expression levels for tens of thousands of genes simultaneously. These experiments have been used in many disciplines of biological -research, including as neuroscience \citep{Mandel03}, -pharmacogenomic research, genetic disease and cancer diagnosis -\citep{Heller02}. As a tool for estimating gene expression and -single nucleotide polymorphism (SNP) genotyping, microarrays produce -huge amounts of data which are providing important new insights. +research, including neuroscience \citep{Mandel03}, pharmacogenomic +research, genetic disease and cancer diagnosis \citep{Heller02}. As a +tool for estimating gene expression and single nucleotide polymorphism +(SNP) genotyping, microarrays produce huge amounts of data which can +providing important new insights. Microarray experiments are rather costly in terms of materials (RNA sample, reagents, chip, etc), laboratory manpower, and data analysis effort. It is critical, therefore, to perform proper experimental -design, including sample size estimation, before carrying out -microarray experiments. Since tens of thousands of variables (gene -expressions) may be measured on each individual chip, it is -essential to take into account multiple testing and dependency among -variables when calculating sample size. +design, including sample size estimation, before carrying out these +experiments. Since tens of thousands of variables (gene expressions) +may be measured on each individual chip, it is essential appropriately +take into account multiple testing and dependency among variables when +calculating sample size. \section*{Method} \subsection*{Overview} -\citet{Warnes05} provides a simple method for computing sample size -for micrarray experiments, and reports on a sereies of simulations -demonstrating the performinace of the method. The key component of -this method is the generation of cumulative plot of the proportion -of genes achieving a desired power as a function of sample size, -based on simple gene-by-gene calculations. While this mechanism can -be used to select a sample size numerically based on pre-specified -conditions, its real utility is as a visual tool for helping clients -to understand the trade off between sample size and power. In our -consulting work, this latter use as a visual tool has been +\citet{Warnes05} provide a simple method for computing sample size for +microarray experiments, and reports on a series of simulations +demonstrating its performance. Surprisingly, despite its simplicity, +the method performs exceptionally well even for data with very high +correlation between measurements. + +The key component of this method is the generation of a cumulative +plot of the proportion of genes achieving a desired power as a +function of sample size, based on simple gene-by-gene calculations. +While this mechanism can be used to select a sample size numerically +based on pre-specified conditions, its real utility is as a visual +tool for understanding the trade off between sample size and power. +In our consulting work, this latter use as a visual tool has been exceptionally valuable in helping scientific clients to make the difficult trade offs between experiment cost and statistical power. @@ -112,8 +116,11 @@ \begin{enumerate} \item{Estimate standard deviation ($\sigma$) for each gene based on - \emph{control samples} from existing studies performed on the - same biological system.} + \emph{control samples} from existing studies performed on the same + biological system. (While samples from the study to be performed + are not, of course, generally available, control samples from + other studies using the same biological system are often readily + available.) } \item{Specify values for \begin{enumerate} @@ -162,7 +169,7 @@ the relationship between power for all genes and required sample size in a single display. A sample size can thus be selected for a proposed microarray experiment based on user-defined criterion. For -the plot in Figure \ref{fig:CumNPlot}, for example, requiring $70\%$ +the plot in Figure \ref{fig:CumNPlot}, for example, requiring $80\%$ of genes to achieve the $80\%$ power yields a sample size of 10. Similar plots can be generated by fixing the sample size and @@ -171,75 +178,11 @@ such plots are shown in Figures \ref{fig:CumPowerPlot} and \ref{fig:CumFoldChangePlot}. +\subsection{Functions} -\section*{Example} +There are three pairs of functions available in the \code{ssize} package. -First, we need to load the \code{ssize} library: - -<<results=hide>>= -library(ssize) -library(xtable) -library(gdata) # for nobs() -options(width=30) -@ - -As part of the \code{ssize} library, I've provided an example data -set containing gene expression values for smooth muscle cells from a -control group of untreated healthy volunteers processed using -Affymetrix U95 chips and normalized per the Robust Multi-array -Average (RMA) method of \citet{Irizarry03}. - -<<>>= -data(exp.sd) -#str(exp.sd) -@ - -This data was calculated via something like - \begin{verbatim} -library(affy) -setwd("~/GT_methods_050316/WORK") -load("probeset_data.Rda") -expression.values <- exprs(probeset.data) -covariate.data <- pData(probeset.data) -controls <- expression.values[, - covariate.data$GROUP=="Control"] #$ -exp.sd <- apply(controls, 1, sd) -\end{verbatim} - - -Lets see what the distribution looks like: - -%\begin{figure}[h!] -% \centering -% \caption{Distribution of exp.sd} -% \label{exp.sd.hist} -%\end{figure} - -<<fig=TRUE,width=12,height=12>>= - xlab <- c("Standard Deviation", " (for data on the log scale)") - hist(exp.sd,n=20, col="cyan", border="blue", main="", xlab=xlab) - dens <- density(exp.sd) - scaled.y <- dens$y*par("usr")[4]/max(dens$y) - lines(dens$x,scaled.y ,col="red",lwd=2) #$ - title("Histogram of Standard Deviations") -@ - - -Note that this distribution is right skewed, even though it is on -the $\log_2$ scale. - -To make the computations run faster, we'll only use the standard -deviations for the first 1000 genes on the chip. Everything will -work if you skip this, but it will take longer. - -<<>>= -exp.sd <- exp.sd[1:1000] -@ - -There are 6 functions available in the \code{ssize} package. - -\begin{verbatim} pow(sd, n, delta, sig.level, alpha.correct = "Bonferonni") power.plot(x, xlab = "Power", @@ -259,15 +202,14 @@ alpha.correct = "Bonferonni") delta.plot (x, xlab = "Fold Change", ylab = "Proportion of Genes with " - "Power >= 80\% at Fold Change=delta", + "Power >= 80\% at\\n" + "Fold Change=delta", marks = c(1.5, 2, 2.5, 3, 4, 6, 10), ...) \end{verbatim} -You will note that there are three pairs. - \begin{description} -\item[pow, power.sd] compute and display a cumulative plot of the +\item[pow, power.plot] compute and display a cumulative plot of the fraction of genes achieving a specified power for a fixed sample size (\code{n}), effect size (\code{delta}), and significance level (\code{sig.level}). @@ -284,47 +226,90 @@ \end{description} -So, now lets see the functions in action. -First, lets define the values for which we will be investigating: + + +\section*{Example} + +First, we need to load the \code{ssize} library: + +<<results=hide>>= +library(ssize) +library(xtable) +library(gdata) # for nobs() +options(width=30) +@ + +The \code{ssize} library provides an example data set containing gene +expression values for smooth muscle cells from a control group of +untreated healthy volunteers processed using Affymetrix U95 chips and +normalized per the Robust Multi-array Average (RMA) method of +\citet{Irizarry03}. + <<>>= +# Load the example data +data(exp.sd) + +# Use only the first 1000, +# so examples run faster +exp.sd <- exp.sd[1:1000] +@ + +This data was calculated via: +\begin{verbatim} +library(affy) +load("probeset_data.Rda") +expression.values <- exprs(probeset.data) +covariate.data <- pData(probeset.data) +controls <- expression.values[, + covariate.data$GROUP=="Control"] #$ +exp.sd <- apply(controls, 1, sd) +\end{verbatim} + +Lets see what the distribution looks like: +<<label=SDPlot,fig=TRUE,include=F,results=hide,width=12,height=12>>= +par(cex=2) +xlab <- c("Standard Deviation", "(for data on the log scale)") +hist(exp.sd,n=40, col="cyan", border="blue", main="", xlab=xlab, log="x") +dens <- density(exp.sd) +scaled.y <- dens$y*par("usr")[4]/max(dens$y) +lines(dens$x,scaled.y ,col="red",lwd=2) #$ +@ +\begin{figure}[H] + \caption{Standard deviations for of logged example data} + \label{fig:SDPlot} + \begin{center} + \includegraphics{ssize-SDPlot} + \end{center} +\end{figure} + + +As is often the case, this distribution is extremely right skewed, +even though the standard deviations were computed on the $\log_2$ +scale. + + +So, now lets see the functions in action. First, define the parameter +values we will be investigating: +<<>>= n<-6 fold.change<-2.0 power<-0.8 sig.level<-0.05 @ +Now, the functions provided by the \code{ssize} package can be used to +address several questions: + \begin{enumerate} -\item What is the power for 6 patients per group with $\delta=1.0$, - $\alpha=0.05$? - -<<fig=TRUE,width=12,height=12>>= -all.power <- pow(sd=exp.sd, n=n, delta=log2(fold.change), - sig.level=sig.level) - -power.plot(all.power, lwd=2, col="blue") -xmax <- par("usr")[2]-0.05; ymax <- par("usr")[4]-0.05 -legend(x=xmax, y=ymax, - legend= strsplit( paste("n=",n,",", - "fold change=",fold.change,",", - "alpha=", sig.level, ",", - "# genes=", nobs(exp.sd), sep=''), "," )[[1]], - xjust=1, yjust=1, cex=1.0) -title("Power to Detect 2-Fold Change") -@ -%\begin{figure}[h!] -% \caption{Effect of Sample Size on Power} \label{fig:CumNPlot} -% \centering -%\end{figure} - \item What is the necessary per-group sample size for $80\%$ power when $\delta=1.0$, and $\alpha=0.05$? - -<<fig=TRUE,width=12,height=12>>= +<<label=CumNPlot,fig=TRUE,include=F>>= all.size <- ssize(sd=exp.sd, delta=log2(fold.change), sig.level=sig.level, power=power) +par(cex=1.3) ssize.plot(all.size, lwd=2, col="magenta", xlim=c(1,20)) xmax <- par("usr")[2]-1; ymin <- par("usr")[3] + 0.05 @@ -333,43 +318,90 @@ "alpha=", sig.level, ",", "power=",power,",", "# genes=", nobs(exp.sd), sep=''), "," )[[1]], - xjust=1, yjust=0, cex=1.0) + xjust=1, yjust=0, cex=0.90) title("Sample Size to Detect 2-Fold Change") @ -%\begin{figure}[h!] -% \caption{Sample size required to detect a 2-fold treatment effect.} -% \label{fig:CumPowerPlot} -% \centering -%\end{figure} +\begin{figure}[H] + \caption{Sample size required to detect a 2-fold treatment effect.} + \label{fig:CumPowerPlot} + \begin{center} + \includegraphics{ssize-CumNPlot} + \end{center} +\end{figure} %\clearpage -\item What is necessary fold change to achieve $80\%$ with $n=6$ -patients per group, when $\delta=1.0$ and $\alpha=0.05$? +This plot illustrates that a sample size of 10 is required to ensure +that at least 80\% of genes have power greater than 80\%. It also +shows that a sample size of 6 is sufficient if only 60\% of the genes +need to achieve 80\% power. -%\begin{figure}[h!] -% \caption[Given Sample Size, Fold Change (Effect Size) Necessary to -% Achieving a Specified Power]{Given sample size, this plot allows -% visualization of the fraction of genes achieving the specified -% power for different fold changes.} -% \label{fig:CumFoldChangePlot} -% \centering -%\end{figure} +\item What is the power for 6 patients per group with $\delta=1.0$, + and $\alpha=0.05$? -<<fig=TRUE,width=12,height=12>>= +<<label=CumPowerPlot,fig=TRUE,include=F>>= +all.power <- pow(sd=exp.sd, n=n, delta=log2(fold.change), + sig.level=sig.level) + +par(cex=1.3) +power.plot(all.power, lwd=2, col="blue") +xmax <- par("usr")[2]-0.05; ymax <- par("usr")[4]-0.05 +legend(x=xmax, y=ymax, + legend= strsplit( paste("n=",n,",", + "fold change=",fold.change,",", + "alpha=", sig.level, ",", + "# genes=", nobs(exp.sd), sep=''), "," )[[1]], + xjust=1, yjust=1, cex=0.90) +title("Power to Detect 2-Fold Change") +@ +\begin{figure}[H] + \caption{Effect of Sample Size on Power} + \label{fig:CumNPlot} + \begin{center} + \includegraphics{ssize-CumPowerPlot} + \end{center} +\end{figure} + +This plot shows that only 52\% of genes achieve at 80\% power at this +sample size and significance level. + +\item How large does a fold-change need to be for 80\% of genes to +achieve 80\% power for an experiment for $n=6$ patients per group and +$\alpha=0.05$? + +<<label=CumFoldChangePlot,fig=TRUE,include=F>>= all.delta <- delta(sd=exp.sd, power=power, n=n, sig.level=sig.level) -delta.plot(all.delta, lwd=2, col="magenta", xlim=c(1,10)) +par(cex=1.3, mar=c(5.1,5.1,4,2)) +delta.plot(all.delta, lwd=2, col="magenta", xlim=c(1,10), + ylab = paste("Proportion of Genes with ", + "Power >= 80\% \n", + "at Fold Change of \delta") +) xmax <- par("usr")[2]-1; ymin <- par("usr")[3] + 0.05 legend(x=xmax, y=ymin, legend= strsplit( paste("n=",n,",", "alpha=", sig.level, ",", "power=",power,",", "# genes=", nobs(exp.sd), sep=''), "," )[[1]], - xjust=1, yjust=0, cex=1.0) + xjust=1, yjust=0, cex=0.90) title("Fold Change to Achieve 80\% Power") @ +\begin{figure}[H] + \caption[Given Sample Size, Fold Change (Effect Size) Necessary to + Achieving a Specified Power]{Given sample size, this plot allows + visualization of the fraction of genes achieving the specified + power for different fold changes.} + \label{fig:CumFoldChangePlot} + \begin{center} + \includegraphics{ssize-CumFoldChangePlot} + \end{center} +\end{figure} +This plot shows that for a fold change of 2.0, only 60\% of genes +achieve 80\% power, while a fold change of 3.0 will be detected with +80\% power for 80\% of genes. + \end{enumerate} \section*{Modifications} @@ -439,9 +471,10 @@ A direct approach to false discovery rates, {\it Journal of Royal Statistical Society B}, {\bf 64:3}, 479-498. -\bibitem[Warnes and Liu, 2005]{Warnes05} Warnes, G. R., Liu, P. (2005) +\bibitem[Warnes and Liu, 2006]{Warnes05} Warnes, G. R., Liu, P. (2006) Sample Size Estimation for Microarray Experiments, \emph{submitted - to} {\it Biometrics}. + to} Technical Report, Department of Biostatistics and + Computational Biology, University of Rochester. \bibitem[Yang {\it et~al.}, 2003]{Yang03} Yang, M. C. K., Yang, J. J., McIndoe, R. A., She, J. X. (2003) Microarray experimental This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |