[R-gregmisc-users] SF.net SVN: r-gregmisc: [942] trunk/PathwayModeling/thesispaper

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Revision: 942
Author:   r_burrows
Date:     2006-03-20 03:41:09 -0800 (Mon, 20 Mar 2006)
ViewCVS:  http://svn.sourceforge.net/r-gregmisc/?rev=942&view=rev

Log Message:
-----------
minor revisions

Modified Paths:
--------------
    trunk/PathwayModeling/thesispaper/introduction.Snw
    trunk/PathwayModeling/thesispaper/paper.Snw
    trunk/PathwayModeling/thesispaper/results.Snw
Modified: trunk/PathwayModeling/thesispaper/introduction.Snw
===================================================================

--- trunk/PathwayModeling/thesispaper/introduction.Snw	2006-03-20 01:13:55 UTC (rev 941)
+++ trunk/PathwayModeling/thesispaper/introduction.Snw	2006-03-20 11:41:09 UTC (rev 942)
@@ -12,7 +12,7 @@
 altering the system behavior in beneficial ways.
 
 In order to gain some insight into the properties of biological
-systems, we would like to prepare quantitative models of sequences of
+systems, we would like to use quantitative models of sequences of
 reactions.
  Specifically, we would like to develop methods that can be used to
 deduce the kinetic parameters of the enzymes in a pathway. With this
@@ -58,7 +58,7 @@
 
 We will explore the utility of the the Bayesian approach using a
 simulation technique: First we will generate data for a model with
-known parameters, and then we will fit this data using our Bayesian
+known parameters, and then we will fit this data using Bayesian
 methods.
 The biochemical data is generated using Gillespie's
 stochastic simulation algorithm \cite{Gillespie77}. This algorithm is

Modified: trunk/PathwayModeling/thesispaper/paper.Snw
===================================================================
--- trunk/PathwayModeling/thesispaper/paper.Snw	2006-03-20 01:13:55 UTC (rev 941)
+++ trunk/PathwayModeling/thesispaper/paper.Snw	2006-03-20 11:41:09 UTC (rev 942)
@@ -34,16 +34,16 @@
 \address{University of Rhode Island, Kingston, RI}
   
 \begin{abstract}
-The usefulness of Markov chain Monte Carlo methods for the modeling
-of biochemical reactions is examined. With simulated data, it is
-shown that mechanistic models can be fit to sequences of enzymatic
+\textit{We have examined} the usefulness of \textit{Bayesian statistical methods} for the modeling
+of biochemical reactions. With simulated data, it is
+shown that mechanistic models can \textit{effectively} be fit to sequences of enzymatic
 reactions. These methods have the advantages of being relatively easy
 to use and producing probability distributions for the model
-parameters rather than point estimates.
+parameters rather than point estimates, \textit{allowing more informative statistical inferences to be drawn}.
 
 Three Markov chain Monte Carlo algorithms are used to fit models to
 data from a
-sequence of 4 enzymatic reactions. The algorithms 
+sequence of 4 enzymatic reactions. These algorithms 
 are evaluated with respect to
 the goodness-of-fit of the fitted models and the time to completion.
 It is shown that the algorithms produce essentially the same

Modified: trunk/PathwayModeling/thesispaper/results.Snw
===================================================================
--- trunk/PathwayModeling/thesispaper/results.Snw	2006-03-20 01:13:55 UTC (rev 941)
+++ trunk/PathwayModeling/thesispaper/results.Snw	2006-03-20 11:41:09 UTC (rev 942)
@@ -9,16 +9,37 @@
 algorithms were used for the simulations and they converged to similar
 distributions. 
 
-The marginal distributions for the parameters were mound-shaped. They
+The marginal distributions for the parameters were mound-shaped 
+(Figure~\ref{densities}). They
 generally were skewed to the right and had thicker
-tails than normal distributions as illustrated in
-Figure~\ref{densities}.
+tails than normal distributions. /textit{The modes of the distributions 
+turned out to be poor estimates of the maximum likelihood values for 
+the parameters. The maximum likelihood values were all larger than 
+the modes, and the difference increased as the position of the reaction 
+in the sequence increased. Correlated pairs of parameters 
+(e.g., $d_1$ --$d_2$ and $d_3$ --$d_4$, Figure~\ref{scatter}) were similar relative distances from their 
+modes. The significance of all this is not clear at this point.}
 
-<<fig4,echo=F,eval=T>>=3
+<<fig4,echo=F,eval=F>>=3
+source("R/paperSSQ.R")
+source("R/getModes.R")
+source("R/do.optim.R")
 source("R/plotDensities.R")
-source("R/getModes.R")
 output16 <- read.table("data/output16.AllComp.thinned")
-modes <- getModes(output16)
+input16 <- read.table("data/input16.dat", header=T)
+attach(input16)
+temp<-output16[sample(1:20000,5000),]
+temp2<-numeric(5000)
+for (i in 1:5000) temp2[i]<-paperSSQ(temp[i,])
+temp<-temp[order(temp2),]
+write.table(temp[1:4750,],file="figures/tempDir/Dsamp.dat",row.names=F,col.names=F)
+mcmcML <- as.numeric(temp[1,-10])
+write.table(mcmcML,file="figures/tempDir/mcmcML.dat",row.names=F,col.names=F)
+modes<-getModes(output16)
+temp<-do.optim(start=modes[-10],scail=modes[-10])
+temp<-do.optim(start=temp$par,scail=temp$par)
+optimML<-temp$par
+write.table(optimML,file="figures/tempDir/optimML.dat",row.names=F,col.names=F)
 SD <- numeric(9)
 for (i in 1:9) SD[i] <- sd(output16[,i])
 pdf(file="figures/tempDir/densities.pdf",width=6,height=6)
@@ -26,6 +47,7 @@
 plotDensities(output16)
 dev.off()
 par(mfrow=c(1,1))
+detach(input16)
 @
 \begin{figure}
   \centering
@@ -34,13 +56,14 @@
   the 5-reaction model generated with the all-components Metropolis
   algorithm and the 16-point dataset. The curves are normal densities with means equal to the
   medians of the distributions and variances equal to the variances of
-  the distributions.}
+  the distributions. \textit{Red vertical lines indicate the parameters values that 
+  minimize the mean squared residuals.}}
   \label{densities}
 \end{figure}
 The effect of the number of data points on the parameter distributions
 can be seen in Figure~\ref{converged}.
 
-<<fig5,echo=F,eval=T>>=4
+<<fig5,echo=F,eval=F>>=4
 source("R/plotDensity.R")
 source("R/plotConverged.R")
 output12 <- read.table("data/output12.AllComp.thinned")
@@ -55,7 +78,7 @@
 \begin{figure}
   \centering
   \includegraphics[scale=0.9]{figures/tempDir/converged}
-  \caption{Parameter distributions as functions of the number of data
+  \caption{\textit{Posterior distributions from different numbers} of data
   points for the all-components algorithm. (---------) prior
   distribution; ({\color{red} - - - -}) 12 points; ({\color{green}
   $\cdots\cdots$}) 16 points; ({\color{blue} -- $\cdot$ -- $\cdot$ --}) 25
@@ -63,12 +86,17 @@
   \label{converged}
 \end{figure}
 There is some narrowing of the distributions as the number of points
-increases from 16 to 25 but it is not pronounced. 
+increases from 16 to 25 but it is not pronounced. \textit{Overall, the 
+reduction in width varied from 18-fold for $d_9$ and 9-fold for $d_1$ to 
+1.5-fold for $d_4$}.
+
 Figure~\ref{scatter} is an example of a bivariate scatterplot of the
 distributions. There is correlation between some pairs of parameters, e.g.,
 $d_1$ -- $d_2$, but no evidence of multi-modality.
 
-<<fig6, echo=F,eval=T>>=5
+<<fig6, echo=F,eval=F>>=5
+source("R/paperPairs.R")
+mcmcML <-scan("figures/tempDir/mcmcML.dat")
 output16 <- read.table("data/output16.AllComp.thinned")
 mcmc.cor <- function(x,y) {
     par(usr=c(0,1,0,1))
@@ -78,7 +106,7 @@
     }
 pdf(file="figures/tempDir/scatterPlot.pdf",width=8,height=8)
 par(pch='.')
-pairs(output16[sample(1:20000,2000),],labels=c('d1','d2','d3','d4','d5','d6','d7','d8','d9'),lower.panel=mcmc.cor)
+paperPairs(output16[sample(1:20000,2000),-10],labels=c('d1','d2','d3','d4','d5','d6','d7','d8','d9'),lower.panel=mcmc.cor)
 dev.off()
 par(pch=1)
 @
@@ -99,25 +127,19 @@
 is shown in Figure~\ref{fits}. Quantitative measures of the fits for
 all the algorithms are given in Table~\ref{MSq}.
 
-<<fig7, echo=F, eval=T>>=6
+<<fig7, echo=F, eval=F>>=6
 source("R/paperSSQ.R")
 source("R/getModes.R")
 source("R/do.optim.R")
 source("R/fitPaper.R")
 source("R/get95CI.R")
 source("R/plotVi.R")
+mcmcML <-scan("figures/tempDir/mcmcML.dat")
+optimML <-scan("figures/tempDir/optimML.dat")
 output16 <- read.table("data/output16.AllComp.thinned")
 input16 <- read.table("data/input16.dat",header=T)
 attach(input16)
-temp<-output16[sample(1:20000,5000),]
-temp2<-numeric(5000)
-for (i in 1:5000) temp2[i]<-paperSSQ(temp[i,])
-temp<-temp[order(temp2),]
-Dsamp<-as.matrix(temp[1:4750,])
-modes<-getModes(output16)
-temp<-do.optim(start=modes[-10],scail=modes[-10])
-temp<-do.optim(start=temp$par,scail=temp$par)
-paramsML<-temp
+Dsamp <- as.matrix(read.table("figures/tempDir/Dsamp.dat"))
 pdf(file="figures/tempDir/V%dfitted.pdf",onefile=FALSE,width=10,height=5.5)
 plotVi(Dsamp,input16)
 dev.off()
@@ -154,13 +176,17 @@
 
 Rates of convergence are illustrated in Figure~\ref{SSQvsIter}. The
 mean sums of squared residuals (SSQ) are plotted vs. the number of
-likelihood evaluations for the three algorithms.
-For the 1-component Metropolis algorithm, one parameter is
-updated for each evaluation of the likelihood method whereas the
-all-component Metropolis and NKC algorithms update all the parameters
-for each likelihood evaluation.
+likelihood evaluations for the three algorithms. \textit{The relatively 
+long convergence time for the 1-component Metropolis algorithm is a result 
+of the high correlations between some of the parameters (Figure~\ref{scatter}).
+With the 1-component Metropolis algorithm, one parameter is
+updated for each evaluation of the likelihood method, i.e., the algorithm 
+searches the parameter space by moving parallel to the axes and thus 
+it can sample a density with a diagonal axis of symmetry very slowly. 
+This is not a problem with the all-component Metropolis and NKC 
+algorithms since they update all the parameters at each iteration}.
 
-<<fig8, echo=F,eval=T>>=7
+<<fig8, echo=F,eval=F>>=7
 source("R/getMSDvsEvals.R")
 source("R/plotMSD16.R")
 source('R/paperSSQ.R')
@@ -172,7 +198,7 @@
 file <- sub('#',as.character(i),'NKC.MSD.dat#')
 assign(file, read.table(paste("data/",file,sep="")))
 }
-pdf(file='figures/tempDir/MSR16.pdf',width=9,height=6)
+pdf(file="figures/tempDir/MSR16.pdf",width=9,height=6)
 plotMSD16()
 dev.off()
 detach(input16)


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.