I'm interested in using JAGS to generate data from a stochastic model,and then
fit that generated data with MCMC.
The motivation for this comes from a particular data set with known
measurement error. To understand the important effects within the data, and
accurately estimate credible intervals, I'd like to use the actual
measurements and the known errors to stochastically generate a bunch of
"data", and then fit that "data" using MCMC in JAGS.
Back in 2009, there was some discussion of this issue over here: http://jackm
kind-of followed that thought process, and I have some ideas, but I don't
currently know how to do this. I would greatly appreciate any advice which
anyone might have.
Things which I have considered include:
1) The JAGS data block. I created a data block, and then created a data model
within this block which would generate "data" from the actual measurements and
associated measurement errors. However, as clearly described in the JAGS
manual, each node in the data block was forward sampled exactly once. What I
had needed was for each node to be forward sampled once per iteration, such
that the ultimate posterior distributions would reflect the full range of
variation possible in the "data." As this is not how a data block functions,
it seems that a data block is not a solution to this problem.
2) cut() or dsum(): After some additional reading, it sounds like the BUGS
cut() function would do the trick. However, I much prefer to work within JAGS.
According to Martyn (http://jackman.stanford.edu/blog/?p=1199) this can be done within JAGS as an observable
function, although it may not be advisable. More specifically, dsum() may do
what I want. According to the manual, dsum() requires that the parameters
passed to the function be "unobserved stochastic nodes", but I could perhaps
work with that.
Thus, a solution may be the following:
for(each data point i)
fakedata[i] ~ stochastic function of measurements and known errors
fakedata_fixed[i] ~ dsum(fakedata[i])
The first relation generates the "data" from the measurements and the known
uncertainties, and the second uses dsum() as the JAGS analog to the BUGS
cut(), ensuring that no information propagates upwards to the "data"
As I noted earlier, I would greatly appreciate any advice or feedback. Thanks,
If you want to generate and fit the data in the same model then in theory you
have two choices
1) Use a loop in which the data are generated first then analyzed. Pool the
simulations from the analysis phase. As you know you can generate the data
within JAGS using a data block. This would be easy to set up using the rjags
interface to R. The disadvantage is that you must go through the model
compilation and adaptation at each iteration.
2) Try to do everything within one model. This will involve cuts. There is no
other way to do it. Cuts are not properly implemented in JAGS. Although there
is an experimental cut module in the source it is not part of the distribution
because the naive cut algorithm, as implemented in OpenBUGS is wrong: it will
not converge to a well-defined distribution.
So in practice you will have to go for option 1 now.
OK. I can do #1, it just takes longer.
As a humble additional request, when you have time, could you expand the
observable functions section of the manual? I'm afraid I still don't
understand how dinterval() and dsum() work, neither do I understand the
constraints on their use.
Again, thanks for sharing your work, and for providing kind support on these
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.