Aziz
2011-07-08
Hi,
I would like to implement a model where the response y~dnorm(mu,tau) is left-
censored but the censoring threshold (cutpoint) depends on both mu and tau.
Basically the censoring threshold is unknown, all I know is the probability of
censoring (pcens_) and I use that with mu and tau to compute the censoring
quantile (called upper_ in my model).
I'm unfamiliar with the syntax of dinterval in case of left censoring. I used
an indicator variable is.observed_ which takes value 0 if y_ is censored, and
value 1 if y_ is observed. This is the reverse situation to the right
censoring case mentionned in the JAGS manual.
My model is the following:
model{
for(i in 1:N){
is.observed_ ~ dinterval(y_, upper_)
y_ ~ dnorm(mu_,tau)
mu_ <- beta + beta*age_
upper_ <- qnorm(pcens_,mu_,tau)
}
tau <- pow(sigma,-2)
sigma ~ dunif(0,100)
beta ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
}
When I run that model I get:
Error in node is.observed
Observed node inconsistent with unobserved parents at initialization
...so I guess the issue is that initial values for missing y's do not agree
with the censoring constraints. But then how may I solve that since the
censoring quantiles depends on mu and tau? As far as I know, this kind of
construct is prohibited in WinBUGS/OpenBUGS and the censoring thresholds must
not depend on mu or tau if the latter are unobserved as well. Is this actually
possible in JAGS? What am I missing here?
Thanks for any input
Aziz
2011-07-08
I forgot the indices when copying the model, sorry... Read:
y=y_
mu=mu_
is.observed=is.observed_
upper=upper___
Aziz
2011-07-08
well indices are not displayed...
Martyn Plummer
2011-07-18
It's the forum software which interprets some indices as formatting commands.
We shall be moving to phpBB soon.
To answer your question, when you use the dinterval distribution it imposes an
a posteriori constraint on the parameters. JAGS cannot automatically
generate starting values that satisfy the constraint so you need to supply
them yourself.
In R you can create suitable starting values for y, given starting values for
mu and tau as follows
L <- ifelse(is.observed, pcens, 0) U <- ifelse(is.observed, 1, pcens) y <- qnorm( runif(N, L, U), mu, tau) dump(c("mu","tau","y"), file="myinits.R")
Aziz
2011-07-20
Thanks a lot for your help Martyn! It is highly appreciated. I will try your
suggestion and let you know how it works.
So far I can say I'm impressed with JAGS. It runs my model MUCH faster than
OpenBUGS. Great work!
Aziz
2011-08-09
Hi again,
first I confirm that your initialization suggestion work both for censored
data (dinterval) and truncated data (using T). Now my question is what should
I actually use: censoring or truncation?
Reading the JAGS user's manual, I understand that truncation is an a priori
constraint while censoring is an a posteriori constraint. In my current
problem, I know a priori that if a data is observed (is.observed=1) than its
value (y) must be higher than some threshold (upper) and if a data is
unobserved (due to either truncation or censoring) than its value(y) must be
lower than that threshold. This sounds like truncation. However because the
threshold itself depends on the posterior distribution of parameters defining
y (which are mu and tau), I'm confused because then the prior constraints
depends on some posteriors...
Both censoring and truncation can be modeled using either dinterval or T(,)
and they seems to provide rather similar (yet not exactly the same) results
although I couldn't test this comprehensively. But the dinterval construct
runs twice as fast as T(,). Am I in a framework in which censoring and
truncation are actually the same??
Martyn Plummer
2011-08-09
Imagine what would happen if you ran your experiment/study again and create
replicated data y.rep and then recalculated the censoring indicator
is.observed.rep based on the value of y.rep.
Here is the key question. Is is.observed.rep the same as is.observed or, if
you re-ran your experiment, could it be different?
Aziz
2011-08-10
Well it's a tough question and I'm unsure I understand it correctly. In my
situation I'm measuring the ability of children to perform specific motor
tasks. Basically if the child is able to do the task, we measure how he/she
performs the task with a qualitative score so that score is observed. If the
child is not able to do the task, then no score can be observed and this is
where truncation or censoring occurs.
I guess that re-running the experiment would imply re-measuring these same
children. Probably that those that were unable to do a task would still be
unable to redo the task except some small fluctuations due to instant mood
etc. So is.observed.rep would probably be quite similar to is.observed besides
those children who were unable to do the task the first time and who would be
able to do it at the re-measure and conversely.
One thing to note however is that according to the model, a child's score is
censored or truncated if it is lower than the threshold "upper". But in
reality it could be possible that a child is able to do the task
(is.observed=1) yet would score lower than the threshold because the ability
to do the task only depends on the ability to measure its score. Does this
help to decide which of truncation or censoring is appropriate here?
Aziz
2011-08-10
My understanding is that I model all children but I have missing values for
those whose score is unobserved because they were not able to do the task. In
some sense this sounds like censoring because I have some information on these
children (I can calculate the proportion of children who were not able to
perform the task at each age with a logistic regression). If my dataset did
only include children who were able to do the task, then I'd probably use
truncation. So I'd say I should go with censoring. Does it make sense?
In WinBUGS, the I(,) construct (or C(,) in OpenBUGS) which refers to censoring
is ignored for observed data. In other words, I(,) or C(,) only apply to
missing data (NA). So in BUGS I'd probably use truncation (though not sure)
because I need to model the fact that if a data is observed it is because it
lies above some threshold. In JAGS, dinterval probably works differently I
guess and censoring constraints do apply for both observed and unobserved
data.
Aziz
2011-08-10
Another point is that we don't know a priori whether a child will be able to
do the task or not. We need to observe it and see whether he/she is able to do
it. In that sense, there is no prior information because we first need to
experiment before decising whether the child is able or unable to do the task.
That sounds again like censoring...
Martyn Plummer
2011-08-10
Yes, that is censoring.
Aziz
2011-08-10
Thanks Martyn! It's nice it is censoring because dinterval runs much faster
than T(,)!!