Left censoring with varying cutpoint

  • Aziz

    Aziz - 2011-07-08


    I would like to implement a model where the response y~dnorm(mu,tau) is left-
    censored but the censoring threshold (cutpoint) depends on both mu and tau.
    Basically the censoring threshold is unknown, all I know is the probability of
    censoring (pcens_) and I use that with mu and tau to compute the censoring
    quantile (called upper_ in my model).

    I'm unfamiliar with the syntax of dinterval in case of left censoring. I used
    an indicator variable is.observed_ which takes value 0 if y_ is censored, and
    value 1 if y_ is observed. This is the reverse situation to the right
    censoring case mentionned in the JAGS manual.

    My model is the following:


    for(i in 1:N){

    is.observed_ ~ dinterval(y_, upper_)

    y_ ~ dnorm(mu_,tau)

    mu_ <- beta + beta*age_

    upper_ <- qnorm(pcens_,mu_,tau)


    tau <- pow(sigma,-2)

    sigma ~ dunif(0,100)

    beta ~ dnorm(0,0.001)

    beta ~ dnorm(0,0.001)


    When I run that model I get:

    Error in node is.observed

    Observed node inconsistent with unobserved parents at initialization

    ...so I guess the issue is that initial values for missing y's do not agree
    with the censoring constraints. But then how may I solve that since the
    censoring quantiles depends on mu and tau? As far as I know, this kind of
    construct is prohibited in WinBUGS/OpenBUGS and the censoring thresholds must
    not depend on mu or tau if the latter are unobserved as well. Is this actually
    possible in JAGS? What am I missing here?

    Thanks for any input

  • Aziz

    Aziz - 2011-07-08

    I forgot the indices when copying the model, sorry... Read:





  • Aziz

    Aziz - 2011-07-08

    well indices are not displayed...

  • Martyn Plummer

    Martyn Plummer - 2011-07-18

    It's the forum software which interprets some indices as formatting commands.
    We shall be moving to phpBB soon.

    To answer your question, when you use the dinterval distribution it imposes an
    a posteriori constraint on the parameters. JAGS cannot automatically
    generate starting values that satisfy the constraint so you need to supply
    them yourself.

    In R you can create suitable starting values for y, given starting values for
    mu and tau as follows

    L <- ifelse(is.observed, pcens, 0)
    U <- ifelse(is.observed, 1, pcens)
    y <- qnorm( runif(N, L, U), mu, tau)
    dump(c("mu","tau","y"), file="myinits.R")
  • Aziz

    Aziz - 2011-07-20

    Thanks a lot for your help Martyn! It is highly appreciated. I will try your
    suggestion and let you know how it works.

    So far I can say I'm impressed with JAGS. It runs my model MUCH faster than
    OpenBUGS. Great work!

  • Aziz

    Aziz - 2011-08-09

    Hi again,

    first I confirm that your initialization suggestion work both for censored
    data (dinterval) and truncated data (using T). Now my question is what should
    I actually use: censoring or truncation?

    Reading the JAGS user's manual, I understand that truncation is an a priori
    constraint while censoring is an a posteriori constraint. In my current
    problem, I know a priori that if a data is observed (is.observed=1) than its
    value (y) must be higher than some threshold (upper) and if a data is
    unobserved (due to either truncation or censoring) than its value(y) must be
    lower than that threshold. This sounds like truncation. However because the
    threshold itself depends on the posterior distribution of parameters defining
    y (which are mu and tau), I'm confused because then the prior constraints
    depends on some posteriors...

    Both censoring and truncation can be modeled using either dinterval or T(,)
    and they seems to provide rather similar (yet not exactly the same) results
    although I couldn't test this comprehensively. But the dinterval construct
    runs twice as fast as T(,). Am I in a framework in which censoring and
    truncation are actually the same??

  • Martyn Plummer

    Martyn Plummer - 2011-08-09

    Imagine what would happen if you ran your experiment/study again and create
    replicated data y.rep and then recalculated the censoring indicator
    is.observed.rep based on the value of y.rep.

    Here is the key question. Is is.observed.rep the same as is.observed or, if
    you re-ran your experiment, could it be different?

  • Aziz

    Aziz - 2011-08-10

    Well it's a tough question and I'm unsure I understand it correctly. In my
    situation I'm measuring the ability of children to perform specific motor
    tasks. Basically if the child is able to do the task, we measure how he/she
    performs the task with a qualitative score so that score is observed. If the
    child is not able to do the task, then no score can be observed and this is
    where truncation or censoring occurs.

    I guess that re-running the experiment would imply re-measuring these same
    children. Probably that those that were unable to do a task would still be
    unable to redo the task except some small fluctuations due to instant mood
    etc. So is.observed.rep would probably be quite similar to is.observed besides
    those children who were unable to do the task the first time and who would be
    able to do it at the re-measure and conversely.

    One thing to note however is that according to the model, a child's score is
    censored or truncated if it is lower than the threshold "upper". But in
    reality it could be possible that a child is able to do the task
    (is.observed=1) yet would score lower than the threshold because the ability
    to do the task only depends on the ability to measure its score. Does this
    help to decide which of truncation or censoring is appropriate here?

  • Aziz

    Aziz - 2011-08-10

    My understanding is that I model all children but I have missing values for
    those whose score is unobserved because they were not able to do the task. In
    some sense this sounds like censoring because I have some information on these
    children (I can calculate the proportion of children who were not able to
    perform the task at each age with a logistic regression). If my dataset did
    only include children who were able to do the task, then I'd probably use
    truncation. So I'd say I should go with censoring. Does it make sense?

    In WinBUGS, the I(,) construct (or C(,) in OpenBUGS) which refers to censoring
    is ignored for observed data. In other words, I(,) or C(,) only apply to
    missing data (NA). So in BUGS I'd probably use truncation (though not sure)
    because I need to model the fact that if a data is observed it is because it
    lies above some threshold. In JAGS, dinterval probably works differently I
    guess and censoring constraints do apply for both observed and unobserved

  • Aziz

    Aziz - 2011-08-10

    Another point is that we don't know a priori whether a child will be able to
    do the task or not. We need to observe it and see whether he/she is able to do
    it. In that sense, there is no prior information because we first need to
    experiment before decising whether the child is able or unable to do the task.
    That sounds again like censoring...

  • Martyn Plummer

    Martyn Plummer - 2011-08-10

    Yes, that is censoring.

  • Aziz

    Aziz - 2011-08-10

    Thanks Martyn! It's nice it is censoring because dinterval runs much faster
    than T(,)!!


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks