random effects model with left AND right censored data

Help
fe fnerd2
2014-05-09
2014-05-13
  • fe fnerd2
    fe fnerd2
    2014-05-09

    Hi,

    I am struggling with defining a JAGS model that deals with left and right censored data.

    I have a model that takes into account that the data is either left or right censored using dinterval (below the case for right-censoring);

    model {

    for(i in 1:n) {
    
        isCensored[i] ~ dinterval(y[i], censorLimit[i])
        y[i] ~ dnorm(y.hat[i], tau.y)
    
        y.hat[i] <- a[subject[i]] + b * predictor[i]
    
    }
    
    for(j in 1:groups) {
    
        a[j] ~ dnorm(mu.a, tau.a)
        a.hat[j] <- mu.a
    }
    
    mu.a ~ dnorm(0, .0001) 
    tau.a <- pow(sigma.a, -2) 
    sigma.a ~ dunif(0, 100)
    
    b ~ dnorm(0, .0001)
    
    tau.y <- pow(sigma.y, -2) 
    sigma.y ~ dunif(0, 100)
    

    }

    But how can I specify that y is actually censored not only on the right but also on the left?

    Thank you!

    }

     
    • Hi fefnerd2,

      x ~ dinterval(y,z) is generic---if z is a vector {0,1}, then
      y <= 0 --> x = 0
      0 < y <= 1 --> x = 1
      1 < y --> x = 2

      ... the <= and < might be backwards but that's in the docs.

       
    • p.s.- fyi: real names are appreciated in lots of these stats/math forums, and you might get more help with your real name.

       
  • fe fnerd2
    fe fnerd2
    2014-05-09

    Ok, thank you very much!

    However what I do not understand is, that y[i] that is censored has to be recorded as NA, right?
    So how can the dinterval() function determine whether a not recorded value is actually bound to be below or above a certain threshold?

    I prepared a toy example in R generating normal distributed data that is censored both left and right;

    How do I have to change dinterval so that it can identify a y-value is lying below or above the censoring thresholds?

    require(rjags)    
    require(coda)
    
    # model
    
    modelstring = "
    model {
      for ( i in 1:N ) {
    
        isCensored[i] ~ dinterval(y[i], censorLimit[i])
        y[i] ~ dnorm(mu, tau) 
    
      }
    
      tau <- 1/pow(sigma,2)
      sigma ~ dunif(0,100)
      mu ~ dnorm(0,1E-6)
    
    }
    "
    
    writeLines(modelstring,con="model.txt")
    
    # data generation
    
    y <- rnorm(n=1500, mean = 0, sd = 1)
    y_original <- y
    
    # censoring (left and right)!
    y[y < -0.5] <- -0.5
    y[y >  2] <- 2
    
    # set up variables
    
    N = length(y)
    
    # take care of right censoring
    censorLimit <- 2
    isCensored <- y > censorLimit
    y[y > censorLimit] <- NA
    
    dataList = list(y = y , N = N, isCensored = as.numeric(isCensored), censorLimit = rep(censorLimit, length(y)))
    
    yInit = rep(NA, length(y))
    yInit[isCensored] = censorLimit+1
    
    initsList = list(y=yInit)
    
    # fit model
    
    jagsModel = jags.model("model.txt", data=dataList, inits=initsList, n.chains=3, n.adapt=100)
    codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000)    
    
    # (mu estimate should be 0)
    summary(codaSamples)
    

    because of not taking into account the left-censoring, the mean is wrongly estimated

     
    • From your code it looks like you actually KNOW if it's censored below or above so... why not give it that information through "isCensored" (0: censored below, 1: not censored, 2: censored above)? Then leave the y's as NA rather than replacing them as you do.

       
  • fe fnerd2
    fe fnerd2
    2014-05-11

    Again, thank you very much for your reply!

    From your code it looks like you actually KNOW if it's censored below or above so... why not give it that information through "isCensored" (0: censored below, 1: not censored, 2: censored above)? Then leave the y's as NA rather than replacing them as you do.

    So how exactly would you do this for the example above?
    With two dinterval statements?

    I am sorry, I am fairly new to JAGS and everything I tried so far only resulted in error messages or strange fits.

    Thanks!

     
    • Just one dinterval statement, but the second argument to dinterval can be a vector (which then maps the reals to more than two values)... then you can pass the info in through isCensored giving values for censored below/not censored/censoreda above, rather than just censored/not censored./

       
  • fe fnerd2
    fe fnerd2
    2014-05-12

    sorry to bother again.

    then you can pass the info in through isCensored giving values for censored below/not censored/censoreda above, rather than just censored/not censored./

    What values do you have to give for below/not/above?

    I tried the following (among others ;) but can not parse the model.

    require(rjags)    
    require(coda)
    
    # model
    
    modelstring = "
    model {
      for ( i in 1:N ) {
    
        isCensored[i] ~ dinterval(y[i], censorLimits)
        y[i] ~ dnorm(mu, tau) 
    
      }
    
      tau <- 1/pow(sigma,2)
      sigma ~ dunif(0,100)
      mu ~ dnorm(0,1E-6)
    
    }
    "
    
    writeLines(modelstring,con="model.txt")
    
    # data generation
    
    y <- rnorm(n=1500, mean = 0, sd = 1)
    y_original <- y
    
    # censoring (left and right)!
    y[y < -0.5] <- -0.5
    y[y >  2] <- 2
    
    # set up variables
    
    N = length(y)
    
    # take care of  censoring
    
    censorLimits <- c(-0.5, 0, 2)
    isCensored <- ifelse(y <= censorLimits[1], -0.5, ifelse(y >= censorLimits[3], 2, 0))
    
    y[y <= censorLimits[1] | y >= censorLimits[3]] <- NA
    
    dataList = list(y = y , N = N, isCensored = isCensored, censorLimits = censorLimits)
    
    yInit = rep(NA, length(y))
    yInit[y_original <= censorLimits[1]] <- censorLimits[1]-1
    yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1
    
    initsList = list(y=yInit)
    
    # fit model
    
    jagsModel = jags.model("model.txt", data=dataList, inits=initsList, n.chains=3, n.adapt=100)
    codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000)    
    
    # (mu estimate should be 0)
    summary(codaSamples)
    
     
    • isCensored <- ifelse(y <= censorLimits[1], -0.5, ifelse(y >=    censorLimits[3], 2, 0))
      

      So here you give isCensored the values:

      censored below -> -0.5
      not censored -> 0
      censored above -> 2

      ... but dinterval maps the real numbers to integers... so maybe using 0,1,2 would work better (that matches with the censorLimits you provide). The documentation on dinterval is complete if you stare at it for long enough.

      This:

      # censoring (left and right)!
      y[y < -0.5] <- -0.5
      y[y >  2] <- 2
      

      .... is unnecessary but I don't remember if it's harmful. I think it is harmful. While you might in reality record -0.5 or 2, what you really know is that the values were truncated.

       
  • fe fnerd2
    fe fnerd2
    2014-05-13

    Thanks again (also for your patience).

    The documentation on dinterval is complete if you stare at it for long enough.

    ok, I will probably have to stare a little bit longer ;)

    I did try to change the script as you suggested.
    Now I get the error message "Error in node isCensored[3]
    Observed node inconsistent with unobserved parents at initialization", when trying to initialize the model.
    (isCensored[3] is equal to 1, and y[3] equal to 1.6).

    Here is the revised sample code;

    require(rjags)    
    require(coda)
    
    # model
    
    modelstring = "
    model {
      for ( i in 1:N ) {
    
        isCensored[i] ~ dinterval(y[i], censorLimits)
        y[i] ~ dnorm(mu, tau) 
    
      }
    
      tau <- 1/pow(sigma,2)
      sigma ~ dunif(0,100)
      mu ~ dnorm(0,1E-6)
    
    }
    "
    
    writeLines(modelstring,con="model.txt")
    
    # data generation
    
    set.seed(123)
    y <- rnorm(n=1500, mean = 0, sd = 1)
    y_original <- y
    
    # censoring (left and right)!
    y[y < -0.5] <- NA
    y[y >  2] <- NA
    
    # set up variables
    
    N = length(y)
    
    # take care of  censoring
    
    censorLimits <- c(0,1,2)
    isCensored <- ifelse(y_original <= censorLimits[1], 0, ifelse(y_original >= censorLimits[3], 2, 1))
    
    dataList = list(y = y, N = N, isCensored = isCensored, censorLimits = censorLimits)
    
    yInit = rep(NA, length(y))
    yInit[y_original <= censorLimits[1]] <- censorLimits[1]-1
    yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1
    
    initsList = list(y=yInit)
    
    # fit model
    
    jagsModel <- jags.model("model.txt", data=dataList, n.chains=3, n.adapt=100)
    codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000)    
    
    # (mu estimate should be 0)
    summary(codaSamples)
    
     
    Last edit: fe fnerd2 2014-05-13
  • You are really close but 'censorLimits' is not quite right---the second argument to dinterval gives the breakspoints. The mapping you have now is:

    censorLimits <- c(0,1,2)
    

    Which means (well, there's a <= in here somewhere...):

    y < 0 -> 0
    0 < y < 1 -> 1
    1 < y < 2 -> 2
    2 < y   -> 3
    

    So y[3] = 1.6 -> 2, which is inconsistent with isCensored[3] = 1 since 1 != 2 which gives you the error. You want censorLimits to indicate where truncation happens (-0.5 and 2 instead of 0, 1, and 2).

     
  • fe fnerd2
    fe fnerd2
    2014-05-13

    Ok, now I changed censorLimits to a vector (c(-0.5, 2)) coding for the breaks.
    isCensored codes for censoring (0 = censored below, 1 = not censored, 2 = censored above).
    y is NA if censored below or above.

    The model statement is;
    isCensored[i] ~ dinterval(y[i], censorLimits)
    (again, where censorLimits is a vector with two elements)

    Still, the same error messages are thrown at me.
    What am I missing?

    here is the complete code;

    require(rjags)    
    require(coda)
    
    # model
    
    modelstring = "
    model {
      for ( i in 1:N ) {
    
        isCensored[i] ~ dinterval(y[i], censorLimits)
        y[i] ~ dnorm(mu, tau) 
    
      }
    
      tau <- 1/pow(sigma,2)
      sigma ~ dunif(0,100)
      mu ~ dnorm(0,1E-6)
    
    }
    "
    
    writeLines(modelstring,con="model.txt")
    
    # data generation
    
    set.seed(123)
    y <- rnorm(n=1500, mean = 0, sd = 1)
    y_original <- y
    
    # censoring (left and right)!
    y[y < -0.5] <- NA
    y[y >  2] <- NA
    
    # set up variables
    
    N = length(y)
    
    # take care of  censoring
    
    censorLimits <- c(-0.5, 2)
    isCensored <- ifelse(y_original <= censorLimits[1], 0, ifelse(y_original >= censorLimits[2], 2, 1))
    
    dataList = list(y = y, N = N, isCensored = isCensored, censorLimits = censorLimits)
    
    yInit = rep(NA, length(y))
    yInit[y_original <= censorLimits[1]] <- censorLimits[1]-1
    yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1
    initsList = list(y=yInit)
    
    # fit model
    
    jagsModel <- jags.model("model.txt", data=dataList, n.chains=1, n.adapt=100)
    codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000)    
    
    # (mu estimate should be 0)
    summary(codaSamples)
    
     
    • yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1
      

      censorLimits has only 2 elements!

      yInit[y_original >= censorLimits[2]] <- censorLimits[2]+1
      

      Now it works!