fe fnerd2
2014-05-09
Hi,
I am struggling with defining a JAGS model that deals with left and right censored data.
I have a model that takes into account that the data is either left or right censored using dinterval (below the case for right-censoring);
model {
for(i in 1:n) { isCensored[i] ~ dinterval(y[i], censorLimit[i]) y[i] ~ dnorm(y.hat[i], tau.y) y.hat[i] <- a[subject[i]] + b * predictor[i] } for(j in 1:groups) { a[j] ~ dnorm(mu.a, tau.a) a.hat[j] <- mu.a } mu.a ~ dnorm(0, .0001) tau.a <- pow(sigma.a, -2) sigma.a ~ dunif(0, 100) b ~ dnorm(0, .0001) tau.y <- pow(sigma.y, -2) sigma.y ~ dunif(0, 100)
}
But how can I specify that y is actually censored not only on the right but also on the left?
Thank you!
}
Krzysztof Sakrejda
2014-05-09
Hi fefnerd2,
x ~ dinterval(y,z) is generic---if z is a vector {0,1}, then
y <= 0 --> x = 0
0 < y <= 1 --> x = 1
1 < y --> x = 2
... the <= and < might be backwards but that's in the docs.
Krzysztof Sakrejda
2014-05-09
p.s.- fyi: real names are appreciated in lots of these stats/math forums, and you might get more help with your real name.
fe fnerd2
2014-05-09
Ok, thank you very much!
However what I do not understand is, that y[i] that is censored has to be recorded as NA, right?
So how can the dinterval() function determine whether a not recorded value is actually bound to be below or above a certain threshold?
I prepared a toy example in R generating normal distributed data that is censored both left and right;
How do I have to change dinterval so that it can identify a y-value is lying below or above the censoring thresholds?
require(rjags) require(coda) # model modelstring = " model { for ( i in 1:N ) { isCensored[i] ~ dinterval(y[i], censorLimit[i]) y[i] ~ dnorm(mu, tau) } tau <- 1/pow(sigma,2) sigma ~ dunif(0,100) mu ~ dnorm(0,1E-6) } " writeLines(modelstring,con="model.txt") # data generation y <- rnorm(n=1500, mean = 0, sd = 1) y_original <- y # censoring (left and right)! y[y < -0.5] <- -0.5 y[y > 2] <- 2 # set up variables N = length(y) # take care of right censoring censorLimit <- 2 isCensored <- y > censorLimit y[y > censorLimit] <- NA dataList = list(y = y , N = N, isCensored = as.numeric(isCensored), censorLimit = rep(censorLimit, length(y))) yInit = rep(NA, length(y)) yInit[isCensored] = censorLimit+1 initsList = list(y=yInit) # fit model jagsModel = jags.model("model.txt", data=dataList, inits=initsList, n.chains=3, n.adapt=100) codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000) # (mu estimate should be 0) summary(codaSamples)
because of not taking into account the left-censoring, the mean is wrongly estimated
Krzysztof Sakrejda
2014-05-09
From your code it looks like you actually KNOW if it's censored below or above so... why not give it that information through "isCensored" (0: censored below, 1: not censored, 2: censored above)? Then leave the y's as NA rather than replacing them as you do.
Krzysztof Sakrejda
2014-05-09
Some follow-up here:
https://sourceforge.net/p/mcmc-jags/discussion/610036/thread/e5633ad6/#47f8
fe fnerd2
2014-05-11
Again, thank you very much for your reply!
From your code it looks like you actually KNOW if it's censored below or above so... why not give it that information through "isCensored" (0: censored below, 1: not censored, 2: censored above)? Then leave the y's as NA rather than replacing them as you do.
So how exactly would you do this for the example above?
With two dinterval statements?
I am sorry, I am fairly new to JAGS and everything I tried so far only resulted in error messages or strange fits.
Thanks!
Krzysztof Sakrejda
2014-05-11
Just one dinterval statement, but the second argument to dinterval can be a vector (which then maps the reals to more than two values)... then you can pass the info in through isCensored giving values for censored below/not censored/censoreda above, rather than just censored/not censored./
fe fnerd2
2014-05-12
sorry to bother again.
then you can pass the info in through isCensored giving values for censored below/not censored/censoreda above, rather than just censored/not censored./
What values do you have to give for below/not/above?
I tried the following (among others ;) but can not parse the model.
require(rjags) require(coda) # model modelstring = " model { for ( i in 1:N ) { isCensored[i] ~ dinterval(y[i], censorLimits) y[i] ~ dnorm(mu, tau) } tau <- 1/pow(sigma,2) sigma ~ dunif(0,100) mu ~ dnorm(0,1E-6) } " writeLines(modelstring,con="model.txt") # data generation y <- rnorm(n=1500, mean = 0, sd = 1) y_original <- y # censoring (left and right)! y[y < -0.5] <- -0.5 y[y > 2] <- 2 # set up variables N = length(y) # take care of censoring censorLimits <- c(-0.5, 0, 2) isCensored <- ifelse(y <= censorLimits[1], -0.5, ifelse(y >= censorLimits[3], 2, 0)) y[y <= censorLimits[1] | y >= censorLimits[3]] <- NA dataList = list(y = y , N = N, isCensored = isCensored, censorLimits = censorLimits) yInit = rep(NA, length(y)) yInit[y_original <= censorLimits[1]] <- censorLimits[1]-1 yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1 initsList = list(y=yInit) # fit model jagsModel = jags.model("model.txt", data=dataList, inits=initsList, n.chains=3, n.adapt=100) codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000) # (mu estimate should be 0) summary(codaSamples)
Krzysztof Sakrejda
2014-05-12
isCensored <- ifelse(y <= censorLimits[1], -0.5, ifelse(y >= censorLimits[3], 2, 0))
So here you give isCensored the values:
censored below -> -0.5
not censored -> 0
censored above -> 2
... but dinterval maps the real numbers to integers... so maybe using 0,1,2 would work better (that matches with the censorLimits you provide). The documentation on dinterval is complete if you stare at it for long enough.
This:
# censoring (left and right)! y[y < -0.5] <- -0.5 y[y > 2] <- 2
.... is unnecessary but I don't remember if it's harmful. I think it is harmful. While you might in reality record -0.5 or 2, what you really know is that the values were truncated.
fe fnerd2
2014-05-13
Thanks again (also for your patience).
The documentation on dinterval is complete if you stare at it for long enough.
ok, I will probably have to stare a little bit longer ;)
I did try to change the script as you suggested.
Now I get the error message "Error in node isCensored[3]
Observed node inconsistent with unobserved parents at initialization", when trying to initialize the model.
(isCensored[3] is equal to 1, and y[3] equal to 1.6).
Here is the revised sample code;
require(rjags) require(coda) # model modelstring = " model { for ( i in 1:N ) { isCensored[i] ~ dinterval(y[i], censorLimits) y[i] ~ dnorm(mu, tau) } tau <- 1/pow(sigma,2) sigma ~ dunif(0,100) mu ~ dnorm(0,1E-6) } " writeLines(modelstring,con="model.txt") # data generation set.seed(123) y <- rnorm(n=1500, mean = 0, sd = 1) y_original <- y # censoring (left and right)! y[y < -0.5] <- NA y[y > 2] <- NA # set up variables N = length(y) # take care of censoring censorLimits <- c(0,1,2) isCensored <- ifelse(y_original <= censorLimits[1], 0, ifelse(y_original >= censorLimits[3], 2, 1)) dataList = list(y = y, N = N, isCensored = isCensored, censorLimits = censorLimits) yInit = rep(NA, length(y)) yInit[y_original <= censorLimits[1]] <- censorLimits[1]-1 yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1 initsList = list(y=yInit) # fit model jagsModel <- jags.model("model.txt", data=dataList, n.chains=3, n.adapt=100) codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000) # (mu estimate should be 0) summary(codaSamples)
Krzysztof Sakrejda
2014-05-13
You are really close but 'censorLimits' is not quite right---the second argument to dinterval gives the breakspoints. The mapping you have now is:
censorLimits <- c(0,1,2)
Which means (well, there's a <= in here somewhere...):
y < 0 -> 0 0 < y < 1 -> 1 1 < y < 2 -> 2 2 < y -> 3
So y[3] = 1.6 -> 2, which is inconsistent with isCensored[3] = 1 since 1 != 2 which gives you the error. You want censorLimits to indicate where truncation happens (-0.5 and 2 instead of 0, 1, and 2).
fe fnerd2
2014-05-13
Ok, now I changed censorLimits to a vector (c(-0.5, 2)) coding for the breaks.
isCensored codes for censoring (0 = censored below, 1 = not censored, 2 = censored above).
y is NA if censored below or above.
The model statement is;
isCensored[i] ~ dinterval(y[i], censorLimits)
(again, where censorLimits is a vector with two elements)
Still, the same error messages are thrown at me.
What am I missing?
here is the complete code;
require(rjags) require(coda) # model modelstring = " model { for ( i in 1:N ) { isCensored[i] ~ dinterval(y[i], censorLimits) y[i] ~ dnorm(mu, tau) } tau <- 1/pow(sigma,2) sigma ~ dunif(0,100) mu ~ dnorm(0,1E-6) } " writeLines(modelstring,con="model.txt") # data generation set.seed(123) y <- rnorm(n=1500, mean = 0, sd = 1) y_original <- y # censoring (left and right)! y[y < -0.5] <- NA y[y > 2] <- NA # set up variables N = length(y) # take care of censoring censorLimits <- c(-0.5, 2) isCensored <- ifelse(y_original <= censorLimits[1], 0, ifelse(y_original >= censorLimits[2], 2, 1)) dataList = list(y = y, N = N, isCensored = isCensored, censorLimits = censorLimits) yInit = rep(NA, length(y)) yInit[y_original <= censorLimits[1]] <- censorLimits[1]-1 yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1 initsList = list(y=yInit) # fit model jagsModel <- jags.model("model.txt", data=dataList, n.chains=1, n.adapt=100) codaSamples <- coda.samples(jagsModel, variable.names='mu', n.iter=1000) # (mu estimate should be 0) summary(codaSamples)
Krzysztof Sakrejda
2014-05-13
yInit[y_original >= censorLimits[2]] <- censorLimits[3]+1
censorLimits has only 2 elements!
yInit[y_original >= censorLimits[2]] <- censorLimits[2]+1
Now it works!