I am having some trouble using a derived sum of an estimated latent state (z) to further estimate the population-level probability of (z=1).

The general model layout is a logistic regression with observation error. Observed data, y, is an imperfect detection of latent state, z. I have a single categorical covariate, x1.

Model Specification:

for (i in 1:n){

y[i] ~ Binom(z[i], 1, p.obs)

z[i] ~ Bern(ilogit(X[i, ] %*% beta))

}

beta ~ N(0, 1.5)

p.obs ~ Beta(1, 1)

prob ~ Binom(sum(z), n, prob)

# Code: # Example Model for SourceForge # setwd('') # Simulate Data n = 100 # Single categorical covariate x1 = rbinom(n, 1, 0.2) X = model.matrix(~x1) # Transformations logit <- function(p) log(p/(1-p)) expit <- function(x) exp(x)/(exp(x)+1) # Logistic Regression Coefficients beta = c(-1, 1) # P(z=1) p = expit(X %*% beta) # Latent State z = rbinom(n, 1, p) # Data, with observation error y = rep(0, n) p.obs = .7 y[z==1] = rbinom(sum(z==1), 1, p.obs) data = list(y = y, X = X, n = n) inits = list(beta=c(0,0), p.obs = 0.5, prob=sum(y)/n) sink('JAGS.r') cat(" model{ beta[1] ~ dnorm(0, 1.5) beta[2] ~ dnorm(0, 1.5) p.obs ~ dbeta(1, 1) prob ~ dbeta(1, 1) for (i in 1:n){ p[i] <- ilogit(X[i, ] %*% beta) z[i] ~ dbern(p[i]) y[i] ~ dbern(p.obs) } sum(z) ~ dbin(n, prob) } ", fill = TRUE) sink() library(rjags) library(runjags) library(coda) n.adapt= 1000 n.update= 1000 n.iter= 10000 jm = jags.model("JAGS.R", data = data, inits = inits, n.chains = length(inits), n.adapt) update(jm, n.iter=n.update) cm = coda.samples(jm, variable.names = c('beta', 'p.obs', 'prob'), n.iter = n.iter) # End Code

I have had several ideas about this:

1) Treat the estimate of z as data within each iteration and simply model the P(z=1) as binomial. This is what I attempted to do above but it doesn't like to use sum() as a link function. A colleague says that he has run into this sort of problem many times because JAGS doesn't allow assigning a node more than once, eg:

count <- sum(z)

count ~ dbin(n, prob)

2) Simply take the mean of p. Would this be equivalent? eg:

prob <- mean(p)

The background is that z is disease status and we want population-level prevalence, P(z=1).

Any feedback is appreciated.

Thanks,

Nathan

ps. I will be traveling for the holidays so please don't be too annoyed if I miss any posts next week.

Last edit: Nathan Galloway 2013-12-19