## Differences in estimated coefficients using JAGS, WinBUGS and the glm function document.SUBSCRIPTION_OPTIONS = { "thing": "thread", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

2014-04-17
2014-05-05
• Jérôme Guélat - 2014-04-17

Hi,

I was trying to give a really basic example during a lecture but I found some surprising differences between estimates provided by JAGS, WinBUGS and the glm function in R. WinBUGS and the glm function give the same results, but JAGS gives different estimates.

Here's the example with a simple Poisson regression:

```set.seed(24)
library(rjags)

# Generate count data using 2 covariates
nsites <- 200
alpha0 <- 1
alpha1 <- 2
alpha2 <- 3

A <- runif(n = nsites, -1, 1)   # Site covariate 1
B <- runif(n = nsites, -1, 1)   # Site covariate 2

lam <- exp(alpha0 + alpha1*A + alpha2*B)
y <- rpois(n = nsites, lambda = lam)

# Analyze with the glm function (without covariate B)
m <- glm(y ~ A, family = poisson)

# Define the same model in the BUGS language
cat(file = "Poisson_GLM.txt","
model {
# Priors
alpha0 ~ dnorm(0, 0.0001)
alpha1 ~ dnorm(0, 0.0001)
# Likelihood
for (i in 1:R){
y[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha0 + alpha1 * A[i]
}
}
")

# Bundle data
jags.data <- list(y = y, R = nsites, A = A)
# Initial values
inits <- function() list(alpha0 = rnorm(1, 0, 10), alpha1 = rnorm(1, 0, 10))
# Parameters monitored
params <- c("alpha0", "alpha1")
# MCMC settings
ni <- 10000   ;   nt <- 1   ;   nb <- 2000   ;  nc <- 3

# Create a JAGS model object
fit <- jags.model(file = "Poisson_GLM.txt", data = jags.data, inits = inits, n.chains = nc)
# Burnin
update(fit, nb)
# Generate posterior samples
samples <- coda.samples(model = fit, variable.names = params, n.iter = ni - nb, thin = nt)
# Convergence looks good
plot(samples)

# Compute posterior summaries
summfit <- summary(samples)

# Compare estimates of the glm function with estimates of JAGS (not similar)
m\$coef
summfit\$statistics[,"Mean"]

# Fit the model in WinBUGS
library(R2WinBUGS)
out <- bugs(data = jags.data, parameters.to.save = params, model.file = "Poisson_GLM.txt", inits = inits, n.iter = ni, n.chains = nc, n.burnin = nb, n.thin = nt, bugs.dir = "C:/Program Files (x86)/WinBUGS14")
# Check the estimates (similar to the glm function)
out\$mean[1:2]
```

As you can see, JAGS estimates seem to be wrong. If I use uniform priors for alpha0 and alpha1 (e.g. dunif(-1000, 1000)), then I get the correct estimates. It looks like the normal priors are influencing the results but I can't believe that the priors provide so much information with such a low precision. I also tried to use different initial values, but that didn't help.

I'm now using R 3.0.3 with JAGS 3.4.0 on a Windows computer (but I also tried with JAGS 3.3.0 and 3.4.0 on a Mac).

Can someone explain what's happening here? Thanks in advance!

Jérôme

• Martyn Plummer - 2014-04-17

Oh dear. You have found a bug in the auxiliary mixture sampler from the glm module (It is not by chance that the glm module is not loaded by default). I'm surprised that it fails on such a simple example. I'll look into it and let you know.

• Daniel Hocking - 2014-04-17

Interesting that it works perfectly when the glm module is not loaded. When is the glm module needed?

Daniel Hocking
djhocking@gmail.com

http://danieljhocking.wordpress.com/

On Apr 17, 2014, at 12:39 PM, Martyn Plummer martyn_plummer@users.sf.net wrote:

Oh dear. You have found a bug in the auxiliary mixture sampler from the glm module (It is not by chance that the glm module is not loaded by default). I'm surprised that it fails on such a simple example. I'll look into it and let you know.

Differences in estimated coefficients using JAGS, WinBUGS and the glm function

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/mcmc-jags/discussion/610036/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

• Jérôme Guélat - 2014-04-23

Thanks a lot! I almost always load the glm module, I guess I should be more careful ;)

• Martyn Plummer - 2014-04-25

I've been scratching my head over this all week trying to find the bug. I believe that it isn't a bug as such. The code is an accurate transcription of the algorithm but there is a problem with the algorithm itself.

The key to your example is that it is an ill-fitting model. Because the second covariate is missing, the data are highly over-dispersed compared to the Poisson model we are trying to fit. The auxiliary mixture sampling algorithm being used here relies on an implicit ``good model'' assumption which does not hold in this case. Briefly, the algorithm introduces some continuous latent variables and then approximates them by a finite mixture of normals. In your example, we are way out in the tail of these latent variables where the normal mixture approximation is rather poor.

I believe that the problem can be overcome by a adding a Metropolis-Hastings acceptance step that accounts for the difference between the true distribution and its normal mixture approximation.

• Daniel Hocking - 2014-04-28

So what is happening differently when the glm module is not loaded because then the coefficient estimates in JAGS are correct?

I believe that the problem can be overcome by a adding a Metropolis-Hastings acceptance step that accounts for the difference between the true distribution and its normal mixture approximation.

Is this something that needs to be added to JAGS underlying code or something that can be adjusted when calling from rjags?

Thanks,
Dan

Daniel Hocking
djhocking@gmail.com

On Apr 25, 2014, at 10:29 AM, Martyn Plummer martyn_plummer@users.sf.net wrote:

I've been scratching my head over this all week trying to find the bug. I believe that it isn't a bug as such. The code is an accurate transcription of the algorithm but there is a problem with the algorithm itself.

The key to your example is that it is an ill-fitting model. Because the second covariate is missing, the data are highly over-dispersed compared to the Poisson model we are trying to fit. The auxiliary mixture sampling algorithm being used here relies on an implicit ``good model'' assumption which does not hold in this case. Briefly, the algorithm introduces some continuous latent variables and then approximates them by a finite mixture of normals. In your example, we are way out in the tail of these latent variables where the normal mixture approximation is rather poor.

I believe that the problem can be overcome by a adding a Metropolis-Hastings acceptance step that accounts for the difference between the true distribution and its normal mixture approximation.

Differences in estimated coefficients using JAGS, WinBUGS and the glm function

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/mcmc-jags/discussion/610036/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

• Martyn Plummer - 2014-04-28

Loading the glm module makes sampling methods available that are specific to generalized linear models. They make sampling more efficient by sampling all variables in the linear predictor together in a single block. The sampling algorithm in question was published by Sylvia Fruewirth-Schnatter and colleagues in 2009. Clearly it does not work on Jerome's model. The M-H step must be done within JAGS. I will let you know how I get on.

• Jérôme Guélat - 2014-04-29

Thanks a lot for the information and for the time you spent trying to find a bug!

I'm an ecologist trying to do statistics and the fact that some MCMC sampling algorithms have an implicit "good model" assumption is new to me. I'm glad to have learned that and I will be (even) more careful with my models now!

• Martyn Plummer - 2014-05-05

To be honest, I didn't realise there was a "good model" assumption either, but your example proves it.