'Matt trick' in JAGS

Adam Slez
2013-06-09
2013-06-11
  • Adam Slez

    Adam Slez - 2013-06-09

    Hi All,

    Over on the Stan list there has been some discussion of speeding up models through the use of what they refer to as the 'Matt trick.' I have been trying to implement this in JAGS, but I keep getting strange results. Just to be clear up front, I'm working with a logistic-binomial model, hence why the distribution of the priors looks a little bit different than what we might expect were we looking at a model with a continuous outcome.

    Let's start with a simple random intercept model with a random intercept term u defined as follows:

    for(g in 1:G){
        u[g] ~ dnorm(u0,sigma2inv)
    }
    u0 ~ dnorm(0,.01)
    sigma ~ dunif(0,10)
    sigma2inv<-pow(sigma,-2)
    

    This is a pretty standard way of writing out the model. Assuming that I am correctly translating from Stan to JAGS, the Matt trick refers to the idea of rewriting the model above like so:

    for(g in 1:G){
        u[g] <- u0 + e_u[g]*sigma
        e_u[g] ~ dnorm(0,1)
    }
    u0 ~ dnorm(0,.01)
    sigma ~ dunif(0,10)
    

    Even if my code here is wrong, you should still be able to get a sense of what I'm trying to do. As I understand it, the idea behind this parameterization is to make the model more efficient by breaking up the correlation between e_u on the one hand, and u0 and sigma on the other.

    When I run the model in JAGS using the above parameterization, I find that while I get good estimates of u0 and bx, I get totally off-the-wall estimates for sigma. I'm not sure whether it is a coding error on my part or whether this trick doesn't carry over to JAGS for some reason. I'm using the glm module. Is it possible that the glm-specific routine might not play well with the Matt trick?

    My full model is as follows:

    model{
        for(i in 1:N){
            xb[i] <- bx*x[i]+u[group[i]]
            logit(p[i]) <- xb[i]
            y[i] ~ dbin(p[i],trials[i])     
        }
        for(g in 1:G){
            u[g] <- u0 + e_u[g]*sigma
            e_u[g] ~ dnorm(0,1)
        }
        u0 ~ dnorm(0,.01)
        sigma ~ dunif(0,10)
        bx ~ dnorm(0,.01)
    }
    

    Any thoughts would be much appreciated.

    Adam

     
  • Martyn Plummer

    Martyn Plummer - 2013-06-11

    Your example is not reproducible. I need to see your data (N, group, x, y) to run the model, evaluate its performance and, if necessary, debug it. Fake data will do fine, as long as it reproduces your problem.

    More generally, the two parameterizations you show are referred to by Yu and Meng (http://dx.doi.org/10.1198/jcgs.2011.203main) as the sufficient parameterization and the ancillary parameterization. In the first, u[1:G] is sufficient for sigma and in the second e_u[1:G] is marginally independent of sigma (so ancillary, from the frequentist viewpoint). Neither parameterization is inherently superior and the worst case performance of either one can be bad. Yu and Meng propose an interweaving strategy that combines the best performance of both. Andrew Gelman's folded half-t prior for sigma has a redundant parameterization that factorizes along the lines of the ancillary/sufficient dichotomy.

    These are ideas that I want to look into for a future version of JAGS, but they are not currently implemented.

     

Log in to post a comment.