Jonas Lindeløv - 2014-03-25

I'm building a hierarchical multivariate model. I'm doing parameter estimation on a group of patients who've been given a bunch of cognitive tests. We expect that their performance on various tests correlate and model this using dmnorm for residuals (deviance from population mean). Everything works great as long as I model the observed data using dnorm for the underlying ability for that task. However, modeling observed data using probit-transformed binomial (observed ~ dbin(phi(ability))) severely impairs mixing of the full model even though the probit-ability is normal.

Here's a minimal version for a single binomial test on 25 subjects. Any insigts and/or fixes would be much appreciated!

Prerequisites in R:

rpm.obs = rbinom(25, 18, 0.5)
zeros = c(0, 0)
eye = diag(2)

The JAGS model:

  model {
    # For residuals: this wishart corresponds to uniform prior on pearson correlations
    covmat.inv ~ dwish(eye, 2 + 1)
    ability.group ~ dnorm(0, 0.001)

    # Loop through subjects
    for(i in 1:n.sub) {
      resid[i, 1:2] ~ dmnorm(zeros, covmat.inv)  # two correlated normal residuals

      # Subject's ability drawn from group and deviating by (potentially) correlated residuals
      ability[i, 1] <- ability.group + resid[i, 1]

      # Fix to data
      rate[i] <- phi(ability[i, 1])  # ability in probits
      rpm.obs[i] ~ dbin(rate[i], 18)  # binomial rate
    }
  }

Briefly, every score for each subject (rpm.obs) is modeled using an underlying ability for that task. This ability is modeled from a group distribution (ability.group) plus a residual (resid[subject, task], whether this subject is better or worse than average). The residuals are drawn from a multivariate normal distribution with an inverse wishart prior (corresponding to uniform priors on correlations).

Debugging:

  • I've tried to use scaled inverse wishart distribution (multiplying random numbers with the residual). This was proposed in the book by Gelman & Hill. This should speed up convergence but I find that it actually decreases it. And mixing is fine with pure-normal data, even with 8 correlated tests/residuals.
  • I've tried Hierarchical Centering, where ability.group is used instead of a zero in dmnorm. This was proposed in the BUGS book. This improves mixing of the binomial but severely decreases the already-bad mixing of the other residuals.

It is clearly the interaction between dmnorm and the binomial-probit stuff that causes the problem. Using a univariate normal as residual gives good mixing. Modeling the data using dnorm instead of dbin(phi()) gives good mixing. Changing the wishart prior to reflect low belief in any strong correlation (eye = diag(2)*0.0001) improves mixing.

Overall, this is a big problem because the bad mixing "contaminates" everything which has the dmnorm as parent. Thanks in advance for any help!

Jonas

 
Last edit: Jonas Lindeløv 2014-03-26