Testing Multiple Hypotheses Simultaneously

Jun Xu
  • Jun Xu
    Jun Xu

    Dear JAGSer's,

    This question is more related to Bayesian hypothesis testing than JAGS per se. I had some basics of especially doing Bayesian data analysis (of course, I use JAGS :). I've been told that what's great about Bayesian analysis is that I don't need to worry about making any adjustment for simultaneous hypothesis testing. With traditional approach, we need to do things like Bonferroni (probably most conservative) or Tukey corrections. But in Bayes, we simply look at the posterior. But then I got a bit confused when people say we need to set up a hierarchical model to do that in Bayes (for genomics data that usually have a large number of hypotheses). Here is an example,

    y = B_0 + B_1 * X_1 + B_2 * X_2 + B_3 * X_3 + e

    To test the following simultaneously:
    H1: B1 = B2
    H2: B1 = B3
    H3: B2 = B3

    I could simply program it in the model statement in JAGS
    D1 <- B1-B2
    D2 <- B1-B3
    D3 <- B2-B3

    and look at whether the posterior distribution of D1, D2, and D3 contain zero (95% credible interval) and that should suffice since the posterior is like this simulated multivariate distribution, not a sample. So no matter how we compare or look at, there is only this one distribution?

    Or, I could just compute D1, D2, and D3 using the posterior of B1, B2, and B3? Any good reference on this? Thanks a lot!

    Jun Xu, PhD
    Associate Professor
    Department of Sociology
    Ball State Unversity
    Muncie, IN

  • Matt Denwood
    Matt Denwood

    There is no difference between calculating B1-B2 in JAGS vs taking the posterior for B2 away from the posterior for B1 in the distributions returned by JAGS, except that the latter is more computationally efficient. So you are free to do it whichever way you like, and to make as many calculations from your posterior as you like, because a deterministic function of the posterior is still a valid part of the full posterior of the model.

    However, remember that 95% credible intervals of a difference between 2 samples from the same population will not contain zero 5% of the time - i.e. if you make 20 independent comparisons you will expect 1 'false positive' result, in exactly the same way as any statistical test. So I'm not sure exactly what you have been told, but a Bayesian analysis is not a magic bullet I'm afraid.

    Also, be aware that there is a difference (either subtle or fundamental, depending on your point of view) between the Bayesian 95% credible interval of a difference and a classical statistical test of a null vs alternative hypothesis. Although it is often done, it is incorrect to refer to the posterior of a difference as 'significant'. But that is a different topic of discussion :)