Posterior predictive distribution in a changepoint problem

  • Avi

    Avi - 2013-07-05


    Firstly, Dr. Plummer, I have to add my thanks to the chorus for your creation and support of JAGS. Thank you!

    I am working on a changepoint problem, and I want to create predictive values for the distributions pre- and post-changepoint. More specifically, if I have N data points, I'd like to have an estimate for N+1. A simple form of the model is below:

    var N, X[N], CPP[N];
    model {
    for (j in 1:N) {
        a[j] <- ifelse(year < m, alpha.prior[1], alpha.prior[2])
        l[j] <- ifelse(year < m, lambda.prior[1], lambda.prior[2])
        X[j] ~ dgamma(alpha[j], lambda[j])
    for (i in 1:2) {
        alpha.prior[i] ~ dgamma (0.5, 0.025)
        lambda.prior[i] <- pow(theta.prior[i], -1)
        theta.prior[i] ~ dgamma (0.5, 5.0E-4)
    PreCP ~ dgamma(alpha.prior[1], lambda.prior[1])
    PostCP ~ dgamma(alpha.prior[2], lambda.prior[2])
    m ~ dcat(CPP)

    My intention is that PreCP represents the distribution of the process prior to the changepoint and that PostCP represents the process subsequent to the changepoint.

    Am I correct in having PreCP and PostCP outside of the loop? Simply creating X.rep[j] ~ dgamma(alpha[j], lambda[j]) inside the loop will give predictions for each j. When doing so, the X.rep for the last j does not have the distribution as PostCP---close but not the same. For that matter creating an X.rep2 ~ dgamma(alpha[2], lambda[2]) which gives close, but non-identical, distributions for each j is also not the same as PostCP.

    Are they all (at least for the last element of j) really estimates of the same random variable and the difference is process risk, or is there a fundamental difference I am missing?

    Once again, thank you very much.

  • Martyn Plummer

    Martyn Plummer - 2013-07-05

    I think there are a couple of small errors in your code. Firstly, the condition year < m is the same for all values of j. I think you want j < m. Secondly, you have defined variables a and l but are generating X from alpha and lambda. In short, I think the first loop should look like this.

    for (j in 1:N) {
        alpha[j] <- ifelse(j < m, alpha.prior[1], alpha.prior[2])
        lambda[j] <- ifelse(j < m, lambda.prior[1], lambda.prior[2])
        X[j] ~ dgamma(alpha[j], lambda[j])

    NB This implies the changepoint must take place by time N. If you want to allow the possibility that the change never occurs (m > N) then you need CPP to be of length N+1 with CPP[N+1] > 0.

    Replication in hierarchical model can take place at different levels. Which level is appropriate depends on what you want to do. If you defined X.rep[j] ~ dgamma(alpha[j], lambda[j]) then X.rep[1:N] represents another possible realisation of the process, but conditional on having the same changepoint m as the observed data.

    If you want a replicate that does not depend on the same changepoint, then you need to replicate m, i.e.

    for (j in 1:N) {
        alpha.rep[j] <- ifelse(j < m.rep, alpha.prior[1], alpha.prior[2])
        lambda.rep[j] <- ifelse(j < m.rep, lambda.prior[1], lambda.prior[2])
        X.rep[j] ~ dgamma(alpha.rep[j], lambda.rep[j])
    m.rep ~ dcat(CPP)

    You can also produce replicates conditional on a change point of your choosing by fixing m.rep instead of drawing it from its prior distribution. So m.rep <- 1 represents a process that changes at time 1, and m.rep <- N represents a process that does not change until time N.

    I hope this helps.

  • Avi

    Avi - 2013-07-05

    Thank you very much. Yes, I changed the variable names of my actual model, but missed some as you pointed out. So the code in my original model is correct as refers to your first text block; my apologies for the unintended confusion.

    As for change point not occurring, my understanding is that would be the equivalent of m=1 as in that case a[j] would always trigger the second clause of the ifelse.

    My intent was to replicate X conditional on the parameters of the distribution post the changepoint "uncovered" by the Bayesian analysis. This, in my understandning, would be the best estimate for the "next j's" process, given that a change occurred. If it hadn't, then m=1, version [2] of the parameters is always active, and so that distribution is "most predictive" for j = N + 1.

    I am not exactly trying to replicate the input dstribution; I am trying to use the result of the analysis to predict the future. I believe that means that fixing m.rep at the mode of m and then generating X.rep based on that fixed value will not do what I want.

    Would creating an N+1 element in CPP (changepoint prior - vector of 1's for now) and capturing X[j+1] be a better estimate? If so, what exactly am I measuring by looking at a distribution based on the [1] and [2] versions of the prior outside the loop?

    My apologies for what may seem like trivial questions, and thank you again for your time and expertise.

    Last edit: Avi 2013-07-05
  • Avi

    Avi - 2013-07-07

    For what it is worth, I compared creating an N+1th node with the results I got from just using the post-chagepoint parameters outside of the loop, and the results are extremely close, across various quantiles, so I'm comfortable---perhaps in ignorance---that the distribution I am getting is the one I want.

    Thank you very much again.

    Last edit: Avi 2013-07-07
  • Martyn Plummer

    Martyn Plummer - 2013-07-08

    OK I did not fully understand what you wanted to do. You don't want to do a posterior predictive simulation of the whole of X but to see what happens next at time N+1. In this case, PostCP gives you what you want, as the change has certainly occcured by this time.

    Equivalently, you can extend the first for loop to for(j in 1:(N+1)), so that alpha, lambda, and X are of length N+1. Append NA to the value of X in your input data and JAGS will simulate X[N+1] for you.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks