outliers in jags?

2013-07-09
2013-07-10
  • Charlotte Boyd

    Charlotte Boyd - 2013-07-09

    Hi

    I’ve been running similar state-space models on various dataset in jags from R. In several cases, I’ve noticed apparent spikes in sampled parameters near the beginning of the saved chain. Changing the burn-in, thinning rate, random seed, computer does not change this pattern, and the specific value of the outlier is sometimes exactly the same from one run to the next. The problem is most apparent in variance parameters, although presumably other parameters are also affected. Has anyone come across this before, or understand what is going on? I’ve posted some code to demonstrate the problem below – in this case, traceplots for sigmaR always start with an exceptionally high value.

    Many thanks

    Charlotte

    Libraries

    library(R2jags)
    library(rjags)
    library(runjags)
    library(coda)

    Simulated data

    Y <- c(11.2352171, 10.8936651, 11.6131174, 12.4798089, 13.9641126, 13.3132187, 12.4915249, 12.6521071, 9.9450315, 10.3393222)
    N <- length(Y)

    Model

    modelFile = cat(
    "
    model {
    # PRIORS
    x[1] ~ dnorm(6,0.01); # initial state

    U ~ dnorm(0,1); # long-term rate of change
    
    tauQ ~ dgamma(0.01,0.01); # process error
    sigmaQ <- 1/sqrt(tauQ); # convert from precision to sd
    
    tauR ~ dgamma(0.01,0.01); # measurement error
    sigmaR <- 1/sqrt(tauR); # convert from precision to sd
    
    # PROCESS MODEL
        for(t in 2:N) {
            predx[t] <- x[t-1] + U;
            x[t] ~ dnorm(predx[t],tauQ);
        }
    
    # LIKELIHOOD
        for(t in 1:N) {Y[t] ~ dnorm(x[t], tauR);}
    

    }
    "
    , file = "example.txt")

    Specify the parameters / data list / model file name

    jags.params = c("U","sigmaQ","sigmaR")
    jags.data = list("Y","N")
    model.location = paste("example.txt",sep="")

    Set MCMC parameters

    set.seed(125)
    mcmcchains = 3
    mcmcthin = 10
    mcmcburn=1000000
    samples2Save=1000*mcmcthin

    testrun = jags(jags.data, inits = NULL, parameters.to.save= jags.params,
    model.file=model.location, n.chains = mcmcchains, n.thin = mcmcthin,
    n.burnin=mcmcburn, n.iter =(mcmcburn+samples2Save), DIC = FALSE)

    -------------------------------------------------------------------------------

    Trace plots

    Built-in function

    traceplot(testrun)

    Alternative trace plot

    outs <- testrun$BUGSoutput$sims.array
    labpar<-c("U", "sigmaQ", "sigmaR")

    par(mfrow=c(3,3))
    xx <- seq(1,1000,1)
    for(jj in 1:3) for(ii in 1:3) plot(xx, outs[,jj,ii], xlab="cycle number", ylab=labpar[ii], type='b',pch=16, xlim=c(1,50)) # despite burn-in of 1000000, all traceplots for sigmaQ show high values at the beginning of the chain

     
  • Martyn Plummer

    Martyn Plummer - 2013-07-10

    Thanks for sending a fully worked example. This really helps to find the source of the problem.

    In this case, it is a bug in the R2jags package. Although you are specifying a long burn-in, it is not being passed on to JAGS. So the outlier you are seeing at the beginning of the chain is the starting values.

    Note that I did not write the R2jags package, so I am not responsible for this, but I have sent a bug report to the maintainer.

     
    • Charlotte Boyd

      Charlotte Boyd - 2013-07-10

      Hi Martyn

      Many thanks for checking this. It's good to know that's all it is, and there's a very simple fix until the bug is addressed.

      Many thanks also for all your work on JAGS - I'm starting to use it for all my bayesian analyses.

      Charlotte

       

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks