Viktor Jonsson
2013-10-22
Hello!
I have a running a hierarchical model in JAGS. The purpose is to do group-wise comparisons on overdispersed count data, doing row-wise comparisons for a table of counts. The model flawlessly on smaller data sets, say 2000 rows of data. On larger datasets (10 000 rows in this case) it runs for a while and then crashes with the error:
Error: Error in node phi[1973,1]
Failure to calculate log density
The failure occurs at exactly the same node for five runs on different data (of same size). The model becomes quite large with "Graph Size: 520025" and I suspect I might be running into a memory issue. What is the limit on model size/memory usage? I can run the larger model succesfully if I reduce the number samples taken.
Reading up on the cause for the error message it would seem that it is related to NaNs or Infs coming up in the model. I do not see why these would occur at the same time each simulation and why it would depend on model size. Could it be that the likelihood gets too small in a large model and thus numerically unstable?
Currently running JAGS version 3.3.0, through rjags.
Best Regards
Viktor
Krzysztof Sakrejda
2013-10-22
No complete answer but a few comments:
"Could it be that the likelihood gets too small in a large model and thus numerically unstable?"
Yes, and this is more likely to occur with a larger data set. I'm a little surprised that it's so consistent with different data sets, maybe Martyn will comment on that? If you take a subsample of the data set(s) causing the problem and use them to get parameter estimates, do the estimates suggest a problem? Are they at some boundary? Do any parameter wander off?
Also, why not check if you run out of memory: what's your % RAM usage? do you see swapping before the error?
Martyn Plummer
2013-10-23
Perhaps you could post your model.
Viktor Jonsson
2013-10-23
Hello again, thanks for the responses.
I did rerun the model on the large data and monitored the parameters for the first 1000 iterations(it generally crashes after 1680 iterations). They exhibited no strange behaviour as far as I could tell. Note also that the data used is currently simulated from the model, it is randomly drawn each time though so it should not be a specific thing related to that node[1973,1] in the data.
There should be plenty of RAM to spare on the machine I am using, I was wondering if there might be some internal memory issue in JAGS. The memory usage slowly increases over time but if it indeed started to swap to drive that shouldn't cause JAGS to crash or?
The model used follows below. The N[j]s are considered fixed and input as data as is the group variable. The current data used has 10000 rows and 12 columns divided into two groups of 6 each. The model itself is a Poisson log normal GLM with a hierarchical gamma prior on the variance of the overdispersion parameter phi.
Best Regards
Viktor
Model:
data { D <- dim(Y) } model { for (i in 1:D[1]) { for (j in 1:group[1]){ Y[i,j] ~ dpois(exp(phi[i,j]+beta0[i]+log(N[j]))) phi[i,j] ~ dnorm(0, binTau[i]) } for (j in (group[1]+1):(group[1]+group[2])){ Y[i,j] ~ dpois(exp(phi[i,j]+beta0[i] +beta1[i]+log(N[j]))) phi[i,j] ~ dnorm(0, binTau[i]) } binVar[i] ~ dgamma(r,lambda) #Gamma distribution on variance binTau[i] <- 1/binVar[i] beta0[i] ~ dnorm(0,0.01) beta1[i] ~ dnorm(0,0.01) } r ~ dunif(0,50) lambda ~ dunif(2,1000) }
Krzysztof Sakrejda
2013-10-23
It could be a problem that r is getting estimated as very small, if binVar is getting estimated as very small... but that would show up as decreasing r/binVar values during the course of sampling... but this is shooting in the dark without a reproducible example.
Martyn Plummer
2013-10-24
I can't see anything in the model that would cause the log density calculation for phi
to fail. At this point we need a reproducible example. Obviously the data are too big to post here, but you say your data are simulated from the model. If you post an R program that reproducibly creates a data set that fails (i.e. calls set.seed
before generating the data) then I can run it through a debugger.
Viktor Jonsson
2013-10-25
Thanks for the responses. It would seem that the JAGS crashes regardless of which seed I set but for consistency I generated data according to the script below with "set.seed(1)". I used 1000 burn in steps, 1000 adaptation steps and attempted to take 2000 samples. JAGS crashed after ca 1640 samples were taken (can't give the exact iteration as it only gives me 64%).
I also include the inits and data for the other parameters. The script simply creates the necessary variables in the R workspace but I guess you can transfer them to the desired format.
Hope we can shed some light on this.
GenerateGammaDataSimple <- function(nrRows=10000,groups=c(6,6) ,meanMean=1500,alpha1=1.5,alpha2=0.1){ #Simplified version of GenerateGammaData nrSamples = sum(groups) Y <- matrix(data=NA, nrow=nrRows, ncol=nrSamples) #Creates a range of beta0 to act as baseline for each row baseMean <- log(rexp(nrRows,rate = 1/meanMean)) rePickIndex <- rbinom(nrRows,1,0.5)==1 baseMean[rePickIndex] <- runif(sum(rePickIndex),-1,log(meanMean)) for(i in 1:nrRows){ sig <- rgamma(1,shape = alpha1, scale = alpha2) phi <- rnorm(nrSamples,0,sqrt(sig)) Y[i,] <- rpois(nrSamples,exp(phi+baseMean[i])) } return(Y) } nrAdapt<-1000 nrBurnIn<-1000 nrSamples<-2000 set.seed(1) Y <- GenerateGammaDataSimple() groups <- c(6,6) nrColumns <- sum(groups) nrRows <- 10000 normalizingConstants <- rep(1,sum(groups)) #DATA FOR MCMC OBJECT dataL <- list(Y=Y,N = normalizingConstants,group=groups) #INITS FOR MCMC OBJECT beta0 <- rep(0,nrRows) beta1 <- rep(0,nrRows) phi <- matrix(rep(0,nrRows*nrColumns),nrRows,nrColumns) binVar <- rep(0.01,nrRows) initsL1 <- list(beta0 = beta0,phi=phi,beta1=beta1,r=2,lambda = 10)
Martyn Plummer
2013-10-30
This works for me. I'm sorry but there is nothing I can do if I can't reproduce the bug.
Viktor Jonsson
2013-11-07
I ran the same setup as above on another computer running windows (the first used redHat linux). I got the exact same error. To be extra clear I am using rjags to call JAGS directly from R.
Perhaps you could share how you ran the model so that I can reproduce the nonreproducability of the error. :)
Martyn Plummer
2013-11-12
I ran the script you sent, then added
library(rjags) m <- jags.model("viktor.bug", data=dataL, inits=initsL1) update(m, 10000)
Where "viktor.bug" is the model you posted above
Viktor Jonsson
2013-11-15
So I might have found the cause behind the crash. The glm module in JAGS. Not knowing whether it would help my model or not I included it since i read somewhere it in general made things faster. However it seems it actually does the opposite and actually cause JAGS to crash (if indeed you can reproduce the error now).
I have a follow up question on the glm module. I am not sure what it does except creating block samplers. Since I have generated quite a few results using it can I expect my results to change a lot if I remove it? Does convergence speed, autocorrelation for parameters etc change much with or without it?
Viktor Jonsson
2013-10-31
Still thanks a lot for the help. Should indicate that there is not something wrong with the model itself. I will setup JAGS and run the model on another rig and see if I can reproduce the error there. I will then get back to you with more detailed info.