Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

"Failure to calculate log density", lack of memory?

Help
2013-10-22
2013-11-15
  • Viktor Jonsson
    Viktor Jonsson
    2013-10-22

    Hello!

    I have a running a hierarchical model in JAGS. The purpose is to do group-wise comparisons on overdispersed count data, doing row-wise comparisons for a table of counts. The model flawlessly on smaller data sets, say 2000 rows of data. On larger datasets (10 000 rows in this case) it runs for a while and then crashes with the error:

    Error: Error in node phi[1973,1]
    Failure to calculate log density

    The failure occurs at exactly the same node for five runs on different data (of same size). The model becomes quite large with "Graph Size: 520025" and I suspect I might be running into a memory issue. What is the limit on model size/memory usage? I can run the larger model succesfully if I reduce the number samples taken.

    Reading up on the cause for the error message it would seem that it is related to NaNs or Infs coming up in the model. I do not see why these would occur at the same time each simulation and why it would depend on model size. Could it be that the likelihood gets too small in a large model and thus numerically unstable?

    Currently running JAGS version 3.3.0, through rjags.

    Best Regards
    Viktor

     
    • No complete answer but a few comments:

      "Could it be that the likelihood gets too small in a large model and thus numerically unstable?"

      Yes, and this is more likely to occur with a larger data set. I'm a little surprised that it's so consistent with different data sets, maybe Martyn will comment on that? If you take a subsample of the data set(s) causing the problem and use them to get parameter estimates, do the estimates suggest a problem? Are they at some boundary? Do any parameter wander off?

      Also, why not check if you run out of memory: what's your % RAM usage? do you see swapping before the error?

       
  • Martyn Plummer
    Martyn Plummer
    2013-10-23

    Perhaps you could post your model.

     
  • Viktor Jonsson
    Viktor Jonsson
    2013-10-23

    Hello again, thanks for the responses.

    I did rerun the model on the large data and monitored the parameters for the first 1000 iterations(it generally crashes after 1680 iterations). They exhibited no strange behaviour as far as I could tell. Note also that the data used is currently simulated from the model, it is randomly drawn each time though so it should not be a specific thing related to that node[1973,1] in the data.

    There should be plenty of RAM to spare on the machine I am using, I was wondering if there might be some internal memory issue in JAGS. The memory usage slowly increases over time but if it indeed started to swap to drive that shouldn't cause JAGS to crash or?

    The model used follows below. The N[j]s are considered fixed and input as data as is the group variable. The current data used has 10000 rows and 12 columns divided into two groups of 6 each. The model itself is a Poisson log normal GLM with a hierarchical gamma prior on the variance of the overdispersion parameter phi.

    Best Regards
    Viktor

    Model:

    data {
      D <- dim(Y)
    }
    model {
      for (i in 1:D[1]) { 
            for (j in 1:group[1]){
    
                Y[i,j] ~ dpois(exp(phi[i,j]+beta0[i]+log(N[j])))
                phi[i,j] ~ dnorm(0, binTau[i])
            }
        for (j in (group[1]+1):(group[1]+group[2])){
    
                Y[i,j] ~ dpois(exp(phi[i,j]+beta0[i] +beta1[i]+log(N[j])))
                phi[i,j] ~ dnorm(0, binTau[i])
            }
    
            binVar[i] ~ dgamma(r,lambda)    #Gamma distribution on variance
    
            binTau[i] <- 1/binVar[i]
            beta0[i] ~ dnorm(0,0.01)
            beta1[i] ~ dnorm(0,0.01)
        }
    
        r ~ dunif(0,50)
        lambda ~ dunif(2,1000)
    }
    
     
  • It could be a problem that r is getting estimated as very small, if binVar is getting estimated as very small... but that would show up as decreasing r/binVar values during the course of sampling... but this is shooting in the dark without a reproducible example.

     
  • Martyn Plummer
    Martyn Plummer
    2013-10-24

    I can't see anything in the model that would cause the log density calculation for phi to fail. At this point we need a reproducible example. Obviously the data are too big to post here, but you say your data are simulated from the model. If you post an R program that reproducibly creates a data set that fails (i.e. calls set.seed before generating the data) then I can run it through a debugger.

     
  • Viktor Jonsson
    Viktor Jonsson
    2013-10-25

    Thanks for the responses. It would seem that the JAGS crashes regardless of which seed I set but for consistency I generated data according to the script below with "set.seed(1)". I used 1000 burn in steps, 1000 adaptation steps and attempted to take 2000 samples. JAGS crashed after ca 1640 samples were taken (can't give the exact iteration as it only gives me 64%).

    I also include the inits and data for the other parameters. The script simply creates the necessary variables in the R workspace but I guess you can transfer them to the desired format.

    Hope we can shed some light on this.

    GenerateGammaDataSimple <- function(nrRows=10000,groups=c(6,6)
                                        ,meanMean=1500,alpha1=1.5,alpha2=0.1){
      #Simplified version of GenerateGammaData
    
      nrSamples = sum(groups)
    
      Y <- matrix(data=NA, nrow=nrRows, ncol=nrSamples)
    
      #Creates a range of beta0 to act as baseline for each row
      baseMean <- log(rexp(nrRows,rate = 1/meanMean))
      rePickIndex <- rbinom(nrRows,1,0.5)==1
      baseMean[rePickIndex] <- runif(sum(rePickIndex),-1,log(meanMean))
    
      for(i in 1:nrRows){
        sig <- rgamma(1,shape = alpha1, scale = alpha2)
        phi <- rnorm(nrSamples,0,sqrt(sig))
        Y[i,] <- rpois(nrSamples,exp(phi+baseMean[i]))
      }
    
      return(Y)
    } 
    
    nrAdapt<-1000
    nrBurnIn<-1000
    nrSamples<-2000
    
    set.seed(1)
    Y <- GenerateGammaDataSimple()
    
    groups <- c(6,6)
    
    nrColumns <- sum(groups)
    nrRows <- 10000
    
    normalizingConstants <- rep(1,sum(groups))
    
    #DATA FOR MCMC OBJECT
    dataL <- list(Y=Y,N = normalizingConstants,group=groups)
    
    #INITS FOR MCMC OBJECT
    beta0 <- rep(0,nrRows)
    beta1 <- rep(0,nrRows)
    phi <- matrix(rep(0,nrRows*nrColumns),nrRows,nrColumns)
    binVar <- rep(0.01,nrRows)
    initsL1 <- list(beta0 = beta0,phi=phi,beta1=beta1,r=2,lambda = 10)
    
     
  • Martyn Plummer
    Martyn Plummer
    2013-10-30

    This works for me. I'm sorry but there is nothing I can do if I can't reproduce the bug.

     
    • Viktor Jonsson
      Viktor Jonsson
      2013-11-07

      I ran the same setup as above on another computer running windows (the first used redHat linux). I got the exact same error. To be extra clear I am using rjags to call JAGS directly from R.

      Perhaps you could share how you ran the model so that I can reproduce the nonreproducability of the error. :)

       
      • Martyn Plummer
        Martyn Plummer
        2013-11-12

        I ran the script you sent, then added

        library(rjags)
        m <- jags.model("viktor.bug", data=dataL, inits=initsL1)
        update(m, 10000)
        

        Where "viktor.bug" is the model you posted above

         
        • Viktor Jonsson
          Viktor Jonsson
          2013-11-15

          So I might have found the cause behind the crash. The glm module in JAGS. Not knowing whether it would help my model or not I included it since i read somewhere it in general made things faster. However it seems it actually does the opposite and actually cause JAGS to crash (if indeed you can reproduce the error now).

          I have a follow up question on the glm module. I am not sure what it does except creating block samplers. Since I have generated quite a few results using it can I expect my results to change a lot if I remove it? Does convergence speed, autocorrelation for parameters etc change much with or without it?

           
  • Viktor Jonsson
    Viktor Jonsson
    2013-10-31

    Still thanks a lot for the help. Should indicate that there is not something wrong with the model itself. I will setup JAGS and run the model on another rig and see if I can reproduce the error there. I will then get back to you with more detailed info.