Menu

Suppressing JAGS warnings while using dsum in a ones-trick

Help
Michael H
2013-10-08
2013-10-14
  • Michael H

    Michael H - 2013-10-08

    Hello everyone,

    I'm enjoying the power of JAGS with rjags for a missing data binomial time series problem. After some try-and-error I found a way to specify (X.obs[t] + X.missing[t]) \sim Bin(n[t], p) using the "ones-trick". Simplifying to the essentials I have the following BUGS model:

    #Simple model with binomial data from a missing process
    model {
      #Constant large enough to ensure all exp(logLikX[t]) < 1
      K <- 1.1
    
      #Missing data time series as samples from multinomial. To avoid 
      #"Unable to find appropriate sampler"
      #use the Poisson trick constraining on the sum
      #X.missing ~ dmulti(pi,nMissing)
      for (t in 1:N) {
          X.missing[t] ~ dpois( pi[t] )
      }     
      #Constrain on sum. [C]: No idea how to write the dsum in a nicer way.
      nMissing ~ dsum(X.missing[1],X.missing[2],X.missing[3],X.missing[4],X.missing[5],X.missing[6],X.missing[7],X.missing[8],X.missing[9],X.missing[10])
    
      #Prior for parameter of interest
      p ~ dunif(0,1)
    
      #Time series is a superposition of X.obs (the data) and X.missing (imputed)
      for (t in 1:N) {
          #BUGS ones trick to implement that (X.obs[t] + X.missing[t]) \sim Bin(n[t],p)
          X[t] <- X.obs[t] + X.missing[t]
          logLikX[t] <- logfact(n[t]) - logfact(X[t]) - logfact(n[t]-X[t]) + X[t]*log(p) + (n[t]-X[t])*log(1-p)
          ones[t] ~ dbern( exp(logLikX[t]) / K)
      }
    }
    

    1) When running rjags::jags.samples on this model I get a huge number of "value out of range in 'lgamma'" warnings during the sampling procedure (see example R code at the end). I don't think they are a problem (just the sampler trying to come up with new proposals), but the problem is that this output in the terminal/shell lags the computations severely - its more busy writing than computing. Hence my question: Is there an easy way to turn off these warnings during jags.samples?
    (I ended up calling R batch-style with output piped into /dev/null, but am unhappy with this solution)

    2) Furthermore, is it possible to use dsum on an entire vector without having to hardcode syntactically each component summed, i.e. is there something like dsum(X.missing[]), which would work for entire vectors and would allow me to write a model which works for N different than 10 also? (in my application N is about 200-300 and it varies from simulation to simulation)

    Best regards,

    Michael

    --

    Here is a small R testscript for calling the above BUGS program and getting the warnings.

    require("rjags")
    
    ######################################################################
    # Small sampler for simulating X[t] \sim Bin(n[t], p) and then
    # arbirtrarily removing nMissing counts from X[t].
    ######################################################################
    simBinTS <- function(n,p=0.3,nMissing=10) {
        #Sample
        X.full <- X <- rbinom(N, size=n, prob=p)
        #Remove nMissing values
        for (i in seq_len(nMissing)) {
            idx <- sample(1:N,size=1,prob=X)
            X[idx] <- X[idx]-1
        }
        #Done
        return(data.frame(n=n,X=X,X.full=X.full))
    }
    
    #Generate a binomial TS which is thinned by a specific number of observations
    N <- 10
    p <- 0.3
    set.seed(123)
    n <- rpois(N,lambda=30)
    nMissing <- 20 #Example: sum(n)*p*0.5
    pi <- rep(1/N,N)
    ts <- simBinTS(p=p,n=n,nMissing=nMissing)
    ones <- rep(1,N)
    
    m <- jags.model('simple.bug',
                data = list(X.obs=ts$X,n=ts$n, N=N,pi=pi,nMissing=nMissing,ones=ones),
                init = list(X.missing=ts$X.full - ts$X),
                n.chains = 1, n.adapt = 500)
    
    samples <- coda.samples(m, n.iter=10000, 'p')
    summary(samples)
    plot(samples)
    
     

    Last edit: Michael H 2013-10-08
  • Martyn Plummer

    Martyn Plummer - 2013-10-09

    The binomial likelihood is available directly using the dsum function

      LikX[t] <- dbin(X[t], p, n[t])
      ones[t] ~ dbern( LikX[t] / K)
    

    This will calculate the likelihood without generating warning messages.

    Unfortunately, your model isn't correct. The way the data are generated, observations are removed without replacement, whereas in your BUGS code you are modelling the removal with replacement (using either multinomial or constrained Poisson). If JAGS had a multivariate hypergeometric distribution then you could model this directly without the ones trick.

     
  • Michael H

    Michael H - 2013-10-14

    Dear Martyn,

    thanks for answering the question and also thanks for pointing out the more subtle mismatch between the missing data generation mechanism of simulation model and JAGS model.
    So I tried implementing the multivariate hypergeometric distribution using the ones trick. Unfortunately, this means that the original 'lgamma' warning messages return, but that price is small compared to having a useful model (at least I hope it now does what I think it should :-)).

    Thanks for providing free software!

    Michael

    --

    #Binomial time series with observations from a missing (without replacement) process.
    model {
      #Constant large enough to ensure all exp(logLikX[t]) < 1
      K <- 1.1
      #Similar constant for the multivariate hypergeometric. \over{ sum(n[])}{nMissing}
      K.mvhypgeom <- exp( logfact(sum(n[])) - logfact(nMissing) - logfact( sum(n[]) - nMissing))
    
      #Prior for parameter of interest
      p ~ dunif(0,1)
    
      #Prior for the time series of missing counts. X.missing ~ Multinomial( nMissing, someRate)
      for (t in 1:N) {
          X.missing[t] ~ dpois( missRate )
      }     
      #Ensure sum corresponds to nMissing (not redundant as MV hypergeom using ones trick does not ensure this)
      nMissing ~ dsum(X.missing[1],X.missing[2],X.missing[3],X.missing[4],X.missing[5],X.missing[6],X.missing[7],X.missing[8],X.missing[9],X.missing[10])
    
      #Numerator of PMF of multivariate hypergeometric  (using the ones trick to do so)
      for (t in 1:N) {
          pX.missing[t] <- exp(logfact(n[t]) - logfact( X.missing[t]) - logfact( n[t] - X.missing[t]))
          ones2[t] ~ dbern(pX.missing[t]/K.mvhypgeom)
      }
    
      #Time series is a superposition of X.obs (the data) and X.missing (imputed)
      for (t in 1:N) {
          X[t] <- X.obs[t] + X.missing[t]
          #BUGS ones trick to implement that X[t] \sim Bin(n[t],p) now using <- dbin
          likX[t] <- dbin( X[t], p, n[t])
          ones1[t] ~ dbern( likX[t] / K)    
      }
    }
    
     

Log in to post a comment.