Menu

Setting all initial values to draw from model priors

runjags
M Sethi
2018-02-19
2018-02-21
  • M Sethi

    M Sethi - 2018-02-19

    Hi there—can anyone help me out with the specific argument I need to use to get autorun.jags to draw all initial values for my (rather complicated) model from the prior distributions specified in the model? I'd like to do this for all 3 chains I'm planning to run in parallel, and have already learned that failing to specify inits leads JAGS to use the same initial starting values for all chains—definitely not what I want.

    Thanks so much!

     

    Last edit: M Sethi 2018-02-19
  • Matt Denwood

    Matt Denwood - 2018-02-19

    If you leave the default argument autorun.jags(..., inits=NA) then JAGS takes all initial values from the middle (either mean or mode depending on the distribution) of the prior. This is generally NOT recommended because (1) it is the same values for all chains so convergence disgnostics that compare chains are less useful to persuade yourself that the model has converged and (2) occasionally you will find that initial values taken from the middle of the priors results in a model that won't compile (i.e. the middle of the priors are not always sensible initial values - when using interval censoring for example).

    Having said all of that it is often useful to use inits=NA just so you can get a model to compile and run with minimal effort while you are building it in stages. But I would always manually provide over-dispersed initial values for at least the important parameters (e.g. intercept, coefficients, random effect precisions in a GLMM) before running my 'final' model. Convergence issues are always a potential danger with MCMC.

    Matt

     
  • M Sethi

    M Sethi - 2018-02-19

    Matt,

    thank you so much for your helpful response. I've set overdispersed initial values for each of my 3 chains, but am now running into a new issue. In addition to the system error below (I understand that the warning message doesn't actually signal an error), I don't think the parallelization itself is working, since I only see 1 core being used in the few minutes before the error message appears. (I'm running this on a 4-CPU Google Compute Engine instance.) Thanks for any advice you can offer!

    mod_lags <- autorun.jags(model="phenomodel_flowerfruitseed.jags", inits=list(chain1, chain2, chain3), monitor=params, data=jags.data_lags, n.chains=3, startburnin=4000, startsample=4000, method="parallel", psrf.target=1.10, thin=10)
    
    Auto-run JAGS
    
    Running a pilot chain...
    Error in system("/bin/ps", intern = TRUE, ignore.stderr = TRUE) : 
      error in running command
    In addition: Warning message:
    You attempted to start parallel chains without setting different PRNG for each chain, which is not recommended.  Different .RNG.name values have been added to each set of initial values. 
    
     

    Last edit: M Sethi 2018-02-19
  • M Sethi

    M Sethi - 2018-02-20

    P.S. One of the strangest things about this error is that I don't get it when I execute the exact same code on my laptop (which doesn't have multiple cores, which is why I want to be able to run it in the cloud).

     
  • Matt Denwood

    Matt Denwood - 2018-02-20

    That looks like an issue with the Google Compute Engine restricting access to certain utilities (in this case ps which is used by runjags to kill the relevant JAGS instances if the user aborts the simulation). It is common (and sensible) for remote access clusters to impose such restrictions, but this seems to differ a lot between clusters, and I don't use Google Compute Engine so can't test it specifically. In this case runjags's attempt to use ps is actually unnecessary - it is not possible for you to abort the model run without pulling the entire job which would presumably abort all processes within that container - so it would probably be a good idea for me to include a new "method" in runjags with the specific intent of being as cluster-friendly as possible. But in any case the error wouldn't happen unless being run on a similar cluster (which also makes it harder to test).

    To solve your immediate problem I can only suggest trying one of the other parallelised methods: snow or rjparallel (bgparallel also uses ps so presumably will not work for you). But I have had reports from other clusters that one or both of these can fail if the cluster is set up to block spawning of processes from within a job - I have no idea if this is true for Google Compute Engine but there is no harm in trying (and I would appreciate you letting me know just for my information). The only methods I can guarantee will (or should) work are rjags and simple but these are obviously single threaded.

    Matt

     
  • M Sethi

    M Sethi - 2018-02-20

    Hi Matt,

    that makes sense—I figured it was an issue with GCE, and eventually realized that I could try rjparallel instead. This doesn't throw up the error! But the parallelization still isn't happening (only 1 core is being used). Is there something else I need to do to get this to work? This may be a question that's already been asked and answered, so I'll do some searching as well. Thanks so much!

     
  • Matt Denwood

    Matt Denwood - 2018-02-20

    You could try setting the n.sim option to runjags e.g. (adapted from the examples in the help file for ?run.jags):

    results <- run.jags(model, n.chains=8, inits=initlist, method="rjparallel", n.sims=2)
    

    Or you could try setting up your own cluster (either fork or psock) manually and then giving that to runjags:

    library(parallel)
    cl <- makeCluster(4)
    results <- run.jags(model, n.chains=8, inits=initlist, method="rjparallel", cl=cl)
    

    [See the help file for ?makeCluster for more options]

    But my best guess is that this won't work because GCE is limiting the R instance to a single core. To test/google this you could try a simpler example using ?clusterApply in the parallel package (the method that runjags uses to parallelise with the snow and rjparallel methods). I have never used GCE so can't speak for that platform, but it is quite common for clusters to enforce a 1-core-per-thread policy that it is impossible to work around.

     
  • M Sethi

    M Sethi - 2018-02-20

    Setting n.sims didn't change anything on GCE, but using makecluster() DID! Hooray! Thanks again, Matt.

     
  • Matt Denwood

    Matt Denwood - 2018-02-21

    Great - glad it worked out! And thanks for reporting back - I will at some point create a cluster FAQ sheet (as it is probably the single most common source of strange problems with runjags), and it is extremely useful to have first-hand reports of success/failure on different systems.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.