runjags / Forum / runjags: results.jags ignores keep.jags.files argument

Stephen C - 2016-01-08

I have a large model with lots of variables and I'm trying to speed up the convergence check by making it embarassingly parallel.

I have a couple of scripts that run simultaneously that loop through sets of monitored variables to calculate the Gelman-Rubin psrf variable.

path<-path/to/working/directory for (i in seq(2,7)){ thisvar<-paste('kappa[1,',i,']',sep="") print(thisvar) results<-results.jags(paste(path,"/runjagsfiles/",sep=""),read.monitor=thisvar,keep.jags.files=TRUE) thislist<-gelman.diag(results,multivariate=FALSE)$psrf[,1] print(thislist) }

Interestingly, when I set this up the first time, I accidentally set the loop range to go outside of the bounds of the kappa array. I got an error message, fixed the problem, and re-ran the script.

the runjagsfiles directory was gone. (#1 important thing to know: results.jags is just dying to delete the runjagsfiles directory)

So I ran a (short) MCMC run again, recreating the runjagsfiles directory, and made a copy to runjagsfiles2. I tried and was unable to load this copy. (#2 important thing to know: The path must end in 'runjagsfiles')

I expect I'm not using this in an expected way, but am I seeing expected behavior?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Stephen

Thanks for the bug report, but I'm afraid that I haven't been able to replicate the behaviour you describe using simulations created using run.jags. To answer your specific question:

For (1), results.jags should always respect the keep.jags.files argument given to the (auto)run.jags function call used to start the simulation, unless this is overridden in the results.jags function call. The only other time that simulations is deleted is if you have set runjags.options(full.cleanup=TRUE) and then unloaded runjags - is that the case? If not, can you confirm the version of runjags you are using, and the original (auto)run.jags function call, and preferably provide a repeatable example?

For (2), you should be able to copy the folder and give it whatever name you want, and then load the simulation using the path to the folder. Can you confirm the error message you received?

The following script tests both of these features, and runs with no problems using runjags version 2.0.3 (but there were not any changes in this code from 2.0.2 as far as I remember). Can you see if it works for you?

# Script to test deleting/keeping/copying runjagsfiles folders

library('runjags')

X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)

# Model in the JAGS format
model <- "model {
for(i in 1 : N){
    Y[i] ~ dnorm(true.y[i], precision);
    true.y[i] <- (m * X[i]) + c
}
m ~ dunif(-1000,1000)
c ~ dunif(-1000,1000)
precision ~ dexp(1)
}"

# Data and initial values in a named list format,
# with explicit control over the random number
# generator used for each chain (optional):
data <- list(X=X, Y=Y, N=length(X))
inits1 <- list(m=1, c=1, precision=1,
.RNG.name="base::Super-Duper", .RNG.seed=1)
inits2 <- list(m=0.1, c=10, precision=1,
.RNG.name="base::Wichmann-Hill", .RNG.seed=2)

# Run the model and produce plots
results <- run.jags(model=model, monitor=c("m", "c", "precision"),
data=data, n.chains=2, method="background", inits=list(inits1,inits2), keep.jags.files=FALSE)

Sys.sleep(5)

# Folder should be preserved:
results.jags(results, keep.jags.files=FALSE, read.monitor='m')
stopifnot(file.exists(results$jobname))

# Folder should be deleted:
results.jags(results, read.monitor='m')
stopifnot(!file.exists(results$jobname))

results <- run.jags(model=model, monitor=c("m", "c", "precision"),
data=data, n.chains=2, method="background", inits=list(inits1,inits2), keep.jags.files=TRUE)

Sys.sleep(5)

# Folder should be preserved:
results.jags(results, keep.jags.files=TRUE, read.monitor='m')
stopifnot(file.exists(results$jobname))

# Folder should be preserved:
results.jags(results, read.monitor='m')
stopifnot(file.exists(results$jobname))

# Folder should be deleted:
results.jags(results, keep.jags.files=FALSE, read.monitor='m')
stopifnot(!file.exists(results$jobname))

results <- run.jags(model=model, monitor=c("m", "c", "precision"),
data=data, n.chains=2, method="background", inits=list(inits1,inits2), keep.jags.files=TRUE)

Sys.sleep(5)

# Copy contents of folder to arbitrary new folder (could be done outside R):
dir.create('whatever')
stopifnot(file.copy(file.path(results$jobname, list.files(results$jobname)), 'whatever', recursive=TRUE))

stopifnot(file.exists('whatever'))
results.jags('whatever', keep.jags.files=FALSE, read.monitor='m')

# New folder should be deleted:
stopifnot(!file.exists('whatever'))
# Original folder should be preserved:
stopifnot(file.exists(results$jobname))

cleanup.jags(all.folders=TRUE)
stopifnot(!file.exists(results$jobname))
# All folders should be deleted:

Thanks again,

Matt

Thanks for having a look, and I understand I was less than helpful for not giving a working example (it was a friday afternoon). I am running JAGS4.0.1 and runjags 2.0.2. I should note that I am running with tempdir=FALSE because, when I run on a batch cluster, I do not have access to the /tmp directory, but rather need to keep all files in my /scratch space working directory.

Based off of the above script, here's an example showing the behavior I'm seeing:

library('runjags')

path<-'path/to/working/dir'

X <- 1:10
Y <- rnorm(length(X), 2*X + 10, 1)
# Model in the JAGS format
model <- "model {
for(i in 1 : N){
    Y[i] ~ dnorm(true.y[i], precision);
    m[i] ~ dunif(-1000,1000)        
    true.y[i] <- (m[i] * X[i]) + c
}
c ~ dunif(-1000,1000)
precision ~ dexp(1)
}"

# Data and initial values in a named list format,
# with explicit control over the random number
# generator used for each chain (optional):
data <- list(X=X, Y=Y, N=length(X))
inits1 <- list(m=1:10, c=1, precision=1,
.RNG.name="base::Super-Duper", .RNG.seed=1)
inits2 <- list(m=1:10, c=10, precision=1,
.RNG.name="base::Wichmann-Hill", .RNG.seed=2)

# Run the model and produce plots
results <- run.jags(model=model, monitor=c("m", "c", "precision"),
    data=data, 
    n.chains=2, 
    method="parallel", 
    inits=list(inits1,inits2), 
    keep.jags.files=TRUE,
    tempdir=FALSE)

Sys.sleep(5)

# Folder should be preserved:
results.jags(paste(path,'runjagsfiles',sep=""),
    keep.jags.files=TRUE, read.monitor='m')

# Return the summary for a single variable, folder is preserved
results.jags(paste(path,'runjagsfiles',sep=""), 
    keep.jags.files=TRUE, read.monitor='m[5]')

# Return the summary out of bounds, poof, folder is gone
results.jags(paste(path,'runjagsfiles',sep=""),
    keep.jags.files=TRUE, read.monitor='m[12]')

Matt Denwood - 2016-01-12

No problem - I am grateful for reports of all potential bugs so always happy to take a look. But I'm afraid I'm still not seeing your error using runjags 2.0.2 (CRAN version) or 2.0.3 (development version). The last part of your code (modified slightly as detailed below) runs for me as follows:

# Create the working directory folder: path <- 'path/to/working/dir' dir.create(path, recursive=TRUE) # Save results directly to a folder called 'whatever' in this folder # Note that if whatever already exists, a counter is appended to the name as usual (and a warning is given) results <- run.jags(model=model, monitor=c("m", "c", "precision"), data=data, n.chains=2, method="parallel", inits=list(inits1,inits2), keep.jags.files=file.path(path, 'whatever'), tempdir=FALSE) # Succeeds - folder is preserved, but will be deleted on package unload if runjags.options(full.cleanup=TRUE): results.jags(file.path(path,'whatever'), read.monitor='m') # Succeeds - folder is preserved, but will be deleted on package unload if runjags.options(full.cleanup=TRUE): results.jags(file.path(path,'whatever'), read.monitor='m[5]') # Gives an error (which is not very helpful and will be improved!), but the folder is still preserved, but will (currently) always be deleted on package unload (I will change this behaviour for the next release): results.jags(file.path(path,'whatever'), read.monitor='m[12]')

A couple of things that might help:

You were running:

results.jags(paste(path,'runjagsfiles',sep=""), ...)

But this should have failed ("path/to/working/dirrunjagsfiles" shouldn't exist - "path/to/working/dir/runjagsfiles" does but maybe not as a relative path from ""path/to/working/dir"). I guess it was just a typo but I guess it is possible this may be causing some issues if the OS is behaving strangely?

The help file does document that keep.jags.files (for run.jags) can also be a character string representing a path to the folder to save, which might be relevant to your earlier point about the path needing to have 'runjagsfiles' in it?

The keep.jags.files=TRUE arguments were unnecessary for the results.jags calls here but I tried it both ways and got the same result. If you specified keep.runjags.files=FALSE this would remove the folder EXCEPT where there was an error reading the files (i.e. the last call) in which case the folder is still preserved and must be deleted manually or using cleanup.jags(all=TRUE).

Does this help track down the problem?

Thanks again,

Matt

Edit: I have just thought of one other potential explanation - are you quitting R (and/or unloading runjags) after the failed attempt to import the results?

Last edit: Matt Denwood 2016-01-12
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Matt,
A couple of things:

My path to my working directory is an absolute path, not a relative path ( I accidentally left off the leading / when replacing my path with the dummy path)
I forgot to give you the error message I see when I run the script I supplied above. The output is:

$ Rscript RunjagsTest.R
RunjagsTest.R
Calling 2 simulations using the parallel method...
Following the progress of chain 1 (the program will wait for all chains
to finish before continuing):
Welcome to JAGS 4.0.1 on Tue Jan 12 09:28:04 2016
JAGS is free software and comes with ABSOLUTELY NO WARRANTY
Loading module: basemod: ok
Loading module: bugs: ok
. . Reading data file data.txt
. Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 10
   Unobserved stochastic nodes: 12
   Total graph size: 77

WARNING: Unused variable(s) in data table:
X

. Reading parameter file inits1.txt
. Initializing model
. Adapting 1000
-------------------------------------------------| 1000
++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
Adaptation successful
. Updating 4000
-------------------------------------------------| 4000
************************************************** 100%
. . . . Updating 10000
-------------------------------------------------| 10000
************************************************** 100%
. . . . Updating 0
. Deleting model
. 
All chains have finished
Simulation complete.  Reading coda files...
Coda files loaded successfully
Calculating summary statistics...
Calculating the Gelman-Rubin statistic for 12 variables....
Finished running the simulation
Error in results.jags(paste(path, "runjagsfiles", sep = ""), keep.jags.files = TRUE,  : 
  An object produced by a background runjags method (or a path to the JAGS folder to be imported) must be supplied (see the manual page for more details)
Execution halted

Note above that I am running this in Rscript. That's necessary when I run in the batch cluster, as I mentioned. I don't know if that could be relevant to your question in your postscript.

Thanks for all of your help!

Last edit: Stephen C 2016-01-12

Matt Denwood - 2016-01-13

Hmm, that error message isn't all that helpful, but what it is telling you is what I posted about in my previous reply - you need to change:

paste(path, "runjagsfiles", sep = "")

To:

paste(path, "runjagsfiles", sep = "/") # or, even better: file.path(path, "runjagsfiles")

Edit: Or at least, that is based on the path format you originally posted - if you formatted your real path with a / at the end then this isn't the problem, but either way it looks like whatever paste(path, "runjagsfiles", sep = "") points to doesn't exist.

I will change the error message to a more expected 'file not found' when results.jags is given a path to a folder. Also, your script assumes that the results are saved to runjagsfiles and not runjagsfiles_1 etc - you might be better giving a specific folder name to keep.jags.files like I suggested before. This could still be changed by runjags (currently it doesn't over-write folders) but at least you will get a warning.

The fact that you are running this in Rscript explains the missing folder - after you try (but fail) to import the folder, the folder is marked for cleanup-on-exit. Once R exits, runjags is unloaded and the folder is removed. This is definitely non-graceful behaviour, which I will change in a future version of runjags, so thanks for the report. But for now I'm afraid that you will just have to make sure that your own variable index remains in-range to avoid the error.

Thanks again,

Matt

Last edit: Matt Denwood 2016-01-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephen C - 2016-01-13

I definitely had the leading / and trailing / to make a complete absolute path, so that wasn't the issue. I think that the main issue here is that the folder is marked for cleanup-on-exit. I am trying to check convergence statistics for different classes of my variables in parallel with separate scripts, due to my specific circumstances, so I need the runjags directory to stick around. Thanks for your attention to my issues.

-S

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

results.jags ignores keep.jags.files argument

The 'runjags' R package and standalone JAGS extension module

Forums

Help

results.jags ignores keep.jags.files argument

results.jags ignores keep.jags.files argument

The 'runjags' R package and standalone JAGS extension module

Forums

Help

results.jags ignores keep.jags.files argument document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

results.jags ignores keep.jags.files argument