Clarifications regarding state-space 1st order autoregression time-series analysis

Help
2013-01-15
2013-01-15
  • Farshid Ahrestani

    I'm new to both bayesian and state-space modeling. My goal is to develop a state-space first order autoregressive population model:

    The State-Space approach combines two models: 1) ecological process model and 2) observation model
    The ecological process model is: X[t] = B0 + (1+B1)X[t-1] + epsilon
    X[t] = loge(abundance) in year t
    X[t-1] = loge(abundance) in year t-1
    B0 = Intinsic growth rate
    B1 = 1st order density dependent autoregressive term
    epsilon = process variance = a time-dependent stochastic term with mean=0
    The Observation process model is: Y[t] = X[t] + observation error
    Y[t] ~ normal(X[t], error)

    The model I have developed so far is:

    #model    {
        ### Parameters and Priors ###
        alpha.mu ~ dnorm(0.0, 1.0E-6);        #Prior on intrinsic rate of increase
        tau.alpha ~ dgamma(1.0E-3, 1.0E-3);       
        sigma.alpha <- 1/sqrt(tau.alpha)      #Prior on precision of intrinsic rate of increase
        beta ~ dnorm(0.0, 1.0E-6);            #Prior on 1st order density dependence
        epsilon ~ dnorm(0.0, 1.0E-6);         #Prior on process variation
        tau.pop ~ dgamma(1.0E-3, 1.0E-3);        
        sigma.pop <- 1/sqrt(tau.pop);         # Prior on precision of intrinsic rate of increase 
        tau.obs ~ dgamma(1.0E-3, 1.0E-3);
        sigma.obs <- 1/sqrt(tau.obs)          # Prior on normal precision of observation error
    
        ### State-Space model ###
    
        # System process model
        for (t in 2:T){    
           alpha[t] ~ dnorm(alpha.mu, sigma.alpha);
           log.pop.mu[t] <- alpha[t] + beta*log(pop[t-1]) + epsilon[t];
           log.pop[t] ~ dnorm(log.pop.mu[t], sigma.pop);
        }
    
        # Observation process model
        for (t in 2:T){    
        Y[t] ~ dnorm(log.pop[t], sigma.obs);
        }
    }
    

    Sample input data:
    "T" <- 45
    "pop" <- c(8150, 5725, NA, NA, 4476, NA, 3842, 3172, 4305, 5543, 7281,
    8215, 9981, 10529, 12607, 12014, 12828, 12680, 10838, NA, NA,
    16019, NA, NA, NA, 16286, 17007, 18913, 16536, 14829, 12027,
    12859, 17585, 19045, 16791, NA, NA, 11692, 11742, 14538, 13400,
    11969, 9215, 8335, 9545)

    Sample initial values:
    "alpha.mu" <- 0.5
    "tau.alpha" <- 0.5
    "tau.obs" <- 1
    "tau.pop" <- 1
    "beta" <- 1
    "epsilon" <- 1

    #Using this following rjags code
    library("rjags", "coda")
    data <- read.jagsdata("data.R")
    inits <- read.jagsdata("inits.R")
    parameters = c("pop") # The parameter(s) to be monitored.
    adaptSteps = 500 # Number of steps to "tune" the samplers.
    burnInSteps = 1000 # Number of steps to "burn-in" the samplers.
    nChains = 2 # Number of chains to run.
    numSavedSteps=50000 # Total number of steps in chains to save.
    thinSteps=1 # Number of steps to "thin" (1=keep every step).
    nIter = ceiling( ( numSavedSteps * thinSteps ) / nChains ) # Steps per chain.
    m <- jags.model("state-space.bug", data, inits, n.chains=nChains, n.adapt=adaptSteps)
    

    I'm getting the following error: Compilation error on line (log.pop.mu[t] <- alpha[t] + beta*log(pop[t-1])). Unable to resolve node pop[3]

    My clarifications are:
    1. What is wrong with my code? Is the error message because of missing data?
    2. Have I written the precisions for the distributions of the parameters correctly?
    3. Do I need the epsilon term in the code? (Other examples I have seen seem to ignore using it?)
    4. Don't I need to reconvert my logarithmic estimates to population counts to be able to compare with the estimates provided in the input?
    5. To compare the original time series with the bayesian estimated time-series, is the following code an appropriate approach:

    #coda code
    codaSamples <- coda.samples(m, variable.names=parameters , n.iter=nIter , thin=thinSteps )
    mcmcChain = as.matrix( codaSamples )
    mcmc.means <- colMeans(mcmcChain, na.rm=TRUE, dims=1) #Do I need to take the means?
    years <- seq(1961,2005, by=1)
    plot(years,mcmc.means, type='l')
    lines(years, data$pop, col='blue')
    

    Thank you.

    -Farshid

     
    • Josh

      Josh - 2013-01-15

      The first thing that caught my eye was this line log(pop[t-1]), I assume you
      want log.pop[t-1]. As in the population size at time t depends on the
      population size at t-1. Also, drop the epsilon, you are doing that with the
      dnorm statement in the process model. The response is Y, not pop, so you need
      to pass your observations to Y. Don't forget to log first. You will also want
      to give the model a starting point, that is specifying the value of the
      population in the first year is necessary. The value in year 2 is calculated
      from year 1 and if your loop starts at 2 then you need to specify year 1 outside
      the loop. I prefer to put the prior for alpha outside the likelihood in its own
      loop. I also added log.pop to the equation to make it consistent with the
      equations you provide at the beginning of your post ((1 + B1)X[t-1]...multiplied
      out looks like X[t-1] + B1 * X[t-1]). It looks like you are monitoring the
      value of your data, I would suggest monitoring the precisions, alpha, beta and
      log.pop. I would also suggest simulating some data for this model and fitting
      it so you can get a handle on what the terms are and how they behave.

      Anyway, some mods are below....not too extensive or thoughtful, just what I
      noticed.
      log.pop[1] ~ dnorm(some.guess, 0.0001)
      # System process model for (t in 2:T){ alpha[t] ~ dnorm(alpha.mu,
      sigma.alpha) log.pop.mu[t] <- log.pop[t-1] + alpha[t] + beta *
      log.pop[t-1] log.pop[t] ~ dnorm(log.pop.mu[t], sigma.pop)T(0,) #
      Truncation can be helpful } # Observation process model for (t in 2:T){
      Y[t] ~ dnorm(log.pop[t], sigma.obs)T(0,) }

      As for plotting, explore the object and figure out where values are stored and
      what information is available. Functions such as names and str should be
      helpful.

      A similar model is covered by Kery and Schaub with code in Bayesian Population
      Analysis using WinBUGS: A hierarchical perspective, about Chapter 5 if I
      remember correctly.

      Best of luck,
      Josh


      From: Farshid Ahrestani farshid25@users.sf.net
      To: [mcmc-jags:discussion] 610037@discussion.mcmc-jags.p.re.sf.net
      Sent: Mon, January 14, 2013 6:54:51 PM
      Subject: [mcmc-jags:discussion] Clarifications regarding state-space 1st order
      autoregression time-series analysis

      I'm new to both bayesian and state-space modeling. My goal is to develop a
      state-space first order autoregressive population model:
      The State-Space approach combines two models: 1) ecological process model and 2)
      observation model
      The ecological process model is: X[t] = B0 + (1+B1)X[t-1] + epsilon
      X[t] = loge(abundance) in year t
      X[t-1] = loge(abundance) in year t-1
      B0 = Intinsic growth rate
      B1 = 1st order density dependent autoregressive term
      epsilon = process variance = a time-dependent stochastic term with mean=0
      The Observation process model is: Y[t] = X[t] + observation error
      Y[t] ~ normal(X[t], error)
      The model I have developed so far is:

      model { ### Parameters and Priors ### alpha.mu ~ dnorm(0.0, 1.0E-6);

      Prior on intrinsic rate of increase tau.alpha ~ dgamma(1.0E-3, 1.0E-3);

      sigma.alpha <- 1/sqrt(tau.alpha) #Prior on precision of intrinsic rate of
      increase beta ~ dnorm(0.0, 1.0E-6); #Prior on 1st order density
      dependence epsilon ~ dnorm(0.0, 1.0E-6); #Prior on process variation
      tau.pop ~ dgamma(1.0E-3, 1.0E-3); sigma.pop <- 1/sqrt(tau.pop);

      Prior on precision of intrinsic rate of increase tau.obs ~ dgamma(1.0E-3,

      1.0E-3); sigma.obs <- 1/sqrt(tau.obs) # Prior on normal precision
      of observation error ### State-Space model ### # System process model
      for (t in 2:T){ alpha[t] ~ dnorm(alpha.mu, sigma.alpha);
      log.pop.mu[t] <- alpha[t] + beta*log(pop[t-1]) + epsilon[t]; log.pop[t] ~
      dnorm(log.pop.mu[t], sigma.pop); } # Observation process model for (t in
      2:T){ Y[t] ~ dnorm(log.pop[t], sigma.obs); } }

      Sample input data:
      "T" <- 45
      "pop" <- c(8150, 5725, NA, NA, 4476, NA, 3842, 3172, 4305, 5543, 7281,
      8215, 9981, 10529, 12607, 12014, 12828, 12680, 10838, NA, NA,
      16019, NA, NA, NA, 16286, 17007, 18913, 16536, 14829, 12027,
      12859, 17585, 19045, 16791, NA, NA, 11692, 11742, 14538, 13400,
      11969, 9215, 8335, 9545)
      Sample initial values:
      "alpha.mu" <- 0.5
      "tau.alpha" <- 0.5
      "tau.obs" <- 1
      "tau.pop" <- 1
      "beta" <- 1
      "epsilon" <- 1

      Using this following rjags code library("rjags", "coda") data <-

      read.jagsdata("data.R") inits <- read.jagsdata("inits.R") parameters = c("pop")

      The parameter(s) to be monitored. adaptSteps = 500 # Number of steps to "tune"

      the samplers. burnInSteps = 1000 # Number of steps to "burn-in" the samplers.
      nChains = 2 # Number of chains to run. numSavedSteps=50000 # Total number of
      steps in chains to save. thinSteps=1 # Number of steps to "thin" (1=keep every
      step). nIter = ceiling( ( numSavedSteps * thinSteps ) / nChains ) # Steps per
      chain. m <- jags.model("state-space.bug", data, inits, n.chains=nChains,
      n.adapt=adaptSteps)

      I'm getting the following error: Compilation error on line (log.pop.mu[t] <-
      alpha[t] + beta*log(pop[t-1])). Unable to resolve node pop[3]
      My clarifications are:
      1. What is wrong with my code? Is the error message because of missing data?
      2. Have I written the precisions for the distributions of the parameters
      correctly?
      3. Do I need the epsilon term in the code? (Other examples I have seen seem to
      ignore using it?)
      4. Don't I need to reconvert my logarithmic estimates to population counts to be
      able to compare with the estimates provided in the input?
      5. To compare the original time series with the bayesian estimated time-series,
      is the following code an appropriate approach:

      coda code codaSamples <- coda.samples(m, variable.names=parameters ,

      n.iter=nIter , thin=thinSteps ) mcmcChain = as.matrix( codaSamples ) mcmc.means
      <- colMeans(mcmcChain, na.rm=TRUE, dims=1) #Do I need to take the means? years
      <- seq(1961,2005, by=1) plot(years,mcmc.means, type='l') lines(years, data$pop,
      col='blue')

      Thank you.
      -Farshid
      Clarifications regarding state-space 1st order autoregression time-series
      analysis


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/mcmc-jags/discussion/610037/
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/prefs/

       
  • Farshid Ahrestani

    Thanks Josh. A few more clarifications:
    1. Could you further clarify what you meant by your comment:
    "It looks like you are monitoring the value of your data, I would suggest monitoring the precisions, alpha, beta and log.pop."?
    2. I referred to Kery & Schaub's chapter today, and tried following their example of transforming the log(population) estimates into real counts, i.e. using a exponential transformation of the estimates (I provided the model data that were log values of the population), e.g.:

    # Population sizes on real scale
    for (t in 2:T) {
        real.pop[t] <- exp(log.pop[t])
    }
    

    However, coda.samples complained that it "Failed to set trace monitor for real.pop;
    Node not found" (this is obviously tied to your comment of monitoring the right variables). The code in Kery & Schaub's chapter, however seems to achieve this by defining the real.pop variable in a for loop, and not as a variable with a prior distribution?

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks