state space model zero truncated negative binomial distribution

Help
2013-10-10
2013-10-11
  • Mathieu Chevalier

    Hi everybody,
    I am new in bayesian analysis and Jags. Here is my problem.
    I would like to perform state space modelling on several population time series. For each time series I would like to fit a density dependent model and then to estimate the parameter of this model (e.g. the coefficient of density dependence). This seems simmple. However, for some time series I have some transitions between Nt=0 to Nt1>=0 which correspond to a different process than the density dependent model (which modeled locol recruiotment processes). Indeed, such transitions could be explained either by colonisation from neighbouring populations or by detection error (i.e. the species is present but not detected). Therefore, the transitions from Nt=0 to Nt1>=0 have to be modeled with a different model than the transitions from Nt>0 to Nt1>=0. This correspond more or less to hurdle count model where the zeros are modeled using a binomial distribution while the positive counts are modeled using a truncated (poisson or negative binomial) distribution. I have already perform that work whithout taking into account observation error under the frequentist framework (i.e. not using state-space modelling; see the corresponding code).

    Y <- c(15,0,62,37,54,0,46,0,45,40,7,0,20,28,8,25,13,0,84,96)
    sampling_area <- c(1050,1050,1050,1050,1125,1125,1125,1125,1125,1125,1080,1036.5,1035,975,1095,1095,950,1123,1125,1050)
    Nt1 <- Y[-1]
    Nt <- Y[-length(Y)]
    surface_Nt1 <- sampling_area[-1]
    surface_Nt <- sampling_area[-length(Y)]
    df <- as.data.frame(cbind(Nt,Nt1,surface_Nt,surface_Nt1,dates))
    nt_0 <- df$Nt == 0
    require(MASS)
    mod1 <- try(glm.nb(Nt1 ~ offset(log(surface_Nt1)),data=df[nt_0,],control=glm.control(maxit=1000))) #modelling the transitions from Nt=0 to Nt1>0
    mod2 <- try(glm.nb(Nt1 ~ I(Nt/surface_Nt) + offset(log(surface_Nt1) + log(Nt/surface_Nt)),data=df[!nt_0,],control=glm.control(maxit=1000))) #modelling the positive counts 
    

    Now I would like to perform EXACTLY the same job using state space modelling under bayesian inference.
    Here is what I tried. however this code does not work and I do not know how to fix the problem...
    Any help would be highly appreciated!!
    Thanks in advance
    Mathieu

    require(dclone)
    require(snow)
    require(rjags)
    data.for.jags=list("Y"=Y,"samp"=sampling_area,"time"=length(Y)-1)
    zero_trunc_SSM=function(){
    transition equation
    for(i in 2:time)    {
    X[i]~dnorm(meanX[i],tau)
    meanX[i] <- test[i-1]*(sampling_area[i]*(X[i-1]/sampling_area[i-1]))*exp(rho-eta*(X[i-1]/sampling_area[i-1])) + (1-test[i-1])*exp(gam)*sampling_area[i] 
    }
    
        # observation equation
        for(i in 1:time)    {
        test[i]~dinterval(Y[i],threshold) 
            p[i] <- theta / (theta + (X[i] * mu[i]) )
            Y[i] ~ dnegbin(p[i], theta)
            }
    
    # Priors on model parameters
    rho~dunif(-10,10)           
    eta~dunif(-10,10)               
    gam~dunif(-10,10)               
    log.sigma2~dunif(-10,10) 
    sigma2<-exp(log.sigma2)         
    tau<-1/sigma2               
    X[1]<-log(24)               
    sampling_area[1]<-log(1000)
    threshold<-0.5
    theta~dgamma(10,2)
    }
    cl <- makeSOCKcluster(4)
    mcmc <- jags.parfit(cl, data.for.jags, c("rho","eta","log.sigma2","gam","theta"), zero_trunc_SSM, n.chains=4, n.adapt=5000, n.update=5000, n.iter=20000, thin=20)
    gelman.diag(mcmc) #Gelman and Rubin's convergence diagnostic 
    summary(mcmc)
    
     
    Last edit: Martyn Plummer 2013-10-11
  • Martyn Plummer

    Martyn Plummer - 2013-10-11

    When you say the code does not work, it would be helpful to tell us the error message.

    Looking at your code I can see two problems:
    1) You are referring to sampling_area in your BUGS code, but you have renamed this variable "samp" in your data list.
    2) Your code refers to nodes mu[i] for i = 1:time but it is never defined.

     
  • Mathieu Chevalier

    Hi Martin,

    First of all, thanks for your help.
    Indeed, you were right about the two points you have raised. I corrected these errors but I still get an error message:

    "Erreur dans checkForRemoteErrors(val) :
    4 nodes produced errors; first error: Error in node Y[3]
    Invalid parent values"

    Here is the code which produces this error message:

    require(dclone)
    require(snow)
    require(rjags)
    
    data.for.jags=list("Y"=Y,"sampling_area"=sampling_area,"time"=length(Y)-1)
    
    zero_trunc_SSM=function(){
    
    # transition equation
        for(i in 2:time)    {
        X[i]~dnorm(meanX[i],tau)
        meanX[i]<-test[i-1]*(sampling_area[i]*(X[i-1]/sampling_area[i-1]))*exp(rho-eta*(X[i-1]/sampling_area[i-1])) + (1-test[i-1])*exp(gam1)*sampling_area[i]  
        }
    
            # observation equation
            for(i in 1:time)    {
                log(mu[i])<-X[i]
                test[i]~dinterval(Y[i],threshold) 
                p[i]<-theta/(theta+(X[i]*mu[i]))
                Y[i]~dnegbin(p[i],theta)
                }
    
        # Priors on model parameters
        rho~dunif(-10,10)           
        eta~dunif(-10,10)               
        gam1~dunif(-10,10)              
        log.sigma2~dunif(-10,10) 
        sigma2<-exp(log.sigma2)         
        tau<-1/sigma2               
        X[1]<-log(24)               
        threshold<-0.5
        theta~dgamma(10,2)
    
    }
    
    cl <- makeSOCKcluster(4)
    mcmc <- jags.parfit(cl, data.for.jags, c("rho","eta","log.sigma2","gam1","theta"), zero_trunc_SSM, n.chains=4, n.adapt=5000, n.update=5000, n.iter=20000, thin=20)
    gelman.diag(mcmc)       #Gelman and Rubin's convergence diagnostic (values above 1 indicate lack of convergence)
    summary(mcmc)
    

    Moreover, besides being novice in bayesian analysis, I am also novice in state space modelling. Consequently, I am not even sure that the bayesian code corresponds to the frequentist non state-space model code provided in the previous post. Specifically, count data are usually modeled using poisson or negative binomial distribution (the last one being to account for overdispersion in the data, which is my case). However, I do not know to which equation (transition or observation) the negative binomial distribution should be applied... Any help would be really appreciated.
    Finally, I do not know if the likelihood function in the first loop is correctly defined. I would like to use the following model to the positive counts (Nt>0 -> Nt1>=0) which modeled a local recruitment process:
    Nt1 = St1(Nt/St)exp(rho-eta(Nt/St))
    whereas I would like to use the following model to the transitions from zeros to positive counts (Nt=0 -> Nt1>=0) which modeled a colonisation process:
    Nt1 = St1
    exp(gamma).
    Thus is the defintion of the variable "meanX[I] correct?

    Anyway, many thanks for giving consideration to this post.

    Mathieu

     
  • Martyn Plummer

    Martyn Plummer - 2013-10-11

    I can't help you with the validity of your model. Your immediate problem is, I think, caused by the fact that a large negative value of X[i] can cause p[i] to be negative, and hence an invalid parameter for the negative binomial distribution.

    More broadly, you can't just jump into complex modelling with BUGS. You need to work up to something this complex. Googling "state space model winbugs" generates lots of hits that look like they may be able to point you in the right direction.

     

Log in to post a comment.