Menu

Zero trick for Inverse Gaussian

2014-10-10
2014-10-13
  • Fadzli Fuzi

    Fadzli Fuzi - 2014-10-10

    Hi,

    I'm trying to fit my data to Inverse-Gaussian distribution. The model runs with no error, but the results are not satisfactory. I get R.hat values bigger than 1 and n.eff in single digits. What did I do wrong/how can I improve my results?

    Below are the codes:

    ~~~~~~~~~~

    modelText <- "
    data{
    C <- 10000
    for(i in 1:N){
    zeros[i] <- 0
    }
    }
    model{
    for(i in 1:N){
    zeros[i] ~ dpois(zeros.mean[i])
    zeros.mean[i] <- -l[i] + C
    #log-likelihood
    l[i] <- 0.5(log(lambda) - log(23.14) - 3log(P[i])) - 0.5 lambda pow( (P[i]- mu[i])/mu[i],2)/P[i]
    #log-link + linear predictor
    log(mu[i]) <- inprod(X[i,],beta[])
    }
    #priors
    for(j in 1:18){
    beta[j] ~ dnorm(0.0, 0.001)
    }
    lambda ~ dgamma(0.01,0.01)
    s2 <- 1/lambda
    Deviance <- -2
    sum(l[1:N])
    }"

    ~~~~~~~~~~~~

    Thanks in advance.

     
  • Martyn Plummer

    Martyn Plummer - 2014-10-10

    It's hard to say what is wrong if you don't give a reproducible example - i.e. one with data.that readers can try out for themselves.

    Having said that, your problem closely resembles the example from chapter 8 of Bayesian Modeling Using WinBUGS by Ioannis Ntzoufras. I have reproduced this example below and it runs fine. This leads me to conclude that the problem lies with your data (which you have not shared). Specifically, you have 18 covariates, which is a lot, If they are highly correlated then you will get poor performance from Gibbs sampling, which updates each beta[i] parameter individually.

    Model

    data {
       for (i in 1:n) {
          zeros[i] <- 0
       }
       C <- 10000
    }
    model{
    
          for (i in 1:n) {
              zeros[i] ~ dpois( zeros.mean[i])
              zeros.mean[i] <- -l[i] + C
              l[i] <-   0.5*( log( lambda ) - log(2*3.14) - 3*log(y[i]) ) - 0.5* lambda * pow( (y[i]-mu[i])/mu[i], 2 )/y[i]
              log(mu[i]) <- beta[1] + beta[2]*x1[i] + beta[3]*x2[i] + beta[4]*x3[i] + beta[5]*x4[i]
           }        
           # priors 
           for (j in 1:5){ beta[j] ~ dnorm( 0.0 ,  0.001 ) }
           lambda ~ dgamma( 0.01, 0.01)
           s2 <- 1/lambda
    }
    

    Data

    n <-
    100
    y <-
    c(41.273, 72.628, 3.895, 56.98, 5.552, 5.744, 6.399, 10.324, 
    19.688, 14.923, 70.551, 36.876, 73.614, 0.63, 6.016, 14.83, 5.941, 
    8.226, 32.997, 5.993, 11.129, 6.265, 1.09, 34.514, 22.988, 13.241, 
    1.984, 1.587, 3.534, 125.5, 85.96, 15.998, 305.689, 68.492, 72.521, 
    7.445, 2.273, 29.444, 11.045, 3.131, 5.312, 33.617, 23.875, 41.723, 
    52.23, 5.472, 1.327, 3.533, 2.698, 34.683, 3.868, 20.195, 10.076, 
    8.767, 9.262, 28.838, 8.238, 8.719, 22.366, 15.731, 6.479, 5.372, 
    1150.242, 2.373, 5.696, 27.043, 10.092, 9.669, 21.733, 1.151, 
    11.928, 1.313, 3.558, 0.867, 8.658, 23.43, 0.361, 37.741, 6.074, 
    10.722, 5.246, 21.654, 4.344, 2.405, 12.9, 4.277, 4.003, 0.833, 
    3.549, 0.091, 12.15, 9.041, 2.137, 2.016, 8.747, 1.256, 25.758, 
    16.548, 9.812, 3.415)
    x1 <-
    c(0.475, 1.958, -0.957, 0.442, 1.234, -0.407, 0.978, 1.733, 0.055, 
    0.863, 0.087, 0.836, -0.366, -1.478, 1.185, 1.058, 0.293, 0.665, 
    1.278, -1.578, -0.861, -0.889, -0.954, -0.248, 1.257, 2.029, 
    -1.436, -0.484, 0.259, 1.477, 0.114, 0.137, 0.416, 0.672, 1.614, 
    -0.246, -0.53, 0.704, -0.155, -0.738, -0.638, 1.591, -0.069, 
    1.692, 0.424, 0.214, -1.031, -0.393, -0.579, 0.546, -1.344, 0.105, 
    0.195, 1.313, -0.684, -0.003, -0.615, 0.466, -0.579, -0.927, 
    -0.729, 0.136, 1.558, -0.136, -0.539, 0.637, -0.173, 1.673, -0.222, 
    0.14, 1.335, 0.884, 1.025, -1.228, 0.528, 0.678, -1.424, 1.997, 
    -0.412, 0.406, 0.412, 2.916, -0.212, -0.711, -0.406, -0.524, 
    1.536, -1.537, -0.738, -2.417, 1.259, -0.548, -0.014, 0.957, 
    0.289, -1.214, 1.076, 0.056, -0.883, -0.069)
    x2 <-
    c(0.404, -1.727, -0.556, 1.205, 0.077, 0.101, -0.334, -0.682, 
    -1.702, 1.198, 0.304, 0.4, -1.397, 0.342, -1.669, 1.213, -3.665, 
    -0.685, 1.116, -1.834, -1.285, -0.94, 1.201, -0.587, 0.348, 0.195, 
    -0.986, 1.742, -1.063, 0.252, -0.706, -0.223, -0.228, -2.281, 
    1.148, -0.613, 0.249, 0.737, -0.966, -1.585, 0.373, 0.136, 0.115, 
    1.027, 0.357, 0.35, 0.588, 0.824, 1.338, -0.58, -0.788, -0.073, 
    -2.042, -0.833, -1.021, 0.535, -0.171, 1.609, -0.446, -1.164, 
    -0.2, 1.849, -0.203, 0.93, 0.586, -0.713, 0.744, -0.231, -1.904, 
    -0.68, -1.232, 0.12, -0.265, 0.372, 0.233, 0.043, 0.961, -2.326, 
    0.506, 1.065, 0.838, -0.235, 0.28, 0.737, -0.327, -0.042, 0.512, 
    0.121, -0.035, 0.485, 1.534, 0.59, -0.067, -0.816, -1.2, 0.682, 
    -0.727, -0.525, -0.186, 0.916)
    x3 <-
    c(0.36, -0.975, 0.703, 1.386, 0.079, 0.389, 0.76, -2.64, -0.729, 
    -0.973, 0.576, 2.475, -0.74, -0.347, 0.563, 1.4, -0.042, 0.748, 
    -1.813, -2.182, 0.006, 0.149, 0.003, 0.135, -0.153, -1.331, -0.368, 
    0.046, 0.137, 1.868, -1.933, -0.284, 0.747, 0.678, -1.301, 0.117, 
    -0.318, -0.58, -1.251, -1.549, -1.672, -1.433, 0.082, -0.261, 
    -0.758, 0.039, -1.377, -1.238, -0.287, 0.171, 0.719, 0.184, 0.112, 
    -2.449, 0.633, 0.667, -0.084, 1.321, 0.536, -0.932, -0.89, 0.125, 
    2.671, 0.265, -1.656, 0.291, -1.259, -0.082, -1.113, 0.645, -0.393, 
    -0.166, -0.605, 0.228, 0.131, 0.55, 0.689, 1.1, 0.463, -0.712, 
    -0.898, -0.737, 0.344, 2.71, 0.344, 1.332, 1.908, -1.562, 0.406, 
    -0.505, -0.137, -0.8, -0.859, -0.629, 0.222, -0.718, 0.521, 0.197, 
    -0.082, 0.921)
    x4 <-
    c(-0.119, -1.501, -0.289, -0.002, -0.263, 1.287, -0.087, -0.743, 
    -1.295, -0.341, -0.682, 0.798, -0.104, 0.03, 0.992, 1.023, 0.727, 
    2.839, -0.129, -0.063, 0.116, -0.425, 0.02, -0.276, -1.363, -0.33, 
    -0.865, 0.679, -0.254, -0.837, 0.07, -0.13, 2.037, 0.099, 0.815, 
    -0.549, -0.233, -0.163, 2.063, 0.88, 0.851, -1.447, -1.876, -0.44, 
    0.428, 1.344, -0.248, -0.466, 0.728, -0.823, -0.278, -2.204, 
    -1.886, 0.027, 3.27, -0.868, 1.561, 0.364, 0.172, -0.185, -0.263, 
    -1.786, -0.602, -1.981, 0.799, -1.218, -0.812, -0.22, -0.648, 
    -1.874, -0.955, -1.107, -0.454, -0.615, 1.916, 1.688, 0.06, -0.006, 
    -1.691, -0.22, -0.063, 0.285, -1.911, 0.173, 1.798, 0.921, -0.508, 
    0.809, -0.563, -0.265, -0.522, -0.411, 0.797, 0.886, 0.043, -0.639, 
    -0.602, -0.389, 0.009, -0.109)
    
     
  • Fadzli Fuzi

    Fadzli Fuzi - 2014-10-10

    Thanks for the quick response.The data is as attached.Below is the full code.

    ~~~~~~
    Data <- read.csv("Data.csv")

    Define data matrix

    X <- cbind(1,Data[,c(1:17)])
    attach(X)
    P <- Data$P
    N <- length(Data$P)

    IGaussian.data <- list("P","N","X")

    Inverse-Gaussian using the zero trick

    modelText <- "
    data{
    C <- 10000
    for(i in 1:N){
    zeros[i] <- 0
    }
    }
    model{
    for(i in 1:N){
    zeros[i] ~ dpois(zeros.mean[i])
    zeros.mean[i] <- -l[i] + C
    #log-likelihood
    l[i] <- 0.5(log(lambda) - log(23.14) - 3log(P[i])) - 0.5 lambda pow( (P[i]-mu[i])/mu[i],2)/P[i]
    #log-link + linear predictor
    log(mu[i]) <- inprod(X[i,],beta[])
    }
    #priors
    for(j in 1:18){
    beta[j] ~ dnorm(0.0, 0.001)
    }
    lambda ~ dgamma(0.01,0.01)
    s2 <- 1/lambda
    Deviance <- -2
    sum(l[1:N])
    }"
    writeLines(modelText,"IGaussian.txt")

    Define parameters

    IGaussian.params <- c(paste("beta[",i=1:18,"]",sep=""),"Deviance")

    Define starting values

    IGaussian.inits <- function(){list("beta"=rep(-0.1,18))}
    ~~~~~~~

    The response variable P[i] is insurance premium and the covariates are risk classes.I'll consider reducing the covariates but please let me know if you can get fine results by using the 18 covariates.Thanks!

     

    Last edit: Fadzli Fuzi 2014-10-10
  • Martyn Plummer

    Martyn Plummer - 2014-10-10

    You can centre the covariates by defining X like this

    X <- Data[,1:17]
    x.bar <- apply(X, 2, mean)
    X <- sweep(X, 2, x.bar, FUN="-")
    X <- cbind(1, X)
    xbar <- c(1, xbar)
    

    You will still get correct estimates for beta[2] up to beta[18]. The intercept parameter beta[1] is different, but if you want to you can recover the original intercept parameter as

    alpha <- inprod(beta, x.bar)
    

    With this reparameterization, your model runs well with a thinning interval of (10, i.e. 10000 iterations to get 1000 samples)

    Martyn

     
  • Fadzli Fuzi

    Fadzli Fuzi - 2014-10-13

    The reparameterization has managed to lower the values of Rhat close to 1, but some n.eff values are still low compared to the number of simulations saved. Here's the updated model:

    ~~~~~~~~~~~~~~
    modelText <- "
    data{
    C <- 10000
    for(i in 1:N){
    zeros[i] <- 0
    }
    }
    model{
    for(i in 1:N){
    zeros[i] ~ dpois(zeros.mean[i])
    zeros.mean[i] <- -l[i] + C
    #log-likelihood
    l[i] <- 0.5(log(lambda) - log(23.14) - 3log(P[i])) - 0.5 lambda pow( (P[i]-mu[i])/mu[i],2)/P[i]
    #log-link + linear predictor
    log(mu[i]) <- inprod(xbar,beta)
    }
    #priors
    for(j in 1:18){
    beta[j] ~ dnorm(0.0, 0.001)
    }
    lambda ~ dgamma(0.01,0.01)
    s2 <- 1/lambda
    Deviance <- -2
    sum(l[1:N])
    }"
    ~~~~~~~~~

    Fadzli

     

Log in to post a comment.