I have data on fish age and length collected over several years. Length has been recorded for all fish but age is not always recorded but it can be estimated from fish length.
I want to look at changes in the mean length of the young-of-year (age 0) fish over time. If all fish were age 0, then I would have a simple linear regression:
where y are observed lengths, x is years and n is the number of fish. (I will investigate a year random effect and hierarchical priors, among other things – this is just a simple illustration.)
Fish, however, can be of any age in [0, 1, 2, 3]. Since fish age and length are related, I would like to include an age “submodel” in the mean length analysis to assign an age to unaged fish. By including an age submodel, my hope is to allow uncertainty in fish age to propagate to the mean length analysis.
My idea is to convert the age problem to a Bernoulli problem, whereby fish of age 0 are coded 1 and all other fish are coded 0. The Bernoulli submodel would be something like:
where w are age codes, x is years and n is the number of fish. (Again, I will investigate random effects and hierarchical priors, etc.)
I want to combine these models but I’m not sure how to do it. These are the ideas I’ve had:
estimate the mean length likelihood for age 0 fish only using an approach similar to a hurdle model but for continuous data, or
estimate the mean length likelihood for age 0 and age >0 fish separately but disregard the age >0 estimates, or
estimate the mean length likelihood for age 0 fish only.
My initial thoughts are: I don’t think that idea 1 is appropriate; idea 2 would require an indexing trick; and 3 I don’t know how to do this (hence ideas 1 and 2).
Below is my closest solution using the jags ifelse function but I’m concerned that setting nu to 0 for all age >0 fish affects the estimation of sigma (and perhaps a1 and b1).
You want something more like this. As well as giving a prior distribution to sigma[1] you need to do the same for a1[2] and b1[2] instead of setting them to zero.
So that I'm completely clear: even though I am not interested in parameters a1[2], b1[2] and sigma[2], I should nevertheless give them normal priors. Is that right? If so, why? Perhaps it is to help the model run?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes they are nuisance parameters. But you still need them. At each iteration, the model has to classify each fish into one of two groups (w=0 or w=1). The probability that a fish belongs to one group or another is based on the likelihood ratio between the two possibilities. To calculate this likelihood ratio, you do need to include a plausible sub-model for the lengths of the older fish as well as the younger fish.
For example, in your previous model you set all older fish to have length zero by setting a1[2] and b1[2] to zero. This is an implausible model, so in practice the model will classify most fish of non-zero length in group w=0.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have data on fish age and length collected over several years. Length has been recorded for all fish but age is not always recorded but it can be estimated from fish length.
I want to look at changes in the mean length of the young-of-year (age 0) fish over time. If all fish were age 0, then I would have a simple linear regression:
where
y
are observed lengths,x
is years andn
is the number of fish. (I will investigate a year random effect and hierarchical priors, among other things – this is just a simple illustration.)Fish, however, can be of any age in [0, 1, 2, 3]. Since fish age and length are related, I would like to include an age “submodel” in the mean length analysis to assign an age to unaged fish. By including an age submodel, my hope is to allow uncertainty in fish age to propagate to the mean length analysis.
My idea is to convert the age problem to a Bernoulli problem, whereby fish of age 0 are coded 1 and all other fish are coded 0. The Bernoulli submodel would be something like:
where w are age codes, x is years and n is the number of fish. (Again, I will investigate random effects and hierarchical priors, etc.)
I want to combine these models but I’m not sure how to do it. These are the ideas I’ve had:
My initial thoughts are: I don’t think that idea 1 is appropriate; idea 2 would require an indexing trick; and 3 I don’t know how to do this (hence ideas 1 and 2).
Below is my closest solution using the jags
ifelse
function but I’m concerned that settingnu
to 0 for all age >0 fish affects the estimation ofsigma
(and perhapsa1
andb1
).Here is some
R
code that will run the above model (paste the code into a file called "tmp.jags" in the working directory):Can anyone suggest a way to combine a Bernoulli and simple linear regression for this problem?
Thanks in advance for any help.
Stephen
Perhaps the answer is to split sigma into a vector of two elements, as in:
Does this sound reasonable?
Thanks in advance for any help.
Stephen
You want something more like this. As well as giving a prior distribution to
sigma[1]
you need to do the same fora1[2]
andb1[2]
instead of setting them to zero.Thanks, Martyn. Very kind of you to respond.
So that I'm completely clear: even though I am not interested in parameters a1[2], b1[2] and sigma[2], I should nevertheless give them normal priors. Is that right? If so, why? Perhaps it is to help the model run?
Yes they are nuisance parameters. But you still need them. At each iteration, the model has to classify each fish into one of two groups (w=0 or w=1). The probability that a fish belongs to one group or another is based on the likelihood ratio between the two possibilities. To calculate this likelihood ratio, you do need to include a plausible sub-model for the lengths of the older fish as well as the younger fish.
For example, in your previous model you set all older fish to have length zero by setting
a1[2]
andb1[2]
to zero. This is an implausible model, so in practice the model will classify most fish of non-zero length in groupw=0
.I see. That makes good sense. Thanks v much for explaining!
Stephen