I wish to fit a survival model using the data below (first two columns are time points, NA represents censoring, third column is number of samples). Is there a way to weight the likelihood by the sample size?
If not, I could repeat each time point for given sample size but this would result in probably too big of a sample.
0.01, 1, 463,
1, 2, 369,
9, 10, 116,
10, 11, 163,
11, 12, 149,
12, 13, 230,
12.5, NA, 150054,
11.5, NA, 146349,
10.5, NA, 118098,
9.5, NA, 41633,
8.5, NA, 30308,
7.5, NA, 25934,
6.5, NA, 29427,
2.5, NA, 37997,
1.5, NA, 40599,
0.5, NA, 43300),ncol=3,byrow=T)
I don't know what you are trying to do here. What do these data represent? Why are the observation times for the censored observations different from the failure times? Are these data interval censored?
Sorry, it is not clear nor was I. If both numbers are present then the data is indeed interval censored. If only the first number is present (second number NA) then the data is right censored. The third column is the number of observations observed for that particular time segment.
Then recode your NA values to 0. If your columns are t, y, n then you have binomial data
y[i] ~ dbin(p[i], n[i])
where p[i] = Prob(T <= t[i]) where T is the failure time.
p[i] = Prob(T <= t[i])
If you assume a constant rate lambda = log(alpha) then you can use a complementary log-log link with log time as an offset to define p[i].
lambda = log(alpha)
cloglog(p[i]) <- alpha + log(t[i])
Since your outcome is rare, a Poisson regression model with a log link will be a good approximation:
y[i] <- dpois(mu[i])
log(mu[i]) <- alpha + log(t[i])
If you want the rate to vary with time then obviously the calculations become more complex.
Note that a similar discussion recently occurred on the BUGS mailing list on jiscmail.ac.uk. Check the archives.