# The jackknife estimator

Back to home page Jackknife estimators are used in ecology in two situations:
• mark-recapture estimation of number of animals in a closed population;
• species richness estimation for a defined assemblage.
In both cases, the raw number of animals or species observed (Sobs) is often too low, as some animals/species are missed. The raw number is thus a biased estimator. The jackknife aims to produce unbiased estimates.
 1 1001000110 2 1000000101 3 1101000011 4 0110000111 5 0101000001 6 0100000000 7 0010011000 8 0011000000 9 0001000011 10 0001100110 11 0001001001 12 0000100000 13 0001000000 14 1001000000 15 0001000000 16 0001100000 17 0001001000 18 0000110000 19 0000100000 20 0000010000 21 1000101000 22 0100000000 23 0001000001 24 0000010000 25 0000000001 26 0000100000

The data consist of detected/not-detected observations on a series of occasions, usually recorded as a matrix of ones and zeros, with a column for each occasion and a row for each animal or species. The box on the right shows data for 26 tigers and 10 capture occasions from a study in Kanha Tiger Reserve (Karanth et al, 2004).

### Estimating bias

If we had data from an infinite number of occasions, we would know the true number of animals/species (S). The bias is due to an insufficient number of sampling occasions. We can estimate this bias by considering what would happen if we had fewer occasions.

If we have data for n occasions, we can generate n jackknife samples by leaving out one occasion at a time. For each jackknife sample we see how many animals/species were detected, and we have n ‘partial estimates’, denoted $$S_{-i}$$ when occasion i is omitted. For the tiger data, when the first occasion is omitted, we still have records for 26 tigers: tigers 1, 2, 3, 14 and 21 were all caught on other occasions, so  $$S_{-1}=26$$. But when occasion 2 is omitted, tigers 6 and 22 drop out, so $$S_{-2}=24$$.

From these, we can calculate a set of n ‘pseudo-values’:

$S_i^*=nS_{obs}-(n-1)S_{-i}$

Partial estimates and pseudo-values for the tiger data are shown in the table below:

 i = 1 2 3 4 5 6 7 8 9 10 mean partial estimates 26 24 26 24 23 24 26 26 26 25 pseudo-values 26 44 26 44 53 44 26 26 26 35 35

If the bias is proportional to 1/n, the bias will cancel out, and the mean of the pseudo-values (35 for the tiger data) will be an unbiased estimator of S.

You may have noticed that the species or animals which drop out are those which were detected on only one occasion, called "singletons". Burnham and Overton (1979) showed that the jackknife estimator can be calculated from the number of singletons, $$f_1$$:

$S^*=S_{obs}+\frac{n-1}{n} f_1$

The tiger data has 10 animals caught only once, $$f_1 = 10$$, so $$S^* = 35$$.

The ‘drop-one-out’ method is a first-order jackknife; we can drop out more than one occasion.

### Higher-order jackknives

If we drop out k occasions at a time, we can calculate a kth-order jackknife estimate. The details can get quite complex, but Burnham and Overton (1979) provide equations making use of frequency of capture. For example, the second-order jackknife estimate is

$S_2^*=S_{obs}+\frac{2n-3}{n} f_1 - \frac{(n-2)^2}{n(n-1)} f_2$

where $$f_2$$ is the number of animals or species recorded twice; $$f_2$$ = 6 for the tiger data, so we have

$S_2^*=26+\frac{2×10-3}{10} 10 - \frac{(10-2)^2}{10(10-1)} 6=26+17-4.2667=38.733$

Since a whole series of jackknife estimates are available, which should we use?

Higher-order jackknives have smaller bias, but they also have larger standard errors, so there is a trade-off between bias and precision. Burnham and Overton (1979) provide a stopping rule which compares the difference between successive jackknives with the standard error, which is implemented in Program CAPTURE and the R code in the wiqid package.

CAPTURE and wiqid  add a further refinement: they allow for interpolation between the two best jackknife estimates. So their estimate of the number of tigers in the Kanha Reserve is 33.32, between the values from the first-order jackknife and "zero-order" jackknife, which is just $$S_{obs}$$.

### References

Burnham, K. P. & Overton, W. S. (1979) Robust estimation of population size when capture probabilities vary among animals. Ecology, 60, 927-936.

Karanth, K. U., Nichols, J. D., Kumar, N. S., Link, W. A., & Hines, J. E. (2004) Tigers and their prey: Predicting carnivore densities from prey abundance. Proceedings of the National Academy of Sciences, 101, 4854-4858.

Updated 23 December 2013 by Mike Meredith