Abstract
Closed population capture-recapture estimation of population size is difficult under heterogeneous capture probabilities. We introduce the minimum chi-square method which can handle multi-occasion capture-recapture data. It complements likelihood methods with elements that can lead to confidence intervals and assessment of goodness-of-fit. We conduct a comprehensive study on the minimum chi-square method for estimating the size of a closed population using multiple-occasion capture-recapture data under heterogeneous capture probability. We also develop two different bootstrap techniques that can be combined with any underlying estimator, be it the minimum chi-square estimator or a likelihood estimator, to perform useful inference for estimating population size. We present a simulation study on the minimum chi-square method and apply it to analyze white stork multiple capture-recapture data. Under certain conditions, the chi-square method outperforms the likelihood based methods.
Introduction
Capture-recapture methods are commonly used in wildlife studies to estimate population size by using information from marked individuals. The Lincoln-Petersen method [1] is the simplest method for estimating a population size with capture-recapture data. Many authors have proposed extensions to model capture-recapture data from a closed population (where the population is closed to births/deaths and immigration/emigration) [2], including developments to incorporate capture probabilities that vary over individual (heterogeneous capture probabilities). For example, [3] suggested using the full and conditional likelihood methods to estimate population size; [4] recommended using a jackknife estimator; [5] proposed a conditional likelihood estimator with covariates; [6] suggested likelihood estimation with finite mixture models; and [7] investigated a Conway-Maxwell-Poisson estimator as it allows different levels of heterogeneity adaptively.
Link [8] described the nonidentifiability of a population size N as the case where populations of very different sizes give rise to roughly the same observed capture history, which implies when we have only the capture history as we normally do in a capture-recapture experiment, the underlying population size cannot be accurately determined. Link (2003) brought forth the nonidentifiability problem for population size N from the standpoint that N could not be identified based on the conditional distribution of the observed data. He tried maximizing the full likelihood function and maximizing the conditional likelihood function to estimate population size. In essence, the nonidentifiability problem refers to the over-dependence; in the sense that for a given set of data, the estimated value can be vastly different when the underlying capture probability distribution g(p; θ) changes. In recent wildlife studies, researchers have focused on the distribution of capture probability p to model the heterogeneities and estimate the abundance under heterogeneity. The authors of [9] utilized a logit-normal model for p and [10] recommended using a beta model for p. The problem of nonidentifiability under heterogeneity still remains when setting high detection probabilities, restricting the number of parameters in models, or obtaining a large number of sampling occasions [11].
We explore the minimum chi-square method and compare it to two existing likelihood methods for estimating population size using multi-occasion capture-recapture data, and we apply these to estimate the white stork (Ciconia ciconia) population size in their main European wintering area, southwestern Spain. A white stork data set was obtained from [12], who focused on analyzing individual consistency in the use of food subsidies. Individuals within the study population were detected daily throughout the wintering season lasting 80 days with the additional state information referring to the foraging categories (dumps or rice fields) in which the individuals were detected. The authors of [13] commented that there were about 4000 white storks wintering in the study area when the capture history data were recorded. This number is based on census data collected and synthesized by scientists from the Spanish Ornithological Society in collaboration with Birdlife International. We give complementary estimates based on the three statistical methods of estimation. The existing likelihood methods that we use are the full and conditional likelihood methods. The minimum chi-square method is a recognized alternative to likelihood based methods [14]. It has been applied to capture-recapture data on a small humpback whale study assuming constant capture probability [15] and was investigated by [16] in the context of time varying capture probabilities. Authors of [17] mentions the possibility of using the minimum chi-square method in a three sample closed population study, but never applied it to their study. We are interested in this method because it lends itself naturally to the multi-occasion capture-recapture data which are categorical data that can be directly handled by this method without discretization. Further, a major advantage to the minimum chi-square method is that it provides point estimation, interval estimation and goodness-of-fit assessment at the same time. We adopt this method for multi-occasion capture-recapture data and show through the white stork data analysis that it is a competitive alternative to existing methods.
The rest of this paper is organized as follows. In Section 2, we first give a brief review of the full and conditional likelihood methods for multi-occasion capture-recapture data, and then discuss and adopt the minimum chi-square method for such data. We also propose two bootstrap techniques for multi-occasion capture-recapture data which can be used to construct confidence intervals for the population size. In Section 3, we present a simulation study to examine the finite sample properties of the minimum chi-square statistic which depend on the underlying population size and sample size. Due to our interest in estimating the white stork population size, we will focus on the case where the population size is not too large. We then apply the likelihood and the minimum chi-square methods to estimate the white stork population size in Section 4 and conclude with some remarks in Section 5. We also studied the accuracy and robustness of the minimum chi-square method when the underlying population is much larger than the white stork population. Results concerning such large population cases are given in the S1 File.
Materials and methods
The full and conditional likelihood methods
We review the full and conditional likelihood methods by following the discussion about these in [11]. Let N be the unknown population size, X1, X2, …, XN be independent random variables where Xi is the number of times individual i is observed, and T be the number of sampling occasions. Define a vector of frequencies f = (f0, f1, …, fT)′ where fx = #(Xi = x) for x = 0, 1, 2, …, T and let p1, p2, …, pN be the capture probabilities of the N individuals. The distribution of Xi given pi follows a binomial distribution with a total number of trials T and a probability of success pi, so we write [Xi|pi]∼ Binomial(T, pi). We assume p1, p2, …, pN are independent and identically distributed random variables with distribution function g(p; θ) with unknown parameter vector θ. Here, g(p; θ) is a probability density function that has support [0, 1] when p is continuous. We use the following capture probability models for heterogeneity: (a) beta distribution: g(p; θ) = g(p; α, β) where α and β are parameters and (b) logit-normal distribution: g(p; θ) = g(p; μ, σ) with parameters μ and σ.
Missing from capture-recapture data is f0, the number of individuals who are never captured during the T sampling occasions. Since fx is the number of individuals captured x times, we only know fx for x = 1, 2, …, T, and f0 is unknown. With this notation, the number of individuals captured at least once can be expressed as . If the probability density function g(p; θ) of p is known, then the probability that a randomly selected individual will be captured x times can be expressed as .
It is clear that f0 = N − n and that f = (f0, f1, f2, …, fT)′ follows a multinomial distribution with (T + 1) cells and the corresponding cell probabilities are πg = (πg(0), πg(1), …, πg(T))′, where πg(x) is the probability of an individual sighted x times. More precisely, with a total population size of N and T sampling times, the distribution of the random vector of observed frequencies f is as follows [f]∼ MultinomialT+1(N, πg). However, the multinomial distribution of f cannot be determined since f0 is unknown. Instead, we consider the distribution of the observable frequencies by conditioning on the total number of individuals captured, n. We denote the observable frequencies by fc = (f1, f2, …, fT)′, and write where denotes the probability that an individual is sighted x times conditional on the individual sighted at least once. The conditional cell probabilities are calculated based on the unconditional probabilities πg as follows The conditional distribution of fc given n is a multinomial distribution with a total of n trials and T cell probabilities , that is, [fc|n]∼ Multinomial. The number of individuals n that are captured at least once follows a binomial distribution with a total of N trials and a probability of success 1 − πg(0). Thus, [n]∼ Binomial(N, 1 − πg(0)). Finally, it follows from the above that the distribution of f can be expressed as [f] = [fc|n][n].
The full likelihood of N and the unknown parameters can be written as L(θ, N; f) ∝ [f] = [fc|n][n], which can be used to estimate N and θ simultaneously. If we condition on n, then N can be removed from the full likelihood and the resulting conditional likelihood for θ can be expressed as Lc(θ; fc) ∝ [fc|n]. More specifically, the distribution of [fc|n] is multinomial with probability mass function Here, elements of are computed based on πg(x) and πg(0) where πg(x) is dependent on the underlying capture probability distribution.
To use the conditional likelihood to estimate the population size N, we first find the maximizer of the conditional likelihood. With this estimated value of θ, the distribution function is available and the probability πg(0) can then be estimated. It follows that the population size N can be estimated using a Horvitz-Thompson type estimator [11] as This conditional likelihood method and the full likelihood method are asymptotically equivalent [3]. Therefore, they often give similar results when the number of individuals captured at least once (n) is large.
We note that all simulations in Section 3 and analyses of the white stork data in Section 4 were done using R software [18].
The minimum chi-square method
Minimum chi-square estimation has a long history in statistics; see, e.g., [19, 20]. We now discuss this method and extend it to handle population size estimation with multi-occasion closed-population capture-recapture data.
Suppose Y is a multinomial random variable supported on {y1, y2, …, yK} with probability mass function π(yi; θ) where yi’s are fixed constants, K ≥ 2 is a known integer, π(yi; θ)>0 and ∑π(yi, θ) = 1, and θ is a d-dimensional parameter vector. Let denote the true but unknown value of the parameter vector. For a given value , consider testing the null hypothesis H0: θt = θ with a random sample of n observations of Y. Let Oi denote the number of times yi appears in the sample. Then, Oi is the observed value and Ei(θ) = nπ(yi; θ) is the expected value under H0. The Pearson chi-square statistic for the null hypothesis is given by
| (1) |
Under H0, X2(θ) has an asymptotic chi-square distribution with (K − 1) degrees of freedom, so to test the null hypothesis we simply compare the observed value of the X2(θ) statistic with a chosen χ2 critical value. Those values of θ not rejected by the test form a region which [20] called the consonance region for θt. Denote by the 100(1—α)% consonance region for θt. Then,
| (2) |
where is the (1 − α)th quantile of the random variable. This concept was first introduced by [21] who were more interested in how consonant the data are with the probability model π(yi; θ) than estimating the unknown parameter vector θ. We will also use the name consonance region to differentiate this region from other types of confidence regions. However, since our main interest is in parameter estimation, we will continue to call the quantity 100(1 − α)% associated with the consonance region its confidence level.
We now compare the consonance region (2) with the classical confidence region based on the asymptotic distribution of a maximum likelihood estimator for θt. The following are three key comparisons.
(i) The construction of the consonance region (2) does not require a point estimator. It is obtained by inverting the Pearson chi-square test in that it consists of θ values not rejected by the test at level-α. Apart from the Pearson chi-square test, other goodness-of-fit tests may also be used to derive consonance regions. For example, in [20], the Anderson-Darling test was also used to derive consonance regions. For discrete data such as the capture-recapture data, the chi-square test is a preferred natural choice.
-
(ii)The classical confidence region can be constructed for any confidence level (1 − α) ∈ (0, 1). For the consonance region, there is a lower bound on the confidence level; consonance regions with levels below this bound are empty. To see this, let and let α* = P . Then, is the (1 − α*)th quantile of the random variable. A consonance region with a confidence level (1 − α) < (1 − α*) is empty because there are no θ values satisfying as the smallest X2(θ) value is , which is greater than because α > α*.
This lower bound, as [20] pointed out, is not a cause for alarm and should be viewed as useful information. For example, suppose (1 − α*) = 0.91 so that a 90% consonance region is unavailable. This tells us that no model can pass the goodness-fit-test at the 10% level and this information is worth knowing. We note that a consonance region with a confidence level that happens to be below the bound derived from a particular sample is still valid; if the assumptions are all correct it may not be empty for the next sample and it will capture the true value 100(1 − α)% of the time when it is constructed repeatedly using independent random samples. When it is empty, it informs us that it is one of the 100α% times where the consonance region does not capture the true value. This is an advantage as none of the classical confidence regions that do not contain the true value would identify themselves as such.
Classical confidence regions based on the asymptotic distribution of a point estimator are nested in that if (1 − α1) < (1 − α2), then a region with confidence level (1 − α1) is fully contained by one with confidence level (1 − α2). The point estimate is the only point in the intersection of all confidence regions with confidence level (1 − α)>0. Consonance regions are also nested.
To extend the method of [20] to estimate Nt, the true value of N, we assume that g(p; θ) is correctly specified although the true value of its parameter θt is unknown. Let π(i; θ) be the marginal probability that a randomly selected individual from the population will be sighted exactly i times. Let Gi = {j: Xj = i} for i = 0, 1, …, T. Then, (a) Gi represents the group of individuals that have been observed on exactly i of the T occasions and (b) the Gi form a partition of the population. Let Oi denote the number of individuals in Gi. Then, (b) implies O0 + O1 + … + OT = Nt. In a capture-recapture experiment, we do not know the number of individuals in G0 and thus O0 is not available. But if Nt is given, we can compute O0 by using the above equation
Now consider testing the hypothesis H0: Nt = N versus H1: Nt ≠ N using the observed O1, O2, …, OT. For the time being, assume all Oi ≥ 5 and T > d, where d is the number of parameters for the underlying capture probability distribution g(p; θ). Possible violations of these assumptions will be addressed in the simulation study section. Define
where Ei(θ) = Nπ(i; θ) and . For a given N, let We define the partial minimum chi-square statistic,
| (3) |
which has an asymptotic chi-square distribution with (T + 1) − 1 − d = T − d degrees of freedom under H0. That the chi-square statistic (3) defined by has degrees of freedom T − d, instead of (T + 1) − 1, was first shown in [22]. See [23] for an alternative estimator to that also leads to T − d degrees of freedom for the X2 statistic. It follows from [20] that a 100(1 − α)% consonance set for Nt is
| (4) |
where is the (1 − α)th quantile of the random variable. For point estimation of Nt, let
Then so long as is not empty, regardless of the confidence level (1 − α). To see this, note that is the smallest value of X2(N, θ). If an is not empty, then there exists one pair such that . Since , we have and thus . This also implies that is the only point in the intersection of all non-empty consonance sets for Nt. By point (iii) in the comparison of consonance regions and classical confidence regions in the previous section, we see that corresponds to the point estimator for Nt. Hence, we use as a point estimator for Nt and call it the minimum chi- square estimator of the population size. The theoretical properties of are difficult to obtain and its variance is presently not available. Fortunately, the associated consonance set is available which reduces the need for its standard error.
To construct a consonance region for the unknown parameter vector θt, consider testing the hypothesis H0: θt = θ and H1: θt ≠ θ using the chi-square statistic
where Ei(θ) = Nπ(i; θ), Nt is the true population size and . Under H0, X2(Nt, θ) has a chi-square distribution with T degrees of freedom. It follows from [20] that a 100(1 − α)% consonance region for θt is
| (5) |
where is the (1 − α) quantile of the random variable. Since Nt is unknown, we replace it with and replace (5) with
| (6) |
Region in (6) contains the region in (5) because X2(Nθ, θ) ≤ X2(Nt, θ). Hence, is a conservative 100(1 − α)% consonance region for θt as its coverage level is more than 100(1 − α)%. Simulation results show that its coverage level is close to 100(1 − α)% in many applications. Following the same argument used to show that is in all non-empty consonance sets , we can show that in is in all non-empty . This implies is the “center” of each and justifies its use as a point estimator for θt. We call the minimum chi-square estimator for θt.
Finally, we use the chi-square statistic as a measure of goodness-of-fit for the heterogeneity model g(p; θ). It measures how consistent the particular combination of and is with the observed data O1, O2, …, OT. Since represents the most favourable N value under which to evaluate the goodness-of-fit of g(p; θ) (in the sense that is the smallest possible value), the use of this statistic as a goodness-of-fit measure for g(p; θ) is favourable to the model. To test the null hypothesis that the true heterogeneity model is with , a reasonable calibration of its limiting distribution is since under the null hypothesis and estimates Nt. If we use this limiting distribution and thus the p-value p = P for testing, then the type-I error may be slightly lower than the significance level α. This is because , and thus the p-value given above is larger than it would be if Nt had been known which leads to fewer rejections under H0.
Bootstrap techniques
The empirical bootstrap is a commonly used statistical resampling technique, developed by [24], to estimate the variation of point estimates without making strong distributional assumptions. The key idea of bootstrapping is to make inference about a population based on the sample data, which can be modelled by resampling the sample data and performing inference about a sample from resampled data. It is now widely recognized as an adequate variance estimation method for capture-recapture studies [25]. In this section, we will discuss two different bootstrap methods, one is resampling from the same estimated distribution, which we refer to as the parametric bootstrap, and another is resampling of individuals from the whole population, which we refer to as the bootstrap of individuals. Bootstrap of individuals is similar to one of the methods discussed in [25]; however, they were concerned with constant capture probability. These two methods are analogous to the third method and the first method, respectively, in [26]. Rather than providing an estimate of variance, we focus on constructing a confidence interval for estimated population size. A parametric bootstrap approach was adopted by [27] based on maximum likelihood estimation to construct an interval estimate of the population size. We study the performance of the two bootstrap methods for constructing confidence intervals of the estimated population size N under minimum chi-square estimation through a simulation study. A comparison of confidence intervals based on the proposed bootstrap techniques for the white stork population size is presented in the application section. The accuracy of the consonance region such as (4) depends on the accuracy of the χ2 approximation to the final sample distribution of which may be poor when the sample size is small. The bootstrap methodology provides an alternative means of interval estimation in such situations.
Parametric bootstrap
To obtain a parametric bootstrapped confidence interval with multi-occasion capture-recapture data, we first make an assumption about the underlying capture probability model, say we assume it is a beta distribution, without specifying its parameter vector θ. We then estimate the population size N and θ using the data through either the minimum chi-square method or likelihood based methods. Let be the estimated parameter values. We now have an (estimated) population of size with a capture probability distribution fully specified by . This is our resampling distribution from which we generate a large number of bootstrap samples, say m samples. Here, each bootstrap sample is a multi-occasion capture-recapture data set with the same number of occasions T as the original data. Each sample is obtained by first generating a random sample of size (integer) from the capture probability distribution, say and then generating binomial random numbers Bi∼ Binomial(T, pi) for . The pi represents the capture probability of the ith individual in the population and Bi represents the number of times it is observed during the T occasions. From these m samples we obtain m bootstrap estimates of N, say . Finally, we obtain a 95% parametric bootstrapped confidence interval for the population size N which is defined by the 2.5th and 97.5th percentiles of these estimates.
Bootstrap of individuals
An alternative to the above parametric bootstrap is to bootstrap the individuals (see [7]) as follows. With a set of multi-occasion capture-recapture data fc = (f1, f2, …, fT)′, we again first estimate the population size under an assumption about the capture probability distribution. Let be the estimated integer value. We now have an (estimated) population of size consisting of individuals in the data set and those that were never observed. Let n denote the total number of individuals that have been observed at least once. Then, the number of individuals never captured may be estimated by . The estimated population can now be characterized by , that is, the population contains 0’s, f1 1’s, and so on. Next, we take a random sample of size from this population with replacement. Since each individual is replaced before the next individual is drawn, some individuals may appear more than once, and others may never appear at all. Each sample is a bootstrap replicate of the original capture-recapture data. We repeat this process m times, obtaining m sets of such data with which we compute m bootstrap estimates of N, say . We then use the 2.5th and 97.5th percentiles of these to obtain the 95% individual bootstrapped confidence interval for N. Compared with the parametric bootstrap, this bootstrap of individuals is less dependent on the capture probability model assumption as it does not use this assumption in the resampling step of the operation.
Numerical integration was used to obtain capture probability estimates πg(x). Further, to obtain consonance regions for model parameters θ and N we performed a grid search. More details are provided in the S1 File and examples of code are provided on Github at https://github.com/ILR819/white_stork.
Simulation studies
Accuracy of the chi-square approximation
In real applications, the accuracy of the chi-square approximation to the finite sample distributions of chi-square statistics in (1) and (3) depends on the size of n and N, respectively. For (1), it is well known that when n is sufficiently large so that the observed cell counts Oi are all more than 5, the approximation is good. For (3), the accuracy of the approximation depends on not only N but also the number of parameters estimated d and the number of categories or cells T + 1 in the chi-square statistic, so it is important to examine the accuracy empirically for the combination of (N, d, T) that we are interested in before we use the asymptotic chi-square distribution to construct consonance intervals for the unknown N.
We now present a simulation study on the accuracy of the chi-square approximation for (3) for the case where N = 4000 and T = 10. These N and T values were chosen because of the prior information about N concerning the white stork population and the data available. For the capture probability distribution, we chose beta(1, 10) as simulation results show that it produces similar observed frequency vector fc to the white stork data. With the chosen N, T and beta(1,10), we first generate multi-occasion capture-recapture data as described in the parametric bootstrap, and then pretend the parameter values of the beta distribution are unknown and estimate them using beta(α, β) as the unknown capture probabililty distribution with the minimum chi-square estimator . The resulting value is a random observation of the partial minimum chi-square statistic (3). We repeated this process 200 times, obtaining 200 sets of simulated multi-occasion capture-recapture data and 200 random observations of the statistic. These 200 observations form a random sample of size 200 from the null distribution of the partial minimum chi-square statistic . We then used this sample of size 200 to determine the accuracy of the chi-square approximation through QQ-plots which plot the sample quantiles against that of the corresponding quantiles of the asymptotic chi-square distribution. Recall that the degrees of freedom of the partial minimum chi-square statistic in (3) equals the total number of cells C = T + 1 minus (d + 1) where d is the dimension of θ. Here, in order to have a positive degrees of freedom and to ensure cell counts Oi are not too small, we pooled the data (aggregating counts for some sampling occasions) into a total of C = 4, 5, 6, 7 cells; see the white stork data analysis in the next section for examples of such aggregation. With d = 2 for the beta model, the degrees of freedom of the asymptotic chi-square distribution are df = 1, 2, 3, 4, respectively.
Fig 1 shows the QQ-plots of the sample of 200 for the four cases. We see from the plots that for the present combination of N, T and capture probability distribution, the asymptotic chi-square approximation to the finite sample distribution of the partial minimum chi-square statistic is not accurate. In particular, for C = 4, the statistic has very small values, indicating the chi-square approximation is very poor. With more cells the approximation becomes better but still not accurate enough for constructing consonance intervals for N. Nevertheless, we note that the simulated quantiles of tend to be smaller than the corresponding chi-square quantiles when the underlying capture probability distribution is correctly specified, regardless the total number of cells C. Thus this statistic is still useful for testing whether or not the capture probability distribution used is correct; if an observed value of the statistic exceeds say the 0.95th quantile of the asymptotic chi-square distribution, then the capture probability model should be rejected at 5% level based on this empirical finding. This test is conservative in that its type-I error is less than 5% but it may not be very powerful against misspecified capture probability models. Nevertheless, in the absence of other tests for the capture probability model, we will apply this test when analyzing the white stork data.
Fig 1. QQ-plots of a sample of n = 200 partial minimum chi-square statistic values versus quantiles of the asymptotic chi-square distribution for C = 4, C = 5, C = 6, and C = 7, respectively, where N = 4000, T = 10, and the capture probability distribution is beta.
The red line in each plot is the y = x line.
Finally, the poor accuracy of the chi-square approximation for the present combination of N, T and capture probability model does not invalidate the minimum chi-square method as a method of point estimation. It simply indicates the asymptotic chi-square distribution cannot be used to calibrate the consonance interval for this particular combination. Further, there are many situations where the chi-square approximation has been found to be accurate when the population size N is large. See the S1 File for examples where the chi-square approximation is accurate.
Bootstrap techniques for interval estimation
Since the chi-square approximation cannot be used to compute confidence intervals for the above combination of N, T and capture probability model, we use the bootstrap instead. To see that the bootstrap is effective, we first generated 100 sets of multi-occasion capture-recapture data and used these to obtain 100 pairs of estimated parameters values , , …, . Each pair of estimated parameters defines an estimated population from which we resample, either through parametric bootstrap or bootstrap of individuals, to generate 1000 sets of capture history data with which we produce 1000 estimates of the population size for j = 1, 2, …, 1000. We then constructed a 95% confidence interval using percentiles of as described in Section 2.3, resulting in 100 different bootstrapped intervals …, . To see these are reasonable bootstrap intervals, we plotted the point estimate against the “centre-point” of the interval represented by the median of } which we denote with . Fig 2 shows such plots of when the number of cells equals to 4, 5, 6, or 7, respectively. The lines in each plot is the y = x line. We see that regardless of the number of cells, the pairs are located around the y = x line indicating that the point estimate is in the centre of the bootstrap confidence interval. Also, among the 100 confidence intervals for each C value, the percentage of the intervals containing the true population size N = 4000 is around 95% when the capture probability distribution is correctly specified, very close to the confidence level of 95%. We did the above simulation and plots for other combinations of N, T and capture probability distribution and obtained similar results.
Fig 2. The geometry of the bootstrap confidence intervals: Plots of the 100 point estimates versus the “centre-point” of the confidence interval for C = 4, C = 5, C = 6, and C = 7 where N = 4000, T = 10 and the capture probability distribution is beta.
The red line is the y = x line.
Fig 3 shows the histograms of randomly selected sets of bootstrap estimates } for the 4-cells, 5-cells, 6-cells, and 7-cells cases, respectively. It is clear that the shape of the histogram for each case is non-normal. The test statistic and the corresponding p-value of Shapiro-Wilk normality test underneath each histogram further support this observation. This is consistent with the skewness of the distribution of population size estimators for all proposed methods in the capture-recapture literature; see [28]. Because of this, normal based confidence intervals cannot be used. When the asymptotic chi-square approximation is accurate, we can use the consonance set (4) as our confidence interval. When it is not, such as in our present case, we need to use the above bootstrap based confidence intervals instead.
Fig 3. Histograms of samples of size 1000 bootstrap minimum chi-square estimates for the population size for C = 4, C = 5, C = 6, and C = 7, respectively, where N = 4000, T = 10 and the capture probability model is beta.
Below the plots are the test statistics and the corresponding p-values of Shapiro-Wilk test for the null hypothesis that the bootstrap estimates are normally distributed.
Bias in population size estimation
Through a thorough simulation study that varied population size N, E(p), and the generating capture probability distribution, we investigated robustness in terms of bias and root mean-square error of (see Section 2.3 of the S1 File). In most cases, the minimum chi-square method outperforms the likelihood methods. In particular, when population size N is large and expected capture-probability E(p) is high, the minimum chi-square method outperforms the likelihood methods regardless of the number of cells used.
Application to the white stork data
[12] conducted a study on the foraging strategies of white storks (Ciconia ciconia) as a closed population in southwestern Spain. They investigated individual foraging specialization. The data were collected by two observers within the white stork’s main wintering area in southwestern Spain. At each sampling occasion, they recorded a “1” if marked storks were observed at rice fields, a “2” if marked storks were observed at dumps, and a “0” if marked storks were not detected in a particular occasion. In total, 1684 different individuals were banded on 80 sampling occasions during the study period lasting 80 days. [12] found apparent survival rate was close to 1 during the winter study concluding that a closure assumption was valid. Due to the large study area and the large number of sampling occasions, the data are sparse. So we aggregated the daily white storks capture history data to reduce the number of sampling occasions from T = 80 to T = 10. This was done by pooling, for example, the first 8 days’ observations together to form the observation for the first (aggregated) sampling occasion; a white stork is recorded as “captured” on this sampling occasion if it was captured at least once in the first 8 days, and recorded as “not captured” otherwise. After the aggregation, the observed frequency vector fc = {1021, 420, 166, 50, 20, 6, 1, 0, 0, 0}. To avoid zero count cells and small cell counts, we further aggregated the counts in the right tail of fc and considered the following four cases when implementing the minimum chi-square method:
4-cells case (C = 4): {1021, 420, 243}
5-cells case (C = 5): {1021, 420, 166, 77}
6-cells case (C = 6): {1021, 420, 166, 50, 27}
7-cells case (C = 7): {1021, 420, 166, 50, 20, 7}
The 4-cell case, for example, includes a cell for unobserved individuals whose count is unknown.
An assumption often made in the modelling of capture-recapture data is that of homogeneity in the capture probability. The homogeneity model postulates that the capture probability p does not vary from one individual to another and over sampling occasions. Under this model, g(p; θ) is a point mass of 1 at p, and πg(x) is Under this assumption, we computed the estimated N by using the minimum chi-square method, the full likelihood method, and the conditional likelihood method (Table 1). The full likelihood method gives an estimated population size of 2434 white storks and the capture probability p is estimated to be 0.11. The conditional likelihood method gives an estimate of 2433 with an estimated capture probability of 0.11 as well. For the minimum chi-square method, estimate of the population size gradually decreases from 2500 to 2344 as the number of cells increases which are largely in agreement with that given by the full and conditional likelihood methods. Moreover, the minimum chi-square statistics as a measure of goodness-of-fit is large relative to the corresponding chi-square distribution for all four cases. Consequently, the p-values are highly significant which suggests that heterogeneity in capture probability may exist and the point estimate obtained based on homogeneity assumption is not reliable. Note that this information is not available if we use the likelihood methods alone.
Table 1. Maximum full likelihood, maximum conditional likelihood and minimum chi-square estimates under the homogeneity assumption for capture probability.
C is the number of cells after pooling counts in the right tail of fc, is the estimated value of N and is the estimated capture probability p. The chi-square statistic X2, its degree of freedom df and p-value summarize the minimum chi-square goodness-of-fit test for the homogeneity model for capture probabilities.
| Method | Number of Cells | X 2 | df | p-value | ||
|---|---|---|---|---|---|---|
| Full likelihood | NA | 2434 | 0.11 | − | − | − |
| Conditional likelihood | NA | 2433 | 0.11 | − | − | − |
| Minimum chi-square | C = 4 | 2500 | 0.11 | 29.001 | 2 | <0.0001 |
| C = 5 | 2439 | 0.11 | 52.727 | 3 | <0.0001 | |
| C = 6 | 2375 | 0.11 | 91.577 | 4 | <0.0001 | |
| C = 7 | 2344 | 0.12 | 111.297 | 5 | <0.0001 |
To take the heterogeneity in capture probabilities into consideration, we first consider modelling the heterogeneity using the beta distribution as it can take on many different shapes. The probability density function of the beta distribution is g(p; θ) = g(p; α, β) = Γ(α + β)/{Γ(α)Γ(β)}pα−1(1 − p)β−1 where α > 0 and β > 0 are two shape parameters. Table 2 shows the results under the beta model for capture probability. The minimum chi-square estimate of the population size does not change very much with the C value indicating it is robust against the number of cells used. The small minimum chi-square statistics and the corresponding large p-values (close to 1) indicate that the beta model is appropriate for the capture probability. The minimum chi-square confidence intervals are also robust against the number of cells used. Both the minimum chi-square point estimate of N and the 95% CI are consistent with the full and conditional likelihood methods. To summarize, the minimum chi-square statistic shows that the heterogeneity in capture probability is present and the beta distribution is an appropriate model for the heterogeneity. It also provides a credible estimate of the white stork population size that is similar to the likelihood estimates and agrees with the census findings of 4000 [2].
Table 2. Maximum full likelihood, maximum conditional likelihood and minimum chi-square estimates under the beta model for capture probability.
C is the number of cells after pooling counts in the right tail of fc and is the minimum chi-square point estimator of N. The chi-square statistic X2, its degree of freedom df and p-value summarize the minimum chi-square goodness-of-fit test for the beta model. Interval estimates and their widths using parametric bootstrap and bootstrap of individuals are also shown.
| Method | Number of Cells | X 2 | p-value | 95% CI1 | Width1 | 95% CI2 | Width2 | |
|---|---|---|---|---|---|---|---|---|
| Full | NA | 4019 | − | − | [3261, 5356] | 2095 | [3259, 5279] | 2020 |
| Conditional | NA | 4000 | − | − | [3298, 5481] | 2183 | [3243, 5464] | 2221 |
| Chi-square | C = 4 | 4033 | <0.01 | 0.99 | [3107, 6186] | 3079 | [3147, 6070] | 2923 |
| C = 5 | 3852 | 0.23 | 0.89 | [3109, 5358] | 2249 | [3171, 5219] | 2048 | |
| C = 6 | 4046 | 1.19 | 0.75 | [3289, 5587] | 2298 | [3280, 5538] | 2258 | |
| C = 7 | 4036 | 1.20 | 0.88 | [3309, 5617] | 2308 | [3290, 5682] | 2392 |
1: Parametric bootstrap;
2: Bootstrap of individuals
We now use the logit-normal distribution to model the heterogeneity of the capture probability. Its density function is where μ is the mean and σ > 0 is the standard deviation. We estimated the white stork population size using the minimum chi-square method, and the full and conditional likelihood methods with this model (Table 3). The point estimates of the model parameters as well as the population size are quite stable for all cases. The relatively large p-values of the minimum chi-square method indicates the logit-normal model is also acceptable for the white stork data. For the minimum chi-square method, the confidence intervals become stable as the number of cells increases for both bootstrap techniques. When the number of cells C = 7, the two bootstrap intervals are only slightly wider than the ones given by the likelihood methods. The point estimates under the logit-normal case range from about 3350 to 3500, which are 500 to 600 individuals fewer than that under the beta model or the census findings. Most of the confidence intervals do not contain the number 4000, but they are narrower than the confidence intervals under the beta model.
Table 3. Maximum full likelihood, maximum conditional likelihood and minimum chi-square estimates under the logit-normal model for the capture probability.
C is the number of cells after pooling the right tail of fc and is the point estimator of N. The chi-square statistic X2, its degree of freedom df and p-value summarize the minimum chi-square goodness-of-fit test for the logit-normal model. Interval estimates and their widths using parametric bootstrap and bootstrap of individuals are also shown.
| Method | Number of Cells | X 2 | p-value | 95% CI1 | Width1 | 95% CI2 | Width2 | |
|---|---|---|---|---|---|---|---|---|
| Full | NA | 3355 | − | − | [2995, 3834] | 839 | [2995, 3785] | 790 |
| Conditional | NA | 3361 | − | − | [2995, 3790] | 795 | [3011, 3777] | 766 |
| Chi-square | C = 4 | 3501 | <0.01 | 0.99 | [2987, 4259] | 1272 | [3002, 4257] | 1255 |
| C = 5 | 3354 | 0.66 | 0.42 | [2943, 3842] | 899 | [2990, 3882] | 892 | |
| C = 6 | 3407 | 1.01 | 0.80 | [3055, 3901] | 846 | [3046, 3934] | 888 | |
| C = 7 | 3386 | 1.22 | 0.88 | [3037, 3849] | 812 | [3029, 3833] | 804 |
1: Parametric bootstrap;
2: Bootstrap of individuals
Conclusion
We explored the minimum chi-square method for estimating a closed population size with multi-occasion capture-recapture data as an alternative method to likelihood based methods. The advantage of the minimum chi-square method is that it not only estimates the unknown population size but also performs goodness-of-fit test on the capture probability model. Further, the partial minimum chi-square statistic has an asymptotic chi-square distribution which can be easily used to construct consonance/confidence interval when the chi-square approximation is accurate. It is important to examine the accuracy of the chi-square approximation through simulations. When the accuracy is unsatisfactory, which may happen when the population is small, we use the bootstrap methods to construct confidence intervals.
In the white stork application, the estimated population size dropped by more than 10% for all methods when we changed the capture probability model from beta to logit-normal and yet both beta and logit-normal models seem to fit the data well. This raises the question of which estimate and which model we should trust. This is related to the nonidentifiability problem we noted in the introduction. While more work is needed to address this problem, through extensive simulation [29] showed that the minimum chi-square estimator is robust against misspecification of the capture probability model g(p; θ) for the case of a large population size of N = 10000 and an expected capture probability E(p) that is not too small; that is, we obtain good estimates of N even when the capture probability model is misspecified. In this case, the minimum chi-square method outperforms the existing likelihood based estimators in terms of bias and mean square error. The asymptotic chi-square distribution is also accurate and the associated consonance set also performs well in terms of coverage accuracy. See the S1 File for some of the simulation results from [29]. Nevertheless, closed populations tend to be small (e.g. a confined population of 135 cottontail rabbit in [30]; an estimated closed population size of 173 of meadow vole in [2]). Thus the bootstrap approach to inference is necessary for these cases. But for population sizes at or above 10000, the consonance interval of the minimum chi-square method is a more natural interval estimate for the population size.
Our beta model based analysis of the white stork data provided supporting evidence to the existing estimate of N = 4000 obtained by the scientists through other means. Because the logit-normal model based estimates differ substantially from this number, we conclude that the capture probability of the white storks likely follows a beta distribution.
Finally, we note that the model we considered assumes only individual heterogeneity among capture probabilities. We did not consider the additional complexity of time-dependent individual heterogeneity in capture probability which would require restructuring the model. We leave this as an interesting future research direction.
Supporting information
Additional simulation studies of the accuracy and robustness of the minimum chi-square method.
(PDF)
Acknowledgments
Analyses and simulation studies were run on Westgrid/Compute Canada with assistance from Dr. Belaid Moa.
Data Availability
All relevant data are within the paper.
Funding Statement
This work was supported by Natural Sciences and Engineering Research Council of Canada Discovery grants RGPIN-2013-327025 to LLEC and RGPIN-2016-03804 to MT (https://www.nserc-crsng.gc.ca). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Williams BK, Nichols JD, Conroy MJ. Analysis and Management of Wildlife Populations: Modeling, Estimation, and Decision Making. San Diego, CA: Academic Press; 2002. [Google Scholar]
- 2. Otis D, Burnham K, White G, Anderson D. Statistical inference from capture data on closed animal population. Wildlife Monographs. 1978;62:3–135. [Google Scholar]
- 3. Sanathanan L. Estimating the size of a multinomial population. The Annals of Mathematical Statistics. 1972;43:142–152. doi: 10.1214/aoms/1177692709 [DOI] [Google Scholar]
- 4. Burnham KP, Overton WS. Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika. 1978;65:625–633. doi: 10.1093/biomet/65.3.625 [DOI] [Google Scholar]
- 5. Huggins RM. On the statistical analysis of capture experiments. Biometrika. 1989;76:133–140. doi: 10.1093/biomet/76.1.133 [DOI] [Google Scholar]
- 6. Pledger S. Unified maximum likelihood estimates for closed capture-recapture models using mixtures. Biometrics. 2000;56:434–442. doi: 10.1111/j.0006-341X.2000.00434.x [DOI] [PubMed] [Google Scholar]
- 7. Anan O, Böhning D, Maruotti A. Uncertainty estimation in heterogeneous capture-recapture count data. Journal of Statistical Computation and Simulation. 2017;87:2094–2114. doi: 10.1080/00949655.2017.1315668 [DOI] [Google Scholar]
- 8. Huggins R. A note on the difficulties associated with the analysis of capture-recapture experiments with heterogeneous capture probabilities. Statistics and Probability Letters. 2001;54:147–152. doi: 10.1016/S0167-7152(00)00233-9 [DOI] [Google Scholar]
- 9. Coull BA, Agresti A. Random effects modeling of multiple binomial responses using the multivariate binomial logit-normal distribution. Biometrics. 2000;56:73–80. doi: 10.1111/j.0006-341X.2000.00073.x [DOI] [PubMed] [Google Scholar]
- 10. Dorazio RM, Royle JA. Mixture models for estimating the size of a closed population when capture rates vary among individuals. Biometrics. 2003;59:315–364. doi: 10.1111/1541-0420.00042 [DOI] [PubMed] [Google Scholar]
- 11. Link WA. Nonidentifiability of population size from capture-recapture data with heterogeneous detection probabilities. Biometrics. 2003;59:1123–1130. doi: 10.1111/j.0006-341X.2003.00129.x [DOI] [PubMed] [Google Scholar]
- 12. Sanz-Aguilar A, Jovani R, Melian CJ, Pradel R, Tella JL. Multi-event capture-recapture analysis reveals individual foraging specialization in a generalist species. Ecology. 2015;96(6):1650–1660. doi: 10.1890/14-0437.1 [DOI] [Google Scholar]
- 13.Aguirre JL. Ciconia ciconia. In: Atlas de las aves en invierno en España 2007-2010. Ministerio de Agricultura, Alimentación y Medio Ambiente-SEO/BirdLife, Madrid, Spain; 2013. p. 152–153.
- 14. Berkson J. Minimum chi-square, not maximum likelihood. Annals of Statistics. 1980;8:457–487. doi: 10.1214/aos/1176345003 [DOI] [Google Scholar]
- 15. Alvarez C, Aguayo A, Rueda R, Urban J. A note on the stock size of humpback whales along the Pacific coast of Mexico. Reports of the International Whaling Commission. 1990;Special Issue 12:191–193. [Google Scholar]
- 16. Yoshizaki J, Brownie C, Pollock KH, Link WA. Modeling misidentification errors that result from use of genetic tags in capture-recapture studies. Environmental and Ecological Statistics. 2011;18:27–55. doi: 10.1007/s10651-009-0116-1 [DOI] [Google Scholar]
- 17. Qi L, Hu M, Chi L, Liao J. Estimated total number of second children based on three sources: the case of the city of Chengdu, Sichuan, China, for the year 2018. Mathematical Population Studies. 2022;29:1–16. doi: 10.1080/08898480.2021.1915638 [DOI] [Google Scholar]
- 18.R Core Team. R: A Language and Environment for Statistical Computing; 2017. Available from: https://www.R-project.org/.
- 19.Kempthorne O. The classical problem of inference—goodness of fit. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. vol. 1. Berkeley, CA: University of California Press; 1967. p. 235–249.
- 20. Easterling RG. Goodness of fit and parameter estimation. Technometrics. 1976;18:1–9. doi: 10.2307/1267910 [DOI] [Google Scholar]
- 21. Kempthorne O, Folks JL. Probability, Statistics, and Data Analysis. Ames, IA: Iowa State University Press; 1971. [Google Scholar]
- 22. Fisher RA. Statistical Methods for Research Workers. Edinburgh, U.K.: Oliver and Boyd; 1925. [Google Scholar]
- 23.Neyman J. Contribution to the theory of χ2 test. In: Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, CA: University of California Press; 1949. p. 239–273.
- 24. Efron B. Bootstrap methods: Another look at the jackknife. The Annals of Statistics. 1979;7(1):1–26. doi: 10.1214/aos/1176344552 [DOI] [Google Scholar]
- 25. Buckland S, Garthwire P. Quantifying precision of mark-recapture estimates using the bootstrap and related methods. Biometrics. 1991;47:255–268. doi: 10.2307/2532510 [DOI] [Google Scholar]
- 26. Norris J, Pollock K. Including model uncertainty in estimating variances in multiple capture studies. Environmental and Ecological Statistics. 1996;3:235–244. doi: 10.1007/BF00453012 [DOI] [Google Scholar]
- 27. Yang X, Pal N, Ackleh AS, Carter J. A case study of green tree frog population size estimation by repeated capture-mark-recapture method with individual tagging: A parametric bootstrap method vs Jolly-Seber method. Journal of Statistical Computation and Simulation. 2011;81:1879–1895. doi: 10.1080/00949655.2010.507764 [DOI] [Google Scholar]
- 28. International Working Group for Disease Monitoring and Forecasting. Capture-recapture and multiple record systems estimation 1: History and theoretical development. American Journal of Epidemiology. 1995;142:1047–1058. doi: 10.1093/oxfordjournals.aje.a117559 [DOI] [PubMed] [Google Scholar]
- 29.Mao Y. Exploring the minimum chi-square method for multiple-occasion capture-recapture data; 2016. Unpublished Master’s project, Department of Mathematics and Statistics, University of Victoria.
- 30. Edwards W, Eberhardt L. Estimating cottontail abundance from livetrapping data. The Journal of Wildlife Management. 1967;31(1):87–96. doi: 10.2307/3798362 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Additional simulation studies of the accuracy and robustness of the minimum chi-square method.
(PDF)
Data Availability Statement
All relevant data are within the paper.



