Abstract
Given a pool of motorists, how do we estimate the total intensity of those who had a prespecified number of traffic accidents in the past year? We previously have proposed the u,v method as a solution to estimation problems of this type. In this paper, we prove that the u,v method provides asymptotically efficient estimators in an important special case.
1. The u,v Method
Given a pool of motorists, how do we estimate the total intensity of those in the pool who had a prespecified number of traffic accidents in a given time period? We may also consider patients with a prespecified number of heart attacks, or salesmen with a prespecified number of disgruntled customers, etc. In general, let θi be the intensity and Xi the number of occurrences of certain type of events of the ith individual in a pool of size n. Suppose that for 1 ≤ i ≤ n conditionally on θi, Xi has the Poisson distribution with . We are interested in estimating the sum
1.1 |
where u(x) is a known “utility function” dictated by practical considerations. In the examples above, Sn is the sum of the intensity θi for those individuals with Xi = a traffic accidents (heart attacks, disgruntled customers, etc.) for a prespecified integer a, if
1.2 |
Robbins (1) considered estimation of the sum in 1.1 and certain other related quantities for general, but known, conditional distributions F(x|y) of Xi given θi = y. The solution he proposed, called the u,v method, estimates Sn by
1.3 |
if there exists a function v(x) such that
1.4 |
In the Poisson case, Eq. 1.4 has the unique solution
1.5 |
provided that Σx=0∞|u(x)|yx/x! < ∞ for all y > 0.
In this paper, we consider the asymptotic efficiency of the u,v method. We prove the asymptotic efficiency of 1.3 for the estimation of 1.1 in the special case of Eq. 1.2 in the Poisson setting in Section 2. In Section 3, we discuss related problems and extensions to the estimation of the sums of u(Xi, θi) for general utility functions u(x, y) and general conditional distributions F(x|y).
2. The Poisson Case
Let f(x|y) ≡ e−yyx/x!, x = 0, 1, 2, … , be the Poisson probability mass function with intensity y > 0 and 𝒢 be a known family of probability distributions with support (0, ∞). Suppose (X, θ), (Xi, θi), are independent identically distributed random vectors such that
2.1 |
where G ∈ 𝒢 is an unknown distribution. We consider in this section estimation of
2.2 |
with the ua in Eq. 1.2 for a given a. By the u,v method, 2.2 should be estimated by
2.3 |
as in 1.3 and Eq. 1.5. For example, according to 2.3, the total intensity of those motorists with no traffic accidents in the past year is estimated by the total number of motorists with exactly one accident in the past year.
The estimator 2.3 also can be derived from an empirical Bayes point of view. If the distribution G in 2.1 is known, then the Bayes estimator of 2.2 under the squared error loss is the conditional expectation
which can be written as
2.4 |
with
2.5 |
where fG(x) ≡ ∫ f(x|y)dG(y) is the marginal probability mass function of X. An empirical Bayes estimator of 2.2 can be obtained by substituting the conditional expectation τa(G) with a suitable estimator, say τ̃a,n, in the Bayes estimator Sn,G in 2.4; i.e.
2.6 |
If G is completely unknown, we may estimate fG(x) by its empirical version and consequently estimate τa(G) by
2.7 |
This leads to the estimator 2.3 via
The relationship 2.6 can be reversed to derive estimates of τa(G) from those of 2.2, say S̃n ≡ S̃n(X1, … , Xn); i.e.
2.8 |
This provides a vehicle for the investigation of the efficiency of S̃n via the efficiency of τ̃a,n. Let H* ≡ H∗,G be the tangent space of the family {fG∶G ∈ 𝒢} at G,
2.9 |
where 𝒞G is the collection of all “differentiable” paths η∶[0, 1] → 𝒢 satisfying
2.10 |
with the fG in Eq. 2.5, and
2.11 |
is the score function for the path η in the parameter space 𝒢. See Bickel et al. (2). Define
2.12 |
with the ua in Eq. 1.2. It will be shown in the proof of Theorem 2.1 that at each G ∈ 𝒢 the efficient influence function for the estimation of τa(·) is
2.13 |
where H* is the tangent space given in 2.9.
Theorem 2.1 (i) A sequence {S̃n ≡ S̃n(X1, … , Xn)} is asymptotically efficient for the estimation of the {Sn} in 2.2 if and only if {τ̃a,n} in 2.8 is asymptotically efficient for the estimation of the functional τa(G) in 2.5. In this case,
2.14 |
where σ12(G) ≡ EGψ*2(X; G) with the ψ* in 2.13, σ22(G) ≡ EGua(X)VarG(θ|X) with the ua in Eq. 1.2, and fG is the marginal probability mass function of X. (ii) If G is completely unknown, i.e., 𝒢 = {all distributions in (0, ∞)}, then {Vn} in 2.3 is asymptotically efficient for the estimation of 2.2 and
2.15 |
Proof: The proof has three parts.
Step 1. Decomposition of (S̃n − Sn)/: By 2.8 and 2.4
2.16 |
where f̂n(a) is as in 2.7, ξn,1 ≡ {τ̃a,n − τa(G)} and ξn,2 ≡ {Sn,G − Sn}/. Conditionally on {Xi, i ≥ 1}, Sn,G − Sn are sums of independent (not identically distributed) random variables with mean zero, so that by the Lindeberg central limit theorem and the law of large numbers
2.17 |
almost surely for all {Xi, i ≥ 1}. The Lindeberg condition can be verified by the law of large numbers, but we shall omit the details. Because the limiting distribution in Eq. 2.17 does not depend on {Xi, i ≥ 1} and f̂n(a) → fG(a), by Eq. 2.16
2.18 |
provided that either (S̃n − Sn)/ or ξ1,n ≡ {τ̃a,n − τa(G)} are stochastically bounded, where ℒ(Z; P) is the distribution of Z under probability P and ★ stands for convolution. Thus, {S̃n} is asymptotically efficient for the estimation of Sn if and only if {τ̃a,n} is asymptotically efficient for the estimation of τa(G).
Step 2. Efficient influence function for the estimation of τa(G): It follows from the information bound in standard semiparametric estimation theory that the limiting distribution of asymptotically efficient {τ̃a,n} is
2.19 |
provided that ψ* is the efficient influence function for the estimation of τa(G). By 2.13, this is the case if for all η ∈ 𝒞G
2.20 |
where ρη is as in Eq. 2.11. See ref. 2. Thus, it suffices to verify Eq. 2.20 for the proof of Theorem 2.1 part i.
Because fG(x) > 0 for all x ≥ 0, by (2.11) t−1{fη(t)(x) − fG(x)} → fG(x)ρη(x), so that by Eq. 2.5 and 2.12
Therefore, Eq. 2.20 holds.
Step 3. Asymptotic efficiency of the u,v method: Let ψ be as in 2.12 and τ̂a,n be as in 2.7. By the central limit theorem and the strong law of large numbers, (τ̂a,n − τa(G)) converges in distribution to N(0, EGψ2(X; G)). Because Vn is the estimator of Sn corresponding to τ̂a,n by 2.8, it suffices to show ψ = ψ* in view of Theorem 2.1 part i and its proof.
For y > 0 define η(t) ≡ (1 − t)G + tδy, where δy puts the whole mass at y. Set ρ(y)(x) ≡ {f(x|y) − fG(x)}/fG(x). Then, EGρ(y)2(X) < ∞ by the Poisson assumption, and the left-hand side of Eq. 2.10 is
Thus, ρ(y) ≡ f(x|y)/fG(x) − 1 is in the tangent space H* for all y > 0 by 2.9. If h is orthogonal to H* in L2(fG), then
for all y > 0, so that h(x) = EGh(X) for all x ≥ 0 by the completeness of the Poisson family. This implies H* = L2(fG) ∩ {h∶EGh(X) = 0}. Hence, ψ* = ψ by 2.13 and the proof is complete.
3. Discussion
3.1. Related Problems.
Let Yi be random variables such that E[Yi|θi, Xi] = λθi. Suppose Yi are unobservable and λ is known. Consider the prediction of
3.1 |
based on observations X1, … , Xn. For example, we may want to predict the total number of accidents in the coming year for the group of motorists with no accidents in the past year, with λ = 1.02 due to 2% growth of drivers in the region of concern. By the u,v method, 3.1 can be predicted by λVn if Eq. 1.4 holds, with the Vn in 1.3. The argument in Section 2 still applies here in the Poisson case with u(x) = ua(x) in Eq. 1.2: {λVn}, with the Vn in 2.3, is asymptotically efficient for the prediction of 3.1 with
3.2 |
where σ12(G) is as in Theorem 2.1.
In many applications, Yi are observable and the problem is to estimate λ. In this case, the u,v methodology provides the estimator
3.3 |
The u,v method also produces estimates of variances. For example, if Eq. 1.4 holds, the variance EG(Vn − Sn)2 = nEG{v(X) − u(X)θ}2 can be estimated by
3.4 |
with two applications of the u,v method, first to u1 ≡ u2 and then to u2 ≡ v1 − 2uv, where vj satisfy ∫ (vj(x) − uj(x)y)F(dx|y) = 0, ∀y.
The u,v method can further be extended to obtain unbiased estimation of
3.5 |
If there exist functions vi(x1, … , xn) satisfying
for all y, i ≤ n and {xj, j ≠ i}, then we may estimate 3.5 by
3.6 |
For example, in the exponential case f(x|y) ≡ y−1e−x/y1{x > 0}, the θi associated with the largest observation can be written as 3.5,
and its unbiased estimation 3.6 is Vn = XRn − XRn−1 with
where Ri are the antiranks of the observations defined by XR1 < ⋯ < XRn.
The related problems mentioned here and their applications were considered in refs. 1 and 3–5.
3.2. Extensions.
The applicability of our methodology is not limited to the sum of u(Xi)θi in 1.1. In general, 1.3 can be used to estimate
3.7 |
for any integrable functions u(x, y), as long as
3.8 |
In fact, for the estimation of variance in 3.4, Eq. 3.8 holds for the pair ũ(x, y) and ṽ(x), with ũ(x, y) ≡ {v(x) − u(x)y}2 and ṽ(x) ≡ v2(x) + v2(x).
The asymptotic theory for the estimation of 3.7 is more complicated and will be studied elsewhere. Define
3.9 |
The asymptotic independence of (S̃n − Sn,G)/ and (Sn,G − Sn)/ can still be derived from the Lindeberg central limit theorem and the strong law of large numbers as in Section 2, but the rest of the argument there does not directly apply without the one-to-one linear mappings between estimates of Sn and τa(G) in 2.6 and 2.8.
Acknowledgments
This research was partially supported by the National Science Foundation and National Security Agency.
References
- 1.Robbins H. In: Statistical Decision Theory and Related Topics IV. Gupta S S, Berger J O, editors. Vol. 1. New York: Springer; 1988. pp. 265–270. [Google Scholar]
- 2.Bickel P J, Klaassen C A J, Ritov Y, Wellner J A. Efficient and Adaptive Estimation for Semiparametric Models. Baltimore: Johns Hopkins Univ. Press; 1992. [Google Scholar]
- 3.Robbins H, Zhang C-H. Proc Nat Acad Sci USA. 1988;85:3670–3672. doi: 10.1073/pnas.85.11.3670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Robbins H, Zhang C-H. Proc Nat Acad Sci USA. 1989;86:3003–3005. doi: 10.1073/pnas.86.9.3003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Robbins H, Zhang C-H. Biometrika. 1991;78:349–354. [Google Scholar]