Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2008 Jul 16.
Published in final edited form as: Ann Stat. 2003;31(2):512–535. doi: 10.1901/jaba.2003.31-512

CURRENT STATUS AND RIGHT-CENSORED DATA STRUCTURES WHEN OBSERVING A MARKER AT THE CENSORING TIME1

MARK J VAN DER LAAN 1, NICHOLAS P JEWELL 1
PMCID: PMC2467478  NIHMSID: NIHMS46669  PMID: 18633452

Abstract

We study nonparametric estimation with two types of data structures. In the first data structure n i.i.d. copies of (C, N(C)) are observed, where N is a finite state counting process jumping at time-variables of interest and C a random monitoring time. In the second data structure n i.i.d. copies of (CT, I (TC), N(CT)) are observed, where N is a counting process with a final jump at time T (e.g., death). This data structure includes observing right-censored data on T and a marker variable at the censoring time.

In these data structures, easy to compute estimators, namely (weighted)-pool-adjacent-violator estimators for the marginal distributions of the unobservable time variables, and the Kaplan–Meier estimator for the time T till the final observable event, are available. These estimators ignore seemingly important information in the data. In this paper we prove that, at many continuous data generating distributions the ad hoc estimators yield asymptotically efficient estimators of n-estimable parameters.

Key words and phrases: Asymptotically linear estimator, asymptotically efficient estimator, current status data, right, censored data, isotonic regression

1. Introduction

In this paper we study nonparametric estimation with two types of data structures. First, we discuss these two data structures in detail. Subsequently, we provide an overview of the rest of the paper.

1.1. Current status data on a finite counting process

Consider a finite state counting process N(t)=j=1kI(Tjt),T1<<Tk, where Tj is the time-variable at which a specified event occurs and where N jumps from value j − 1 to j at time Tj. The number of jumps k is fixed and known. We allow that there is a positive probability that the counting process never reaches jump j0 for any particular j0 ∈ {1, …, k}; since T1 < · · · < Tk, this implies that there is also a positive probability that N never reaches jump j for j = j0, …, k: that is, we allow multivariate distributions of (T1, …, Tk) with P (Tj = ∞) > 0 for j = j0, …, k. In this manner we allow applications in which the number of jumps of N is random on {1, …, k}.

We consider the data structure (C, N (C)) for a single random monitoring time C. The only assumption is that C is independent of N: the cumulative distribution G of C, and the probability distribution F of N are unspecified. Note that the distribution of N, denoted by F, is not a cumulative distribution function, but a probability distribution that is identified by the multivariate cumulative distribution of (T1, …, Tk).

Such data structures occur in cross-sectional studies where each subject is monitored once. For example, in some carcinogenicity experiments, one can only determine a discretized occult tumor size at time t in a randomly sampled mouse, as measured by N (t), by sacrificing a mouse at time t. In this example, T1 might represent time till onset of the tumor and T2, …, Tk might correspond with times till increasing sizes of the tumor. Similarly, Tj might denote the age at which a child has mastered the j th skill among a set of k skills ordered in difficulty. We refer to Jewell and van der Laan (1995) for additional applications.

The distribution of (C, N (C)) depends on the distribution of T⃗ = (T1, …, Tk) only through the marginal distributions Fj of Tj, j = 1, …, k (see Section 2). In this problem, the NPMLE of the distribution of Tj requires an iterative algorithm. On the other hand, an ad hoc method for estimation of the distribution of Tj is directly available: reduce the observation (C, N (C)) to a standard current status observation (C, Δj = I (TjC)) on Tj. Then one can estimate the distribution of Tj with the NPMLE based on the reduced current status observations, which we will refer to as the reduced data NPMLE (RNPMLE). This estimator provides regular and asymptotically linear estimators of pathwise differentiable functionals of Fj such as μj = ∫(1 − Fj)(u)r(u) du, for a given r, in the nonparametric model under certain conditions [Groeneboom and Wellner (1992)]. Previous work and examples of traditional current status data on a time variable T can be found in Diamond, McDonald and Shah (1986), Jewell and Shiboski (1990), Diamond and McDonald (1992), Keiding (1991) and Sun and Kalbfleisch (1993). In its nonparametric setting, the current status data structure is also known as case I interval censored data [Groeneboom and Wellner (1992)]. Current status data commonly arise in epidemiological investigations of the natural history of disease and in animal tumorigenicity experiments. Jewell, Malani and Vittinghoff (1994) give two examples that arise from studies of Human Immunodeficiency Virus (HIV) disease.

Note that the RNPMLE of Fj ignores the value of N (C), beyond information on whether N (C) ≥ j or not. For example, if N (t) is tumor size in a carcinogenicity experiment, then the simple current status estimator of the distribution of time, T1, till onset of tumor would not distinguish between an observation (C, N (C)) with N (C) large and an observation (C, N (C)) with N (C) small but larger than 0, while the latter observation seems to suggest that onset occurred recently. Nonetheless, we establish that the RNPMLE yields efficient estimators of pathwise differentiable parameters at a large class of continuous data generating distributions of interest.

1.2. Current status data on a finite counting process when the final event is right censored

We also consider the data structure (kCTk, N (k)) for a finite state counting process N(t)=j=1kI(Tjt), where Tk represents the final event (say death) which is right censored by the monitoring time C, and k is known. Note that this observation includes observing the failure indicator I (k = Tk). For example, consider a carcinogenicity experiment with mice in which T1 is time till onset of colon tumor, T2 time to liver metastasis and T3 time to death from tumor, where we assume that colon tumors do not cause death except through liver failure secondary to metastasis. Here C is either a sacrificing time or time till death from any unrelated cause.

Consider another example concerning estimation of the survival function of the time T = JI between time I at seroconversion and time J at death of a hemophiliac patient infected with HIV. For this purpose we observe n i.i.d. subjects in a fixed time-interval of 10 years. If we assume that the time I at seroconversion of the subject is observed (which is approximately true for hemophiliacs), then the subject’s survival time T is right censored by C ≡ 10 − I, where T will play the role of Tk. We define Tj as the time till a given monotone “surrogate” process Z(t) achieves a particular value among a set of k − 1 increasing values, j = 1, …, k − 1, where we assume that death T = Tk always and only occurs after the value Z(Tk − 1) has been reached. Let N(t)=j=1kI(Tjt) be the counting process. Here Z(t) measures the progression of the disease of the subject t years after seroconversion; for example, Z(t) might be a measure of viral load of the subject t years after seroconversion, where it may be reasonable to assume that the viral load is a nondecreasing process in the absence of treatment.

Suppose that for every subject who did not die before the end of the study C one measures the “surrogate” Z(C) at time C only. In other words, we observe failure times only for subjects who fail before end of follow up and for every subject who is alive at end of follow up we also have a marker indicating future prognosis. Note that the observed data on a subject is given by ( = TC, Z()). We only assume that C is independent of Z. A seemingly ad hoc estimator of S(t) = P (T > t) is the Kaplan–Meier estimator which simply ignores the marker information. In this example, a natural question is whether one can improve on the Kaplan–Meier estimator using the information in the surrogate process Z. In this paper we prove that the Kaplan–Meier estimator is asymptotically efficient at many continuous data generating distributions for which Fj have compact support.

A special case of this data structure has been treated in the literature. Consider a carcinogenicity experiment with N(t)=j=12I(Tjt), T1 is time till onset of tumor and T2 is time till death from tumor. Thus one observes (2CT2, N (2)). This data structure has been considered in Kodell, Shaw and Johnson (1982), Dinse and Lagakos (1982), Turnbull and Mitchell (1984), van der Laan, Jewell and Peterson (1997), and recently Groeneboom (1998). The NPMLE for this data structure requires an iterative algorithm: Turnbull and Mitchell (1984) implemented the NPMLE via the EM-algorithm (using an initial distribution with point masses at each data point so that the EM-algorithm indeed converges to the NPMLE), while Groeneboom (1998) implements the NPMLE by maximizing the actual likelihood with a modern optimization algorithm. In this problem, an ad hoc estimator of the distribution of T2 is the Kaplan–Meier estimator based on the reduced data (2, Δ2 = I (2 = T2)). In Dinse and Lagakos (1982), the Kaplan–Meier estimator of F2 was proposed and it was suggested that the NPMLE might be more efficient than the Kaplan–Meier estimator. In van der Laan, Jewell and Peterson (1997) it is shown that the Kaplan–Meier is efficient under a weak condition on (F1, F2). Moreover, an isotonic regression estimator of F1 was provided: note that estimation of F1 is complicated by the fact that for some subjects one only observes T2 and thus that T1 < T2, where T2 cannot be viewed as an independent monitoring time for T1. We note here that, in van der Laan, Jewell and Peterson (1997), a simulation study was carried out which incorrectly implements the NPMLE, so that finite sample comparisons between the Kaplan–Meier estimator and the NPMLE remain open to study [specifically the derivation of the score equations in van der Laan, Jewell and Peterson (1997) for the NPMLE was not valid since the authors incorrectly assumed that the NPMLE 1 is strictly smaller than the NPMLE 2].

1.3. Organization and overview of results

In Section 2 we prove, for the data structure of Section 1.1, that if the Fj’s are continuous with Lebesgue density bounded away from zero on [0, τj] and zero elsewhere, and G is also continuous, then any estimator of a parameter μ=Φ(F) that is regular and asymptotically linear at PF,G is also asymptotically efficient. The complexity of the NPMLE is discussed including that it is more efficient at many data generating distributions with singular pairs Fj1, Fj2; for example, F1 discrete and F2 continuous

In Section 3, we prove an analogous result for the nonparametric model with the data structure (CTk, N (CTk)). This shows that the Kaplan–Meier estimator of the distribution of Tk, based on the reduced data (k, ΔkI (TkC)), is asymptotically efficient at many continuous data generating distributions, extending the result in van der Laan, Jewell and Peterson (1997) for the case k = 2. Moreover, simple isotonic regression estimators for the distributions Fj, j = 1, …, k − 1, are proposed that also yield asymptotically efficient estimators of smooth functionals by our general result.

2. Current status data on a counting process

2.1. Traditional current status data

Traditional current status data can be viewed as current status data on a simple counting process as follows. Let T be a univariate failure time of interest and define the process Δ(t) = I (Tt) as the counting process with one single jump at point T. Let Y = (C, Δ(C)) represent current status data on Δ at a monitoring time C. We assume that C is independent of T [i.e., of Δ(·)]. The parameter of interest is the distribution F of T.

The properties of the NPMLE Fn of the distribution of T were established in Groeneboom and Wellner (1992). Here the NPMLE is defined as the maximum likelihood estimator over all discrete distributions with jumps at the monitoring times. Beyond proving a limit distribution result for Fn, these authors also established efficiency of smooth functionals of Fn with a closed form expression of the limit variance so that Wald-type confidence intervals are directly available. Huang and Wellner (1995) provide an alternative proof of asymptotic linearity of the NPMLE of smooth functionals of F under weak conditions.

We refer to Bickel, Klaassen, Ritov and Wellner (1993) for definitions of a regular, asymptotically linear and efficient estimator and influence curve of an estimator. The semiparametric-information bound at PF,G is defined as the infimum of parametric information bounds over a specified class of parametric submodels. We choose as parametric one-dimensional submodels

{εPFε,h1,Gε,h2:hj<,j=1,2,h1dF=h2dG=0},

where dFε,h1 (·) = (1 + ε,h1 (·)) dF (·), dGε,h2 (·) = (1 + ε,h2 (·)) dG (·) and ε is the unknown parameter with parameter space [− δ, δ] for some small δ > 0. The tangent space at PF,G is now defined as the closure in L02(PF,G) of the linear span of all the scores of these one-dimensional submodels, where, for a given measure μ, we define L02(μ)={h:h2dμ<,hdμ=0} as the Hilbert space endowed with inner product 〈h1, h2μ = ∫h1(y)h2(y) d μ(y). Thus the tangent space at PF,G is a sub-Hilbert space of L02(PF,G).

In this paper it is particularly important to realize that efficiency of an estimator is a local property in the sense that a regular estimator can be efficient at a particular PF,G and inefficient at another element of the model.

Lemma 2.1

Consider the nonparametric model for Y = (C, Δ(C)), where Δ(·) ≡I (T ≤ ·), T has unspecified distribution F and C is independent of T with unspecified distribution G. We observe n i.i.d. observations of Y = (C, Δ(C)). Consider the parameter μ = ∫(1 − Fn)(u)r(u) du for a given function r. Consider the estimator μ = ∫(1 − Fn)(u)r(u) du, where Fn is the NPMLE of F. We have that μn is regular and asymptotically linear at any (F, G) for which F is continuous with density fT > 0 on [0, M] and zero elsewhere (M < ∞), g(x) = dG/dx > 0 on [0, M], and r is bounded on [0, M].

The influence curve of μn is given by

IC(YF,g,r)=r(C)g(C)(F(C)(1Δ)(1F(C))Δ). (1)

The variance of IC is given by

VAR(IC)=r2(c)g(c)F(c)(1F(c))dc.

This lemma is proved in Huang and Wellner (1995).

We can also prove the following tangent space result.

Lemma 2.2

Consider the nonparametric model for Y = (C, Δ(C)), where Δ(·) ≡I (T ≤ ·), T has unspecified distribution F and C is independent of T with unspecified distribution G. We observe n i.i.d. observations of Y = (C, Δ(C)). Suppose that:

  1. F has a Lebesgue density f with f > 0 on [0, τF) and, if τF < ∞(τF = ∞ is allowed), then f = 0 on (τF = ∞), and

  2. G has a Lebesgue density g.

We allow F ({∞}) > 0. Then the tangent space at PF,G equals L02(PF,G). This implies that an estimator of a parameter μ (F) which is regular and asymptotically linear at PF,G is also asymptotically efficient if F, G satisfy (1) and (2).

In Gill, van der Laan and Robins (1997) it is proved that if one only assumes that the conditional distribution of the observed data Y, given the full data T, satisfies “coarsening at random” (CAR), then the tangent space at PF,G is saturated, that is, equals L02(PF,G). The tangent space generated by G(· | T) under the sole assumption CAR equals TCAR={v(Y)L02(PF,G):E(v(Y)T)=0}. Therefore, the main idea of the proof below is to show that under the independent censoring model G(· | T) = G(·), the tangent space of the marginal distribution G equals TCAR at a PF,G satisfying (1) and (2) of Lemma 2.2. The proof below will be an ingredient of the proofs of our two main theorems.

Proof of Lemma 2.2

Let A:L02(F)L02(PF,G),A(h)(Y)=EF(h(T)Y) be the score operator for F and let A:L02(PF,G)L02(F),A(V)(T)=EG(V(Y)T) be its adjoint. The closure of the range of a Hilbert space operator equals the orthogonal complement of the null-space of its adjoint; that is, R(A)¯=N(A), where R(A)¯ is the closure of the range of the score operator and N (A) is the null space of A. Thus L02(PF,G)=R(A)¯+N(A).

The data generating distribution is indexed by two locally variation-independent parameters F and G, so that the tangent space at PF,G can be obtained as a sum of two tangent spaces, namely the tangent space for F, which is given by R(A)¯, and the tangent space for G. For every hL02(G) with finite supremum norm, we have that ε →(1 + εh2) dG is a one-dimensional submodel through G at ε = 0. Thus the tangent space corresponding with submodels εPF,Gε equals L02(G). Thus we have that the tangent space is given by R(A)¯+L02(G). We conclude that it suffices to show that N(A)=L02(G).

We have

A(V)(T)=0TV(c,0)dG(c)+0V(c,1)dG(c).

Thus ∫V (c, Δ(c)) dG(c) = 0 F -a.e. implies that

0T{V(c,0)V(c,1)}g(c)dc=0V(c,1)dG(c)   for T[0,τF). (2)

Differentiation w.r.t T yields V (C, 0) = V (C, 1) on [0, τF) G-a.e. If τF < ∞ and c > τF, then c > T and thus V (c, Δ(c)) = V (c, 1). Thus V (C, 0) = V (C, 1) G-a.e. which proves N(A)=L02(G). □

It is of interest to note that one can represent FT (t) as a monotonic regression of Δ on C since F (t) = E(Δ (C) | C = t). This suggests that one can estimate FT with the estimator Fn(t) which minimizes i=1n(Δ(Ci)FT(Ci))2 over all distribution functions FT. Fn(t) can be computed using the pool-adjacent-violator-algorithm [see Barlow, Bartholomew, Bremner and Brunk (1972)] which, in fact, yields the NPMLE.

2.2. Current status data on a counting process

Let the process of interest be a counting process N(t)=j=1kI(Tjt),T1<<Tk, where Tj is the time-variable at which an event occurs and where N jumps from value j − 1 to j. Let C be a monitoring time and consider the data structure Y = (C, N (C)). We observe n i.i.d. copies of Y. We only assume that C is independent of N.

The distribution of (C, N (C)) depends on the distribution of T⃗ only through the marginal distributions Fj of Tj, j = 1, …, k. To be precise, we have (denoting Si = 1 − Fi), for j ∈ {0, …, k},

PF,G(dc,N(C)=j)=I(j=0)S1(c)dG(c)+I(j=k)Fk(c)dG(c)+I(j=1){S2(c)S1(c)}dG(c)++I(j=k1){Sk(c)Sk1(c)}dG(c).

Thus the distribution of Y = (C, N (C)) only identifies the marginal distributions of Tj, j = 1, …, k.

The NPMLE does not exist in closed form and can only be computed with an iterative algorithm. For a given j, we can reduce the observation (C, N (C)) to simple current status data (C, Δj = I (TjC)) on Tj, and estimate Fj with the RNPMLE. Under the conditions stated in Lemma 2.1, with F = Fj and G = G, this estimator provides regular and asymptotically linear estimators of smooth functionals of the type μj = ∫(1 − FTj)(u)r(u) du,, for a given r in the nonparametric model. The following theorem proves that, at a data generating distribution of Y satisfying a specified condition, any regular asymptotically linear estimator will provide asymptotically efficient estimators of smooth functionals of FTj. We decided to state a condition (3) which is easy to understand, but our proof shows that this can be weakened, for example, to allow the analogue of (3) for the case where all distributions G, F1, …, Fk are discrete with a finite number of support points; that is, the support points of Fj are contained in the support points of Fj are contained in the support points of Fj+1, j = 1, …, k −1, and G is discrete with support contained in the support of Fk.

Theorem 2.1

Let T1 < T2 < ···< Tk be time-variables corresponding to the chronological events of interest. Define the counting process with jumps of size 1 at these Tjs by

N(t)=j=1kI(Tjt).

Let Y = (C, N (C)). Consider the following semiparametric model for Y: Let C ~ G be independent of T⃗~ F, but leave G and F unspecified. Then, the distribution of Y only depends on the multivariate distribution F of T⃗ = (T1, …, Tk) through the marginal distributions F1, …, Fk of T1, …, Tk.

Consider a data generating distribution PF,G in the model above, satisfying the following condition (3): For certain τ1 < ···< τk < ∞let Fj have Lebesgue density fj on [0, τj] with

fj>0on[0,τj]andfj=0on(τj,),j=1,,k,Fj>Fj+1on(0,τj],j=1,,k1, GhasLebesguedensityg.  (3)

We allow that pjP (Tj = ∞) > 0 for j = j0, …, k and j0 ∈ {1, …, k}.

Then the tangent space at PF,G equals L02(PF,G) and is thus saturated.

This implies that any estimator of a real valued parameter of F that is a regular and asymptotically linear estimator at PF,G is also asymptotically efficient if PF,G satisfies (3). In particular, given j ∈ {1, …, k}, if PF,G satisfies (3), and Fj, G satisfy the conditions of Lemma 2.1 for the RNPMLE of μFj based on (C, I (Tj = C)) (thus with F = Fj and G = G), then the RNPMLE of μFj is asymptotically efficient.

2.2.1. Heuristic understanding of the difference between NPMLE and RNPMLE

To understand the difference between the NPMLE and the RNPMLE, we consider the special case k = 2 in detail. In this case N can have three possible values:

N(C)={0,if C<T1,1,if T1<C<T22,if C>T2.,

Let us assume that C has a Lebesgue density g. The likelihood of (C, N (C)) is given by

pF1,F2,G(c,N(c)=j)=S1(c)I(j=0)(S2S1)(c)I(j=1)F2(c)I(j=2)g(c).

We note that the density pF1,,F2,G can be reparametrized as

pR,F2,G(c,δ)=R(c)I(j=0)(1R(c))I(j=1)S2(c)I(j{0,1})F2(c)I(j=2)g(c),

where R(t) ≡ S1(t)/S2(t). Thus, if we ignore the relation between F2 and R, then the NPMLE of F2 of the likelihood corresponding with pR,F2,G would actually be equal to the reduced data NPMLE based on the reduced data (C, I (T2C)). However, F2 and R are related since S2R has to be a survival function. Therefore, it is not possible to determine the NPMLE by separate maximization w.r.t. F2 and R, which explains why the NPMLE and the RNPMLE of F2 differ.

Theorem 2.1 shows that this relation between F2 and R is not informative for estimation of smooth functionals of F2 at a large class of data generating distributions, since the RNPMLE, which ignores this relation, is still asymptotically efficient for estimation of n-estimable parameters. Our proof of Theorem 2.1 for k = 2 shows that the efficient score operator (for the definition of an efficient score operator, see the proof) of F2 equals the efficient score operator for F2 in the reduced data model based on (C, Δ2). This implies that, at (F1, F2) satisfying (3), the efficient influence curve for any smooth functional of F2 equals the influence curve of the RNPMLE as given in Lemma 2.1. Closer inspection of the proof for k = 2 also shows that, if (e.g.) F2 is continuous while F1 is discrete on [0, τ1], or F2 is discrete with support not containing the support of a discrete F1, then the efficient score operator for F2 is not the same as the efficient score operator for F2 in the reduced data model, so that, in particular, the efficient influence curves (and information bounds) differ for the two models. Thus, at such (F1, F2), the RNPMLE of smooth functionals of F2 is inefficient.

Here, we provide a likelihood-based explanation of this fact. Let Rn be the NPMLE of R. The NPMLE of F2 maximizes the likelihood corresponding with pRn,F2 over all F2 for which S2Rn is a survival function, while the RNPMLE maximizes the likelihood over all distributions F2. Suppose now that the model consists of discrete F1’s and continuous F2’s. This model, though smaller than the model with F1, F2 being unspecified, has the same semiparametric efficiency bound at a (F1, F2) in this smaller model as the efficiency bound in the original model. This follows from the fact that the class of one-dimensional submodels as needed to compute the tangent space can still be chosen the same. In this smaller model, an R = S1/S2 will be discrete at the support points of F1, and the shape of R between the support points equals the shape of 1/S2. As a consequence, since R determines the shape of F2 between the support points, knowing R in the smaller model helps enormously in estimating S2. In particular, for a given Rn, maximizing the likelihood corresponding with pRn,,F2 over F2 with S2Rn being a survival function, is very different from maximizing this likelihood over all possible distributions F2. This shows that the RNPMLE in the smaller model is inefficient at such (F1, F2). Since the efficiency bound in the smaller model is the same as the efficiency bound in the original model, this also shows that the RNPMLE will also be inefficient at such (F1, F2).

Proof of Theorem 2.1

We need to prove that assumption (3) implies that the tangent space at PF,G equals L02(PF,G), and is thus saturated. The data generating distribution PF,G is indexed by F and G, where the dependence on F is only through the marginals Fj, j = 1, …, k. Thus, the tangent space at PF,G can be obtained as a sum of two tangent spaces, namely the tangent space for F and the tangent space for G, where the latter equals L02(G). Let F, G be given and satisfy (3). We now claim that the tangent space for F is given by the closure of the sum of the k tangent spaces for Fj calculated as if the Fj ’s are variation-independent parameters, j = 1, …, k. We will show this now. Let hjL02(Fj) have finite supremum norm, and let Fj,ε,hj be the one-dimensional perturbation ε0(1+εhj)dFj through Fj at ε = 0, j = 1, …, k. First, note that the support of Fj,ε,hj equals the support of Fj, j = 1, …, k. Since Fj > Fj +1 (strictly) on (0, τj] we have that, given an arbitrarily small δ1 > 0, there exists a neighborhood ε ∈ (−δ, δ) with Fj,ε,hjFj+1,ε,hj+1on (δ, τj] for all j = 1, …, k − 1. Thus, PFj,ε,hj,j = 1,…,k,G satisfies the constraints FjFj+1, j = 1, …, k − 1, of our model except on an arbitrarily small neighborhood of 0. Thus, by modifying hj on an arbitrarily small neighborhood of 0, we can make εPFj,ε,hj, j=1,…,k,G a true one-dimensional submodel. Since a tangent space for F is obtained as the closure in L02(F) of the linear span of scores of all possible one-dimensional submodels, it follows that the score of the unmodified εPFj,ε,hj, j=1,…,k,G also belongs to the tangent space. This proves our claim.

Let j ∈ {1, …, k} be given. For a given hjL02(Fj), we consider the one-dimensional submodel Fj,ε given by ε → (1 + εhj (t)) dFj (t) which goes through Fj at ε = 0. For notational convenience, define the random variable R = N (C) + 1 ∈ {1, …, k + 1}, and let Fj be the (k − 1)-dimensional vector of c.d.f.’s excluding Fj. This one-dimensional submodel Fj,ε implies a score for PFj,ε,Fj,G given by

A1(h1)=I(R=1)ch1dF1S1(c)I(R=2)ch1dF1(S2S1)(c)   if j=1,Aj(hj)=I(R=j)chjdFj(SjSj1)(c)I(R=j+1)chjdFj(Sj+1Sj)(c)   if j{2,,k1},Ak(hk)=I(R=k)chkdFk(SkSk1)(c)I(R=k+1)chkdFkFk(c)   if j=k.

If we define S0 ≡ 0 and Sk+1 ≡ 1, then, for j = 1, …, k,

Aj(hj)=I(R=j)chjdFj(SjSj1)(c)I(R=j+1)chjdFj(Sj+1Sj)(c),

where we use that S1S0 = S1, and Sk+1Sk = Fk. Here Aj:L02(Fj)L02(PF,G) is called the score operator of, Fj = 1, …, k. The tangent space for Fj is given by the closure of the range of Aj denoted by R(Aj)¯. Define AF:L02(F1)××L02(Fk)L02(PF,G) by AF (h1, …, hk) = A1(h1) + … + Ak(hk). Then, the tangent space for F equals R(AF)¯ so that the tangent space at PF,G is given by R(AF)+L02(G)¯. Thus, to prove the theorem, it suffices to show that R(AF)+L02(G)¯=L02(PF,G) at any F, G satisfying (3).

The remaining task is to understand the range of AF. We decompose AF as a sum of efficient score operators Aj*, where Aj* is defined as Aj minus its projection, on the sum-space spanned by the ranges of the other score operators A1, …, Aj −1, Aj +1, …, Ak, j = 1, …, k. We will prove that the efficient score operator of Fj at a PF,G satisfying (3) equals Aj*(hj)=E(hj(Tj))(C,Δj=I(TjC)), which is the score operator for the reduced current status data structure (C, Δj), j = 1, …, k. Since the information bounds for smooth functionals of Fj are, in both models, solely expressed in terms of the efficient score operator for Fj, the latter result proves that an efficient estimator of μj based on (C, Δj), j = 1, …, k, like the RNPMLE, is also efficient in the model for the more informative data structure (C, N (C)) [e.g., Bickel, Klaassen, Ritov and Wellner (1993)]. This proves that the RNPMLE actually yields efficient estimators. Subsequently, we show that this special structure of the efficient score operators implies that the tangent space at a PF,G satisfying (3) is saturated, proving the more general statement of Theorem 2.1.

Derivation of the efficient score operators of Fj

Since E(Al(hl)Am(hm)(Y)) is equal to 0 if | lm |≥ 2, it will follow that the efficient score operators mainly involve projections of the type (AjR(Aj1)¯) and (AjR(Aj+1)¯). Therefore we first obtain closed form expressions, in general, for these projection operators.

If the projection (Aj(hj)R(Aj1)¯) is actually an element of R(Aj −1), then this projection is given by (compare with the formula X(XX)XY for the least squares estimator):

(Aj(hj)R(Aj1)¯)=Aj1(Aj1Aj1)Aj1Aj(hj), (4)

where Aj1:L02(PF,G)L02(Fj1) is the adjoint of Aj1:L02(Fj1)L02(PF,G), and (Aj1Aj1) stands for the generalized inverse of Aj1Aj1:L02(Fj1)L02(Fj1). Similarly,

(Aj(hj)R(Aj+1)¯)=Aj+1(Aj+1Aj+1)Aj+1Aj(hj). (5)

The adjoint Al is defined by

Al(hl),ηPF=hl,Al(η)Fl   for all hlL02(Fl) and ηL02(PF,G).

It is easily shown that for l ∈ {1, …, k},

Al(V)(Tl)=0Tl{V(c,l)V(c,l+1)}dG(c).

We have that

AlAl(hl)(Tl)=0Tlφl(c)chldFldG(c),

where

φ1=S2S1(S2S1),φl=Sl+1Sl1(Sl+1Sl)(SlSl1),   l=2,,k1,φk=Fk1(SkSk1)Fk,

or, in fact, with our convention of S0 = 0 and Sk+1 = 1,

φl=Sl+1Sl1(Sl+1Sl)(SlSl1),   l=1,,k.

Here φl (t) ≡ 0 if Sl(t) = 0.

If pl = P (Tl = ∞) > 0, then we can write

AlAl(hl)(Tl)=0min(Tl,τl)φl(c)0chldFldG(c)+I(Tl=)hl()plτlφl(c)dG(c).

Thus, given a K with KG, a solution (if it exists) of AlAl(hl)=K has to satisfy: for G-a.e., c ∈ [0, τl],

0chldFl=dKdG1φl(c),   l=1,,k, (6)

and, if pl = P (Tl = ∞) > 0, then the equation AlAl(hl)()=K() yields

hl()=1pl0φl(c)dG(c){K()0τlφl(c)cτlhldFldG(c)}. (7)

Thus, even when pl > 0, (6) is the principal equation to solve (and will imply our conditions) since its solution hl on [0, τl] yields the complete solution hl(Tl) = hl(Tl)I[0,τl](Tl) + I(Tl = ∞)hl(∞). This two-step method for solving for hl in AlAl(hl)=K first solves for hl I[0,τl] and then uses that, if pl > 0, hl (∞) is a function of hl I[0, τl].

We have, for l ∈ {1, …, k − 1},

AlAl+1(hl+1)=0Tl[Al+1(hl+1)I(R=l)Al+1(hl+1)I(R=l+1)]dG(c)=0TlAl+1(hl+1)I(R=l+1)dG(c)=0Tl1Sl+1Slchl+1dFl+1dG(c).

We note that this element is indeed absolutely continuous w.r.t. G. Similarly, it follows that, for l ∈ {1, …, k − 1},

Al+1Al(hl)=0Tl+11Sl+1SlchldFldG(c).

Thus, hj1,j(Aj1Aj1)Aj1Aj(hj) is the h satisfying

0chdFj1=1SjSj1chjdFj1φj1   for Ga.e., c[0,τj1] (8)

for G-a.e., c ∈ [0, τj−1] and, if pj −1 > 0, then h(∞) is a simple function of hI[0,τj−1] as given above. Similarly, hj+1,j(Aj+1Aj+1)Aj+1Aj(hj) is the h satisfying

0chdFj+1=1Sj+1SjchjdFj1φj+1   for Ga.e., c[0,τj+1] (9)

for G-a.e., c ∈ [0, τj+1] and, if pj +1 > 0, then h(∞) is a simple function of hI[0,τj+1]. If we can take a derivative of the right-hand sides in (8) and (9) w.r.t. Fj −1 and Fj +1, then, in terms of h, equations (8) and (9) have a solution. This is possible if FjFl (i.e., Fj is absolute continuous w.r.t. Fl) on [0, τl], l ∈ {j − 1, j + 1}, which holds under assumption (3) since we assumed that all Fj have positive Lebesgue density on [0, τj]. The efficient score operator Aj* also involves projections requiring existence of solutions hl−1,l, hl+1,l for l different from j. Therefore, the assumed condition (3) includes (via an easy to understand condition) the necessary and sufficient conditions for the existence of hl−1,l, hl+1,l for all possible l, as needed below.

This gives the following closed form expressions for the projections (4) and (5) by simply replacing chdFl in Al(h) by the expressions above. We have, for j = 1, …, k − 1,

(Aj(hj)R(Aj+1)¯)=Aj+1(hj+1,j)=chjdFj(Sj+1Sj)2φj+1I(R=j+1)+chjdFj(Sj+2Sj+1)(Sj+1Sj)φj+1I(R=j+2) (10)

and, for j = 2, …, k,

(Aj(hj)R(Aj1)¯)=Aj1(hj1,j)=chjdFj(SjSj1)(Sj1Sj2)φj1I(R=j1)+chjdFj(SjSj1)2φj1I(R=j). (11)

For simplicity we derive the efficient score operators for the case k = 3. (The proof generalizes to the general case.) First, define

Ajl=Aj(AjR(Al)¯).

The efficient score operators Aj*:L02(Fj)L02(PF) are given by

A3*=A3(A3R(A1+A2)¯)=A3(A3R(A21)¯),A2*=A2(A2R(A1+A3)¯)=A2(A2R(A1)¯)(A2R(A3)¯),A1*=A1(A1R(A2+A3)¯)=A1(A1R(A23)¯).

Calculation of A2*. Applying (10) and (11) with j = 2 gives us

(A2(h2)R(A1)¯)=ch2dF2(S2S1)(S1S0)φ1I(R=1)+ch2dF2(S2S1)2φ1I(R=2)

and

(A2(h2)R(A3)¯)=ch2dF2(S3S2)2φ3I(R=3)+ch2dF2(S4S3)(S3S2)φ3I(R=4).

Thus,

A2*(h2)=ch2dF2(S2S1)φ1I(R=I)+{1S2S11(S2S1)2φ1}ch2dF2I(R=2)+{1(S3S2)2φ31S3S2}ch2dF2I(R=3)ch2dF2(S4S3)(S3S2)φ3I(R=4).

Now, notice that

(S2S1)S1φ1=S2,(S4S3)(S3S2)φ3=S4S2=F2,1(S3S2)2φ31S3S2=1F2,1S2S11(S2S1)2φ1=1S2.

Thus (using 0h2dF2=0),

A2*(h2)=ch2dF2S2(c)I(R{1,2})+0ch2dF2F2(c)I(R{3,4}).

Calculation of A1*. Formula (10) with j = 1 gives us

(A2(h2)R(A3)¯)=ch2dF2(S3S2)2φ3(c)I(R=3)+ch2dF2(S4S3)(S3S2)φ3(c)I(R=4).

Thus,

A23(h2)=A2(h2)(A2(h2)R(A3)¯)=ch2dF2(S2S1)(c)I(R=2)+{1(S3S2)2φ3(c)1(S3S2)(c)}ch2dF2I(R=3)ch2dF2(S4S3)(S3S2)φ3(c)I(R=4).

We now note that

(S4S3)(S3S2)φ3=F2,1(S3S2)2φ3(c)1(S3S2)(c)=1F2.

Thus,

A23(h2)=ch2dF2(S2S1)(c)I(R=2)ch2dF2F2(c)I(R{3,4}).

It is easily verified that the adjoint A23:L02(PF)L02(F2) is given by

A23(V)=0T2{V(c,2)(S3S2)(c)F2(c)V(c,3)F3(c)F2(c)V(c,4)}dG(c).

Subsequently, we can now verify that

A23A23(h2)=0T2φ23(c)ch2dF2dG(c),

where

φ23F1F2(S2S1).

We need to find h23,1(A23A23)(K) with

K=A23A1(h1)=0T2ch1dF1(S2S1)(c)dG(c).

This solution has to satisfy on [0, τ2]:

0ch23,1dF2=dKdG(c)1φ23(c)=F2F1(c)ch1dF1

and, as shown previously, h23,1(∞) is a simple function of h23,1I[0,τ2]. We note that h23,1 exists under the assumption FjFk (i.e., FjFk and FkFj) on [0, τj], j = 1, …, k − 1, which follows from (3). We conclude that

(A1(h1)R(A23)¯)=A23(h23,1)=F2F1(S2S1)ch1dF1I(R=2)+ch1dF1F1I(R{3,4}).

Using F2/(F1(S2S1)) − 1/(S2S1) = −1/F1 and ch1dF1=0ch1dF1 yields

A1*(h1)=A1(h1)(A1(h1)R(A23)¯)=ch1dF1S1(c)I(R=1)+0ch1dF1F1(c)I(R{2,3,4}).

Calculation of A3*. This calculation is very similar to the one above for A1* and is omitted. We have

A3*(h3)=0ch3dF3F3(c)I(R=4)+ch3dF3S3(c)I(R{1,2,3}).

Proving that the tangent space is saturated

Given the expressions for the efficient score operators derived above, we now prove that the tangent space at a PF,G satisfying (3) is saturated. Under our assumption (3), the tangent space equals L02(G) (scores generated by G) plus the closure of the range of A*:L02(F1)××L02(Fk)L02(PF) defined by

(h1,,hk)A1*(h1)++Ak*(hk),

where the marginal efficient score operators are given by Aj*(hj)=E(hj(Tj))(C,Δj=I(TjC)),j=1,,k. The closure of the range of a Hilbert space operator equals the orthogonal complement of the null-space of its adjoint, that is, R(A*)¯=N(A*). Thus we need to show that N(A*)=L02(G). The adjoint A*:L02(PF)L02(F1)××L02(Fk) is given by

A*(V)=(A1*(V),,Ak*(V)),

where it is easily verified that the adjoint Aj*:L02(PF)L02(Fj) of Aj* is given by

Aj*(V)=E(E(V(C,R)C,Δj)Tj).

Consider the operator Bj:L02(C,Δj)L02(Fj) given by Bj(η)=E(η(C,Δj)Tj), where L02(C,Δj) is the space of functions of (C, Δj) with finite variance and zero mean (both taken w.r.t. PF,G). Using precisely the same proof as the proof of Lemma 2.2, it follows that, if Fj has a Lebesgue density fj > 0 on [0, τj], then the null-space N(Bj)=L02(G), that is, it consists of functions independent of Δj. Thus, under (3), Aj*(V)=0 implies that E(V (C, R) | C, Δj) = E (V (C, R) | C) ≡ φ(C), j = 1, …, k.

Setting Δ1 = 0 yields φ(C) = E(V (C, R) | C, Δ1 = 0) = V (C, 1). Now, we note that

P(R=mΔj=1,C=c)=I(mj+1)P(R=mc)Fj(c),   j=1,,k,

where P (R = m | c) = (SmSm−1)(c). Thus, E(V (C, R) | C, Δj = 1) is given by

mj+1V(c,m)(SmSm1)(c)Fj(c)=φ(c),   j=1,,k.

For j = k, this equality gives V (c, k + 1) = φ(c). For j = k − 1, this equality gives then

V(c,k)(SkSk1)(c)Fk1(c)=(1Fk(c)Fk1(c))φ(c)=(SkSk1)(c)Fk1(c)φ(c)

so that V (c, k) = φ(c). In this manner, we subsequently find φ(c) = V (c, k + 1) = V (c, k) = … = V (c, 2). This shows that V (C, R) does not depend on R. This completes the proof.

3. Current status data on a counting process when final event is right censored

The following theorem proves efficiency of any regular asymptotically linear estimator at a specified rich sub-model.

Theorem 3.1

Let N (t) be a counting process N(t)=j=1kI(Tjt) for random variables T1 < …< Tk. Let C be a random censoring time. For every subject we observe the following data structure:

Y=(T=TkC,Δ=I(TkC),N(T)).

We assume that C is independent of (T1, …, Tk). The distribution of Y only depends on the multivariate distribution F of (T1, …, Tk) through the marginal distributions F1, …, Fk of (T1, …, Tk).

Consider a data generating distribution PF,G in the model above satisfying the following condition (12): For certain τ1 < …< τk < ∞, let Fj have Lebesgue density fj on [0, τj] with

fj>0on[0,τj]andfj=0on(τj,),j=1,,k,Fj>Fj+1on(0,τj],j=1,,k1, GhasLebesguedensityg.  (12)

We allow that pjP(Tj = ∞) > 0 for j = j0, …, k and j0 ∈{1, …, k}.

Then, the tangent space at PF,G equals L02(PF,G) and is thus saturated. This implies that an estimator of a real valued parameter of the distribution F which is regular and asymptotically linear at PF,G is also asymptotically efficient if PF,G satisfies (12). In particular, if Ḡ(t) > 0 and F, G satisfy (12), then the Kaplan–Meier estimator Sk,KM (t) of Sk(t) = P (Tk > t), based on the i.i.d. data (, Δ), is asymptotically efficient.

3.1. Regular and asymptotically linear estimators

The important implication of Theorem 3.1 is that, if we can construct an estimator of n-estimable parameters of Fj which is regular, then this estimator will be asymptotically efficient at any F satisfying (12), j = 1, …, k. In this subsection, we provide relatively simple regular and asymptotically linear estimators.

First, consider estimation of Sk(t) = P (Tk > t). It is well known that Sk,KM (t) is a regular asymptotically linear estimator of Sk(t) whenever (t) > 0. Second, consider estimation of Sj (t) = P (Tj > t), j = 1, …, k − 1. Let ΔjI (TjC). Under independent censoring (we can weaken this to noninformative censoring of Tk), we have

E(1ΔjC=c,Tk>c)=Sj(c)Sk(c)Rj(c). (13)

So

Sj(c)=Sk(c)E(1ΔjC=c,Tk>c)=E(Sk(c)(1Δj)C=c,Tk>c). (14)

In other words, estimating Sj can be viewed as estimating a monotonic regression of Sk(C)(1 − Δj) on the observed C’s. This suggests replacing Sk by the efficient Kaplan–Meier estimator Sk,KM and minimizing

1ni=1nwi{Sk,KM(Ci)(1Δji)Sj(Ci)}2I(CiTki) (15)

over the vector (Sj (Ci): i = 1, …, n), under the constraint that Sj is monotone, where wi, i = 1, …, n, is a given set of weights possibly assigning more mass to observations with smaller variance. The solution Sj,n of this problem can be obtained with the pool-adjacent-violator-algorithm (PAVA) [see, e.g., Barlow, Bartholomew, Bremner and Brunk (1972)].

A simple calculation shows that

VAR{Sk(C)(1Δj)C=c,Tk>c}=Sk(c)2VAR{1ΔjC=c,Tk>c}=Sk2(c)Rj(c){1Rj(c)}. (16)

Since Rj is not identified from the data at a better rate than Sj, a good set of weights is wi=1/Sk,KM2(Ci),i=1,,n [see van der Laan, Jewell and Peterson (1997)].

It is beyond the scope of this paper to prove that smooth functionals of Sj,n are regular and asymptotically linear. Since it is straightforward to prove such a theorem for a standard histogram regression estimator of the regression of Sk(C)(1 − Δj) on the observed C’s, one expects that the more sophisticated isotonic regression estimate Sj,n (which only differs because it selects its bins adaptively) is regular and asymptotically linear under the same conditions. We note that the choice of weights wi, i = 1, …, n, has no effect on the limit distribution of smooth functionals of Sj,n.

3.2. Proof of Theorem 3.1

In the first part of the proof we establish that, if condition (12) holds, then the efficient score operator of Fk equals the efficient score operator of Fk in the reduced data model for (k, Δk), hereby establishing a proof of the efficiency of the Kaplan–Meier estimator SKM (t). Subsequently, exploiting this special form of the efficient score operator of Fk, we prove saturation of the tangent space and thus Theorem 3.1.

Consider the data structure (k = TkC, N (k)), where N(t)=j=1kI(Tjt) and T1 < T2 < …< Tk are ordered random variables. Let R = N (k) + 1. The density of the data is given by

P(dTk,R=j)=m=1k(SmSm1)(Tk)R=mdFk(Tk)R=k+1dG(t)Rk+1G¯(t)R=k,

where S0 ≡ 0 and Sk+1 ≡ 1. We refer to the beginning of the proof of Theorem 2.1 to show that the tangent space at a PF,G satisfying condition (12) is the closure of the sum of the tangent spaces generated by Fj, j = 1, …, k and the tangent space of G, treating Fj as locally variation-independent. We have that the score operators: Aj:L02(Fj)L02(PF,G) for Fj, j= 1, …, k − 1, are given by

Aj(hj)=chjdFj(SjSj1)(c)I(R=j)chjdFj(Sj+1Sj)(c)I(R=j+1)

and

Ak(hk)=hk(Tk)I(R=k+1)+chkdFk(SkSk1)(c)I(R=k).
Derivation of efficient score operator of Fk

We first determine the efficient score operator for Fk. For notational convenience, we consider the case k = 3. We have

A3*(h3)=A3(h3)(A3(h3)R(A21)¯)

where

A21=A2(A2R(A1)¯).

Applying formula (11) gives

(A2(h2)R(A1)¯)=ch2dF2S2(c)I(R=1)+ch2dF2(S2S1)2φ1(c)I(R=2),

where we need to assume that F2F1 on [0, τ1]. Thus, an easy calculation shows that

A21(h2)=ch2dF2S2(c)I(R{1,2})ch2dF2(S3S2)(c)I(R=3).

Another straightforward calculation shows that the adjoint A21:L02(PF,G)L02(F2) of A21:L02(F2)L02(PF,G) is given by

A21(V)=0T2{V(c,1)S1S2(c)+(S2S1)S2(c)V(c,2)V(c,3)}dG(c).

A straightforward calculation now shows that

A21A21(h2)=0T20h2dF2S3S2(S3S2)(c)dG(c).

We also have

A21A3(h3)=0T2ch3dF3(S3S2)(c)dG(c).

This shows that h21,3(A21A21)A21A3(h3) satisfies, on [0, τ2],

0ch21,3dF2=S2S3(c)ch3dF3,

and, if p2 = P (T2 = ∞) > 0, then h21,3(∞) is a simple function of h21,3I[0, τ2]as shown above (7). Here we need to assume that this equation can be solved in h21,3. This is true if F3F2 on [0, τ2]. Then

(A3(h3)R(A21)¯)=A21(h21,3)=ch3dF3S3(c)I(R{1,2})+S2(c)S3(S3S2)(c)ch3dF3I(R=3).

This proves that

A3*(h3)=h3(T3)I(R=4)+{1S3S2S2(S3S2)S3}(c)I(R=3)+ch3dF3S3(c)I(R{1,2})=h3(T3)I(R=4)+ch3dF3S3(c)I(R{1,2,3}).

Thus, we have proved that, if FkFj on [0, τj], j = 1, …, k − 1, then the efficient score Ak*(hk)=E(hk(Tk)Tk,Δk). The latter condition holds, in particular, if (12) holds. This proves the statement of Theorem 3.1 regarding efficiency of the Kaplan–Meier estimator SKM.

Saturated tangent space result

Note that, for a random variable Y, we define L02(Y)={h(Y):Eh2(Y)<,Eh(Y)=0}. For simplicity, we prove saturation for k = 3. Let A:L02(F1)×L02(F2)L02(PF,G) be defined by A(h1, h2) = A1(h1) + A2(h2). Then, the tangent space of F is given by R(A1)+R(A2)+R(A3)¯=R(A1)+R(A2)¯R(A3*)¯. Thus, the tangent space at PF,G is given by R(A)¯R(A3*)¯R(B)¯, where B:L02(G)L02(T3,Δ3) is the score operator for the censoring mechanism G, given by B(h) = E(h(C) | 3, Δ3). By factorization of the likelihood into F and G parts, we have that R(B) is orthogonal to F-scores. It is well known that R(A3*)R(B)¯=L02(T3,Δ3). The latter result simply states that the tangent space for the nonparametric right-censored data model for (3, Δ3), only assuming that C is independent of T, is saturated [e.g., Bickel, Klaassen, Ritov and Wellner (1993)]. Thus, we need to prove that R(A)¯L02(T3,Δ3)=L02(PF,G) which is equivalent to proving N(A)=L02(T3,Δ3), where A:L02P(F,G)L02(F1)×L02(F2) is the adjoint of A and N (A) denotes its null space.

First, we decompose A1 + …+ Ak− 1 into a sum of orthogonal operators (efficient score operators in the model with Fk known). Let A1=A1(A1R(A2)¯) and A2=A2(A2R(A1)¯). By (4), it follows that

A1(h1)=ch1dF1S1(c)I(R=1)ch1dF1(S3S1)I(R{2,3}),A2(h2)=ch2dF2S2(c)I(R{1,2})+0ch2dF2(S3S2)(c)I(R=3),

where we need the equivalence assumptions FjFj +1 on [0, τj] for j = 1, …, k, again. A more compact manner of representing these operators Aj:L02(Fj)H(C,R){V(C,R)I(R<4)L02(PF,G):V} is

Aj(hj)=E(hj(Tj)C,Δj,T3>C)I(T3>C),   j=1,2. (17)

Consider the operator A:L02(F1)×L02(F2)H(C,R) defined by A(h1,h2)=A1(h1)+A2(h2). Proving N(A)=L02(T3,Δ3) is equivalent to proving N(A)=L02(T3,Δ3), where A′⊤ is the adjoint of A.

From the representation (17), the adjoint Aj:H(C,R)L02(Fj) is given by

Aj(V)=E(E(V(C,R)I(T3>C)C,Δj,T3>C)Tj),   j=1,2,

and thus, N(A)=N(A1)N(A2).

Consider now a solution V I (T3 > C) ∈ H (C, R) satisfying Aj(VI(T3>C))=0,j=1,2. In order to prove VL02(T3,Δ3), it suffices to show I (T3 > C)V = I (T3 > C)φ(C) for some φ. Using precisely the same proof as the proof of Lemma 2.2, it follows that, if Fj has a Lebesgue density fj > 0 on [0, τj] and G has a Lebesgue density, then, for any function I (T3 > C)η(C, Δj), E(I (T3 > C) η (C, Δj) | Tj) = 0 implies η (C, 1) = η (C, 0). This proves that E(V (C, R)I (T3 > C) | C, Δj, T3 > C) = E(V (C, R)I (T3 > C) | C, T3 > C) ≡ I (T3 > C)φ(C) does not depend on Δj, j = 1, 2.

Setting Δ1 = 0 yields I (T3 > C)φ(C) = E(V (C, R)I (T3 > C) | C, Δj, T3 > C) = V (C, 1)I (T3 > C). Now, we note that

P(R=mΔj=1,C=c,T3>c)=I(mj+1,m<4)(SmSm1)(S3Sj)(c),   j=1,2.

Thus, E(V (C, R)I (T3 > C) | C, Δj = 1, T3 > C) is given by

I(T3>C)mj+1,m<4V(C,m)(SmSm1)(C)(S3Sj)(C)=I(T3>C)φ(C),   j=1,2.

For j = 2, this equality gives I (T3 > C)V (C, 3) = I (T3 > C)φ(C). For j = 1, this equality gives

I(T3>C){V(C,2)(S2S1)(c)(S3S1)(C)+φ(C)(S3S2)(C)(S3S1)(C)}=I(T3>C)Φ(C),

so that I (T3 > C)V (C, 2) = I (T3 > C)φ(C). We have shown I (T3 > C) × V (C, 1) = I (T3 > C)V (C, 2) = I (T3 > C)V (C, 3) which proves that V = I (T3 < C)V1(T3) + I (T3 > C)φ(C) for some functions V1 and f, and thus that VL02(T3,Δ3). This completes the proof. □

Acknowledgments

The authors thank the referees and Associate Editor for their helpful comments.

Footnotes

1

Supported by a FIRST award (GM53722) from the National Institute of General Medical Sciences and the National Institutes of Health.

References

  1. Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD. Statistical Inference under Order Restrictions. Wiley; New York: 1972. [Google Scholar]
  2. Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation in Semi-Parametric Models. Johns Hopkins Univ. Press; 1993. [Google Scholar]
  3. Diamond ID, McDonald JW. The analysis of current status data. In: Trussell J, Hankinson R, Tilton J, editors. Demographic Applications of Event History Analysis. Oxford Univ. Press; 1992. pp. 231–252. [Google Scholar]
  4. Diamond ID, McDonald JW, Shah IH. Proportional hazards models for current status data: Application to the study of differentials in age at weaning in Pakistan. Demography. 1986;23:607–620. [PubMed] [Google Scholar]
  5. Dinse GE, Lagakos SW. Nonparametric estimation of lifetime and disease onset distributions from incomplete observations. Biometrics. 1982;38:921–932. [PubMed] [Google Scholar]
  6. Gill RD, van der Laan MJ, Robins JM. Proc First Seattle Symposium in Biostatistics. Lecture Notes in Statist. Vol. 123. Springer; New York: 1997. Coarsening at random: Characterizations, conjectures and counterexamples; pp. 255–294. [Google Scholar]
  7. Groeneboom PJ. Special topics course 593C: Nonparametric estimation for inverse problems: algorithms and asymptotics. 1998 Technical Report 344, Dept. Statistics, Univ. Washington. (For related software see www.stat.washington.edu/jaw/RESEARCH/SOFTWARE/software.list.html.)
  8. Groeneboom P, Wellner JA. Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser; Basel: 1992. [Google Scholar]
  9. Huang J, Wellner JA. Asymptotic normality of the NPMLE of linear functionals for interval censored data, case I. Statist Neerlandica. 1995;49:153–163. [Google Scholar]
  10. Jewell NP, Malani HM, Vittinghoff E. Nonparametric estimation for a form of doubly censored data with application to two problems in AIDS. J Amer Statist Assoc. 1994;89:7–18. [Google Scholar]
  11. Jewell NP, Shiboski SC. Statistical analysis of HIV infectivity based on partner studies. Biometrics. 1990;46:1133–1150. [PubMed] [Google Scholar]
  12. Jewell NP, van der Laan MJ. Generalizations of current status data with applications. Lifetime Data Analysis. 1995;1:101–109. doi: 10.1007/BF00985261. [DOI] [PubMed] [Google Scholar]
  13. Jongbloed G. Three statistical inverse problems. Delft Univ. Technology; 1995. Ph.D. dissertation. [Google Scholar]
  14. Keiding N. Age-specific incidence and prevalence: A statistical perspective (with discussion) J Roy Statist Soc Ser A. 1991;154:371–412. [Google Scholar]
  15. Kodell RL, Shaw GW, Johnson AM. Nonparametric joint estimators for disease resistance and survival functions in survival/sacrifice experiments. Biometrics. 1982;38:43–58. [PubMed] [Google Scholar]
  16. Sun J, Kalbfleisch JD. The analysis of current status data on point processes. J Amer Statist Assoc. 1993;88:1449–1454. [Google Scholar]
  17. Turnbull BW, Mitchell TJ. Nonparametric estimation of the distribution of time to onset for specific diseases in survival/sacrifice experiments. Biometrics. 1984;40:41–50. [PubMed] [Google Scholar]
  18. van der Laan MJ, Jewell NP, Peterson DR. Efficient estimation of the lifetime and disease onset distribution. Biometrika. 1997;84:539–554. [Google Scholar]

RESOURCES