CURRENT STATUS AND RIGHT-CENSORED DATA STRUCTURES WHEN OBSERVING A MARKER AT THE CENSORING TIME

MARK J VAN DER LAAN; NICHOLAS P JEWELL

doi:10.1901/jaba.2003.31-512

. Author manuscript; available in PMC: 2008 Jul 16.

Published in final edited form as: Ann Stat. 2003;31(2):512–535. doi: 10.1901/jaba.2003.31-512

CURRENT STATUS AND RIGHT-CENSORED DATA STRUCTURES WHEN OBSERVING A MARKER AT THE CENSORING TIME^¹

MARK J VAN DER LAAN ¹, NICHOLAS P JEWELL ¹

PMCID: PMC2467478 NIHMSID: NIHMS46669 PMID: 18633452

Abstract

We study nonparametric estimation with two types of data structures. In the first data structure n i.i.d. copies of (C, N(C)) are observed, where N is a finite state counting process jumping at time-variables of interest and C a random monitoring time. In the second data structure n i.i.d. copies of (C ∧ T, I (T ≤ C), N(C ∧ T)) are observed, where N is a counting process with a final jump at time T (e.g., death). This data structure includes observing right-censored data on T and a marker variable at the censoring time.

In these data structures, easy to compute estimators, namely (weighted)-pool-adjacent-violator estimators for the marginal distributions of the unobservable time variables, and the Kaplan–Meier estimator for the time T till the final observable event, are available. These estimators ignore seemingly important information in the data. In this paper we prove that, at many continuous data generating distributions the ad hoc estimators yield asymptotically efficient estimators of $\sqrt{n}$ -estimable parameters.

Key words and phrases: Asymptotically linear estimator, asymptotically efficient estimator, current status data, right, censored data, isotonic regression

1. Introduction

In this paper we study nonparametric estimation with two types of data structures. First, we discuss these two data structures in detail. Subsequently, we provide an overview of the rest of the paper.

1.1. Current status data on a finite counting process

Consider a finite state counting process $N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t), T_{1} < \dots < T_{k}$ , where T_j is the time-variable at which a specified event occurs and where N jumps from value j − 1 to j at time T_j. The number of jumps k is fixed and known. We allow that there is a positive probability that the counting process never reaches jump j₀ for any particular j₀ ∈ {1, …, k}; since T₁ < · · · < T_k, this implies that there is also a positive probability that N never reaches jump j for j = j₀, …, k: that is, we allow multivariate distributions of (T₁, …, T_k) with P (T_j = ∞) > 0 for j = j₀, …, k. In this manner we allow applications in which the number of jumps of N is random on {1, …, k}.

We consider the data structure (C, N (C)) for a single random monitoring time C. The only assumption is that C is independent of N: the cumulative distribution G of C, and the probability distribution F of N are unspecified. Note that the distribution of N, denoted by F, is not a cumulative distribution function, but a probability distribution that is identified by the multivariate cumulative distribution of (T₁, …, T_k).

Such data structures occur in cross-sectional studies where each subject is monitored once. For example, in some carcinogenicity experiments, one can only determine a discretized occult tumor size at time t in a randomly sampled mouse, as measured by N (t), by sacrificing a mouse at time t. In this example, T₁ might represent time till onset of the tumor and T₂, …, T_k might correspond with times till increasing sizes of the tumor. Similarly, T_j might denote the age at which a child has mastered the j th skill among a set of k skills ordered in difficulty. We refer to Jewell and van der Laan (1995) for additional applications.

The distribution of (C, N (C)) depends on the distribution of T⃗ = (T₁, …, T_k) only through the marginal distributions F_j of T_j, j = 1, …, k (see Section 2). In this problem, the NPMLE of the distribution of T_j requires an iterative algorithm. On the other hand, an ad hoc method for estimation of the distribution of T_j is directly available: reduce the observation (C, N (C)) to a standard current status observation (C, Δ_j = I (T_j ≤ C)) on T_j. Then one can estimate the distribution of T_j with the NPMLE based on the reduced current status observations, which we will refer to as the reduced data NPMLE (RNPMLE). This estimator provides regular and asymptotically linear estimators of pathwise differentiable functionals of F_j such as μ_j = ∫(1 − F_j)(u)r(u) du, for a given r, in the nonparametric model under certain conditions [Groeneboom and Wellner (1992)]. Previous work and examples of traditional current status data on a time variable T can be found in Diamond, McDonald and Shah (1986), Jewell and Shiboski (1990), Diamond and McDonald (1992), Keiding (1991) and Sun and Kalbfleisch (1993). In its nonparametric setting, the current status data structure is also known as case I interval censored data [Groeneboom and Wellner (1992)]. Current status data commonly arise in epidemiological investigations of the natural history of disease and in animal tumorigenicity experiments. Jewell, Malani and Vittinghoff (1994) give two examples that arise from studies of Human Immunodeficiency Virus (HIV) disease.

Note that the RNPMLE of F_j ignores the value of N (C), beyond information on whether N (C) ≥ j or not. For example, if N (t) is tumor size in a carcinogenicity experiment, then the simple current status estimator of the distribution of time, T₁, till onset of tumor would not distinguish between an observation (C, N (C)) with N (C) large and an observation (C, N (C)) with N (C) small but larger than 0, while the latter observation seems to suggest that onset occurred recently. Nonetheless, we establish that the RNPMLE yields efficient estimators of pathwise differentiable parameters at a large class of continuous data generating distributions of interest.

1.2. Current status data on a finite counting process when the final event is right censored

We also consider the data structure (T̃_k ≡ C ∧ T_k, N (T̃_k)) for a finite state counting process $N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t)$ , where T_k represents the final event (say death) which is right censored by the monitoring time C, and k is known. Note that this observation includes observing the failure indicator I (T̃_k = T_k). For example, consider a carcinogenicity experiment with mice in which T₁ is time till onset of colon tumor, T₂ time to liver metastasis and T₃ time to death from tumor, where we assume that colon tumors do not cause death except through liver failure secondary to metastasis. Here C is either a sacrificing time or time till death from any unrelated cause.

Consider another example concerning estimation of the survival function of the time T = J − I between time I at seroconversion and time J at death of a hemophiliac patient infected with HIV. For this purpose we observe n i.i.d. subjects in a fixed time-interval of 10 years. If we assume that the time I at seroconversion of the subject is observed (which is approximately true for hemophiliacs), then the subject’s survival time T is right censored by C ≡ 10 − I, where T will play the role of T_k. We define T_j as the time till a given monotone “surrogate” process Z(t) achieves a particular value among a set of k − 1 increasing values, j = 1, …, k − 1, where we assume that death T = T_k always and only occurs after the value Z(T_k _{− 1}) has been reached. Let $N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t)$ be the counting process. Here Z(t) measures the progression of the disease of the subject t years after seroconversion; for example, Z(t) might be a measure of viral load of the subject t years after seroconversion, where it may be reasonable to assume that the viral load is a nondecreasing process in the absence of treatment.

Suppose that for every subject who did not die before the end of the study C one measures the “surrogate” Z(C) at time C only. In other words, we observe failure times only for subjects who fail before end of follow up and for every subject who is alive at end of follow up we also have a marker indicating future prognosis. Note that the observed data on a subject is given by (T̃ = T ∧ C, Z(T̃)). We only assume that C is independent of Z. A seemingly ad hoc estimator of S(t) = P (T > t) is the Kaplan–Meier estimator which simply ignores the marker information. In this example, a natural question is whether one can improve on the Kaplan–Meier estimator using the information in the surrogate process Z. In this paper we prove that the Kaplan–Meier estimator is asymptotically efficient at many continuous data generating distributions for which F_j have compact support.

A special case of this data structure has been treated in the literature. Consider a carcinogenicity experiment with $N (t) = \sum_{j = 1}^{2} I (T_{j} \leq t)$ , T₁ is time till onset of tumor and T₂ is time till death from tumor. Thus one observes (T̃₂ ≡ C ∧ T₂, N (T̃₂)). This data structure has been considered in Kodell, Shaw and Johnson (1982), Dinse and Lagakos (1982), Turnbull and Mitchell (1984), van der Laan, Jewell and Peterson (1997), and recently Groeneboom (1998). The NPMLE for this data structure requires an iterative algorithm: Turnbull and Mitchell (1984) implemented the NPMLE via the EM-algorithm (using an initial distribution with point masses at each data point so that the EM-algorithm indeed converges to the NPMLE), while Groeneboom (1998) implements the NPMLE by maximizing the actual likelihood with a modern optimization algorithm. In this problem, an ad hoc estimator of the distribution of T₂ is the Kaplan–Meier estimator based on the reduced data (T̃₂, Δ₂ = I (T̃₂ = T₂)). In Dinse and Lagakos (1982), the Kaplan–Meier estimator of F₂ was proposed and it was suggested that the NPMLE might be more efficient than the Kaplan–Meier estimator. In van der Laan, Jewell and Peterson (1997) it is shown that the Kaplan–Meier is efficient under a weak condition on (F₁, F₂). Moreover, an isotonic regression estimator of F₁ was provided: note that estimation of F₁ is complicated by the fact that for some subjects one only observes T₂ and thus that T₁ < T₂, where T₂ cannot be viewed as an independent monitoring time for T₁. We note here that, in van der Laan, Jewell and Peterson (1997), a simulation study was carried out which incorrectly implements the NPMLE, so that finite sample comparisons between the Kaplan–Meier estimator and the NPMLE remain open to study [specifically the derivation of the score equations in van der Laan, Jewell and Peterson (1997) for the NPMLE was not valid since the authors incorrectly assumed that the NPMLE F̂₁ is strictly smaller than the NPMLE F̂₂].

1.3. Organization and overview of results

In Section 2 we prove, for the data structure of Section 1.1, that if the F_j’s are continuous with Lebesgue density bounded away from zero on [0, τ_j] and zero elsewhere, and G is also continuous, then any estimator of a parameter $μ = Φ (F) ℝ$ that is regular and asymptotically linear at P_F,G is also asymptotically efficient. The complexity of the NPMLE is discussed including that it is more efficient at many data generating distributions with singular pairs F_j₁, F_j₂; for example, F₁ discrete and F₂ continuous

In Section 3, we prove an analogous result for the nonparametric model with the data structure (C ∧ T_k, N (C ∧ T_k)). This shows that the Kaplan–Meier estimator of the distribution of T_k, based on the reduced data (T̃_k, Δ_k ≡ I (T_k ≤ C)), is asymptotically efficient at many continuous data generating distributions, extending the result in van der Laan, Jewell and Peterson (1997) for the case k = 2. Moreover, simple isotonic regression estimators for the distributions F_j, j = 1, …, k − 1, are proposed that also yield asymptotically efficient estimators of smooth functionals by our general result.

2. Current status data on a counting process

2.1. Traditional current status data

Traditional current status data can be viewed as current status data on a simple counting process as follows. Let T be a univariate failure time of interest and define the process Δ(t) = I (T ≤ t) as the counting process with one single jump at point T. Let Y = (C, Δ(C)) represent current status data on Δ at a monitoring time C. We assume that C is independent of T [i.e., of Δ(·)]. The parameter of interest is the distribution F of T.

The properties of the NPMLE F_n of the distribution of T were established in Groeneboom and Wellner (1992). Here the NPMLE is defined as the maximum likelihood estimator over all discrete distributions with jumps at the monitoring times. Beyond proving a limit distribution result for F_n, these authors also established efficiency of smooth functionals of F_n with a closed form expression of the limit variance so that Wald-type confidence intervals are directly available. Huang and Wellner (1995) provide an alternative proof of asymptotic linearity of the NPMLE of smooth functionals of F under weak conditions.

We refer to Bickel, Klaassen, Ritov and Wellner (1993) for definitions of a regular, asymptotically linear and efficient estimator and influence curve of an estimator. The semiparametric-information bound at P_F,G is defined as the infimum of parametric information bounds over a specified class of parametric submodels. We choose as parametric one-dimensional submodels

{ε \to P_{F_{ε, h_{1}}, G_{ε, h_{2}}} : ∥ h_{j} ∥_{\infty} < \infty, j = 1, 2, \int h_{1} d F = \int h_{2} d G = 0},

where dF_ε,h_₁ (·) = (1 + ε,h_₁ (·)) dF (·), dG_ε,h_₂ (·) = (1 + ε,h_₂ (·)) dG (·) and ε is the unknown parameter with parameter space [− δ, δ] for some small δ > 0. The tangent space at P_F,G is now defined as the closure in $L_{0}^{2} (P_{F, G})$ of the linear span of all the scores of these one-dimensional submodels, where, for a given measure μ, we define $L_{0}^{2} (μ) = {h : \int h^{2} d μ < \infty, \int h d μ = 0}$ as the Hilbert space endowed with inner product 〈h₁, h₂〉_μ = ∫h₁(y)h₂(y) d μ(y). Thus the tangent space at P_F,G is a sub-Hilbert space of $L_{0}^{2} (P_{F, G})$ .

In this paper it is particularly important to realize that efficiency of an estimator is a local property in the sense that a regular estimator can be efficient at a particular P_F,G and inefficient at another element of the model.

Lemma 2.1

Consider the nonparametric model for Y = (C, Δ(C)), where Δ(·) ≡I (T ≤ ·), T has unspecified distribution F and C is independent of T with unspecified distribution G. We observe n i.i.d. observations of Y = (C, Δ(C)). Consider the parameter μ = ∫(1 − F_n)(u)r(u) du for a given function r. Consider the estimator μ = ∫(1 − F_n)(u)r(u) du, where F_n is the NPMLE of F. We have that μ_n is regular and asymptotically linear at any (F, G) for which F is continuous with density f_T > 0 on [0, M] and zero elsewhere (M < ∞), g(x) = dG/dx > 0 on [0, M], and r is bounded on [0, M].

The influence curve of μ_n is given by

I C (Y ∣ F, g, r) = \frac{r (C)}{g (C)} (F (C) (1 - Δ) - (1 - F (C)) Δ) .

(1)

The variance of IC is given by

VAR (I C) = \int \frac{r^{2} (c)}{g (c)} F (c) (1 - F (c)) d c .

This lemma is proved in Huang and Wellner (1995).

We can also prove the following tangent space result.

Lemma 2.2

F has a Lebesgue density f with f > 0 on [0, τ_F) and, if τ_F < ∞(τ_F = ∞ is allowed), then f = 0 on (τ_F = ∞), and
G has a Lebesgue density g.

We allow F ({∞}) > 0. Then the tangent space at P_F,G equals $L_{0}^{2} (P_{F, G})$ . This implies that an estimator of a parameter μ (F) which is regular and asymptotically linear at P_F,G is also asymptotically efficient if F, G satisfy (1) and (2).

In Gill, van der Laan and Robins (1997) it is proved that if one only assumes that the conditional distribution of the observed data Y, given the full data T, satisfies “coarsening at random” (CAR), then the tangent space at P_F,G is saturated, that is, equals $L_{0}^{2} (P_{F, G})$ . The tangent space generated by G(· | T) under the sole assumption CAR equals $T_{CAR} = {v (Y) \in L_{0}^{2} (P_{F, G}) : E (v (Y) ∣ T) = 0}$ . Therefore, the main idea of the proof below is to show that under the independent censoring model G(· | T) = G(·), the tangent space of the marginal distribution G equals T_CAR at a P_F,G satisfying (1) and (2) of Lemma 2.2. The proof below will be an ingredient of the proofs of our two main theorems.

Proof of Lemma 2.2

Let $A : L_{0}^{2} (F) \to L_{0}^{2} (P_{F, G}), A (h) (Y) = E_{F} (h (T) ∣ Y)$ be the score operator for F and let $A^{⊤} : L_{0}^{2} (P_{F, G}) \to L_{0}^{2} (F), A (V) (T) = E_{G} (V (Y) ∣ T)$ be its adjoint. The closure of the range of a Hilbert space operator equals the orthogonal complement of the null-space of its adjoint; that is, $\bar{R (A)} = N {(A^{⊤})}^{⊥}$ , where $\bar{R (A)}$ is the closure of the range of the score operator and N (A^⊤) is the null space of A^⊤. Thus $L_{0}^{2} (P_{F, G}) = \bar{R (A)} + N (A^{⊤})$ .

The data generating distribution is indexed by two locally variation-independent parameters F and G, so that the tangent space at P_F,G can be obtained as a sum of two tangent spaces, namely the tangent space for F, which is given by $\bar{R (A)}$ , and the tangent space for G. For every $h \in L_{0}^{2} (G)$ with finite supremum norm, we have that ε →(1 + εh₂) dG is a one-dimensional submodel through G at ε = 0. Thus the tangent space corresponding with submodels ε → P_F,G_{_ε} equals $L_{0}^{2} (G)$ . Thus we have that the tangent space is given by $\bar{R (A)} + L_{0}^{2} (G)$ . We conclude that it suffices to show that $N (A^{⊤}) = L_{0}^{2} (G)$ .

We have

A^{⊤} (V) (T) = \int_{0}^{T} V (c, 0) d G (c) + \int_{0}^{\infty} V (c, 1) d G (c) .

Thus ∫V (c, Δ(c)) dG(c) = 0 F -a.e. implies that

\int_{0}^{T} {V (c, 0) - V (c, 1)} g (c) d c = \int_{0}^{\infty} V (c, 1) d G (c) for T \in [0, τ_{F}) .

(2)

Differentiation w.r.t T yields V (C, 0) = V (C, 1) on [0, τ_F) G-a.e. If τ_F < ∞ and c > τ_F, then c > T and thus V (c, Δ(c)) = V (c, 1). Thus V (C, 0) = V (C, 1) G-a.e. which proves $N (A^{⊤}) = L_{0}^{2} (G)$ . □

It is of interest to note that one can represent F_T (t) as a monotonic regression of Δ on C since F (t) = E(Δ (C) | C = t). This suggests that one can estimate F_T with the estimator F_n(t) which minimizes $\sum_{i = 1}^{n} {(Δ (C_{i}) - F_{T} (C_{i}))}^{2}$ over all distribution functions F_T. F_n(t) can be computed using the pool-adjacent-violator-algorithm [see Barlow, Bartholomew, Bremner and Brunk (1972)] which, in fact, yields the NPMLE.

2.2. Current status data on a counting process

Let the process of interest be a counting process $N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t), T_{1} < \dots < T_{k}$ , where T_j is the time-variable at which an event occurs and where N jumps from value j − 1 to j. Let C be a monitoring time and consider the data structure Y = (C, N (C)). We observe n i.i.d. copies of Y. We only assume that C is independent of N.

The distribution of (C, N (C)) depends on the distribution of T⃗ only through the marginal distributions F_j of T_j, j = 1, …, k. To be precise, we have (denoting S_i = 1 − F_i), for j ∈ {0, …, k},

\begin{array}{l} P_{F, G} (d c, N (C) = j) = I (j = 0) S_{1} (c) d G (c) + I (j = k) F_{k} (c) d G (c) \\ + I (j = 1) {S_{2} (c) - S_{1} (c)} d G (c) \\ + \dots + I (j = k - 1) {S_{k} (c) - S_{k - 1} (c)} d G (c) . \end{array}

Thus the distribution of Y = (C, N (C)) only identifies the marginal distributions of T_j, j = 1, …, k.

The NPMLE does not exist in closed form and can only be computed with an iterative algorithm. For a given j, we can reduce the observation (C, N (C)) to simple current status data (C, Δ_j = I (T_j ≤ C)) on T_j, and estimate F_j with the RNPMLE. Under the conditions stated in Lemma 2.1, with F = F_j and G = G, this estimator provides regular and asymptotically linear estimators of smooth functionals of the type μ_j = ∫(1 − F_T_{_j})(u)r(u) du,, for a given r in the nonparametric model. The following theorem proves that, at a data generating distribution of Y satisfying a specified condition, any regular asymptotically linear estimator will provide asymptotically efficient estimators of smooth functionals of F_T_{_j}. We decided to state a condition (3) which is easy to understand, but our proof shows that this can be weakened, for example, to allow the analogue of (3) for the case where all distributions G, F₁, …, F_k are discrete with a finite number of support points; that is, the support points of F_j are contained in the support points of F_j are contained in the support points of F_j₊₁, j = 1, …, k −1, and G is discrete with support contained in the support of F_k.

Theorem 2.1

Let T₁ < T₂ < ···< T_k be time-variables corresponding to the chronological events of interest. Define the counting process with jumps of size 1 at these T_j’s by

N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t) .

Let Y = (C, N (C)). Consider the following semiparametric model for Y: Let C ~ G be independent of T⃗~ F, but leave G and F unspecified. Then, the distribution of Y only depends on the multivariate distribution F of T⃗ = (T₁, …, T_k) through the marginal distributions F₁, …, F_k of T₁, …, T_k.

Consider a data generating distribution P_F,G in the model above, satisfying the following condition (3): For certain τ₁ < ···< τ_k < ∞let F_j have Lebesgue density f_j on [0, τ_j] with

\begin{matrix} f_{j} > 0 & o n [0, τ_{j}] and f_{j} = 0 & o n (τ_{j}, \infty), j = 1, \dots, k, \\ F_{j} > F_{j + 1} & o n (0, τ_{j}], j = 1, \dots, k - 1, \\ G & has Lebesgue density g . \end{matrix}

(3)

We allow that p_j ≡ P (T_j = ∞) > 0 for j = j₀, …, k and j₀ ∈ {1, …, k}.

Then the tangent space at P_F,G equals $L_{0}^{2} (P_{F, G})$ and is thus saturated.

This implies that any estimator of a real valued parameter of F that is a regular and asymptotically linear estimator at P_F,G is also asymptotically efficient if P_F,G satisfies (3). In particular, given j ∈ {1, …, k}, if P_F,G satisfies (3), and F_j, G satisfy the conditions of Lemma 2.1 for the RNPMLE of μ_F_{_j} based on (C, I (T_j = C)) (thus with F = F_j and G = G), then the RNPMLE of μ_F_{_j} is asymptotically efficient.

2.2.1. Heuristic understanding of the difference between NPMLE and RNPMLE

To understand the difference between the NPMLE and the RNPMLE, we consider the special case k = 2 in detail. In this case N can have three possible values:

N (C) = {\begin{matrix} 0, & if C < T_{1}, \\ 1, & if T_{1} < C < T_{2} \\ 2, & if C > T_{2} . \end{matrix},

Let us assume that C has a Lebesgue density g. The likelihood of (C, N (C)) is given by

p_{F_{1}, F_{2}, G} (c, N (c) = j) = S_{1} {(c)}^{I (j = 0)} (S_{2} - S_{1}) {(c)}^{I (j = 1)} F_{2} {(c)}^{I (j = 2)} g (c) .

We note that the density p_F_₁,_,F_₂,_G can be reparametrized as

p_{R, F_{2}, G} (c, δ) = R {(c)}^{I (j = 0)} {(1 - R (c))}^{I (j = 1)} S_{2} {(c)}^{I (j \in {0, 1})} F_{2} {(c)}^{I (j = 2)} g (c),

where R(t) ≡ S₁(t)/S₂(t). Thus, if we ignore the relation between F₂ and R, then the NPMLE of F₂ of the likelihood corresponding with p_R_,_F_₂,_G would actually be equal to the reduced data NPMLE based on the reduced data (C, I (T₂ ≤ C)). However, F₂ and R are related since S₂R has to be a survival function. Therefore, it is not possible to determine the NPMLE by separate maximization w.r.t. F₂ and R, which explains why the NPMLE and the RNPMLE of F₂ differ.

Theorem 2.1 shows that this relation between F₂ and R is not informative for estimation of smooth functionals of F₂ at a large class of data generating distributions, since the RNPMLE, which ignores this relation, is still asymptotically efficient for estimation of $\sqrt{n}$ -estimable parameters. Our proof of Theorem 2.1 for k = 2 shows that the efficient score operator (for the definition of an efficient score operator, see the proof) of F₂ equals the efficient score operator for F₂ in the reduced data model based on (C, Δ₂). This implies that, at (F₁, F₂) satisfying (3), the efficient influence curve for any smooth functional of F₂ equals the influence curve of the RNPMLE as given in Lemma 2.1. Closer inspection of the proof for k = 2 also shows that, if (e.g.) F₂ is continuous while F₁ is discrete on [0, τ₁], or F₂ is discrete with support not containing the support of a discrete F₁, then the efficient score operator for F₂ is not the same as the efficient score operator for F₂ in the reduced data model, so that, in particular, the efficient influence curves (and information bounds) differ for the two models. Thus, at such (F₁, F₂), the RNPMLE of smooth functionals of F₂ is inefficient.

Here, we provide a likelihood-based explanation of this fact. Let R_n be the NPMLE of R. The NPMLE of F₂ maximizes the likelihood corresponding with p_R_{_n,}_F_₂ over all F₂ for which S₂R_n is a survival function, while the RNPMLE maximizes the likelihood over all distributions F₂. Suppose now that the model consists of discrete F₁’s and continuous F₂’s. This model, though smaller than the model with F₁, F₂ being unspecified, has the same semiparametric efficiency bound at a (F₁, F₂) in this smaller model as the efficiency bound in the original model. This follows from the fact that the class of one-dimensional submodels as needed to compute the tangent space can still be chosen the same. In this smaller model, an R = S₁/S₂ will be discrete at the support points of F₁, and the shape of R between the support points equals the shape of 1/S₂. As a consequence, since R determines the shape of F₂ between the support points, knowing R in the smaller model helps enormously in estimating S₂. In particular, for a given R_n, maximizing the likelihood corresponding with p_R_{_n,}_,F_₂ over F₂ with S₂R_n being a survival function, is very different from maximizing this likelihood over all possible distributions F₂. This shows that the RNPMLE in the smaller model is inefficient at such (F₁, F₂). Since the efficiency bound in the smaller model is the same as the efficiency bound in the original model, this also shows that the RNPMLE will also be inefficient at such (F₁, F₂).

Proof of Theorem 2.1

We need to prove that assumption (3) implies that the tangent space at P_F,G equals $L_{0}^{2} (P_{F, G})$ , and is thus saturated. The data generating distribution P_F,G is indexed by F and G, where the dependence on F is only through the marginals F_j_, j = 1, …, k. Thus, the tangent space at P_F,G can be obtained as a sum of two tangent spaces, namely the tangent space for F and the tangent space for G, where the latter equals $L_{0}^{2} (G)$ . Let F, G be given and satisfy (3). We now claim that the tangent space for F is given by the closure of the sum of the k tangent spaces for F_j calculated as if the F_j ’s are variation-independent parameters, j = 1, …, k. We will show this now. Let $h_{j} \in L_{0}^{2} (F_{j})$ have finite supremum norm, and let F_j,ε,h_{_j} be the one-dimensional perturbation $ε \to \int_{0}^{\cdot} (1 + ε h_{j}) d F_{j}$ through F_j at ε = 0, j = 1, …, k. First, note that the support of F_j,ε,h_{_j} equals the support of F_j_, j = 1, …, k. Since F_j > F_j ₊₁ (strictly) on (0, τ_j] we have that, given an arbitrarily small δ₁ > 0, there exists a neighborhood ε ∈ (−δ, δ) with F_j,ε,h_{_j} ≥ F_j_+1,_ε,h_{_j₊₁}on (δ, τ_j] for all j = 1, …, k − 1. Thus, P_F_{_j,ε,hj,}j = 1,…,k,G satisfies the constraints F_j ≥ F_j₊₁, j = 1, …, k − 1, of our model except on an arbitrarily small neighborhood of 0. Thus, by modifying h_j on an arbitrarily small neighborhood of 0, we can make ε → P_F_{_j,ε,hj,} j=1,…,k,G a true one-dimensional submodel. Since a tangent space for F is obtained as the closure in $L_{0}^{2} (F)$ of the linear span of scores of all possible one-dimensional submodels, it follows that the score of the unmodified ε → P_F_{_j,ε,hj,} j=1,…,k,G also belongs to the tangent space. This proves our claim.

Let j ∈ {1, …, k} be given. For a given $h_{j} \in L_{0}^{2} (F_{j})$ , we consider the one-dimensional submodel F_j,ε given by ε → (1 + εh_j (t)) dF_j (t) which goes through F_j at ε = 0. For notational convenience, define the random variable R = N (C) + 1 ∈ {1, …, k + 1}, and let F₋_j be the (k − 1)-dimensional vector of c.d.f.’s excluding F_j. This one-dimensional submodel F_j,ε implies a score for P_F_{_j,ε_,}F₋_j,G given by

\begin{array}{l} A_{1} (h_{1}) = I (R = 1) \frac{\int_{c}^{\infty} h_{1} d F_{1}}{S_{1} (c)} - I (R = 2) \frac{\int_{c}^{\infty} h_{1} d F_{1}}{(S_{2} - S_{1}) (c)} if j = 1, \\ A_{j} (h_{j}) = I (R = j) \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j} - S_{j - 1}) (c)} \\ - I (R = j + 1) \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j + 1} - S_{j}) (c)} if j \in {2, \dots, k - 1}, \\ A_{k} (h_{k}) = I (R = k) \frac{\int_{c}^{\infty} h_{k} d F_{k}}{(S_{k} - S_{k - 1}) (c)} - I (R = k + 1) \frac{\int_{c}^{\infty} h_{k} d F_{k}}{F_{k} (c)} if j = k . \end{array}

If we define S₀ ≡ 0 and S_k₊₁ ≡ 1, then, for j = 1, …, k,

A_{j} (h_{j}) = I (R = j) \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j} - S_{j - 1}) (c)} - I (R = j + 1) \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j + 1} - S_{j}) (c)},

where we use that S₁ − S₀ = S₁, and S_k₊₁ − S_k = F_k. Here $A_{j} : L_{0}^{2} (F_{j}) \to L_{0}^{2} (P_{F, G})$ is called the score operator of, F_j = 1, …, k. The tangent space for F_j is given by the closure of the range of A_j denoted by $\bar{R (A_{j})}$ . Define $A_{F} : L_{0}^{2} (F_{1}) \times \dots \times L_{0}^{2} (F_{k}) \to L_{0}^{2} (P_{F, G})$ by A_F (h₁, …, h_k) = A₁(h₁) + … + A_k(h_k). Then, the tangent space for F equals $\bar{R (A_{F})}$ so that the tangent space at P_F,G is given by $\bar{R (A_{F}) + L_{0}^{2} (G)}$ . Thus, to prove the theorem, it suffices to show that $\bar{R (A_{F}) + L_{0}^{2} (G)} = L_{0}^{2} (P_{F, G})$ at any F, G satisfying (3).

The remaining task is to understand the range of A_F. We decompose A_F as a sum of efficient score operators $A_{j}^{*}$ , where $A_{j}^{*}$ is defined as A_j minus its projection, on the sum-space spanned by the ranges of the other score operators A₁, …, A_j ₋₁, A_j ₊₁, …, A_k_, j = 1, …, k. We will prove that the efficient score operator of F_j at a P_F,G satisfying (3) equals $A_{j}^{*} (h_{j}) = E (h_{j} (T_{j})) ∣ (C, Δ_{j} = I (T_{j} \leq C))$ , which is the score operator for the reduced current status data structure (C, Δ_j), j = 1, …, k. Since the information bounds for smooth functionals of F_j are, in both models, solely expressed in terms of the efficient score operator for F_j_, the latter result proves that an efficient estimator of μ_j based on (C, Δ_j), j = 1, …, k, like the RNPMLE, is also efficient in the model for the more informative data structure (C, N (C)) [e.g., Bickel, Klaassen, Ritov and Wellner (1993)]. This proves that the RNPMLE actually yields efficient estimators. Subsequently, we show that this special structure of the efficient score operators implies that the tangent space at a P_F,G satisfying (3) is saturated, proving the more general statement of Theorem 2.1.

Derivation of the efficient score operators of F_j

Since E(A_l(h_l)A_m(h_m)(Y)) is equal to 0 if | l − m |≥ 2, it will follow that the efficient score operators mainly involve projections of the type $\prod (A_{j} ∣ \bar{R (A_{j - 1})})$ and $\prod (A_{j} ∣ \bar{R (A_{j + 1})})$ . Therefore we first obtain closed form expressions, in general, for these projection operators.

If the projection $\prod (A_{j} (h_{j}) ∣ \bar{R (A_{j - 1})})$ is actually an element of R(A_j ₋₁), then this projection is given by (compare with the formula X(X′X)⁻X′Y for the least squares estimator):

\prod (A_{j} (h_{j}) ∣ \bar{R (A_{j - 1})}) = A_{j - 1} {(A_{j - 1}^{⊤} A_{j - 1})}^{-} A_{j - 1}^{⊤} A_{j} (h_{j}),

(4)

where $A_{j - 1}^{⊤} : L_{0}^{2} (P_{F, G}) \to L_{0}^{2} (F_{j - 1})$ is the adjoint of $A_{j - 1} : L_{0}^{2} (F_{j - 1}) \to L_{0}^{2} (P_{F, G})$ , and $(A_{j - 1}^{⊤} A_{j - 1})$ stands for the generalized inverse of $A_{j - 1}^{⊤} A_{j - 1} : L_{0}^{2} (F_{j - 1}) \to L_{0}^{2} (F_{j - 1})$ . Similarly,

\prod (A_{j} (h_{j}) ∣ \bar{R (A_{j + 1})}) = A_{j + 1} {(A_{j + 1}^{⊤} A_{j + 1})}^{-} A_{j + 1}^{⊤} A_{j} (h_{j}) .

(5)

The adjoint $A_{l}^{⊤}$ is defined by

{〈 A_{l} (h_{l}), η 〉}_{P_{F}} = {〈 h_{l}, A_{l}^{⊤} (η) 〉}_{F_{l}} for all h_{l} \in L_{0}^{2} (F_{l}) and η \in L_{0}^{2} (P_{F, G}) .

It is easily shown that for l ∈ {1, …, k},

A_{l}^{⊤} (V) (T_{l}) = \int_{0}^{T_{l}} {V (c, l) - V (c, l + 1)} d G (c) .

We have that

A_{l}^{⊤} A_{l} (h_{l}) (T_{l}) = \int_{0}^{T_{l}} φ_{l} (c) \int_{c}^{\infty} h_{l} d F_{l} d G (c),

where

\begin{array}{l} φ_{1} = \frac{S_{2}}{S_{1} (S_{2} - S_{1})}, \\ φ_{l} = \frac{S_{l + 1} - S_{l - 1}}{(S_{l + 1} - S_{l}) (S_{l} - S_{l - 1})}, l = 2, \dots, k - 1, \\ φ_{k} = \frac{F_{k - 1}}{(S_{k} - S_{k - 1}) F_{k}}, \end{array}

or, in fact, with our convention of S₀ = 0 and S_k₊₁ = 1,

φ_{l} = \frac{S_{l + 1} - S_{l - 1}}{(S_{l + 1} - S_{l}) (S_{l} - S_{l - 1})}, l = 1, \dots, k .

Here φ_l (t) ≡ 0 if S_l(t) = 0.

If p_l = P (T_l = ∞) > 0, then we can write

\begin{array}{l} A_{l}^{⊤} A_{l} (h_{l}) (T_{l}) = - \int_{0}^{min (T_{l}, τ_{l})} φ_{l} (c) \int_{0}^{c} h_{l} d F_{l} d G (c) \\ + I (T_{l} = \infty) h_{l} (\infty) p_{l} \int_{τ_{l}}^{\infty} φ_{l} (c) d G (c) . \end{array}

Thus, given a K with K ≪ G, a solution (if it exists) of $A_{l}^{⊤} A_{l} (h_{l}) = K$ has to satisfy: for G-a.e., c ∈ [0, τ_l],

\int_{0}^{c} h_{l} d F_{l} = - \frac{d K}{d G} \frac{1}{φ_{l} (c)}, l = 1, \dots, k,

(6)

and, if p_l = P (T_l = ∞) > 0, then the equation $A_{l}^{⊤} A_{l} (h_{l}) (\infty) = K (\infty)$ yields

h_{l} (\infty) = \frac{1}{p_{l} \int_{0}^{\infty} φ_{l} (c) d G (c)} {K (\infty) - \int_{0}^{τ_{l}} φ_{l} (c) \int_{c}^{τ_{l}} h_{l} d F_{l} d G (c)} .

(7)

Thus, even when p_l > 0, (6) is the principal equation to solve (and will imply our conditions) since its solution h_l on [0, τ_l] yields the complete solution h_l(T_l) = h_l(T_l)I_[0,_τ_{_l]}(T_l) + I(T_l = ∞)h_l(∞). This two-step method for solving for h_l in $A_{l}^{⊤} A_{l} (h_{l}) = K$ first solves for h_l I_[0,_τ_{_l]} and then uses that, if p_l > 0, h_l (∞) is a function of h_l I_[0, _τ_{_l].}

We have, for l ∈ {1, …, k − 1},

\begin{array}{l} A_{l}^{⊤} A_{l + 1} (h_{l + 1}) = \int_{0}^{T_{l}} [A_{l + 1} (h_{l + 1}) I (R = l) - A_{l + 1} (h_{l + 1}) I (R = l + 1)] d G (c) \\ = - \int_{0}^{T_{l}} A_{l + 1} (h_{l + 1}) I (R = l + 1) d G (c) \\ = - \int_{0}^{T_{l}} \frac{1}{S_{l + 1} - S_{l}} \int_{c}^{\infty} h_{l + 1} d F_{l + 1} d G (c) . \end{array}

We note that this element is indeed absolutely continuous w.r.t. G. Similarly, it follows that, for l ∈ {1, …, k − 1},

A_{l + 1}^{⊤} A_{l} (h_{l}) = - \int_{0}^{T_{l + 1}} \frac{1}{S_{l + 1} - S_{l}} \int_{c}^{\infty} h_{l} d F_{l} d G (c) .

Thus, $h_{j - 1, j} \equiv {(A_{j - 1}^{⊤} A_{j - 1})}^{-} A_{j - 1}^{⊤} A_{j} (h_{j})$ is the h satisfying

- \int_{0}^{c} h d F_{j - 1} = - \frac{1}{S_{j} - S_{j - 1}} \int_{c}^{\infty} h_{j} d F_{j} \frac{1}{φ_{j - 1}} for G - a .e ., c \in [0, τ_{j - 1}]

(8)

for G-a.e., c ∈ [0, τ_j₋₁] and, if p_j ₋₁ > 0, then h(∞) is a simple function of hI_[0,_τ_{_j₋₁]} as given above. Similarly, $h_{j + 1, j} \equiv {(A_{j + 1}^{⊤} A_{j + 1})}^{-} A_{j + 1}^{⊤} A_{j} (h_{j})$ is the h satisfying

- \int_{0}^{c} h d F_{j + 1} = - \frac{1}{S_{j + 1} - S_{j}} \int_{c}^{\infty} h_{j} d F_{j} \frac{1}{φ_{j + 1}} for G - a .e ., c \in [0, τ_{j + 1}]

(9)

for G-a.e., c ∈ [0, τ_j₊₁] and, if p_j ₊₁ > 0, then h(∞) is a simple function of hI_[0,_τ_{_j₊₁]}. If we can take a derivative of the right-hand sides in (8) and (9) w.r.t. F_j ₋₁ and F_j ₊₁, then, in terms of h, equations (8) and (9) have a solution. This is possible if F_j ≪ F_l (i.e., F_j is absolute continuous w.r.t. F_l) on [0, τ_l], l ∈ {j − 1, j + 1}, which holds under assumption (3) since we assumed that all F_j have positive Lebesgue density on [0, τ_j]. The efficient score operator $A_{j}^{*}$ also involves projections requiring existence of solutions h_l₋₁_,l_, h_l₊₁_,l for l different from j. Therefore, the assumed condition (3) includes (via an easy to understand condition) the necessary and sufficient conditions for the existence of h_l₋₁_,l_, h_l₊₁_,l for all possible l, as needed below.

This gives the following closed form expressions for the projections (4) and (5) by simply replacing $\int_{c}^{\infty} h d F_{l}$ in A_l(h) by the expressions above. We have, for j = 1, …, k − 1,

\begin{array}{l} \prod (A_{j} (h_{j}) ∣ \bar{R (A_{j + 1})}) = A_{j + 1} (h_{j + 1, j}) \\ = - \frac{\int_{c}^{\infty} h_{j} d F_{j}}{{(S_{j + 1} - S_{j})}^{2} φ_{j + 1}} I (R = j + 1) \\ + \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j + 2} - S_{j + 1}) (S_{j + 1} - S_{j}) φ_{j + 1}} I (R = j + 2) \end{array}

(10)

and, for j = 2, …, k,

\begin{array}{l} \prod (A_{j} (h_{j}) ∣ \bar{R (A_{j - 1})}) = A_{j - 1} (h_{j - 1, j}) \\ = - \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j} - S_{j - 1}) (S_{j - 1} - S_{j - 2}) φ_{j - 1}} I (R = j - 1) \\ + \frac{\int_{c}^{\infty} h_{j} d F_{j}}{{(S_{j} - S_{j - 1})}^{2} φ_{j - 1}} I (R = j) . \end{array}

(11)

For simplicity we derive the efficient score operators for the case k = 3. (The proof generalizes to the general case.) First, define

A_{j}^{l} = A_{j} - \prod (A_{j} ∣ \bar{R (A_{l})}) .

The efficient score operators $A_{j}^{*} : L_{0}^{2} (F_{j}) \to L_{0}^{2} (P_{F})$ are given by

\begin{array}{l} A_{3}^{*} = A_{3} - \prod (A_{3} ∣ \bar{R (A_{1} + A_{2})}) = A_{3} - \prod (A_{3} ∣ \bar{R (A_{2}^{1})}), \\ A_{2}^{*} = A_{2} - \prod (A_{2} ∣ \bar{R (A_{1} + A_{3})}) = A_{2} - \prod (A_{2} ∣ \bar{R (A_{1})}) - \prod (A_{2} \bar{R (A_{3})}), \\ A_{1}^{*} = A_{1} - \prod (A_{1} ∣ \bar{R (A_{2} + A_{3})}) = A_{1} - \prod (A_{1} ∣ \bar{R (A_{2}^{3})}) . \end{array}

Calculation of $A_{2}^{*}$ . Applying (10) and (11) with j = 2 gives us

\begin{array}{l} \prod (A_{2} (h_{2}) ∣ \bar{R (A_{1})}) = - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{2} - S_{1}) (S_{1} - S_{0}) φ_{1}} I (R = 1) \\ + \frac{\int_{c}^{\infty} h_{2} d F_{2}}{{(S_{2} - S_{1})}^{2} φ_{1}} I (R = 2) \end{array}

and

\begin{array}{l} \prod (A_{2} (h_{2}) ∣ \bar{R (A_{3})}) = - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{{(S_{3} - S_{2})}^{2} φ_{3}} I (R = 3) \\ + \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{4} - S_{3}) (S_{3} - S_{2}) φ_{3}} I (R = 4) . \end{array}

Thus,

\begin{array}{l} A_{2}^{*} (h_{2}) = \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{2} - S_{1}) φ_{1}} I (R = I) \\ + {\frac{1}{S_{2} - S_{1}} - \frac{1}{{(S_{2} - S_{1})}^{2} φ_{1}}} \int_{c}^{\infty} h_{2} d F_{2} I (R = 2) \\ + {\frac{1}{{(S_{3} - S_{2})}^{2} φ_{3}} - \frac{1}{S_{3} - S_{2}}} \int_{c}^{\infty} h_{2} d F_{2} I (R = 3) \\ - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{4} - S_{3}) (S_{3} - S_{2}) φ_{3}} I (R = 4) . \end{array}

Now, notice that

\begin{array}{l} (S_{2} - S_{1}) S_{1} φ_{1} = S_{2}, \\ (S_{4} - S_{3}) (S_{3} - S_{2}) φ_{3} = S_{4} - S_{2} = F_{2}, \\ \frac{1}{{(S_{3} - S_{2})}^{2} φ_{3}} - \frac{1}{S_{3} - S_{2}} = - \frac{1}{F_{2}}, \\ \frac{1}{S_{2} - S_{1}} - \frac{1}{{(S_{2} - S_{1})}^{2} φ_{1}} = \frac{1}{S_{2}} . \end{array}

Thus (using $\int_{0}^{\infty} h_{2} d F_{2} = 0$ ),

A_{2}^{*} (h_{2}) = \frac{\int_{c}^{\infty} h_{2} d F_{2}}{S_{2} (c)} I (R \in {1, 2}) + \frac{\int_{0}^{c} h_{2} d F_{2}}{F_{2} (c)} I (R \in {3, 4}) .

Calculation of $A_{1}^{*}$ . Formula (10) with j = 1 gives us

\begin{array}{l} \prod (A_{2} (h_{2}) ∣ \bar{R (A_{3})}) = - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{{(S_{3} - S_{2})}^{2} φ_{3} (c)} I (R = 3) \\ + \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{4} - S_{3}) (S_{3} - S_{2}) φ_{3} (c)} I (R = 4) . \end{array}

Thus,

\begin{array}{l} A_{2}^{3} (h_{2}) = A_{2} (h_{2}) - \prod (A_{2} (h_{2}) ∣ \bar{R (A_{3})}) \\ = \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{2} - S_{1}) (c)} I (R = 2) \\ + {\frac{1}{{(S_{3} - S_{2})}^{2} φ_{3} (c)} - \frac{1}{(S_{3} - S_{2}) (c)}} \int_{c}^{\infty} h_{2} d F_{2} I (R = 3) \\ - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{4} - S_{3}) (S_{3} - S_{2}) φ_{3} (c)} I (R = 4) . \end{array}

We now note that

\begin{array}{l} (S_{4} - S_{3}) (S_{3} - S_{2}) φ_{3} = F_{2}, \\ \frac{1}{{(S_{3} - S_{2})}^{2} φ_{3} (c)} - \frac{1}{(S_{3} - S_{2}) (c)} = - \frac{1}{F_{2}} . \end{array}

Thus,

A_{2}^{3} (h_{2}) = \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{2} - S_{1}) (c)} I (R = 2) - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{F_{2} (c)} I (R \in {3, 4}) .

It is easily verified that the adjoint $A_{2}^{3 ⊤} : L_{0}^{2} (P_{F}) \to L_{0}^{2} (F_{2})$ is given by

A_{2}^{3 ⊤} (V) = \int_{0}^{T_{2}} {V (c, 2) - \frac{(S_{3} - S_{2}) (c)}{F_{2} (c)} V (c, 3) - \frac{F_{3} (c)}{F_{2} (c)} V (c, 4)} d G (c) .

Subsequently, we can now verify that

A_{2}^{3 ⊤} A_{2}^{3} (h_{2}) = \int_{0}^{T_{2}} φ_{2}^{3} (c) \int_{c}^{\infty} h_{2} d F_{2} d G (c),

where

φ_{2}^{3} \frac{F_{1}}{F_{2} (S_{2} - S_{1})} .

We need to find $h_{23, 1} \equiv {(A_{2}^{3 ⊤} A_{2}^{3})}^{-} (K)$ with

K = A_{2}^{3 ⊤} A_{1} (h_{1}) = - \int_{0}^{T_{2}} \frac{\int_{c}^{\infty} h_{1} d F_{1}}{(S_{2} - S_{1}) (c)} d G (c) .

This solution has to satisfy on [0, τ₂]:

- \int_{0}^{c} h_{23, 1} d F_{2} = \frac{d K}{d G} (c) \frac{1}{φ_{2}^{3} (c)} = - \frac{F_{2}}{F_{1}} (c) \int_{c}^{\infty} h_{1} d F_{1}

and, as shown previously, h_23,1(∞) is a simple function of h_23,1I_[0,τ₂]. We note that h_23,1 exists under the assumption F_j ≡ F_k (i.e., F_j ≪ F_k and F_k ≪ F_j) on [0, τ_j], j = 1, …, k − 1, which follows from (3). We conclude that

\begin{array}{l} \prod (A_{1} (h_{1}) ∣ \bar{R (A_{2}^{3})}) = A_{2}^{3} (h_{23, 1}) \\ = - \frac{F_{2}}{F_{1} (S_{2} - S_{1})} \int_{c}^{\infty} h_{1} d F_{1} I (R = 2) \\ + \frac{\int_{c}^{\infty} h_{1} d F_{1}}{F_{1}} I (R \in {3, 4}) . \end{array}

Using F₂/(F₁(S₂ − S₁)) − 1/(S₂ − S₁) = −1/F₁ and $\int_{c}^{\infty} h_{1} d F_{1} = - \int_{0}^{c} h_{1} d F_{1}$ yields

\begin{array}{l} A_{1}^{*} (h_{1}) = A_{1} (h_{1}) - \prod (A_{1} (h_{1}) ∣ \bar{R (A_{2}^{3})}) \\ = \frac{\int_{c}^{\infty} h_{1} d F_{1}}{S_{1} (c)} I (R = 1) + \frac{\int_{0}^{c} h_{1} d F_{1}}{F_{1} (c)} I (R \in {2, 3, 4}) . \end{array}

Calculation of $A_{3}^{*}$ . This calculation is very similar to the one above for $A_{1}^{*}$ and is omitted. We have

A_{3}^{*} (h_{3}) = \frac{\int_{0}^{c} h_{3} d F_{3}}{F_{3} (c)} I (R = 4) + \frac{\int_{c}^{\infty} h_{3} d F_{3}}{S_{3} (c)} I (R \in {1, 2, 3}) .

Proving that the tangent space is saturated

Given the expressions for the efficient score operators derived above, we now prove that the tangent space at a P_F,G satisfying (3) is saturated. Under our assumption (3), the tangent space equals $L_{0}^{2} (G)$ (scores generated by G) plus the closure of the range of $A^{*} : L_{0}^{2} (F_{1}) \times \dots \times L_{0}^{2} (F_{k}) \to L_{0}^{2} (P_{F})$ defined by

(h_{1}, \dots, h_{k}) \to A_{1}^{*} (h_{1}) + \dots + A_{k}^{*} (h_{k}),

where the marginal efficient score operators are given by $A_{j}^{*} (h_{j}) = E (h_{j} (T_{j})) ∣ (C, Δ_{j} = I (T_{j} \leq C)), j = 1, \dots, k$ . The closure of the range of a Hilbert space operator equals the orthogonal complement of the null-space of its adjoint, that is, $\bar{R (A^{*})} = N {(A^{* ⊤})}^{⊥}$ . Thus we need to show that $N (A^{* ⊤}) = L_{0}^{2} (G)$ . The adjoint $A^{* ⊤} : L_{0}^{2} (P_{F}) \to L_{0}^{2} (F_{1}) \times \dots \times L_{0}^{2} (F_{k})$ is given by

A^{* ⊤} (V) = (A_{1}^{* ⊤} (V), \dots, A_{k}^{* ⊤} (V)),

where it is easily verified that the adjoint $A_{j}^{* ⊤} : L_{0}^{2} (P_{F}) \to L_{0}^{2} (F_{j})$ of $A_{j}^{*}$ is given by

A_{j}^{* ⊤} (V) = E (E (V (C, R) ∣ C, Δ_{j}) ∣ T_{j}) .

Consider the operator $B_{j}^{⊤} : L_{0}^{2} (C, Δ_{j}) \to L_{0}^{2} (F_{j})$ given by $B_{j}^{⊤} (η) = E (η (C, Δ_{j}) ∣ T_{j})$ , where $L_{0}^{2} (C, Δ_{j})$ is the space of functions of (C, Δ_j) with finite variance and zero mean (both taken w.r.t. P_F,G). Using precisely the same proof as the proof of Lemma 2.2, it follows that, if F_j has a Lebesgue density f_j > 0 on [0, τ_j], then the null-space $N (B_{j}^{⊤}) = L_{0}^{2} (G)$ , that is, it consists of functions independent of Δ_j. Thus, under (3), $A_{j}^{* ⊤} (V) = 0$ implies that E(V (C, R) | C, Δ_j) = E (V (C, R) | C) ≡ φ(C), j = 1, …, k.

Setting Δ₁ = 0 yields φ(C) = E(V (C, R) | C, Δ₁ = 0) = V (C, 1). Now, we note that

P (R = m ∣ Δ_{j} = 1, C = c) = I (m \geq j + 1) \frac{P (R = m ∣ c)}{F_{j} (c)}, j = 1, \dots, k,

where P (R = m | c) = (S_m − S_m₋₁)(c). Thus, E(V (C, R) | C, Δ_j = 1) is given by

\sum_{m \geq j + 1} V (c, m) \frac{(S_{m} - S_{m - 1}) (c)}{F_{j} (c)} = φ (c), j = 1, \dots, k .

For j = k, this equality gives V (c, k + 1) = φ(c). For j = k − 1, this equality gives then

V (c, k) \frac{(S_{k} - S_{k - 1}) (c)}{F_{k - 1} (c)} = (1 - \frac{F_{k} (c)}{F_{k - 1} (c)}) φ (c) = \frac{(S_{k} - S_{k - 1}) (c)}{F_{k - 1} (c)} φ (c)

so that V (c, k) = φ(c). In this manner, we subsequently find φ(c) = V (c, k + 1) = V (c, k) = … = V (c, 2). This shows that V (C, R) does not depend on R. This completes the proof.

3. Current status data on a counting process when final event is right censored

The following theorem proves efficiency of any regular asymptotically linear estimator at a specified rich sub-model.

Theorem 3.1

Let N (t) be a counting process $N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t)$ for random variables T₁ < …< T_k. Let C be a random censoring time. For every subject we observe the following data structure:

Y = (\tilde{T} = T_{k} \land C, Δ = I (T_{k} \leq C), N (\tilde{T})) .

We assume that C is independent of (T₁, …, T_k). The distribution of Y only depends on the multivariate distribution F of (T₁, …, T_k) through the marginal distributions F₁, …, F_k of (T₁, …, T_k).

Consider a data generating distribution P_F_,_G in the model above satisfying the following condition (12): For certain τ₁ < …< τ_k < ∞, let F_j have Lebesgue density f_j on [0, τ_j] with

\begin{matrix} f_{j} > 0 & o n [0, τ_{j}] and f_{j} = 0 & o n (τ_{j}, \infty), j = 1, \dots, k, \\ F_{j} > F_{j + 1} & o n (0, τ_{j}], j = 1, \dots, k - 1, \\ G & has Lebesgue density g . \end{matrix}

(12)

We allow that p_j ≡ P(T_j = ∞) > 0 for j = j₀, …, k and j₀ ∈{1, …, k}.

Then, the tangent space at P_F_,_G equals $L_{0}^{2} (P_{F, G})$ and is thus saturated. This implies that an estimator of a real valued parameter of the distribution F which is regular and asymptotically linear at P_F_,_G is also asymptotically efficient if P_F_,_G satisfies (12). In particular, if Ḡ(t) > 0 and F, G satisfy (12), then the Kaplan–Meier estimator S_k_,_KM (t) of S_k(t) = P (T_k > t), based on the i.i.d. data (T̃, Δ), is asymptotically efficient.

3.1. Regular and asymptotically linear estimators

The important implication of Theorem 3.1 is that, if we can construct an estimator of $\sqrt{n}$ -estimable parameters of F_j which is regular, then this estimator will be asymptotically efficient at any F satisfying (12), j = 1, …, k. In this subsection, we provide relatively simple regular and asymptotically linear estimators.

First, consider estimation of S_k(t) = P (T_k > t). It is well known that S_k,KM (t) is a regular asymptotically linear estimator of S_k(t) whenever Ḡ(t) > 0. Second, consider estimation of S_j (t) = P (T_j > t), j = 1, …, k − 1. Let Δ_j ≡ I (T_j ≤ C). Under independent censoring (we can weaken this to noninformative censoring of T_k), we have

E (1 - Δ_{j} ∣ C = c, T_{k} > c) = \frac{S_{j} (c)}{S_{k} (c)} \equiv R_{j} (c) .

(13)

\begin{array}{l} S_{j} (c) = S_{k} (c) E (1 - Δ_{j} ∣ C = c, T_{k} > c) \\ = E (S_{k} (c) (1 - Δ_{j}) ∣ C = c, T_{k} > c) . \end{array}

(14)

In other words, estimating S_j can be viewed as estimating a monotonic regression of S_k(C)(1 − Δ_j) on the observed C’s. This suggests replacing S_k by the efficient Kaplan–Meier estimator S_k,KM and minimizing

\frac{1}{n} \sum_{i = 1}^{n} w_{i} {S_{k, K M} (C_{i}) (1 - Δ_{j i}) - S_{j} (C_{i})}^{2} I (C_{i} \leq T_{k i})

(15)

over the vector (S_j (C_i): i = 1, …, n), under the constraint that S_j is monotone, where w_i, i = 1, …, n, is a given set of weights possibly assigning more mass to observations with smaller variance. The solution S_j,n of this problem can be obtained with the pool-adjacent-violator-algorithm (PAVA) [see, e.g., Barlow, Bartholomew, Bremner and Brunk (1972)].

A simple calculation shows that

\begin{array}{l} VAR {S_{k} (C) (1 - Δ_{j}) ∣ C = c, T_{k} > c} \\ = S_{k} {(c)}^{2} VAR {1 - Δ_{j} ∣ C = c, T_{k} > c} = S_{k}^{2} (c) R_{j} (c) {1 - R_{j} (c)} . \end{array}

(16)

Since R_j is not identified from the data at a better rate than S_j, a good set of weights is $w_{i} = 1 / S_{k, K M}^{2} (C_{i}), i = 1, \dots, n$ [see van der Laan, Jewell and Peterson (1997)].

It is beyond the scope of this paper to prove that smooth functionals of S_j,n are regular and asymptotically linear. Since it is straightforward to prove such a theorem for a standard histogram regression estimator of the regression of S_k(C)(1 − Δ_j) on the observed C’s, one expects that the more sophisticated isotonic regression estimate S_j,n (which only differs because it selects its bins adaptively) is regular and asymptotically linear under the same conditions. We note that the choice of weights w_i, i = 1, …, n, has no effect on the limit distribution of smooth functionals of S_j,n.

3.2. Proof of Theorem 3.1

In the first part of the proof we establish that, if condition (12) holds, then the efficient score operator of F_k equals the efficient score operator of F_k in the reduced data model for (T̃_k, Δ_k), hereby establishing a proof of the efficiency of the Kaplan–Meier estimator S_KM (t). Subsequently, exploiting this special form of the efficient score operator of F_k, we prove saturation of the tangent space and thus Theorem 3.1.

Consider the data structure (T̃_k = T_k ∧ C, N (T̃_k)), where $N (t) = \sum_{j = 1}^{k} I (T_{j} \leq t)$ and T₁ < T₂ < …< T_k are ordered random variables. Let R = N (T̃_k) + 1. The density of the data is given by

P (d {\tilde{T}}_{k}, R = j) = \prod_{m = 1}^{k} (S_{m} - S_{m - 1}) {({\tilde{T}}_{k})}^{R = m} d F_{k} {({\tilde{T}}_{k})}^{R = k + 1} d G {(t)}^{R \neq k + 1} \bar{G} {(t)}^{R = k},

where S₀ ≡ 0 and S_k₊₁ ≡ 1. We refer to the beginning of the proof of Theorem 2.1 to show that the tangent space at a P_F,G satisfying condition (12) is the closure of the sum of the tangent spaces generated by F_j, j = 1, …, k and the tangent space of G, treating F_j as locally variation-independent. We have that the score operators: $A_{j} : L_{0}^{2} (F_{j}) \to L_{0}^{2} (P_{F, G})$ for F_j, j= 1, …, k − 1, are given by

A_{j} (h_{j}) = \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j} - S_{j - 1}) (c)} I (R = j) - \frac{\int_{c}^{\infty} h_{j} d F_{j}}{(S_{j + 1} - S_{j}) (c)} I (R = j + 1)

and

A_{k} (h_{k}) = h_{k} ({\tilde{T}}_{k}) I (R = k + 1) + \frac{\int_{c}^{\infty} h_{k} d F_{k}}{(S_{k} - S_{k - 1}) (c)} I (R = k) .

Derivation of efficient score operator of F_k

We first determine the efficient score operator for F_k. For notational convenience, we consider the case k = 3. We have

A_{3}^{*} (h_{3}) = A_{3} (h_{3}) - \prod (A_{3} (h_{3}) ∣ \bar{R (A_{2}^{1})})

where

A_{2}^{1} = A_{2} - \prod (A_{2} ∣ \bar{R (A_{1})}) .

Applying formula (11) gives

\prod (A_{2} (h_{2}) ∣ \bar{R (A_{1})}) = - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{S_{2} (c)} I (R = 1) + \frac{\int_{c}^{\infty} h_{2} d F_{2}}{{(S_{2} - S_{1})}^{2} φ_{1} (c)} I (R = 2),

where we need to assume that F₂ ≪ F₁ on [0, τ₁]. Thus, an easy calculation shows that

A_{2}^{1} (h_{2}) = \frac{\int_{c}^{\infty} h_{2} d F_{2}}{S_{2} (c)} I (R \in {1, 2}) - \frac{\int_{c}^{\infty} h_{2} d F_{2}}{(S_{3} - S_{2}) (c)} I (R = 3) .

Another straightforward calculation shows that the adjoint $A_{2}^{1 ⊤} : L_{0}^{2} (P_{F, G}) \to L_{0}^{2} (F_{2})$ of $A_{2}^{1 ⊤} : L_{0}^{2} (F_{2}) \to L_{0}^{2} (P_{F, G})$ is given by

A_{2}^{1 ⊤} (V) = \int_{0}^{T_{2}} {V (c, 1) \frac{S_{1}}{S_{2}} (c) + \frac{(S_{2} - S_{1})}{S_{2}} (c) V (c, 2) - V (c, 3)} d G (c) .

A straightforward calculation now shows that

A_{2}^{1 ⊤} A_{2}^{1} (h_{2}) = \int_{0}^{T_{2}} \int_{0}^{\infty} h_{2} d F_{2} \frac{S_{3}}{S_{2} (S_{3} - S_{2})} (c) d G (c) .

We also have

A_{2}^{1 ⊤} A_{3} (h_{3}) = - \int_{0}^{T_{2}} \frac{\int_{c}^{\infty} h_{3} d F_{3}}{(S_{3} - S_{2}) (c)} d G (c) .

This shows that $h_{21, 3} \equiv {(A_{2}^{1 ⊤} A_{2}^{1})}^{-} A_{2}^{1 ⊤} A_{3} (h_{3})$ satisfies, on [0, τ₂],

- \int_{0}^{c} h_{21, 3} d F_{2} = - \frac{S_{2}}{S_{3}} (c) \int_{c}^{\infty} h_{3} d F_{3},

and, if p₂ = P (T₂ = ∞) > 0, then h₂₁_,₃(∞) is a simple function of h₂₁_,₃I_[0, _τ_₂]as shown above (7). Here we need to assume that this equation can be solved in h₂₁_,₃. This is true if F₃ ≪ F₂ on [0, τ₂]. Then

\begin{array}{l} \prod (A_{3} (h_{3}) ∣ \bar{R (A_{2}^{1})}) = A_{2}^{1} (h_{21, 3}) \\ = - \frac{\int_{c}^{\infty} h_{3} d F_{3}}{S_{3} (c)} I (R \in {1, 2}) \\ + \frac{S_{2} (c)}{S_{3} (S_{3} - S_{2}) (c)} \int_{c}^{\infty} h_{3} d F_{3} I (R = 3) . \end{array}

This proves that

\begin{array}{l} A_{3}^{*} (h_{3}) = h_{3} ({\tilde{T}}_{3}) I (R = 4) + {\frac{1}{S_{3} - S_{2}} - \frac{S_{2}}{(S_{3} - S_{2}) S_{3}}} (c) I (R = 3) \\ + \frac{\int_{c}^{\infty} h_{3} d F_{3}}{S_{3} (c)} I (R \in {1, 2}) \\ = h_{3} ({\tilde{T}}_{3}) I (R = 4) + \frac{\int_{c}^{\infty} h_{3} d F_{3}}{S_{3} (c)} I (R \in {1, 2, 3}) . \end{array}

Thus, we have proved that, if F_k ≡ F_j on [0, τ_j], j = 1, …, k − 1, then the efficient score $A_{k}^{*} (h_{k}) = E (h_{k} (T_{k}) ∣ {\tilde{T}}_{k}, Δ_{k})$ . The latter condition holds, in particular, if (12) holds. This proves the statement of Theorem 3.1 regarding efficiency of the Kaplan–Meier estimator S_KM.

Saturated tangent space result

Note that, for a random variable Y, we define $L_{0}^{2} (Y) = {h (Y) : E h^{2} (Y) < \infty, E h (Y) = 0}$ . For simplicity, we prove saturation for k = 3. Let $A : L_{0}^{2} (F_{1}) \times L_{0}^{2} (F_{2}) \to L_{0}^{2} (P_{F, G})$ be defined by A(h₁, h₂) = A₁(h₁) + A₂(h₂). Then, the tangent space of F is given by $\bar{R (A_{1}) + R (A_{2}) + R (A_{3})} = \bar{R (A_{1}) + R (A_{2})} \oplus \bar{R (A_{3}^{*})}$ . Thus, the tangent space at P_F_,_G is given by $\bar{R (A)} \oplus \bar{R (A_{3}^{*})} \oplus \bar{R (B)}$ , where $B : L_{0}^{2} (G) \to L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ is the score operator for the censoring mechanism G, given by B(h) = E(h(C) | T̃₃, Δ₃). By factorization of the likelihood into F and G parts, we have that R(B) is orthogonal to F-scores. It is well known that $\bar{R (A_{3}^{*}) \oplus R (B)} = L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ . The latter result simply states that the tangent space for the nonparametric right-censored data model for (T̃₃, Δ₃), only assuming that C is independent of T, is saturated [e.g., Bickel, Klaassen, Ritov and Wellner (1993)]. Thus, we need to prove that $\bar{R (A)} \oplus L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3}) = L_{0}^{2} (P_{F, G})$ which is equivalent to proving $N (A^{⊤}) = L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ , where $A^{⊤} : L_{0}^{2} P_{(F, G)} \to L_{0}^{2} (F_{1}) \times L_{0}^{2} (F_{2})$ is the adjoint of A and N (A^⊤) denotes its null space.

First, we decompose A₁ + …+ A_k_{− 1} into a sum of orthogonal operators (efficient score operators in the model with F_k known). Let $A_{1}^{'} = A_{1} - \prod (A_{1} ∣ \bar{R (A_{2})})$ and $A_{2}^{'} = A_{2} - \prod (A_{2} ∣ \bar{R (A_{1})})$ . By (4), it follows that

\begin{array}{l} A_{1}^{'} (h_{1}) = \frac{\int_{c}^{\infty} h_{1} d F_{1}}{S_{1} (c)} I (R = 1) - \frac{\int_{c}^{\infty} h_{1} d F_{1}}{(S_{3} - S_{1})} I (R \in {2, 3}), \\ A_{2}^{'} (h_{2}) = \frac{\int_{c}^{\infty} h_{2} d F_{2}}{S_{2} (c)} I (R \in {1, 2}) + \frac{\int_{0}^{c} h_{2} d F_{2}}{(S_{3} - S_{2}) (c)} I (R = 3), \end{array}

where we need the equivalence assumptions F_j ≡ F_j ₊₁ on [0, τ_j] for j = 1, …, k, again. A more compact manner of representing these operators $A_{j}^{'} : L_{0}^{2} (F_{j}) \to H (C, R) \equiv {V (C, R) I (R < 4) \in L_{0}^{2} (P_{F, G}) : V}$ is

A_{j}^{'} (h_{j}) = E (h_{j} (T_{j}) ∣ C, Δ_{j}, T_{3} > C) I (T_{3} > C), j = 1, 2.

(17)

Consider the operator $A^{'} : L_{0}^{2} (F_{1}) \times L_{0}^{2} (F_{2}) \to H (C, R)$ defined by $A^{'} (h_{1}, h_{2}) = A_{1}^{'} (h_{1}) + A_{2}^{'} (h_{2})$ . Proving $N (A^{⊤}) = L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ is equivalent to proving $N ({A^{'}}^{⊤}) = L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ , where A^′⊤ is the adjoint of A^′.

From the representation (17), the adjoint $A_{j}^{' ⊤} : H (C, R) \to L_{0}^{2} (F_{j})$ is given by

A_{j}^{' ⊤} (V) = E (E (V (C, R) I (T_{3} > C) ∣ C, Δ_{j}, T_{3} > C) ∣ T_{j}), j = 1, 2,

and thus, $N ({A^{'}}^{⊤}) = N (A_{1}^{' ⊤}) \cap N (A_{2}^{' ⊤})$ .

Consider now a solution V I (T₃ > C) ∈ H (C, R) satisfying $A_{j}^{' ⊤} (V I (T_{3} > C)) = 0, j = 1, 2$ . In order to prove $V \in L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ , it suffices to show I (T₃ > C)V = I (T₃ > C)φ(C) for some φ. Using precisely the same proof as the proof of Lemma 2.2, it follows that, if F_j has a Lebesgue density f_j > 0 on [0, τ_j] and G has a Lebesgue density, then, for any function I (T₃ > C)η(C, Δ_j), E(I (T₃ > C) η (C, Δ_j) | T_j) = 0 implies η (C, 1) = η (C, 0). This proves that E(V (C, R)I (T₃ > C) | C, Δ_j, T₃ > C) = E(V (C, R)I (T₃ > C) | C, T₃ > C) ≡ I (T₃ > C)φ(C) does not depend on Δ_j, j = 1, 2.

Setting Δ₁ = 0 yields I (T₃ > C)φ(C) = E(V (C, R)I (T₃ > C) | C, Δ_j, T₃ > C) = V (C, 1)I (T₃ > C). Now, we note that

\begin{array}{l} P (R = m ∣ Δ_{j} = 1, C = c, T_{3} > c) \\ = I (m \geq j + 1, m < 4) \frac{(S_{m} - S_{m - 1})}{(S_{3} - S_{j}) (c)}, j = 1, 2. \end{array}

Thus, E(V (C, R)I (T₃ > C) | C, Δ_j = 1, T₃ > C) is given by

I (T_{3} > C) \sum_{m \geq j + 1, m < 4} V (C, m) \frac{(S_{m} - S_{m - 1}) (C)}{(S_{3} - S_{j}) (C)} = I (T_{3} > C) φ (C), j = 1, 2 .

For j = 2, this equality gives I (T₃ > C)V (C, 3) = I (T₃ > C)φ(C). For j = 1, this equality gives

I (T_{3} > C) {V (C, 2) \frac{(S_{2} - S_{1}) (c)}{(S_{3} - S_{1}) (C)} + φ (C) \frac{(S_{3} - S_{2}) (C)}{(S_{3} - S_{1}) (C)}} = I (T_{3} > C) Φ (C),

so that I (T₃ > C)V (C, 2) = I (T₃ > C)φ(C). We have shown I (T₃ > C) × V (C, 1) = I (T₃ > C)V (C, 2) = I (T₃ > C)V (C, 3) which proves that V = I (T₃ < C)V₁(T₃) + I (T₃ > C)φ(C) for some functions V₁ and f, and thus that $V \in L_{0}^{2} ({\tilde{T}}_{3}, Δ_{3})$ . This completes the proof. □

Acknowledgments

The authors thank the referees and Associate Editor for their helpful comments.

Footnotes

Supported by a FIRST award (GM53722) from the National Institute of General Medical Sciences and the National Institutes of Health.

References

Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD. Statistical Inference under Order Restrictions. Wiley; New York: 1972. [Google Scholar]
Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation in Semi-Parametric Models. Johns Hopkins Univ. Press; 1993. [Google Scholar]
Diamond ID, McDonald JW. The analysis of current status data. In: Trussell J, Hankinson R, Tilton J, editors. Demographic Applications of Event History Analysis. Oxford Univ. Press; 1992. pp. 231–252. [Google Scholar]
Diamond ID, McDonald JW, Shah IH. Proportional hazards models for current status data: Application to the study of differentials in age at weaning in Pakistan. Demography. 1986;23:607–620. [PubMed] [Google Scholar]
Dinse GE, Lagakos SW. Nonparametric estimation of lifetime and disease onset distributions from incomplete observations. Biometrics. 1982;38:921–932. [PubMed] [Google Scholar]
Gill RD, van der Laan MJ, Robins JM. Proc First Seattle Symposium in Biostatistics. Lecture Notes in Statist. Vol. 123. Springer; New York: 1997. Coarsening at random: Characterizations, conjectures and counterexamples; pp. 255–294. [Google Scholar]
Groeneboom PJ. Special topics course 593C: Nonparametric estimation for inverse problems: algorithms and asymptotics. 1998 Technical Report 344, Dept. Statistics, Univ. Washington. (For related software see www.stat.washington.edu/jaw/RESEARCH/SOFTWARE/software.list.html.)
Groeneboom P, Wellner JA. Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser; Basel: 1992. [Google Scholar]
Huang J, Wellner JA. Asymptotic normality of the NPMLE of linear functionals for interval censored data, case I. Statist Neerlandica. 1995;49:153–163. [Google Scholar]
Jewell NP, Malani HM, Vittinghoff E. Nonparametric estimation for a form of doubly censored data with application to two problems in AIDS. J Amer Statist Assoc. 1994;89:7–18. [Google Scholar]
Jewell NP, Shiboski SC. Statistical analysis of HIV infectivity based on partner studies. Biometrics. 1990;46:1133–1150. [PubMed] [Google Scholar]
Jewell NP, van der Laan MJ. Generalizations of current status data with applications. Lifetime Data Analysis. 1995;1:101–109. doi: 10.1007/BF00985261. [DOI] [PubMed] [Google Scholar]
Jongbloed G. Three statistical inverse problems. Delft Univ. Technology; 1995. Ph.D. dissertation. [Google Scholar]
Keiding N. Age-specific incidence and prevalence: A statistical perspective (with discussion) J Roy Statist Soc Ser A. 1991;154:371–412. [Google Scholar]
Kodell RL, Shaw GW, Johnson AM. Nonparametric joint estimators for disease resistance and survival functions in survival/sacrifice experiments. Biometrics. 1982;38:43–58. [PubMed] [Google Scholar]
Sun J, Kalbfleisch JD. The analysis of current status data on point processes. J Amer Statist Assoc. 1993;88:1449–1454. [Google Scholar]
Turnbull BW, Mitchell TJ. Nonparametric estimation of the distribution of time to onset for specific diseases in survival/sacrifice experiments. Biometrics. 1984;40:41–50. [PubMed] [Google Scholar]
van der Laan MJ, Jewell NP, Peterson DR. Efficient estimation of the lifetime and disease onset distribution. Biometrika. 1997;84:539–554. [Google Scholar]

[R1] Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD. Statistical Inference under Order Restrictions. Wiley; New York: 1972. [Google Scholar]

[R2] Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation in Semi-Parametric Models. Johns Hopkins Univ. Press; 1993. [Google Scholar]

[R3] Diamond ID, McDonald JW. The analysis of current status data. In: Trussell J, Hankinson R, Tilton J, editors. Demographic Applications of Event History Analysis. Oxford Univ. Press; 1992. pp. 231–252. [Google Scholar]

[R4] Diamond ID, McDonald JW, Shah IH. Proportional hazards models for current status data: Application to the study of differentials in age at weaning in Pakistan. Demography. 1986;23:607–620. [PubMed] [Google Scholar]

[R5] Dinse GE, Lagakos SW. Nonparametric estimation of lifetime and disease onset distributions from incomplete observations. Biometrics. 1982;38:921–932. [PubMed] [Google Scholar]

[R6] Gill RD, van der Laan MJ, Robins JM. Proc First Seattle Symposium in Biostatistics. Lecture Notes in Statist. Vol. 123. Springer; New York: 1997. Coarsening at random: Characterizations, conjectures and counterexamples; pp. 255–294. [Google Scholar]

[R7] Groeneboom PJ. Special topics course 593C: Nonparametric estimation for inverse problems: algorithms and asymptotics. 1998 Technical Report 344, Dept. Statistics, Univ. Washington. (For related software see www.stat.washington.edu/jaw/RESEARCH/SOFTWARE/software.list.html.)

[R8] Groeneboom P, Wellner JA. Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser; Basel: 1992. [Google Scholar]

[R9] Huang J, Wellner JA. Asymptotic normality of the NPMLE of linear functionals for interval censored data, case I. Statist Neerlandica. 1995;49:153–163. [Google Scholar]

[R10] Jewell NP, Malani HM, Vittinghoff E. Nonparametric estimation for a form of doubly censored data with application to two problems in AIDS. J Amer Statist Assoc. 1994;89:7–18. [Google Scholar]

[R11] Jewell NP, Shiboski SC. Statistical analysis of HIV infectivity based on partner studies. Biometrics. 1990;46:1133–1150. [PubMed] [Google Scholar]

[R12] Jewell NP, van der Laan MJ. Generalizations of current status data with applications. Lifetime Data Analysis. 1995;1:101–109. doi: 10.1007/BF00985261. [DOI] [PubMed] [Google Scholar]

[R13] Jongbloed G. Three statistical inverse problems. Delft Univ. Technology; 1995. Ph.D. dissertation. [Google Scholar]

[R14] Keiding N. Age-specific incidence and prevalence: A statistical perspective (with discussion) J Roy Statist Soc Ser A. 1991;154:371–412. [Google Scholar]

[R15] Kodell RL, Shaw GW, Johnson AM. Nonparametric joint estimators for disease resistance and survival functions in survival/sacrifice experiments. Biometrics. 1982;38:43–58. [PubMed] [Google Scholar]

[R16] Sun J, Kalbfleisch JD. The analysis of current status data on point processes. J Amer Statist Assoc. 1993;88:1449–1454. [Google Scholar]

[R17] Turnbull BW, Mitchell TJ. Nonparametric estimation of the distribution of time to onset for specific diseases in survival/sacrifice experiments. Biometrics. 1984;40:41–50. [PubMed] [Google Scholar]

[R18] van der Laan MJ, Jewell NP, Peterson DR. Efficient estimation of the lifetime and disease onset distribution. Biometrika. 1997;84:539–554. [Google Scholar]

PERMALINK

CURRENT STATUS AND RIGHT-CENSORED DATA STRUCTURES WHEN OBSERVING A MARKER AT THE CENSORING TIME^¹

MARK J VAN DER LAAN

NICHOLAS P JEWELL

Abstract

1. Introduction

1.1. Current status data on a finite counting process

1.2. Current status data on a finite counting process when the final event is right censored

1.3. Organization and overview of results

2. Current status data on a counting process

2.1. Traditional current status data

Lemma 2.1

Lemma 2.2

Proof of Lemma 2.2

2.2. Current status data on a counting process

Theorem 2.1

2.2.1. Heuristic understanding of the difference between NPMLE and RNPMLE

Proof of Theorem 2.1

Derivation of the efficient score operators of F_j

Proving that the tangent space is saturated

3. Current status data on a counting process when final event is right censored

Theorem 3.1

3.1. Regular and asymptotically linear estimators

3.2. Proof of Theorem 3.1

Derivation of efficient score operator of F_k

Saturated tangent space result

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

CURRENT STATUS AND RIGHT-CENSORED DATA STRUCTURES WHEN OBSERVING A MARKER AT THE CENSORING TIME1

MARK J VAN DER LAAN

NICHOLAS P JEWELL

Abstract

1. Introduction

1.1. Current status data on a finite counting process

1.2. Current status data on a finite counting process when the final event is right censored

1.3. Organization and overview of results

2. Current status data on a counting process

2.1. Traditional current status data

Lemma 2.1

Lemma 2.2

Proof of Lemma 2.2

2.2. Current status data on a counting process

Theorem 2.1

2.2.1. Heuristic understanding of the difference between NPMLE and RNPMLE

Proof of Theorem 2.1

Derivation of the efficient score operators of Fj

Proving that the tangent space is saturated

3. Current status data on a counting process when final event is right censored

Theorem 3.1

3.1. Regular and asymptotically linear estimators

3.2. Proof of Theorem 3.1

Derivation of efficient score operator of Fk

Saturated tangent space result

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

CURRENT STATUS AND RIGHT-CENSORED DATA STRUCTURES WHEN OBSERVING A MARKER AT THE CENSORING TIME^¹

Derivation of the efficient score operators of F_j

Derivation of efficient score operator of F_k