Simple and Fast Overidentified Rank Estimation for Right-Censored Length-Biased Data and Backward Recurrence Time

Yifei Sun; Kwun Chuen Gary Chan; Jing Qin

doi:10.1111/biom.12727

. Author manuscript; available in PMC: 2019 Mar 1.

Published in final edited form as: Biometrics. 2017 May 15;74(1):77–85. doi: 10.1111/biom.12727

Simple and Fast Overidentified Rank Estimation for Right-Censored Length-Biased Data and Backward Recurrence Time

Yifei Sun ^1,^*, Kwun Chuen Gary Chan ^2,^**, Jing Qin ^3,^***

PMCID: PMC5976459 NIHMSID: NIHMS969395 PMID: 28504836

Summary

Length-biased survival data subject to right-censoring are often collected from a prevalent cohort. However, informative right censoring induced by the sampling design creates challenges in methodological development. While certain conditioning arguments could circumvent the problem of informative censoring, related rank estimation methods are typically inefficient because the marginal likelihood of the backward recurrence time is not ancillary. Under a semiparametric accelerated failure time model, an overidentified set of log-rank estimating equations is constructed based on the left-truncated right-censored data and backward recurrence time. Efficient combination of the estimating equations is simplified by exploiting an asymptotic independence property between two sets of estimating equations. A fast algorithm is studied for solving non-smooth, non-monotone estimating equations. Simulation studies confirm that the overidentified rank estimator can have a substantially improved estimation efficiency compared to just-identified rank estimators. The proposed method is applied to a dementia study for illustration.

Keywords: Backward and forward recurrence time, Generalized method of moments, Weighted log-rank estimating equation

1. Introduction

The accelerated failure time (AFT) model is an important alternative to Cox’s proportional hazards model and is particularly appealing to medical investigators due to its straightforward interpretation. In an ideal situation, prospective follow-up studies are conducted by sampling incident cases over a possibly long period, and the subsequent survival time of interest is usually subject to right censoring. Methods for AFT model for traditional right-censored survival data has been extensively studied by many authors, see Buckley and James (1979), Tsiatis (1990), Ying (1993) among others. In practice, due to constraints on cost and time, studies on incident cohorts are often unavailable, and data on a prevalent cohort of diseased individuals, who have experienced the disease incidence before recruitment but not the failure event, are collected and analyzed. For example, in the Canadian Study of Health and Aging (CSHA), survival data were collected from a prevalent cohort of dementia patients who were alive at the time of recruitment. In many applications, including the CSHA, it is reasonable to assume that the incidence of disease onset is stable over time, and the survival time in the prevalent cohort is length-biased (Wang, 1991; Asgharian et al., 2002).

Semiparametric estimation of the AFT model for length-biased and right-censored data has been studied by Shen et al. (2009); Ning et al. (2011, 2014a,b). Specifically, Shen et al. (2009) proposed an inverse weighted estimating equation approach with a closed-form expression. Ning et al. (2011) generalized a Buckley–James type of estimator to length-biased and right-censored data. Given the feature that observed failure time data can be transformed to identically and independently distributed random variables without covariate effects, Ning et al. (2014a) proposed a class of estimating equations based on the score functions for the transformed data. Ning et al. (2014b) proposed two rank-based estimators, one based on modified risk-sets, and another based on inverse weighting and ranking. As shown in Ning et al. (2014b), there is no uniformly best estimation method regarding statistical efficiency in the current literature, and the authors provide decision guidelines on how to choose an estimation method only for scenarios with a few symmetric error distributions. Moreover, although well-established statistically, some of the existing approaches may suffer from unstable computational properties. Hence, it is desirable to develop efficient, computationally fast and stable estimation procedures under the AFT model for right-censored length-biased data.

In this article, we introduce a simple and efficient rank-based method for the estimation and inference of the AFT model under length-biased sampling. In addition to the rank-based estimating equations for left-truncated and right-censored data (Lai and Ying, 1991), we construct an additional set of estimating equations based on an induced model of the backward recurrence time. To improve efficiency, the overidentified sets of estimating equations are combined, in the spirit of the generalized method of moments (Hansen, 1982). The estimation and inference are greatly simplified by the fact that the two sets of estimating functions are asymptotically independent, even though they are constructed from correlated survival times. A further advantage of the proposed estimator is that the AFT model can be estimated using only the backward recurrence time data, which means that one can obtain a consistent estimator after recruitment even without follow-up; most of the existing works dealing with semiparametric AFT model under length-biased sampling (Shen et al., 2009; Ning et al., 2011, 2014a,b) require some failure events to be observed and cannot handle this case. Furthermore, a computationally efficient algorithm is given to provide a solution of the estimating equations which are neither continuous nor monotone.

We note that Li and Yin (2009) proposed an overidentified rank estimator for clustered survival data. Our estimator is sufficiently different in a few key aspects. The construction of overidentified rank estimator of Li and Yin (2009) was motivated by efficiency improvement from multiple working correlation structures, extending the work of Qu et al. (2000) for uncensored data. The survival times, as well as the estimating functions, are correlated in that setting. We consider univariate length-biased survival data but decompose the survival time into two correlated portions to construct overidentified estimating equations, while the two sets of estimating functions are asymptotically independent and can be easily combined by exploiting the independence structure.

The content of the article is organized as follows: In Section 2.1, we introduce the overidentified weighted log-rank estimating equations and propose an efficient combination. To further improve efficiency, we derive and incorporate the optimal weight functions in the estimating equations in Section 2.2. Moreover, in the absence of censoring, we show the proposed estimator with correctly estimated weight function achieves the semiparametric efficiency bound. In Section 3, a fast algorithm for parameter and variance estimation is developed. Simulation studies and an application to a dementia study are presented in Section 4 and 5 for illustration. We conclude with a discussion in Section 6.

2. Estimation

2.1. Over-Identified Estimating Equations

For individuals in the target population, let $\tilde{T}$ denote the time from the disease onset to the failure event of interest, and let $\tilde{X}$ denote a p × 1 vector of covariates. We assume that the survival time in the target population follows the AFT model

\log \tilde{T} = β^{⊤} \tilde{X} + \tilde{ε},

(1)

where β is a p × 1 vector of parameters, and $\tilde{ε}$ follows an unspecified distribution. We denote by $\tilde{A}$ the time between disease onset and study enrollment, and assume that $\tilde{A}$ is independent of $\tilde{T}$ . In a prevalent cohort study, a diseased subject would be qualified to be sampled if the failure event does not occur before the sampling time, that is, $\tilde{T} \geq \tilde{A}$ . In other words, $\tilde{T}$ is left truncated by $\tilde{A}$ . Denote by T, A, and X the survival time, truncation time, and the covariates for individuals in the prevalent cohort. Then (T, A, X) has the same joint distribution as ( $\tilde{T}$ , $\tilde{A}$ , $\tilde{X}$ ) conditional on $\tilde{T} \geq \tilde{A}$ . When prospective follow-up is present, the observation of the survival time in the prevalent cohort is usually subject to right censoring. Instead of the actual value of T, we observe possibly censored survival time Y = min(T, A + C) and censoring indicator Δ = I(T ≤ A + C). In many applications, it is reasonable to assume that the censoring time after enrollment, C, is independent of (T, A) given X. Note, however, that the survival time T and the total censoring time A + C are typically correlated given X, as they share the same A. Thus the survival time T is subject to informative censoring. We assume that the observed data {(Y_i, A_i, X_i, Δ_i), i = 1, …, n} are independent and identically distributed replicates of (Y, A, X, Δ).

Let f (t) and S(t) denote the density and survival function of the random variable $\exp (\tilde{ε})$ , and $μ (x, β) = e^{β^{⊤} x} \int_{0}^{\infty} S (u) d u$ be the mean of $\tilde{T}$ given $\tilde{X} = x$ . Under length-biased sampling, the observed data likelihood, conditioning on X, is L = L_C × L_M (Wang, 1991), where we have

L_{C} \propto {\prod_{i = 1}^{n} {\frac{f (Y_{i} e^{- β^{⊤} X_{i}}) e^{- β^{⊤} X_{i}}}{S (A_{i} e^{- β^{⊤} X_{i}})}}}^{Δ_{i}} {\frac{S (Y_{i} e^{- β^{⊤} X_{i}})}{S (A_{i} e^{- β^{⊤} X_{i}})}}^{1 - Δ_{i}} and L_{M} = \prod_{i = 1}^{n} {\frac{S (A_{i} e^{- β^{⊤} X_{i}})}{μ (X_{i}, β)}} .

Based on the conditional likelihood function L_C (i.e., likelihood function of the observed failure time conditioning on truncation time and X), rank estimation for model (1) was proposed by Lai and Ying (1991), treating the data as left-truncated and right-censored. Note that inference based on the conditional likelihood L_C is not fully efficient for length-biased sampling, as evidenced by Vardi (1989), Wang (1991), Asgharian et al. (2002), Shen et al. (2009) among others. The reason is that the marginal likelihood L_M (i.e., likelihood function of the truncation time A given X) contains β and is not ancillary. Therefore, full likelihood inference will be more efficient than conditional likelihood inference. However, even under the simplest case of one-sample estimation, the maximum likelihood estimator based on the full likelihood does not have a closed form expression as discussed in Vardi (1989). Moreover, there is a thorny issue of informative censoring that invalidates risk set methods to be directly extended based on the full likelihood, because T and A + C are correlated given covariates X. In what follows, we propose an estimator that combines information from L_C and L_M to improve efficiency.

To estimate β, weighted log-rank estimating equation was proposed in Lai and Ying (1991) based on inverting a class of linear rank test statistics constructed from L_C. We define $N_{i}^{Y} (t, β) = I (\log Y_{i} - β^{⊤} X_{i} \leq t)$ and $R_{i}^{Y} (t, β) = I (\log A_{i} - β^{⊤} X_{i} \leq t \leq \log Y_{i} - β^{⊤} X_{i})$ . Let ϕ₁(t, β) denote a weight function that possibly depends on data. A system of weighted log-rank estimating functions can be constructed as

Ψ_{1} (β) = n^{- 1} \sum_{i = 1}^{n} \int_{- \infty}^{\infty} ϕ_{1} (u, β) {X_{i} - \frac{\sum_{j = 1}^{n} X_{j} R_{j}^{Y} (u, β)}{\sum_{j = 1}^{n} R_{j}^{Y} (u, β)}} d N_{i}^{Y} (u, β) .

(2)

We denote ${\hat{β}}_{WLR, 1}$ to be the solution of Ψ₁(β) = o_p(n^−1/2). The right-hand side of the equation may not be identical to 0 because Ψ₁ is discontinuous and the solution is typically defined as the zero-crossing of Ψ₁(β).

Since Ψ₁(β) is based on L_C, we can improve estimation efficiency by considering L_M, the marginal likelihood of A given X. Under length-biased sampling, we have

L_{M} = \prod_{i = 1}^{n} \frac{S (A_{i} e^{- β^{⊤} X_{i}})}{μ (X_{i}, β)} = \prod_{i = 1}^{n} \frac{S (A_{i} e^{- β^{⊤} X_{i}})}{e^{- β^{⊤} X_{i}} E (e^{\tilde{ε}})} \propto \prod_{i = 1}^{n} \frac{S (A_{i} e^{- β^{⊤} X_{i}}) A_{i} e^{- β^{⊤} X_{i}}}{E (e^{\tilde{ε}})} \overset{def}{=} \prod_{i = 1}^{n} f_{η} (\log A_{i} - β^{⊤} X_{i}),

where $f_{η} (u) = S (e^{u}) e^{u} / E (e^{\tilde{ε}})$ is a density function. Thus L_M is equivalent to the likelihood based on the following induced model on the truncation time A:

\log A_{i} = β^{⊤} X_{i} + η_{i}, i = 1, \dots, n

(3)

where η is a random variable with density function f_η(·). Model (3) was first discussed by Yamaguchi (2003), where the author considered parametric AFT models when follow-up is not present. Define, $N_{i}^{A} (t, β) = I (\log A_{i} - β^{⊤} X_{i} \leq t)$ and $R_{i}^{A} (t, β) = I (\log A_{i} - β^{⊤} X_{i} \geq t)$ . Based on the induced model (3), a weighted log-rank estimating function is given by

Ψ_{2} (β) = n^{- 1} \sum_{i = 1}^{n} \int_{- \infty}^{\infty} ϕ_{2} (u, β) {X_{i} \frac{\sum_{j = 1}^{n} X_{i} R_{j}^{A} (u, β)}{\sum_{j = 1}^{n} R_{j}^{A} (u, β)}} d N_{i}^{A} (u, β),

(4)

where ϕ₂(t, β) is a weight function that possibly depends on data. We denote ${\hat{β}}_{WLR, 2}$ to be a solution of Ψ₂(β) = o_p (n^−1/2).

To estimate the parameter β, we have two sets of estimating equations. Combining Ψ₁(β) and Ψ₂(β) yields an overidentified set of estimating equations for β, and a question arises as for how to combine the estimating equations to attain optimal efficiency. One possible way is the generalized method of moments (GMM) (Hansen, 1982). Define Ψ(β) = (Ψ₁(β)^⊤, Ψ₂(β)^⊤)^⊤, and let W be a 2p × 2p positive-definite weight matrix. A consistent estimator of β can be obtained by ${\hat{β}}_{GMM} = \arg {min}_{β} Ψ {(β)}^{⊤} W Ψ (β)$ . Moreover, the optimal matrix W that yields an efficient estimator is the inverse of asymptotic covariance matrix of $\sqrt{n} Ψ (β_{0})$ , where β₀ is the true value of β. The following lemma implies that the optimal weight matrix is a block diagonal matrix.

Lemma 1

Under Assumptions (A1)~(A4) in the Appendix, $\sqrt{n} Ψ_{1} (β_{0})$ and $\sqrt{n} Ψ_{2} (β_{0})$ are asymptotically independent.

Lemma 1 is a non-trivial result because T and A, the outcomes used to construct Ψ₁ and Ψ₂, are positively correlated. The proof of Lemma 1 is given in the Supplementary Materials. The independence of estimation functions can also be rationalized from a likelihood perspective. It is easy to see that the β-score functions from conditional likelihood L_C and marginal likelihood L_M are orthogonal. Moreover, by projecting the score functions to the space orthogonal to the nuisance tangent space, the efficient score functions are still orthogonal. Since the weighted log-rank estimating functions are constructed based on the efficient score functions (Ritov and Wellner, 1988), the asymptotic independence of $\sqrt{n} Ψ_{1} (β_{0})$ and $\sqrt{n} Ψ_{2} (β_{0})$ can be proved.

It can be verified that $\sqrt{n} ({\hat{β}}_{WLR, 1} - β_{0})$ and $\sqrt{n} ({\hat{β}}_{WLR, 2} - β_{0})$ are asymptotically normal with covariance-variance matrices V₁ and V₂ (V₁,V₂ are given in the Supplementary Materials). By applying Lemma 1, the optimal GMM type estimator has asymptotic variance ${(V_{1}^{- 1} + V_{2}^{- 1})}^{- 1}$ . However, the computation of ${\hat{β}}_{GMM}$ requires to minimize a quadratic form, which can be computationally intensive, particularly because Ψ(β) is neither continuous nor monotone.

Based on Lemma 1, we can construct a simpler estimator that is asymptotically equivalent to the optimal GMM estimator. It is shown in the Supplementary Materials that $\sqrt{n} ({\hat{β}}_{WLR, 1} - β_{0})$ and $\sqrt{n} ({\hat{β}}_{WLR, 2} - β_{0})$ are asymptotically orthogonal. This suggests us to consider a linearly weighted estimator, ${(V_{1}^{- 1} + V_{2}^{- 1})}^{- 1} (V_{1}^{- 1} {\hat{β}}_{WLR, 1} + V_{2}^{- 1} {\hat{β}}_{WLR, 2})$ , whose asymptotic variance equals that of the optimal GMM estimator. In practice, V₁ and V₂ are usually unknown and need to be estimated. Suppose ( ${\hat{V}}_{1}$ , ${\hat{V}}_{2}$ ) are consistent estimators of (V₁, V₂), we propose to use the following weighted estimator,

{\hat{β}}_{W} = {({\hat{V}}_{1}^{- 1} + {\hat{V}}_{2}^{- 1})}^{- 1} ({\hat{V}}_{1}^{- 1} {\hat{β}}_{WLR, 1} + {\hat{V}}_{2}^{- 1} {\hat{β}}_{WLR, 2}) .

A detailed computation procedure to obtain ( ${\hat{β}}_{WLR, 1}$ , ${\hat{β}}_{WLR, 2}$ ) and ( ${\hat{V}}_{1}$ , ${\hat{V}}_{2}$ ) is given in Section 3. Let β₀ be the true regression coefficient, Theorem 1 summarizes the asymptotic properties of ${\hat{β}}_{W}$ , with a proof given in the Supplementary Materials.

Theorem 1

Under assumptions (A1)–(A5) in the Appendix, $\sqrt{n} ({\hat{β}}_{W} - β_{0})$ converges weakly to a zero mean normal random vector with covariance matrix ${(V_{1}^{- 1} + V_{2}^{- 1})}^{- 1}$ .

From Theorem 1, the proposed estimator ${\hat{β}}_{W}$ is more efficient than the estimators using just identified estimating equations, because $V_{1} \geq {(V_{1}^{- 1} + V_{2}^{- 1})}^{- 1}$ and $V_{2} \geq {(V_{1}^{- 1} + V_{2}^{- 1})}^{- 1}$ , where V ≥ U if V − U is positive semi-definite for matrix V, U.

The above discussion and theoretical results are based on unspecified weight functions ϕ₁(·, β) and ϕ₂(·, β). For instance, setting ϕ₁(·, β) = ϕ₂(·, β) = 1 yields the log-rank estimating equations. Moreover, because Model (3) is the standard semi-parametric linear regression model, a natural choice to estimate β is the least square estimator ${\hat{β}}_{L S}$ , defined as the solution of the following estimating equation,

Ψ_{L S} (β) = \frac{1}{n} \sum_{i = 1}^{n} (X_{i} - \bar{X}) (\log A_{i} - X_{i} β) = 0,

where $\bar{X} = \sum_{i = 1}^{n} X_{i} / n$ . By setting $ϕ_{2} (t, β) = t - \frac{\int_{t}^{\infty} u d F_{η} (u)}{1 - F_{η} (t)}$ , we have $\sqrt{n} Ψ_{L S} (β_{0}) = \sqrt{n} Ψ_{2} (β_{0}) + o_{p} (1)$ , where F_η is the cumulative distribution function of η (Ritov, 1990). Therefore the asymptotically independence result of $\sqrt{n} Ψ_{1} (β_{0})$ and $\sqrt{n} Ψ_{L S} (β_{0})$ also holds, and one can linearly combine ${\hat{β}}_{1, WLR}$ and ${\hat{β}}_{L S}$ to improve efficiency. Without additional assumptions, it is not clear whether ${\hat{β}}_{2, WLR}$ is more efficient than ${\hat{β}}_{L S}$ . Although rank estimation in (4) is not the standard way to handle uncensored data, it is used because of the independence property that leads to a simple combined estimator. In Section 2.2, we explore the weight functions ϕ₁(t, β) and ϕ₂(t, β), so that ${\hat{β}}_{2, WLR}$ could be more efficient than ${\hat{β}}_{L S}$ with properly chosen weight functions.

2.2. Efficient Adaptive Rank Estimators

To further improve the efficiency, we derive the optimal weight functions ϕ₁(·, β) and ϕ₂(·, β) for the two sets of estimating equations. Define $ϕ_{1}^{0} (u, β)$ to be the limit of ϕ₁(u, β) as n → ∞, and let $λ_{\tilde{ε}} (\cdot)$ denote the hazard function of $\tilde{ε}$ . For the first set of estimating function Ψ₁(β), it is shown that random vector $\sqrt{n} ({\hat{β}}_{WLR, 1} - β_{0})$ is asymptotically normal with covariance matrix $V_{1} = Γ_{1} {(β_{0})}^{- 1} \sum_{1} (β_{0}) Γ_{1} {(β_{0})}^{- 1}$ , where

Γ_{1} (β_{0}) = E {\int_{- \infty}^{\infty} ϕ_{1}^{0} (u, β_{0}) \frac{{\dot{λ}}_{\tilde{ε}} (u)}{λ_{\tilde{ε}} (u)} [X - \frac{E {R^{Y} (u, β_{0}) X}}{E {R^{Y} (u, β_{0})}}]}^{\otimes 2} d N^{Y} (u, β_{0}),

and

\sum_{1} (β_{0}) = E \int_{- \infty}^{\infty} ϕ_{1}^{0} {(u, β_{0})}^{2} {[X - \frac{E {R^{Y} (u, β_{0}) X}}{E {R^{Y} (u, β_{0})}}]}^{\otimes 2} d N^{Y} (u, β_{0}) .

By Cauchy–Schwartz inequality, the optimal weight is

ϕ_{1}^{opt} (u) = {\dot{λ}}_{\tilde{ε}} (u) / λ_{\tilde{ε}} (u) = e^{u} \dot{λ} (e^{u}) / λ (e^{u}) + 1,

(5)

where $\dot{λ} (u) = d λ (u) / d u$ and ${\dot{λ}}_{\tilde{ε}} (u) = d λ_{\tilde{ε}} (u) / d u$ . Similarly, for Ψ₂(β), let λ_η be the hazard function of η with ${\dot{λ}}_{η} (u) = d λ_{η} (u) / d u$ , then the optimal weight function is

ϕ_{2}^{opt} (u) = {\dot{λ}}_{η} (u) / λ_{η} (u) = - λ (e^{u}) e^{u} + 1 + \frac{S (e^{u}) e^{u}}{\int_{u}^{\infty} S (e^{x}) e^{x} d x} .

(6)

There are a few options to estimate the weight functions $ϕ_{1}^{opt} (\cdot)$ and $ϕ_{2}^{opt} (\cdot)$ : for example, kernel smoothing techniques have been applied in Lai and Ying (1991) and Lin and Chen (2013). However, substituting such nonparametric type smoothing estimators into equations (2) and (4) could lead to estimators for β that perform poorly with moderate sample sizes, due to the unstableness of the kernel estimators. As an alternative, we can assume a flexible working parametric model for $\tilde{ε}$ . For instance, $e^{\tilde{ε}}$ can be assumed to follow the generalized gamma distribution (Cox et al., 2007), which is an extensive family that contains nearly all of the most commonly-used survival distributions. Then the unknown parameter involved in the distribution of $\tilde{ε}$ can be estimated through score equation of the conditional likelihood using rescaled survival times. Even in the case where the working model is mis-specified, the proposed estimator is consistent and asymptotically normal.

In the absence of censoring, if the error term $\tilde{ε}$ follows the working model distribution, the combined estimator with consistently estimated optimal weights achieves the semiparametric efficiency bound. Define $M_{1} (t, β) = N^{Y} (t, β) - \int_{- \infty}^{t} R^{Y} (u, β) λ_{\tilde{ε}} (u) d u$ and $M_{2} (t, β) = N^{A} (t, β) - \int_{- \infty}^{t} R^{A} (u, β) λ_{η} (u) d u$ . Theorem 2 states the efficiency score of the AFT model with length-biased survival data, and the proof is given in the Supplementary Materials.

Theorem 2

In the absence of censoring, the efficient score of model (1) with length-biased data {(A_i, T_i, X_i), i = 1, …, n} is

S_{eff} (A, T, X) = \int_{- \infty}^{\infty} \frac{{\dot{λ}}_{\tilde{ε}} (u)}{λ_{\tilde{ε}} (u)} {X - E (X)} d M_{1} (u, β_{0}) + \int_{- \infty}^{\infty} \frac{{\dot{λ}}_{η} (u)}{λ_{η} (u)} {X - E (X)} d M_{2} (u, β_{0}) .

Remark 1

When the optimal weight function is correctly estimated, ${\hat{β}}_{W}$ is asymptotically equivalent ot ${\hat{β}}_{S}$ , which is the solution of Ψ₁(β) + Ψ₂(β) = o_p (n^−1/2). However, when the user-specified weight function is different from the optimal choice, then ${\hat{β}}_{W}$ is asymptotically more efficient than ${\hat{β}}_{S}$ in general.

Remark 2

The following induced models hold in the absence of censoring,

\log T_{i} = X_{i} β + ε_{i},

(7)

\log A_{i} = X_{i} β + η_{i},

(8)

where the joint density function of (ε, η) is $f_{(ε, η)} (u, v) = f (e^{v}) e^{u + v} / E (e^{\tilde{ε}})$ for u < v. Model (7) has been studied in Chen (2010) and Mandel and Ritov (2010). In this case, T_i’s are sufficient for estimating β, and only (7) is needed for estimation. Moreover, it can be shown that our proposed estimator, with consistently estimated optimal weight, is asymptotically equivalent to the efficient estimator based on marginal likelihood of model (7). However, the rank estimator of Chen (2010) cannot handle length-biased right-censored data because of induced informative censoring. To improve efficiency in the presence of right censoring, we need to consider (7) and (8) jointly.

Remark 3

It has been shown in Ritov and Wellner (1988) that the efficient score function for model (3) is

\int_{- \infty}^{\infty} \frac{{\dot{λ}}_{η} (t)}{λ_{η} (t)} {X - E X} d M_{2} (t),

where $M_{2} (t) = I (A \leq t) - \int_{0}^{t} (A \geq t) λ_{η} (t) d t$ , and the efficiency bound is

I_{2} = \int_{- \infty}^{\infty} {\frac{{\dot{f}}_{η} (t)}{f_{η} (t)}}^{2} f_{η} (t) d t \cdot Cov (X) .

When the weight function $ϕ_{2}^{opt}$ is consistently estimated, the estimator ${\hat{β}}_{WLR, 2}$ will achieve the semi-parametric efficiency bound I₂, and thus asymptotically will be more efficient than the least square estimate ${\hat{β}}_{L S}$ .

3. Fast Computation

The computation of rank estimators is typically challenging, because the weighted log-rank estimating equation is usually neither continuous nor monotone, and it may have inconsistent roots in addition to a consistent root (Fygenson and Ritov, 1994). In such cases, the estimator needs to be defined in a shrinking neighborhood of the true value β₀, and iterative methods require a consistent initial value. However, finding a consistent initial estimate is usually as computationally challenging as directly finding the root of the estimating equation. This computational challenge is a major obstacle for applying the rank estimation techniques in practice even for the standard right-censored data. In what follows, a computationally simple approach is given for computing ${\hat{β}}_{WLR, 1}$ by borrowing strength from two algorithms proposed by Huang (2002) and Huang (2013). A parallel argument applies to ${\hat{β}}_{WLR, 2}$ and is thus omitted.

Although methodologies for length-biased and right-censored data is usually thought as more complicated than that for right-censored data, a rather surprising fact is that a simple consistent initial estimator of β can be obtained from the induced model (3). Specifically, based on model (7) and Yamaguchi (2003), the least square estimate ${\hat{β}}_{L S}$ by regressing the backward recurrent time log A against X is a $\sqrt{n}$ -consistent estimate of β and thus can serve as an initial value for an iterative algorithm.

To compute ${\hat{β}}_{WLR, 1}$ , we consider a modified Newton’s method, following the arguments of Huang (2013). Under regularity conditions (A1)–(A5) in the Appendix, an asymptotic local linearity condition holds. Specifically, let ‖·‖ denote the Euclidean norm, for every sequence d_n > 0 and d_n converges to 0 in probability,

\sup_{β : ‖ β - β_{0} ‖ \leq d_{n}} \frac{‖ Ψ_{1} (β) - Ψ_{1} (β_{0}) - {\hat{Γ}}_{1} (β - β_{0}) ‖}{n^{- 1 / 2} + ‖ β - β_{0} ‖} = o_{p} (1),

(9)

where ${\hat{Γ}}_{1}$ is a consistent estimate of matrix Γ₁(β₀), he derivative at β₀ of the limiting Ψ₁(β) when n→∞. Based on (9), a Newton-type algorithm can be made iteratively,

{\hat{β}}^{(k)} = {\hat{β}}^{(k - 1)} - {\hat{Γ}}_{1} Ψ_{1} ({\hat{β}}^{(k - 1)}), k \geq 1

(10)

where ${\hat{β}}^{(0)} = {\hat{β}}_{L S}$ . Since ${\hat{β}}^{(0)}$ is an $\sqrt{n}$ -consistent estimate of β₀, it can be shown that the one-step estimator ${\hat{β}}^{(1)}$ satisfies $\sqrt{n} Ψ_{1} ({\hat{β}}^{(1)}) = o_{p} (1)$ . Moreover, to avoid the problem of over-shooting, we halve the step size repeatedly until the new estimate leads to a decrease in the quadratic score $Ψ_{1} {(β)}^{⊤} {\sum^{^}}_{1} {(β)}^{- 1} Ψ_{1} (β)$ , where ${\sum^{^}}_{1} (β)$ is defined as

{\sum^{^}}_{1} (β) = n^{- 1} \sum_{i = 1}^{n} \int_{- \infty}^{\infty} ϕ_{1}^{2} (u, β) {X_{i} - \frac{\sum_{j = 1}^{n} X_{j} R_{j}^{Y} (u, β)}{\sum_{j = 1}^{n} R_{j}^{Y} (u, β)}}^{\otimes 2} d N_{i}^{Y} (u, β) .

In order to apply the algorithm in (10), a consistent estimate of Γ₁(β₀) is needed. Note that for a p × 1 vector h, we have

Ψ_{1} ({\hat{β}}^{(0)} + n^{- 1 / 2} h) - Ψ_{1} ({\hat{β}}^{(0)}) = n^{- 1 / 2} Γ_{1} (β_{0}) h + o_{p} (n^{- 1 / 2}) .

(11)

Let H₁ be a p × p non-singular matrix with ‖H₁‖_max = O_p(1) and $‖ H_{1}^{- 1} ‖_{max} = O_{p} (1)$ , where ‖·‖_max denotes the maximum absolute value of the matrix elements. Let h₁₁, …, h₁_p be the column vectors of H₁, that is, H₁ = (h₁₁, …, h₁_p). Define the matrix $A_{1} = \sqrt{n} {Ψ_{1} ({\hat{β}}^{(0)} + n^{- 1 / 2} h_{11}) - Ψ_{1} ({\hat{β}}^{(0)}), \dots, Ψ_{1} ({\hat{β}}^{(0)} + n^{- 1 / 2} h_{1 p}) - Ψ_{1} ({\hat{β}}^{(0)})}$ , it follows from (11) that $A_{1} H_{1}^{- 1}$ is a consistent estimate of Γ₁(β₀), thus we estimate Γ₁(β₀) by

{\hat{Γ}}_{1} = A_{1} H_{1}^{- 1} .

One possible choice of n^−1/2H₁ is the Cholesky factorization of the estimated covariance matrix of ${\hat{β}}^{(0)}$ . Given ${\hat{Γ}}_{1}$ , ${\hat{β}}_{WLR, 1}$ can be obtained by the Newton type algorithm in (10). Moreover, the asymptotic variance estimate of $\sqrt{n} ({\hat{β}}_{WLR, 1} - β_{0})$ is readily available as

{\hat{V}}_{1} = {\hat{Γ}}_{1}^{- 1} {\sum^{^}}_{1} ({\hat{β}}_{WLR, 1}) {({\hat{Γ}}_{1}^{⊤})}^{- 1},

(12)

which converges in probability to V₁. The variance estimation is simpler than many other existing methods that either require kernel smoothing or resampling (Tsiatis, 1990; Parzen et al., 1994; Jin et al., 2003).

The above algorithm is similar in flavor to the algorithm in Huang (2002), but with certain important differences. The algorithm of Huang (2002) approximates the inverse of estimating function, which requires solution-finding and may be computationally intensive. Moreover, due to the lack of a consistent initial estimate, Huang (2002) uses a recursive bisection algorithm. Our algorithm is also similar to the algorithm in Huang (2013), which requires an initial value obtained from a censored quantile regression model (Huang, 2010). Our problem structure permits us to use a least square estimate as the initial estimation, which is much simpler. Also, the method of Huang (2013) may not be readily used for finding the solution of Ψ₁(β) = o_p(n^−1/2), since it is unclear how a computationally simple and consistent initial value is obtained from censored quantile regression for left-truncated and right-censored data.

4. Simulations

Simulation studies are conducted to examine the finite-sample performance of the proposed inference procedures. We generate failure times from the following model

\log \tilde{T} = β_{1} {\tilde{X}}_{1} + β_{2} {\tilde{X}}_{2} + \tilde{ε}

where ${\tilde{X}}_{1}$ is generated from a Bernoulli distribution with success probability 0.5, and ${\tilde{X}}_{2}$ is a continuous variable from the uniform distribution on [0,1]. We set β₁ = 0.5 and β₂ = 1. The error distribution were generated from (i) $e^{\tilde{ε}}$ follows Weibull distribution with shape parameter 2, scale parameter 0.5; (ii) $\tilde{ε}$ follows extreme value distribution with scale parameter 0.2; (iii) $e^{\tilde{ε}}$ follows gamma distribution with mean one and variance 0·25; and (iv) $\tilde{ε}$ follows normal distribution with mean zero and variance 1/12. The truncation times and residual censoring times were generated in the original time scale (not log-scale). Specifically, the truncation times were generated from a uniform distribution with a large enough upper bound to ensure the stationarity assumption, and we kept only the pairs satisfying $\tilde{A} < \tilde{T}$ . The residual censoring times, C, were independently generated from a uniform distribution over [0,c], where c was chosen to yield the censoring percentage of 0, 25, and 50%. For each specified set of parameters, sample size of 200 and 800 are chosen, and each scenario was repeated 1000 times. The results are summarized in Tables 1 and 2. We denote the proposed estimator with log-rank weight by ${\hat{β}}_{W}^{lr}$ and the proposed estimator with estimated optimal weight using generalized gamma family as the working model by ${\hat{β}}_{W}^{opt}$ . We compare our estimators with the estimator ${\hat{β}}_{L T}$ by solving log-rank estimation equation for left-truncated and right-censored data, and the weighted log-rank estimator ${\hat{β}}_{M}$ based on the marginal likelihood with estimated ϕ₂ using the working model. We also present the results of parametric maximum likelihood estimator by assuming $\tilde{ε}$ follows generalized gamma distribution $({\hat{β}}_{MLE})$ and normal distribution $({\hat{β}}_{MLE}^{normal})$ .

Table 1.

Simulation summary statistics (n = 200)

Cen

{\hat{β}}_{W}^{opt}

{\hat{β}}_{W}^{lr}

{\hat{β}}_{L T}

Bias

SEE

Bias

SEE

Bias

Scenario I

(0,−1)

(63,107)

(59,104)

(69,72)

(−1,−2)

(63,107)

(60,104)

(69,72)

(−1,−4)

(76,126)

(0,−2)

(68,118)

(65,114)

(64,65)

(0,−3)

(68,118)

(66,115)

(64,65)

(1,2)

(85,146)

(−2,−11)

(75,133)

(73,127)

(51,50)

(−2,−11)

(74,131)

(74,128)

(50,48)

(2,−4)

(105,189)

Scenario II

(1,−3)

(28,49)

(27,47)

(86,89)

(1,−1)

(30,50)

(28,49)

(99,92)

(−3,−1)

(30,52)

(2,−1)

(30,52)

(29,51)

(94,86)

(1,1)

(30,51)

(30,52)

(94,83)

(1,3)

(31,56)

(0,0)

(34,58)

(33,57)

(84,79)

(−1,0)

(34,59)

(84,82)

(−2,3)

(37,65)

Scenario III

(1,2)

(68,119)

(63,108)

(66,68)

(2,2)

(69,122)

(66,114)

(67,72)

(0,2)

(84,144)

(−1,−2)

(73,124)

(69,117)

(63,63)

(0,2)

(73,126)

(71,123)

(63,65)

(−2,−7)

(92,156)

(2,3)

(80,146)

(75,129)

(53,56)

(2,3)

(81,147)

(77,134)

(55,57)

(6,8)

(110,195)

Scenario IV

(0,2)

(42,73)

(39,63)

(63,66)

(0,1)

(46,79)

(45,78)

(75,77)

(0,3)

(53,90)

(−1,0)

(47,82)

(42,69)

(62,66)

(0,0)

(51,88)

(49,85)

(72,76)

(1,2)

(60,101)

(−1,−8)

(53,89)

(47,77)

(64,56)

(−2,−8)

(58,95)

(54,95)

(77,64)

(3,1)

(66,119)

Cen

{\hat{β}}_{M}

{\hat{β}}_{MLE}

{\hat{β}}_{MLE}^{norm}

Bias

Scenario I

(3,6)

(106,185)

(195,216)

(0,−3)

(60,105)

(62,69)

(−2,−9)

(72,130)

(90,107)

(2,−5)

(107,186)

(158,162)

(−1,−2)

(65,114)

(58,61)

(−1,−7)

(78,136)

(84,87)

(0,−3)

(109,185)

(108,96)

(0,−8)

(71,125)

(46,44)

(−5,−23)

(83,145)

(63,60)

Scenario II

(0,8)

(71,122)

(555,553)

(1,1)

(28,47)

(86,82)

(1,−4)

(33,61)

(120,138)

(6,0)

(71,123)

(528,481)

(1,0)

(30,52)

(94,86)

(0,−3)

(39,74)

(158,174)

(2,3)

(70,122)

(357,352)

(1,2)

(35,62)

(89,91)

(−1,−3)

(42,73)

(129,126)

Scenario III

(0,2)

(112,194)

(178,181)

(−5,−15)

(66,110)

(62,59)

(0,−2)

(69,121)

(67,71)

(−2,−7)

(113,195)

(150,156)

(2,5)

(70,127)

(58,66)

(−3,−5)

(74,125)

(65,64)

(6,8)

(110,201)

(100,106)

(3,4)

(77,140)

(49,52)

(−1,−4)

(79,142)

(51,53)

Scenario IV

(0,−11)

(85,162)

(257,325)

(2,−5)

(42,72)

(63,64)

(1,−6)

(43,76)

(66,72)

(1,7)

(90,164)

(225,264)

(−2,−1)

(45,81)

(56,64)

(−2,−1)

(45,83)

(56,68)

(7,3)

(95,158)

(207,177)

(3,1)

(50,85)

(57,51)

(3,−2)

(51,89)

(60,56)

Open in a new tab

Note: Cen is the censoring rate (%); Bias is the empirical bias (×1000); SE is the empirical standard error (×1000); SEE is the empirical mean of the standard error estimates (×1000); RE is the relative efficiency (×100) compared to ${\hat{β}}_{L T} \cdot {\hat{β}}_{W}^{opt}$ is the combined estimator with estimated weight function as in Section 2.2; ${\hat{β}}_{W}^{lr}$ is the combined estimator with ϕ₁ = ϕ₂ = 1; ${\hat{β}}_{L T}$ is the estimator from log-rank estimating equations based on L_C; ${\hat{β}}_{M}$ is the rank-based estimator based on L_M with estimated ϕ₂ by assuming $\tilde{ε}$ follows a generalized gamma distribution; ${\hat{β}}_{MLE}$ and ${\hat{β}}_{MLE}^{normal}$ are the parametric maximum likelihood estimators assuming generalized gamma and normal distribution for $\tilde{ε}$ . RE of ${\hat{β}}_{L T}$ is 100 and is omitted in the table.

Table 2.

Simulation summary statistics (n = 800)

Cen

{\hat{β}}_{W}^{opt}

{\hat{β}}_{W}^{lr}

{\hat{β}}_{L T}

Bias

SEE

Bias

SEE

Bias

Scenario I

(1,1)

(30,52)

(30,51)

(66,64)

(0,0)

(30,52)

(30,51)

(66,64)

(1,1)

(37,65)

(1,0)

(31,54)

(32,56)

(60,56)

(1,−1)

(31,54)

(32,56)

(60,56)

(2,−2)

(40,72)

(0,−2)

(35,63)

(36,63)

(51,50)

(−1,−2)

(35,63)

(36,63)

(51,50)

(−2,−2)

(49,89)

Scenario II

(1,1)

(14,24)

(13,23)

(87,85)

(1,1)

(14,24)

(13,23)

(87,85)

(1,1)

(15,26)

(1,0)

(14,26)

(14,24)

(77,86)

(1,0)

(14,26)

(14,25)

(77,86)

(1,1)

(16,28)

(0,−1)

(16,26)

(16,27)

(88,70)

(0,−1)

(16,27)

(16,28)

(89,76)

(0,0)

(17,31)

Scenario III

(3,0)

(34,57)

(32,55)

(53,64)

(3,0)

(34,57)

(33,57)

(53,64)

(−2,−1)

(47,71)

(−1,2)

(36,60)

(35,59)

(59,59)

(−1,1)

(36,61)

(35,61)

(59,61)

(1,−1)

(47,78)

(0,3)

(38,66)

(48,51)

(0,2)

(38,67)

(39,67)

(48,53)

(−2,5)

(55,92)

Scenario IV

(1,0)

(21,37)

(20,34)

(65,65)

(1,1)

(22,39)

(23,39)

(72,72)

(1,1)

(26,46)

(0,2)

(23,40)

(22,37)

(63,64)

(0,1)

(25,42)

(24,42)

(74,71)

(0,1)

(29,50)

(−2,0)

(25,43)

(24,41)

(51,55)

(−1,0)

(27,45)

(27,47)

(60,60)

(−1,1)

(35,58)

Cen

{\hat{β}}_{M}

{\hat{β}}_{MLE}

{\hat{β}}_{MLE}^{norm}

Bias

Scenario I

(−1,−1)

(51,90)

(190,192)

(−1,−3)

(30,50)

(66,59)

(−3,−7)

(37,67)

(100,107)

(0,−2)

(50,90)

(156,156)

(1,1)

(31,54)

(60,56)

(−3,−9)

(41,72)

(105,101)

(−1,3)

(51,86)

(108,93)

(0,−2)

(35,62)

(51,49)

(−7,−19)

(43,79)

(79,83)

Scenario II

(1,3)

(34,56)

(512,465)

(0,−1)

(13,22)

(75,72)

(0,−1)

(17,32)

(128,151)

(1,−2)

(33,60)

(424,459)

(1,1)

(15,26)

(88,86)

(−2,−5)

(21,37)

(173,178)

(0,−2)

(35,58)

(424,350)

(0,−1)

(17,29)

(100,88)

(0,−1)

(22,43)

(167,192)

Scenario III

(−1,2)

(55,96)

(137,183)

(−1,−2)

(33,56)

(49,63)

(−3,−4)

(35,61)

(56,74)

(−1,6)

(54,98)

(132,157)

(−2,0)

(35,60)

(50,59)

(−1,−1)

(38,66)

(65,71)

(1,−3)

(55,95)

(100,106)

(−1,−1)

(39,66)

(56,52)

(−5,−6)

(39,68)

(51,55)

Scenario IV

(−1,2)

(45,77)

(299,280)

(−1,1)

(21,37)

(65,65)

(−2,−2)

(22,38)

(72,68)

(0,0)

(44,79)

(230,250)

(0,−1)

(23,39)

(63,61)

(−2,−4)

(23,40)

(63,65)

(2,2)

(44,81)

(158,195)

(2,−3)

(25,43)

(51,55)

(1,−3)

(25,45)

(51,75)

Open in a new tab

It can be seen from the table that all the estimators perform well in finite sample studies, and the proposed estimators substantially outperform ${\hat{β}}_{L T}$ and ${\hat{β}}_{M}$ in all the scenarios. In Scenario (i)–(iii), the distributions of $e^{\tilde{ε}}$ belong to generalized gamma family, and ${\hat{β}}_{W}^{opt}$ has similar standard error as ${\hat{β}}_{W}^{lr}$ . Note that $ϕ_{1}^{opt} \equiv 1$ in Scenario (i) and (ii). In Scenario (iv), general gamma distribution approaches normal distribution (Cox et al., 2007), and ${\hat{β}}_{W}^{opt}$ have smaller standard error than ${\hat{β}}_{W}^{lr}$ . The improvement of our estimator is mainly due to combination of the two sets of estimating equations, and improvement from estimating the optimal weight function is less notable. When the parametric model is correctly specified, the MLE is slightly more efficient than the proposed estimators; however, MLE can be less efficient when the parametric model is wrongly specified, for example, ${\hat{β}}_{MLE}^{normal}$ has relatively large variance in Scenario (i)–(iii).

5. Data Analysis

We illustrate the proposed estimation procedure by analyzing the CSHA data. As discussed in Wolfson et al. (2001), the CSHA was a prevalent cohort where the survival data were collected from a cohort of dementia patients at recruitment. Thus, patients who died before the recruitment period were not qualified to enter the cohort. CSHA recruited a prevalent cohort of individuals aged 65 and older with dementia during the period between February 1991 and May 1992. The survival time of interest is the time from onset to death, and the truncation time in the prevalent cohort is the duration from the onset of dementia to study enrollment. The goal of our analysis is to estimate the relative survival following the onset of dementia among subcategories of dementia, which is an important scientific question studied by Mölsä et al. (1986) and Roberson et al. (2005). We considered a subset of the study data by excluding those with missing date of onset or classification of dementia subtype. Moreover, as in Wolfson et al. (2001), patients with observed survival time of 20 or more years were excluded because these subjects are considered unlikely to have Alzheimers disease or vascular dementia. A total of 807 subjects were analyzed; among them, 249 were diagnosed with possible Alzheimers disease, 388 had probable Alzheimers disease, and 170 had vascular dementia. The observation of the residual survival time after recruitment is censored by end of the follow up period. The constant disease incidence assumption was checked in Huang and Qin (2012) with the Kolmogorov–Smirnov test, based on the fact that under mild conditions, the truncation time A and the residual lifetime after enrollment T − A have identical distributions if and only if the incidence of disease is constant over time (Asgharian et al., 2006). The applicability of the AFT time to the application was checked using QQ-plots Ning et al. (2011).

We consider the following AFT model,

\log (\tilde{T}) = β_{1} {\tilde{X}}_{1} + β_{2} {\tilde{X}}_{2} + \tilde{ε},

where ${\tilde{X}}_{1}$ and ${\tilde{X}}_{2}$ are binary variables that indicate whether the patients is probable Alzheimer and vascular dementia, respectively. The proposed estimator of β₁ is −0.107, with a 95% confidence interval (−0.216, −0.001), and β₂ is −0.166, with a 95% confidence (−0.289, −0.044). Our analysis suggests that the survival time for probable Alzheimer and vascular dementia patients are significantly shorter than that of the possible Alzheimer patients. For comparison, we also applied the two rank-based methods in Ning et al. (2014b). Using the first method in Ning et al. (2014b), based on modified risk sets, the estimated β₁ is −0.138 (CI: −0.361, 0.085) and β₂ is −0.152 (CI: −0 375, 0.071). Using their second method based on inverse weighting and ranking, the estimated β₁ is −0.102 (CI: −0.214, 0.010) and β₂ is −0.156 (CI: −0.319, 0.007). All estimators have similar point estimates, but our proposed estimator has the smallest standard error estimates and detect significant effects of probable Alzheimer and vascular dementia on survival time.

6. Discussion

In this article, we propose an estimator to efficiently combine overidentified sets of estimating equations resulting from the follow-up data as well as the backward recurrence time data for a length-biased prevalent cohort. The proposed estimator is simple to implement, but is asymptotically equivalent to the optimal GMM estimator. A computationally fast and stable procedure is also presented for estimation and inference.

Rank-based estimating equation can be regarded as the inversion of weighted log-rank statistics. In our case, the estimating equations can be regarded as the inversion of the log-rank test of Ying (1990) for left-truncated and right-censored data and the log-rank test of Chan and Qin (2015) for backward recurrence data. However, in terms of estimation, the proposed method for estimating regression parameter is much simpler than directly inverting the combined log-rank test of Chan and Qin (2015).

Supplementary Material

Supplemental proof

NIHMS969395-supplement-Supplemental_proof.pdf^{(186.7KB, pdf)}

Acknowledgments

The authors thank the editor, an associate editor, and a reviewer for their helpful comments that greatly improve the article. The first and second authors are partially supported by US National Institutes of Health grant R01-HL122212.

Appendix A

We adopt the following regularity conditions:

(A1)
The random variable $\tilde{ε}$ has a bounded density function with bounded derivative.
(A2)
The censoring time C is independent of T conditioning on the truncation time A and covariates X. The density function of C is bounded.
(A3)
The vector of covariates X is bounded.
(A4)
Denote the compact parameter space by ℬ, with β₀ ∈ ℬ. The nonnegative weight functions ϕ₁(t, β) and ϕ₂(t, β) have bounded variation and converges almost surely to $ϕ_{1}^{0} (t, β)$ and $ϕ_{2}^{0} (t, β)$ uniformly for β ∈ ℬ respectively. Let ‖·‖₀ denote the supremum norm in a neighborhood ℬ₀ ⊂ ℬ of β, we assume $‖ ϕ_{1} (t, β) - ϕ_{1}^{0} (t, β) ‖_{0} = O_{p} (n^{- 1 / 2})$ and $‖ ϕ_{2} (t, β) - ϕ_{2}^{0} (t, β) ‖_{0} = O_{p} (n^{- 1 / 2})$ . Furthermore, $ϕ_{1}^{0} (t, β)$ and $ϕ_{2}^{0} (t, β)$ are differentiable in β, and the derivatives are continuous and uniformly bounded for t ∈ (−∞, ∞) and β ∈ ℬ.
(A5)
The matrices Γ₁(β₀) and Γ₂(β₀) are nonsingular, where,
$Γ_{1} (β) = E \int_{- \infty}^{\infty} ϕ_{1}^{0} (u, β) \frac{{\dot{λ}}_{\tilde{ε}} (u)}{λ_{\tilde{ε}} (u)} {[X - \frac{E {R^{Y} (u, β) X}}{E {R^{Y} (u, β)}}]}^{\otimes 2} d N^{Y} (u, β),$
and
$Γ_{2} (β) = E \int_{- \infty}^{\infty} ϕ_{2}^{0} (u, β) \frac{{\dot{λ}}_{η} (u)}{λ_{η} (u)} {[X - \frac{E {R^{A} (u, β) X}}{E {R^{A} (u, β)}}]}^{\otimes 2} d N^{A} (u, β)$

Footnotes

Supplementary Materials

The proof of Lemma 1, Theorem 1, and Theorem 2 referenced in Section 2, and the R program for data analysis are available with this article at the Biometrics website on Wiley Online Library.

References

Asgharian M, M’Lan CE, Wolfson DB. Length-biased sampling with right censoring: An unconditional approach. Journal of the American Statistical Association. 2002;97:201–209. [Google Scholar]
Asgharian M, Wolfson DB, Zhang X. Checking stationarity of the incidence rate using prevalent cohort survival data. Statistics in medicine. 2006;25:1751–1767. doi: 10.1002/sim.2326. [DOI] [PubMed] [Google Scholar]
Buckley J, James I. Linear regression with censored data. Biometrika. 1979;66:429–436. [Google Scholar]
Chan KCG, Qin J. Rank-based testing of equal survivorship based on cross-sectional survival data with or without prospective follow-up. Biostatistics. 2015;16:772–784. doi: 10.1093/biostatistics/kxv011. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen YQ. Semiparametric regression in size-biased sampling. Biometrics. 2010;66:149–158. doi: 10.1111/j.1541-0420.2009.01260.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cox C, Chu H, Schneider MF, Muñoz A. Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Statistics in Medicine. 2007;26:4352–4374. doi: 10.1002/sim.2836. [DOI] [PubMed] [Google Scholar]
Fygenson M, Ritov Y. Monotone estimating equations for censored data. The Annals of Statistics. 1994;22:732–746. [Google Scholar]
Hansen LP. Large sample properties of generalized method of moments estimators. Econometrica: Journal of the Econometric Society. 1982;50:1029–1054. [Google Scholar]
Huang CY, Qin J. Composite partial likelihood estimation under length-biased sampling, with application to a prevalent cohort study of dementia. Journal of the American Statistical Association. 2012;107:946–957. doi: 10.1080/01621459.2012.682544. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huang Y. Calibration regression of censored lifetime medical cost. Journal of the American Statistical Association. 2002;97:318–327. [Google Scholar]
Huang Y. Quantile calculus and censored regression. Annals of Statistics. 2010;38:1607–1637. doi: 10.1214/09-aos771. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huang Y. Fast censored linear regression. Scandinavian Journal of Statistics. 2013;40:789–806. doi: 10.1111/sjos.12031. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jin Z, Lin D, Wei L, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]
Lai TL, Ying Z. Rank regression methods for left-truncated and right-censored data. The Annals of Statistics. 1991:531–556. [Google Scholar]
Li H, Yin G. Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika. 2009;96:293–306. [Google Scholar]
Lin Y, Chen K. Efficient estimation of the censored linear regression model. Biometrika. 2013;100:525–530. [Google Scholar]
Mandel M, Ritov Y. The accelerated failure time model under biased sampling. Biometrics. 2010;66:1306–1308. doi: 10.1111/j.1541-0420.2009.01366_1.x. [DOI] [PubMed] [Google Scholar]
Mölsä PK, Marttila R, Rinne U. Survival and cause of death in alzheimer’s disease and multi-infarct dementia. Acta Neurologica Scandinavica. 1986;74:103–107. doi: 10.1111/j.1600-0404.1986.tb04634.x. [DOI] [PubMed] [Google Scholar]
Ning J, Qin J, Shen Y. Buckley–james-type estimator with right-censored and length-biased data. Biometrics. 2011;67:1369–1378. doi: 10.1111/j.1541-0420.2011.01568.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ning J, Qin J, Shen Y. Score estimating equations from embedded likelihood functions under accelerated failure time model. Journal of the American Statistical Association. 2014a;109:1625–1635. doi: 10.1080/01621459.2014.946034. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ning J, Qin J, Shen Y. Semiparametric accelerated failure time model for length-biased data with application to dementia study. Statistica Sinica. 2014b;24:313–333. doi: 10.5705/ss.2011.197. [DOI] [PMC free article] [PubMed] [Google Scholar]
Parzen M, Wei L, Ying Z. A resampling method based on pivotal estimating functions. Biometrika. 1994;81:341–350. [Google Scholar]
Qu A, Lindsay BG, Li B. Improving generalised estimating equations using quadratic inference functions. Biometrika. 2000;87:823–836. [Google Scholar]
Ritov Y. Estimation in a linear regression model with censored data. The Annals of Statistics. 1990:303–328. [Google Scholar]
Ritov Y, Wellner JA. Censoring, martingales, and the cox model. Contemporary Mathematics. 1988;80:191–219. [Google Scholar]
Roberson E, Hesse J, Rose K, Slama H, Johnson J, Yaffe K, et al. Frontotemporal dementia progresses to death faster than alzheimer disease. Neurology. 2005;65:719–725. doi: 10.1212/01.wnl.0000173837.82820.9f. [DOI] [PubMed] [Google Scholar]
Shen Y, Ning J, Qin J. Analyzing length-biased data with semiparametric transformation and accelerated failure time models. Journal of the American Statistical Association. 2009;104:1192–1202. doi: 10.1198/jasa.2009.tm08614. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tsiatis AA. Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics. 1990;18:354–372. [Google Scholar]
Vardi Y. Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika. 1989;76:751–761. [Google Scholar]
Wang MC. Nonparametric estimation from cross-sectional survival data. Journal of the American Statistical Association. 1991;86:130–143. [Google Scholar]
Wolfson C, Wolfson DB, Asgharian M, M’Lan CE, Østbye T, Rockwood K, Hogan DF. A reevaluation of the duration of survival after the onset of dementia. New England Journal of Medicine. 2001;344:1111–1116. doi: 10.1056/NEJM200104123441501. [DOI] [PubMed] [Google Scholar]
Yamaguchi K. Accelerated failure–time mover–stayer regression models for the analysis of last-episode data. Sociological Methodology. 2003;33:81–110. [Google Scholar]
Ying Z. Linear rank statistics for truncated data. Biometrika. 1990;77:909–914. [Google Scholar]
Ying Z. A large sample study of rank estimation for censored regression data. The Annals of Statistics. 1993;21:76–99. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental proof

NIHMS969395-supplement-Supplemental_proof.pdf^{(186.7KB, pdf)}

[R1] Asgharian M, M’Lan CE, Wolfson DB. Length-biased sampling with right censoring: An unconditional approach. Journal of the American Statistical Association. 2002;97:201–209. [Google Scholar]

[R2] Asgharian M, Wolfson DB, Zhang X. Checking stationarity of the incidence rate using prevalent cohort survival data. Statistics in medicine. 2006;25:1751–1767. doi: 10.1002/sim.2326. [DOI] [PubMed] [Google Scholar]

[R3] Buckley J, James I. Linear regression with censored data. Biometrika. 1979;66:429–436. [Google Scholar]

[R4] Chan KCG, Qin J. Rank-based testing of equal survivorship based on cross-sectional survival data with or without prospective follow-up. Biostatistics. 2015;16:772–784. doi: 10.1093/biostatistics/kxv011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Chen YQ. Semiparametric regression in size-biased sampling. Biometrics. 2010;66:149–158. doi: 10.1111/j.1541-0420.2009.01260.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Cox C, Chu H, Schneider MF, Muñoz A. Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Statistics in Medicine. 2007;26:4352–4374. doi: 10.1002/sim.2836. [DOI] [PubMed] [Google Scholar]

[R7] Fygenson M, Ritov Y. Monotone estimating equations for censored data. The Annals of Statistics. 1994;22:732–746. [Google Scholar]

[R8] Hansen LP. Large sample properties of generalized method of moments estimators. Econometrica: Journal of the Econometric Society. 1982;50:1029–1054. [Google Scholar]

[R9] Huang CY, Qin J. Composite partial likelihood estimation under length-biased sampling, with application to a prevalent cohort study of dementia. Journal of the American Statistical Association. 2012;107:946–957. doi: 10.1080/01621459.2012.682544. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Huang Y. Calibration regression of censored lifetime medical cost. Journal of the American Statistical Association. 2002;97:318–327. [Google Scholar]

[R11] Huang Y. Quantile calculus and censored regression. Annals of Statistics. 2010;38:1607–1637. doi: 10.1214/09-aos771. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Huang Y. Fast censored linear regression. Scandinavian Journal of Statistics. 2013;40:789–806. doi: 10.1111/sjos.12031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Jin Z, Lin D, Wei L, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]

[R14] Lai TL, Ying Z. Rank regression methods for left-truncated and right-censored data. The Annals of Statistics. 1991:531–556. [Google Scholar]

[R15] Li H, Yin G. Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika. 2009;96:293–306. [Google Scholar]

[R16] Lin Y, Chen K. Efficient estimation of the censored linear regression model. Biometrika. 2013;100:525–530. [Google Scholar]

[R17] Mandel M, Ritov Y. The accelerated failure time model under biased sampling. Biometrics. 2010;66:1306–1308. doi: 10.1111/j.1541-0420.2009.01366_1.x. [DOI] [PubMed] [Google Scholar]

[R18] Mölsä PK, Marttila R, Rinne U. Survival and cause of death in alzheimer’s disease and multi-infarct dementia. Acta Neurologica Scandinavica. 1986;74:103–107. doi: 10.1111/j.1600-0404.1986.tb04634.x. [DOI] [PubMed] [Google Scholar]

[R19] Ning J, Qin J, Shen Y. Buckley–james-type estimator with right-censored and length-biased data. Biometrics. 2011;67:1369–1378. doi: 10.1111/j.1541-0420.2011.01568.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Ning J, Qin J, Shen Y. Score estimating equations from embedded likelihood functions under accelerated failure time model. Journal of the American Statistical Association. 2014a;109:1625–1635. doi: 10.1080/01621459.2014.946034. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Ning J, Qin J, Shen Y. Semiparametric accelerated failure time model for length-biased data with application to dementia study. Statistica Sinica. 2014b;24:313–333. doi: 10.5705/ss.2011.197. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Parzen M, Wei L, Ying Z. A resampling method based on pivotal estimating functions. Biometrika. 1994;81:341–350. [Google Scholar]

[R23] Qu A, Lindsay BG, Li B. Improving generalised estimating equations using quadratic inference functions. Biometrika. 2000;87:823–836. [Google Scholar]

[R24] Ritov Y. Estimation in a linear regression model with censored data. The Annals of Statistics. 1990:303–328. [Google Scholar]

[R25] Ritov Y, Wellner JA. Censoring, martingales, and the cox model. Contemporary Mathematics. 1988;80:191–219. [Google Scholar]

[R26] Roberson E, Hesse J, Rose K, Slama H, Johnson J, Yaffe K, et al. Frontotemporal dementia progresses to death faster than alzheimer disease. Neurology. 2005;65:719–725. doi: 10.1212/01.wnl.0000173837.82820.9f. [DOI] [PubMed] [Google Scholar]

[R27] Shen Y, Ning J, Qin J. Analyzing length-biased data with semiparametric transformation and accelerated failure time models. Journal of the American Statistical Association. 2009;104:1192–1202. doi: 10.1198/jasa.2009.tm08614. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Tsiatis AA. Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics. 1990;18:354–372. [Google Scholar]

[R29] Vardi Y. Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika. 1989;76:751–761. [Google Scholar]

[R30] Wang MC. Nonparametric estimation from cross-sectional survival data. Journal of the American Statistical Association. 1991;86:130–143. [Google Scholar]

[R31] Wolfson C, Wolfson DB, Asgharian M, M’Lan CE, Østbye T, Rockwood K, Hogan DF. A reevaluation of the duration of survival after the onset of dementia. New England Journal of Medicine. 2001;344:1111–1116. doi: 10.1056/NEJM200104123441501. [DOI] [PubMed] [Google Scholar]

[R32] Yamaguchi K. Accelerated failure–time mover–stayer regression models for the analysis of last-episode data. Sociological Methodology. 2003;33:81–110. [Google Scholar]

[R33] Ying Z. Linear rank statistics for truncated data. Biometrika. 1990;77:909–914. [Google Scholar]

[R34] Ying Z. A large sample study of rank estimation for censored regression data. The Annals of Statistics. 1993;21:76–99. [Google Scholar]

PERMALINK

Simple and Fast Overidentified Rank Estimation for Right-Censored Length-Biased Data and Backward Recurrence Time

Yifei Sun

Kwun Chuen Gary Chan

Jing Qin

Summary

1. Introduction

2. Estimation

2.1. Over-Identified Estimating Equations

Lemma 1

Theorem 1

2.2. Efficient Adaptive Rank Estimators

Theorem 2

Remark 1

Remark 2

Remark 3

3. Fast Computation

4. Simulations

Table 1.

Table 2.

5. Data Analysis

6. Discussion

Supplementary Material

Acknowledgments

Appendix A

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Simple and Fast Overidentified Rank Estimation for Right-Censored Length-Biased Data and Backward Recurrence Time

Yifei Sun

Kwun Chuen Gary Chan

Jing Qin

Summary

1. Introduction

2. Estimation

2.1. Over-Identified Estimating Equations

Lemma 1

Theorem 1

2.2. Efficient Adaptive Rank Estimators

Theorem 2

Remark 1

Remark 2

Remark 3

3. Fast Computation

4. Simulations

Table 1.

Table 2.

5. Data Analysis

6. Discussion

Supplementary Material

Acknowledgments

Appendix A

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases