Quantile Regression Adjusting for Dependent Censoring from Semi-Competing Risks

Ruosha Li; Limin Peng

doi:10.1111/rssb.12063

. Author manuscript; available in PMC: 2016 Jan 1.

Published in final edited form as: J R Stat Soc Series B Stat Methodol. 2014 Mar 21;77(1):107–130. doi: 10.1111/rssb.12063

Quantile Regression Adjusting for Dependent Censoring from Semi-Competing Risks

Ruosha Li ¹, Limin Peng ^2,^*

PMCID: PMC4283952 NIHMSID: NIHMS549451 PMID: 25574152

Summary

In this work, we study quantile regression when the response is an event time subject to potentially dependent censoring. We consider the semi-competing risks setting, where time to censoring remains observable after the occurrence of the event of interest. While such a scenario frequently arises in biomedical studies, most of current quantile regression methods for censored data are not applicable because they generally require the censoring time and the event time be independent. By imposing rather mild assumptions on the association structure between the time-to-event response and the censoring time variable, we propose quantile regression procedures, which allow us to garner a comprehensive view of the covariate effects on the event time outcome as well as to examine the informativeness of censoring. An efficient and stable algorithm is provided for implementing the new method. We establish the asymptotic properties of the resulting estimators including uniform consistency and weak convergence. The theoretical development may serve as a useful template for addressing estimating settings that involve stochastic integrals. Extensive simulation studies suggest that the proposed method performs well with moderate sample sizes. We illustrate the practical utility of our proposals through an application to a bone marrow transplant trial.

Keywords: Copula, Dependent censoring, Quantile regression, Semi-competing risks, Stochastic integral equation

1. Introduction

Quantile regression (Koenker and Bassett, 1978) is gaining increasing attention in survival analysis as a valuable alternative to traditional regression models. It targets a spectrum of conditional quantiles to achieve a comprehensive and robust analysis. Specifically, define the τth conditional quantile of an event time T₁ as Q_T₁(τ|Z) = inf{t : Pr(T₁ ≤ t|Z) ≥ τ}, where τ ∈ (0, 1), Z = (1, Z̃)^T, and Z̃ is a p × 1 covariate vector. A quantile regression model may assume

Q_{T_{1}} (τ ∣ Z) = \exp {Z^{T} β_{0} (τ)}, 0 < τ < 1,

(1)

where β₀(τ) is a vector of unknown regression coefficients representing covariate effects at the τth quantile of the event time. By allowing β₀(τ) to vary with τ, model (1) may accommodate inhomogeneous covariate effects that, for example, differ between subjects of high susceptibility to the event versus those with low risk of the event. Also note that model (1) is a strict extension of the accelerated failure time (AFT) model, which corresponds to model (1) with each component of β₀(τ) being constant except for the intercept. Like with AFT model, quantile regression postulates a direct relationship between the event time and covariates, thereby rendering straightforward physical interpretations. Many research efforts have been devoted to developing methods for quantile regression with survival data subject to independent censoring (Powell 1986; Ying et al. 1995; Portnoy 2003; Peng and Huang 2008; Wang and Wang 2009; Huang 2010; among others). However, the assumption of independent censoring is problematic in many practical situations.

For example, in a multi-center trial of allogenic bone marrow transplants (BMT) in patients with acute leukemia (Copelan et al., 1991; Klein and Moeschberger, 2005), one intermediate endpoint of interest is the development of chronic graft-versus-host disease (GVHD). However, some patients died without experiencing GVHD. The time to GVHD for these subjects was then censored by death, which may correlate with GVHD. On the other hand, the occurrence of GVHD did not prevent the subsequent patient follow-up in this study. As a result, time to death remained observable after the GVHD endpoint. Such a dependent censoring scenario falls into the paradigm of semi-competing risks (Fine et al., 2001), which permits the observation of censoring after the occurrence of the endpoint of interest, and is the focus of this work. Naively conducting competing risks analysis in the BMT example would ignore the observed data on time to death for patients who experienced GVHD during the study and hence incur unnecessary information loss.

In the presence of dependent censoring to T₁, it would be problematic to estimate model (1) by using existing quantile regression methods that assume the time to censoring is independent of T₁. For example, we show via simulation studies that applying Peng and Huang (2008)'s method by treating dependent censoring as independent can incur large biases in the estimation of β₀(τ) (Figure 1). To date, very limited work has been done for quantile regression with dependently censored data. The competing risks quantile regression method by Peng and Fine (2009), like its precursor in one sample setting (Peng and Fine, 2007a), targets the crude conditional quantiles, which are conditional quantiles defined based on the cumulative incidence function, not the marginal distribution function. In practice, quantities based on the marginal distribution of T₁ can also be of interest (Jiang et al., 2003). This consideration motivates the study of quantile regression based on model (1), which formulates covariate effects on Q_T₁(τ|Z), and cannot be addressed by the method of Peng and Fine (2009).

Fig. 1 — Assessment of the robustness of $\hat{β} (τ)$ with data generated from set-up S2.C (the first row) and from set-up S2.F (the second row). Solid lines represent true regression quantiles, dashed lines represent empirical averages of naive estimates, dotted lines represent empirical averages of the proposed estimates based on Clayton's copula, and dashed dotted lines represent empirical averages of the proposed estimates based on Frank's copula.

In this paper, we develop a quantile regression method, which accommodates dependent censoring and renders inference on net conditional quantiles that refer to conditional quantiles defined upon the marginal distribution of the event time of interest. We focus on the semi-competing risks setting, where the censoring event remains observable after the occurrence of the endpoint of interest, as in the BMT example. The new method provides a useful alternative to existing regression approaches for dependently censored data of semi-competing risks structure. For example, it allows for nonconstant covariate effects, which are not permitted by Lin et al. (1996), Peng and Fine (2006), Hsieh et al. (2008), Ding et al. (2009), and Chen (2011). While other varying coefficient models, such as multiplicative hazards model and additive risks model, have been studied for survival data (we refer to Martinussen and Scheike (2006) for a comprehensive coverage), approaches tailored to the semi-competing risks setting are quite limited. One available method is the functional regression model studied by Peng and Fine (2007b), which generalizes the Cox proportional hazards model with varying coefficient incorporated. In contrast, model (1) formulates covariate effects on the quantiles of T₁, offering direct physical interpretations, and thus may be preferred in many practical situations.

In our proposals, we first tackle the issue related to the lack of nonparametric identifiability of the marginal distribution of T₁ (Tsiatis, 1975) by assuming the association between T₁ and its censoring time follows a copula model with unspecified copula parameters. Note that, this is a weaker assumption compared to that usually adopted in the competing risks setting, where the copula model is fully specified (Zheng and Klein 1995; Chen 2010; among others). We construct unbiased estimating equations based on the assumed models, effectively utilizing the semi-competing risks feature of the censoring scheme. Estimation and inference procedures are developed to offer not only analyses of net conditional quantiles but also insights about how censoring is associated with the event of interest. A stable and efficient algorithm is developed for the implementation of the proposed procedures.

Moreover, we establish the asymptotic properties of the proposed estimators despite considerable technical challenges. Our theoretical development provides a useful template for addressing estimating equations that involve the use of stochastic integrals. Via extensive simulations, we show that the proposed method performs well with moderate sample sizes, and are robust to misspecification of the assumed association model. An application to the BMT example illustrates the utility of our proposals, uncovering findings unattainable through traditional survival regression models.

2. Quantile Regression Procedure

2.1. Data and Model

We begin with a formal introduction of data and notation. Let T₁ be time to the endpoint of interest and T₂ be time to dependent censoring. Define Z̃ = (1, Z̃^T)^T as a (p + 1) 1 vector extended from covariates recorded in Z̃. There often involves an administrative censoring time C which is conditionally independent of (T₁, T₂) given Z. Define X = T₁ ∧ T₂ ∧ C, Y = T₂ ∧ C, δ = I(T₁ ≤ T₂ ∧ C), η = I(T₂ ≤ C), where ∧ is the minimum operator and I(·) is the indicator function. Observed data include n identically and independently distributed (i.i.d.) replicates of {X, Y, δ, η, Z}, denoted by ${X_{i}, Y_{i}, δ_{i}, η_{i}, Z_{i}}_{i = 1}^{n}$ .

To address the lack of nonparametric identifiability of the net conditional quantiles Q_T₁(τ|Z) in the presence of dependent censoring by T₂ (Tsiatis, 1975), we impose additional assumptions| on the association structure between T₁ and T₂. Such a strategy has been widely adopted in previous work on dependent censoring. Specifically, we adopt a copula model that assumes

\Pr (T_{1} > s, T_{2} > t ∣ Z) = Ψ {1 - F_{1} (s ∣ Z), 1 - F_{2} (t ∣ Z), g ({\overset{‒}{Z}}^{T} r_{0})},

(2)

where F_i(t | Z) = Pr(T_i ≤ t | Z) (i = 1, 2), $\overset{‒}{Z} = {(1, {\tilde{Z}}_{0}^{T})}^{T}$ with Z̃₀ being a sub-vector of Z̃, or Z̃ itself, and Ψ is a known function. For a given parameter θ, Ψ(·, ·, θ) is a copula function, for example, Clayton's copula (Clayton, 1978) or Frank's copula (Genest, 1987). In model (2), r₀ is an unknown parameter, characterizing the dependent censoring mechanism that may vary according to Z̄. When Z̄ = 1, the association between T₁ and T₂ is assumed to be homogeneous for all subjects. The copula parameter specified as g(Z̄^Tr₀) is often closely connected to a measure that characterizes the strength of the association between T₁ and T₂, where g(·) is a known function. For example, under the Clayton's copula model, where Ψ(u, v, a) = (u^–a + v^–a – 1)^–1/a and a ≥ 0, a/(a + 2) equals the Kendall's tau coefficient (Kendall and Gibbons, 1962). In this case, one may select g(x) = exp(x), which accounts for the fact that a needs to be non-negative.

While model (2) allows identifying the net conditional quantiles of T₁, the estimation of model (1) is further facilitated by the fact that T₂ is only subject to independent censoring by C, and therefore standard censored regression techniques can be applied to estimate F₂(t|Z). Here, we assume that the regression model for T₂ takes the same form as that for T₁:

Q_{T_{2}} (τ ∣ Z) = \exp {Z^{T} α_{0} (τ)}, τ \in (0, 1),

(3)

where α₀(τ) is a (p + 1) × 1 vector of regression coefficients, which may be estimated by using Peng and Huang (2008)'s approach. Denote the resulting estimator as $\hat{α} (\cdot)$ . Note that the modeling in (1)–(3) accommodates situations where T₁ and T₂ are influenced by different sets of covariates. This can be achieved by letting Z include all covariates that impact T₁ or T₂ and then appropriately setting zero coefficients in β₀(τ) and α₀(τ). In practice, one may consider a different model for T₂ when there exists strong evidence for the inadequacy of model (3). In principle, model (3) may be replaced by any model that can generate “good” estimates for Q_T₂(τ|Z) with randomly censored data. The resulting change in the estimating equations would be substituting $\exp {Z_{i}^{T} α (τ)}$ with the estimate for Q_T₂(τ|Z). In the sequel, models (1)–(3) are assumed without further mentioning.

It is interesting to note that the model studied here reduces to a bivariate location-shift model, when Z̄ = 1 and both α₀(τ) and β₀(τ) are constant except in the intercept terms. The method proposed below will provide a valuable complement with enhanced flexibility and interpretability to the regression methods by Lin et al. (1996) and Peng and Fine (2006).

2.2. Estimating Equations

Our proposal for estimating β₀(τ) and r₀ is based on the following two equalities:

\Pr (X > t ∣ Y > t, Z) = \frac{\Pr (T_{1} > t, T_{2} > t ∣ Z)}{\Pr (T_{2} > t ∣ Z)} = K_{A} {F_{1} (t ∣ Z), F_{2} (t ∣ Z), g ({\overset{‒}{Z}}^{T} r_{0})},

(4)

and for s ≤ t,

\Pr (X \leq s ∣ Y > t, Z) = \frac{\Pr (T_{1} \leq s, T_{2} > t ∣ Z)}{\Pr (T_{2} > t ∣ Z)} = K_{B} {F_{1} (s ∣ Z), F_{2} (t ∣ Z), g ({\overset{‒}{Z}}^{T} r_{0})},

(5)

where K_A(u, v, θ) = Ψ(1 – u, 1 – v, θ)/(1 – v) and K_B(u, v, θ) = {1 – v, – Ψ(1 – u, 1 – v, θ}/(1 – v).

It is suggested by (4) and (5) that the independent censoring by C may be handled through conditioning X > t and X ≤ s on Y > t with s ≤ t. Note that (4) is derived based on the assumed joint distribution of (T₁, T₂) on the diagonal line, while (5) utilizes the information on the upper wedge of (T₁, T₂). Therefore, these two equalities well capture the semi-competing risks feature of the observed data.

To construct estimating equations for β₀(τ) and r₀ based on (4) and (5), we need to further bridge α₀(·) and β₀(·) with the distribution functions of T₁ and T₂. This may be done based on the facts that $F_{1} [\exp {Z^{T} β_{0} (τ)} ∣ Z] = τ$ for any τ ∈ (0, 1), and

F_{2} (t ∣ Z) = \int_{0}^{1} I {F_{2} (t ∣ Z) \geq u} d u = \int_{0}^{1} I [t \geq \exp {Z^{T} α_{0} (u)}] d u .

(6)

Motivated by (4), (5) and (6), we consider the following two quantities:

P_{i}^{0} (β, α, r, τ) = I {\log X_{i} > Z_{i}^{T} β (τ)} - I {\log Y_{i} > Z_{i}^{T} β (τ)} \times K_{A} {τ, \int_{0}^{1} I {Z_{i}^{T} β (τ) \geq Z_{i}^{T} α (u)} d u, g ({\overset{‒}{Z}}_{i}^{T} r)},

(7)

Q_{i}^{0} (β, α, r, τ) = \int_{0}^{\infty} I {Z_{i}^{T} β (τ) \leq \log t < \log Y_{i}} \times (I {\log X_{i} \leq Z_{i}^{T} β (τ)} - K_{B} {τ, \int_{0}^{1} I {\log t \geq Z_{i}^{T} α (u)} d u, g ({\overset{‒}{Z}}_{i}^{T} r)}) d t .

(8)

In the above definitions, the β and α in $P_{i}^{0} (β, α, r, τ)$ and $Q_{i}^{0} (β, α, r, τ)$ stand for functions defined from [0, 1] to R^p+1. The same rule applies to other similar notation without further mentioning. Under the assumed models (1)–(3), we have $E {P_{i}^{0} (β_{0}, α_{0}, r_{0}, τ) ∣ Z} = 0$ . This can be seen from the equality E{I(X > t) – I(Y > t) Pr(X > t|Y > t, Z)|Z} = 0 with (4) plugged in.

Moreover, we can show $E {\int_{τ} Q_{i}^{0} (β_{0}, α_{0}, r_{0}, τ) d τ ∣ Z_{i}} = 0$ by noting that

E [\int_{s} \int_{t} I (s \leq t, t < Y) \times {I (X \leq s) - \Pr (X \leq s ∣ Y > t, Z)} d t d s ∣ Z] = 0,

and applying the monotone variable transformation s = exp Z^Tβ₀(τ) . The double integration with respect to s and t reflects a good use of the data in the whole upper wedge of (T₁, T₂).

Based on the above results, one may consider the following estimating equations:

n^{- 1 ∕ 2} \sum_{i = 1}^{n} Z_{i} P_{i}^{0} (β, \hat{α}, r, τ) = 0, n^{- 1 ∕ 2} \sum_{i = 1}^{n} \int_{τ_{a}}^{τ_{b}} {\overset{‒}{Z}}_{i} Q_{i}^{0} (β, \hat{α}, r, τ) d τ = 0,

(9)

where τ_a and τ_b are some prespecified constants in (0, 1). Please note, in the absence of censoring (i.e. Y_i = ∞ for all i), under an independent copula model (i.e. Ψ(x, y, θ) = xy), the first estimating equation in (9) would reduce to $n^{- 1 ∕ 2} \sum_{i = 1}^{n} Z_{i} [I {\log X_{i} > Z_{i}^{T} β (τ)} - τ] = 0$ , which is the same as that presented in Ying et al. (1995) for the case without censoring. The estimating function in this special case equals ( $- 1 ∕ \sqrt{n}$ ) times the “pseudo” derivative of the classical objective function for defining sample regression quantiles (Koenker, 2005), $\sum_{i = 1}^{n} ρ_{τ} (\log X_{i} - Z_{i}^{T} b)$ , where ρ_τ(x) = x{τ – I(x < 0)}. Such a connection may help understand the inclusion of Z_i in (9).

There are some potential issues with the equations in (9). More specifically, with T₂ subject to the independent censoring from C, α₀(τ) may not be identifiable for some τ close to 1 (Peng and Huang, 2008). In this case, problems would arise because $P_{i}^{0} (β, \hat{α}, r, τ)$ and $Q_{i}^{0} (β, \hat{α}, r, τ)$ requires estimating α₀(τ) for all τ ∈ (0, 1), which may not be feasible due to the lack of identifiability of α₀(τ) in the upper tail of τ. To circumvent this issue, we modify the equations in (9) based on the idea of “truncating” $P_{i}^{0} (β_{0}, α_{0}, r_{0}, τ)$ and $Q_{i}^{0} (β_{0}, α_{0}, r_{0}, τ)$ conditioning on $Z_{i}^{T} β_{0} (τ) \leq Z_{i}^{T} α_{0} (τ_{U, 2})$ and $t \leq \exp {Z_{i}^{T} α_{0} (τ_{U, 2})}$ respectively, where τ_U,2 is an upper bound of a τ-range, in which α₀(τ) is identifiable. Some empirical rule for selecting τ_U,2 is provided in Peng and Huang (2008). Applying this truncation idea leads to the consideration of

P_{i} (β, α, r, τ) = I {\log X_{i} > Z_{i}^{T} β (τ)} - I {\log Y_{i} > Z_{i}^{T} β (τ)} \times K_{A} {τ, \int_{0}^{τ_{U, 2}} I {Z_{i}^{T} β (τ) \geq Z_{i}^{T} α (u)} d u, g ({\overset{‒}{Z}}_{i}^{T} r)},

which involves α(u) for u up to τ_U,2 instead of 1. We have

Q_{i} (β, α, r, τ) = \int_{t \in (0, \infty)} I {Z_{i}^{T} β (τ) \leq \log t \leq Z_{i}^{T} α (τ_{U 2}) \land \log Y_{i}} \times (I {\log X_{i} \leq Z_{i}^{T} β (τ)} - K_{B} [τ, \int_{0}^{τ_{U 2}} I {\log t \geq Z_{i}^{T} α (u)} d u, g ({\overset{‒}{Z}}_{i}^{T} r)]) d t,

for τ ∈ (0, 1). The first equality holds because $\int_{0}^{τ_{U, 2}} I {Z_{i}^{T} β_{0} (τ) \geq Z_{i}^{T} α_{0} (u)} d u = \int_{0}^{1} I {Z_{i}^{T} β_{0} (τ) \geq Z_{i}^{T} α_{0} (u)} d u$ when $Z_{i}^{T} β_{0} (τ) \leq Z_{i}^{T} α_{0} (τ_{U, 2})$ , and the second and the third equalities follow directly from (4) and (6). With similar arguments, we can also show that the integrand in Q_i(β₀, α₀, r₀, τ) has expectation 0, and hence

E (I {Z_{i}^{T} β_{0} (τ) \leq Z_{i}^{T} α_{0} (τ_{U, 2})} P_{i} (β_{0}, α_{0}, r_{0}, τ)) = E (I {Z_{i}^{T} β_{0} (τ) \leq Z_{i}^{T} α_{0} (τ_{U, 2})} [I {\log X_{i} > Z_{i}^{T} β_{0} (τ)} - I {\log Y_{i} > Z_{i}^{T} β_{0} (τ)} \times K_{A} {τ, \int_{0}^{1} I {Z_{i}^{T} β_{0} (τ) \geq Z_{i}^{T} α_{0} (u)} d u, g ({\overset{‒}{Z}}_{i}^{T} r_{0})}]) = E (I {Z_{i}^{T} β_{0} (τ) \leq Z_{i}^{T} α_{0} (τ_{U, 2})} [I {\log X_{i} > Z_{i}^{T} β_{0} (τ)} - I {\log Y_{i} > Z_{i}^{T} β_{0} (τ)} \times K_{A} (F_{1} [\exp {Z_{i}^{T} β_{0} (τ)} ∣ Z], F_{2} [\exp {Z_{i}^{T} β_{0} (τ)} ∣ Z], g ({\overset{‒}{Z}}_{i}^{T} r_{0}))]) = 0,

(10)

By (10) and (11), we propose the following estimating equations:

E Q_{i} (β_{0}, α_{0} . r_{0}, τ) = 0 .

(11)

where

n^{1 ∕ 2} S_{n} (β, \hat{α}, r, τ) = 0, n^{1 ∕ 2} \int_{τ_{a}}^{τ_{b}} W_{n} (β, \hat{α}, r, τ) d τ = 0,

(12)

Taking into consideration the identifiability of β₀(τ) due to censoring, we restrict our attention to β₀(τ) with τ ∈ (0, τ_U,1], where τ_U,1 is a constant less than 1 subject to certain regularity conditions given in Appendix A. In practice, τ_U,1 and τ_U,2 may be selected in an adaptive manner as suggested by Peng and Huang (2008) for randomly censored data. We suggest choosing [τ_a, τ_b) such that the interval covers most of (0, τ_U,1] but stays away from the boundaries. The proposed estimators of β₀(τ) and r₀ can be obtained as the solutions to the equations in (12), denoted by $\hat{β} (τ)$ and r̂ respectively.

Remark: To computre $Q_{i} (β, \hat{α}, r, τ)$ , one only needs to evaluate the integration over $t \in (0, \max_{i = 1}^{n} Y_{i} \land \exp {Z_{i}^{T} \hat{α} (τ_{U, 2})})$ . When β(·) is a cadlag function, the integrand in $Q_{i} (β, \hat{α}, r, τ)$ is a piecewise constant function t, and hence $Q_{i} (β, \hat{α}, r, τ)$ can be calculated exactly.

2.3 Computational Algorithm

We developed an efficient iterative algorithm to jointly estimate β₀(·) and r₀ based on (12). In the sequel, we may use β as a shorthand for β(τ) and α for α(τ). Similarly, we may write $\int_{τ_{a}}^{τ_{b}} W_{n} (β, \hat{α}, r, τ) d τ$ as $W_{n} (β, \hat{α}, r)$ for brevity.

First, let $G_{L (n)} = {0 < τ_{0} = τ_{1} = \dots < τ_{L (n)} = τ_{U, 1} < 1}$ be a prespecified grid on [0, 1], and $‖ G_{L (n)} ‖ = \max {τ_{j} - τ_{j - 1}; j = 1, 2 \dots, L (n)}$ be the size of the grid. For any given r, let $\hat{β} (τ; r)$ , with a shorthand $\hat{β} (r)$ , denote a right-continuous step function of τ which jumps only on $G_{L (n)}$ and satisfies $n^{1 ∕ 2} S_{n} {\hat{β} (r), \hat{α}, r, τ} = o (1)$ for $τ \in G_{L (n)}$ . The detailed algorithm follows.

Step A Estimate ${\hat{α} (τ), τ \in (0, τ_{U, 2}]}$ using Peng and Huang (2008)'s method.

Step B Set k = 0 and choose an initial value for $\hat{β}$ and r̂, denoted by ${\hat{β}}^{[k]}$ and r̂^[k].

Step C Update ${\hat{β}}^{[k]}$ with ${\hat{β}}^{[k + 1]} = \hat{β} ({\hat{r}}^{[k]})$ .

Step D Solve r̂^[k] from $n^{1 ∕ 2} W_{n} ({\hat{β}}^{[k]}, \hat{α}, r) = 0$ . Then increase k by 1 and go to Step C until certain convergence criteria are satisfied.

In practice, ${\hat{β}}^{[0]}$ may be chosen as the naive estimate obtained by treating (X, δ) as independently censored survival data and employing Peng and Huang (2008)'s method, and r̂⁽⁰⁾ can be chosen as the r that correspond to the Kendall's tau coefficient between X_i and Y_i, i = 1, 2, ..., n. Since $n^{1 ∕ 2} W_{n} ({\hat{β}}^{[k]}, \hat{α}, r)$ is a smooth function of r, the root-finding in Step D can be easily implemented with existing statistical functionalities, like the optim function in R. To solve $n^{1 ∕ 2} S_{n} (b, \hat{α}, {\hat{r}}^{[k]}, τ) = 0$ in Step C, we propose to obtain ${\hat{β}}^{[k + 1]} (τ) \equiv \hat{β} (τ; r^{[k]})$ through the following iterative procedure:

C.0 Set m = 0 and let ${\hat{β}}^{[k + 1, m]} (τ) = {\hat{β}}^{[k]} (τ)$ .

C.1 Find ${\hat{β}}^{[k + 1, m + 1]} (τ)$ by solving

\begin{matrix} S_{n} (β, α, r, τ) = & n^{- 1} \sum_{i = 1}^{n} Z_{i} I {Z_{i}^{T} β (τ) \leq Z_{i}^{T} α (τ_{U, 2})} P_{i} (β, α, r, τ), \\ W_{n} (β, α, r, τ) = & n^{- 1} \sum_{i = 1}^{n} {\overset{‒}{Z}}_{i} Q_{i} (β, α, r, τ) . \end{matrix}

for b, where ${\hat{β}}_{i}^{[m]} (τ) = I {Z_{i}^{T} {\hat{β}}^{[k + 1, m]} (τ) \leq Z_{i}^{T} \hat{α} (τ_{U, 2})}$ and

n^{- 1 ∕ 2} \sum_{i = 1}^{n} Z_{i} {\hat{B}}_{i}^{[m]} (τ) {I (\log X_{i} > Z_{i}^{T} b) - {\hat{A}}_{i}^{[m]} (τ)} = 0

(13)

C.2 Increase m by 1 and go to C.1 until certain convergence criteria are satisfied.

It can be shown that the estimating function in (13) is a monotone random field in b (Fygenson and Ritov, 1994), and equals the derivative of the following L₁-type function

{\hat{A}}_{i}^{[m]} (τ) = I {\log Y_{i} > Z_{i}^{T} {\hat{β}}^{[k + 1, m]} (τ)} K_{A} [τ, \int_{0}^{τ_{U, 2}} I {Z_{i}^{T} {\hat{β}}^{[k + 1, m]} (τ) \geq Z_{i}^{T} \hat{α} (u)} d u, g ({\overset{‒}{Z}}_{i}^{T} {\hat{r}}^{[k]})] .

where M is an extremely large number that can bound $∣ \sum_{i = 1}^{n} {\hat{β}}_{i}^{[m]} (τ) Z_{i}^{T} b ∣$ and $∣ 2 \sum_{i = 1}^{n} {\hat{β}}_{i}^{[m]} (τ) {\hat{A}}_{i}^{[m]} (τ) Z_{i}^{T} b ∣$ from the above. Such a L₁–type minimization problem can be readily implemented in the rq function in R or the l1fit function in Splus. In Section 3, we show that the proposed algorithm can produce fast and stable implementation of the proposed estimation method. More details of the algorithm, including convergence criteria adopted for iterations, are provided in Appendix D.

2.4. Asymptotic Results

In this subsection we outline the asymptotic properties of the proposed estimators. In the below, Theorem 1 states the consistency of r̂ and the uniform consistency of $\hat{β} (τ)$ , and Theorem 2 gives the results on the limiting distribution of n^1/2(r̂ – r₀) and $n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)}$ . In order to to establish two theorems, we need regularity conditions C1-C5, the details of which are relegated to Appendix A.

THEOREM 1. Under conditions C1-C5, if $\lim_{n \to \infty} ‖ G_{L (n)} ‖ = 0$ , then there exists ${\hat{β} (τ), \hat{r}}$ in a neighborhood of {β₀(τ), r₀}, such that $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ \hat{β} (τ) - β_{0} (τ) ‖ \overset{p}{\to} 0$ , for any 0 < ν₁ < τ_U,1 and $\hat{r} \overset{p}{\to} r_{0}$ .

THEOREM 2. Under conditions C1-C5, if $\lim_{n \to \infty} n^{1 ∕ 2} ‖ G_{L (n)} ‖ = 0$ , then n^1/2(r̂ – r₀) converges in distribution to a Normal distribution with mean 0, and $n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)}$ converges weakly to a mean 0 Gaussian process for τ ∈ [ν₁, τ_U,1], where 0 < ν₁ < τ_U,1.

The proofs for Theorems 1–2 involve extensive use of empirical process theory and stochastic integrals. In the following we sketch the major steps taken to prove the theorems, which can provide a useful template for asymptotic studies in similar settings. To this end, we need to introduce additional notation. Let s(β, α, r, τ) = E{S_n(β, α, r, τ)}, w(β, α, r, τ) = E{W_n(β, α, r, τ)}, and $w (β, α, r) = \int_{τ_{a}}^{τ_{b}} w (β, α, r, τ) d τ$ .

To prove the consistency in Theorem 1, we first show that $\int_{0}^{τ_{U, 2}} I [t \geq \exp {z^{T} \hat{α} (u)}] d u$ provides a uniformly consistent estimate for $F_{2} (t ∣ z) \land τ_{U, 2}$ . By this result, applications of empirical process theory lead to $\sup_{τ \in (0, τ_{U, 1}]} s {\hat{β} (r), α_{0}, r, τ} = o_{p} (1)$ for any fixed $r \in R (d_{R})$ , which implies the uniform consistency of $\hat{β} (τ; r)$ to $\tilde{β} (τ; r)$ , the solution of s(β, α₀, r, τ) = 0. Similarly, we can show $‖ \tilde{w} (\hat{α}, r) - \tilde{w} (α_{0}, r) ‖ \to_{p} 0$ , where $\tilde{w} (α, r) = w (\tilde{β} (r), α, r}$ , and then w̃(α₀, r̂) = o_p(1), which implies the consistency of r̂. Next, we show that $\tilde{β} (τ; r)$ has a bounded derivative with respect to r at r = r₀. This, coupled with the consistency of r̂, gives $\sup_{τ \in (ν, τ_{U, 1}]} ‖ \tilde{β} (τ; \hat{r}) - \tilde{β} (τ; r_{0}) ‖ = o_{p} (1)$ , and hence the uniform consistency of $\hat{β} (τ) \equiv \hat{β} (τ; \hat{r})$ .

Establishing the asymptotic normality of r̂ and the weak convergence of β(τ) requires considerable efforts to address difficulties due to the non-smoothness of the estimating functions and also the complicated entanglement between $\hat{β} (τ) \equiv \hat{β} (τ; \hat{r})$ and r̂. A brief outline of our proof follows. First, we establish the weak convergence of $n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s {β_{0}, α_{0}, r_{0}, τ}]$ and that of $n^{1 ∕ 2} W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}}$ . The former result entails a uniform i.i.d. representation of $n^{1 ∕ 2} {\hat{β} (τ; r_{0}) - β_{0} (τ)}$ . Secondly, we show the asymptotic linearity of $n^{1 ∕ 2} {\hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0})}$ with respect to n^1/2(r̂ – r₀). This result facilitates deriving an i.i.d. representation of n^1/2(r̂ – r₀) from the arguments for the weak convergence of $n^{1 ∕ 2} W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}}$ . Lastly, nothing that $n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)} = n^{1 ∕ 2} {\hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0}) + \hat{β} (τ; r_{0}) - β_{0} (τ)}$ , the asymptotic linearization of $n^{1 ∕ 2} {\hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0})}$ , coupled with the i.i.d. representations of $n^{1 ∕ 2} {\hat{β} (τ; r_{0}) - β_{0} (τ)}$ and n^1/2(r̂ – r₀), renders the weak convergence of $n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)}$ . The detailed proofs of Theorems 1–2 are provided in Appendices B–C.

2.5. Other Inferences

Due to the complexity involved in the limit distribution of $\hat{β} (τ)$ and r̂, we propose to use a bootstrap method for the inference on β₀(τ) and r₀. Specifically, we employ the paired-bootstrap scheme (Efron, 1979) and apply the algorithm presented in Section 2.3 to each of the B bootstrapped datasets to acquire ${{\hat{α}}_{b}^{*}, {\hat{β}}_{b}^{*}, {\hat{r}}_{b}^{*}}_{b = 1}^{B}$ . For a fixed τ ∈ (0, τ_U,1], one may use the sample variance of ${{\hat{β}}_{b}^{*} (τ)}_{b = 1}^{B}$ to estimate the asymptotic variance of $\hat{β} (τ)$ . Similarly, the variance of r̂ can be approximated by the sample variance of ${{\hat{r}}_{b}^{*}}_{b = 1}^{B}$ . The confidence intervals for β₀(τ) and r₀ can be constructed using normal approximation, or with the empirical percentiles of the bootstrap estimates.

We can also perform second stage inferences for exploring the varying pattern of covariate effects over τ ∈ [l, u], where [l, u] ⊂ (0, τ_U,1] denotes a prespecified interval of interest. For example, one may let [l, u] = [τ_a, τ_b]. First, we define a trimmed mean effect, $Φ_{1, q} = {\int_{l}^{u} β_{0}^{(q)} (τ) d τ} ∕ (u - l)$ , to summarize the effect of Z^(q). Here and hereafter, we use u^(q) to denote the qth component of a vector u. As a natural estimate for Φ_1,q, ${\hat{Φ}}_{1, q} = {\int_{l}^{u} {\hat{β}}^{(q)} (τ) d τ} ∕ (u - l)$ can be shown to be consistent and asymptotically normal under mild assumption on the functional form of β₀(·). The limit distribution of ${\hat{Φ}}_{1, q}$ can be estimated by the empirical distribution of the bootstrap realizations, ${{\int_{l}^{u} {\hat{β}}_{b}^{* (q)} (τ) d τ} ∕ (u - l)}_{b = 1}^{B}$ . This result can be easily extended to a Wald-type or percentile-based test based on ${\hat{Φ}}_{1, q}$ for evaluating whether a covariate Z^(q) (2 ≤ q ≤ p + 1) has a significant effect on a range of quantiles with τ ∈ [l, u], namely, H₀₁ : β^(q)(τ) = 0, τ ∈ [l, u]. In practice, one may be interested in accessing whether the effect of a covariate Z^(q) is constant for τ ∈ [l, u]. The null hypothesis may be formulated as $H_{02} : β_{0}^{(q)} (τ) = c_{0}, τ \in [l, u]$ , where c₀ is an unspecified constant. With a predetermined nonconstant weight function Ξ(τ) satisfying $\int_{l}^{u} Ξ (τ) d τ = 1$ , the test statistic can be constructed as $T_{2, q} = \int_{l}^{u} Ξ (τ) {\hat{β}}^{(q)} (τ) d τ - {\hat{Φ}}_{1, q}$ , the distribution of which can be approximated by the empirical distribution of ${\int_{l}^{u} Ξ (τ) {\hat{β}}_{b}^{* (q)} (τ) d τ - {\int_{l}^{u} {\hat{β}}_{b}^{* (q)} (τ) d τ} ∕ (u - l)}_{b = 1}^{B}$ . Wald-type or percentile-based hypothesis testing can be performed accordingly. Rejection of H₀₂ indicates that the effect of Z^(q) may not be constant across the quantiles. Justifications of the presented second-stage inferences would follow similar lines of Peng and Huang (2008) and thus are omitted in the paper

3. Simulation Studies

We evaluated the finite-sample performance of the proposed estimators via extensive Monte Carlo simulations. Let Z = (1, Z₁, Z₂)^T, where Z₁ ~ Unif(0, 1) and Z₂ ~ Bernoulli(0.5). The model used for generating T₁ is log T₁ = b₁Z₁ + b₂Z₂ + ε₁, where the error term ε₁ is Normal(0, 0.25²) when Z₂ = 0, and Normal(0, 0.5²) when Z₂ = 1. With this heteroscedastic error structure, the underlying regression quantile $β_{0} (τ) = {β_{0}^{(1)} (τ), β_{0}^{(2)} (τ), β_{0}^{(3)} (τ)}$ , where $β_{0}^{(1)} (τ) = Q n o r m (τ, 0, {0.25}^{2})$ , $β_{0}^{(2)} (τ) = b_{1}$ and $β_{0}^{(3)} (τ) = Q n o r m (τ, 0, {0.5}^{2}) - Q n o r m (τ, 0, {0.25}^{2}) + b_{2}$ . Here Qnorm(τ, μ, σ²) denotes the τ-th quantile of a N(μ, σ²) variable. Notice that both $β_{0}^{(1)} (τ)$ and $β_{0}^{(2)} (τ)$ vary with τ while $β_{0}^{(2)} (τ)$ is constant. We generated the dependent censoring time T₂ from a log-linear model with i.i.d. errors, log T₂ = a₁Z₁ + a₂Z₂ + ε₂, where ε₂ ~ Normal(μ₂, 0.5²). For the association structure between T₁ and T₂, we considered two types of copulas: the Clayton's copula and the Frank's copula. The bivariate survival function Pr(T₁ > s, T₂ > t | Z) is given by

- \frac{n^{- ∕ 1 ∕ 2}}{2} [\sum_{i = 1}^{n} {\hat{B}}_{i}^{[m]} (τ) ∣ \log (X_{i}) - Z_{i}^{T} b ∣ + ∣ M - \sum_{i = 1}^{n} {\hat{B}}_{i}^{[m]} (τ) Z_{i}^{T} b ∣ + ∣ M + 2 \sum_{i = 1}^{n} {\hat{B}}_{i}^{[m]} (τ) {\hat{A}}_{i}^{[m]} (τ) Z_{i}^{T} b ∣],

(14)

when the Clayton's copula is adopted, and by

{\Pr {(T_{1} > s ∣ Z)}^{- \exp (r_{c})} + \Pr (T_{2} > t ∣ Z)^{- \exp (r_{c})} - 1}^{- 1 ∕ \exp (r_{c})}

when the Frank's copula is adopted. For a fixed Z, the Kendall's tau coefficient between T₁ and T₂ equals exp(r_c)/{exp(r_c) + 2} under the Clayton's model, and equals 1 + 4 {D₁(r_f) – 1}/r_f under the Frank's model, where $D_{1} (r) = [\int_{0}^{r} t ∕ {\exp (t) - 1} d t] r$ . We set r_c = log(2) and r_f = 5.75 so that the corresponding Kendall's tau coefficients equal 0.5 under both copula models. The independent censoring time C was set to follow Unif(0, U_C). We considered different combinations of μ₂, (a₁, a₂), (b₁, b₂) and U_C, which led to the 4 set-ups presented in Table 1.

Table 1.

Summary of simulation setups, with P₀₀ = P(δ = 0, η = 0), P₀₁ = P(δ = 0, η = 1), P₁₀ = P(δ = 1, η = 0) and P₁₁ = P(δ = 1, η = 1).

Setup	Copula	μ ₂	(a₁, a₂)	(b₁, b₂)	U_C	P ₀₀	P ₀₁	P ₁₀	P ₁₁
S1.C	Clayton	0.1	(0.4, 0.2)	(0, 0)	18	0.06	0.15	0.04	0.76
S1.F	Frank	0.1	(0.4, 0.2)	(0, 0)	18	0.06	0.15	0.04	0.76
S2.C	Clayton	0.0	(0.3, –0.25)	(–0.4, 0)	3.5	0.24	0.24	0.09	0.43
S2.F	Frank	0.0	(0.3, –0.25)	(–0.4, 0)	3.5	0.23	0.24	0.10	0.43

Open in a new tab

Under each set-up, we generated 1000 simulated datasets and implemented the proposed numerical algorithm to obtain $\hat{β} (τ)$ and r̂. We chose sample size n = 200, grid size $‖ G_{L (n)} ‖ = 0.01$ , and bootstrap sample size B = 150. We set (τ_a, τ_b) = (0.1, 0.75) under S1.C and S2.F, and let (τ_a, τ_b) = (0.1, 0.65) under S2.C and S2.F, which involve heavier censoring. The simulation results on $\hat{β} (τ)$ at different τ's are summarized in Table 2. We report the empirical biases (EBias), the empirical standard errors (ESD), and the average resampling-based standard errors (ASD) for $\hat{β} (τ)$ , as well as the empirical coverage probabilities of the 95% Wald-type (ECP_W) and percentile-based (ECP_P) confidence intervals of β₀(τ). It is shown that $\hat{β} (τ)$ has small biases under all set-ups, and the bootstrap standard errors closely match the empirical standard errors. Both the Wald-type and the percentile-based confidence intervals achieve empirical coverage probabilities that are close to the nominal level. It is observed that the percentile-based confidence intervals may perform better than the Wald-type intervals when τ is small.

Table 2.

Summary of simulation results on $\hat{β} (τ)$ , which includes the empirical biases (×10³), empirical standard errors (×10³), averages of resampling based standard error estimates (×10³), and empirical coverages (%) of 95% wald-type and percentile-based confidence intervals.

τ	EBias(×10³)			ESD(×10³)			ASD(×10³)			ECP_W			ECP_P
	${\hat{β}}^{(0)}$	${\hat{β}}^{(1)}$	${\hat{β}}^{(2)}$	${\hat{β}}^{(0)}$	${\hat{β}}^{(1)}$	${\hat{β}}^{(2)}$	${\hat{β}}^{(0)}$	${\hat{β}}^{(1)}$	${\hat{β}}^{(2)}$	${\hat{β}}^{(0)}$	${\hat{β}}^{(1)}$	${\hat{β}}^{(2)}$	${\hat{β}}^{(0)}$	${\hat{β}}^{(1)}$	${\hat{β}}^{(2)}$
Set-up S1.C
0.1	4	2	2	100	162	99	102	167	105	93.9	94.6	94.5	93.8	95.9	95.5
0.2	5	–4	1	89	143	86	94	151	92	93.9	94.5	95.0	94.5	95.9	95.7
0.3	4	–2	2	86	138	84	90	144	88	94.3	94.5	94.9	95.9	96.2	96.2
0.4	4	–3	2	83	133	82	87	140	87	95.1	95.7	95.5	95.5	95.9	96.2
0.5	4	–2	3	81	131	83	86	138	87	94.7	94.7	95.9	95.2	96.2	95.4
0.6	4	–1	3	81	131	86	86	139	88	94.3	94.9	95.4	95.0	95.6	95.0
0.7	5	–4	4	82	134	87	87	142	92	94.6	95.3	94.5	95.1	94.8	95.5
Set-up S1.F
0.1	13	–10	4	101	159	99	102	164	101	93.4	95.0	93.9	94.1	96.5	95.3
0.2	11	–8	3	86	136	86	89	142	87	94.5	96.3	94.5	94.3	96.6	96.0
0.3	11	–9	4	75	121	78	81	131	81	95.1	96.5	94.2	95.2	96.2	95.2
0.4	10	–7	1	71	115	75	77	124	78	96.0	96.2	95.2	96.5	97.2	95.0
0.5	11	–9	2	69	111	74	74	121	78	95.3	96.4	95.7	95.7	97.1	95.3
0.6	11	–9	2	66	111	76	74	120	78	94.8	96.5	95.5	95.1	96.9	95.6
0.7	12	–11	2	70	117	79	75	124	81	94.3	94.6	95.4	94.7	95.1	95.7
Set–up S2.C
0.1	9	–8	9	106	171	113	112	180	113	94.4	95.6	92.7	94.6	95.7	93.3
0.2	6	–1	5	97	157	106	110	171	110	96.3	95.9	94.9	96.3	97.3	96.2
0.3	10	–8	5	97	152	108	109	168	112	95.7	95.8	95.3	96.8	97.2	95.9
0.4	10	–9	6	100	156	110	107	167	117	96.3	95.7	95.8	95.9	97.3	96.0
0.5	13	–10	6	95	151	116	107	168	124	95.4	95.8	95.6	95.8	96.5	95.4
0.6	15	–14	13	97	154	126	107	170	131	95.7	95.5	94.7	95.4	96.2	95.6
Set-up S2.F
0.1	26	–30	13	125	191	118	114	179	115	88.6	91.8	91.0	88.8	93.4	93.2
0.2	18	–21	18	108	164	111	108	167	111	92.6	94.8	93.4	92.3	94.8	94.0
0.3	21	–26	20	98	148	106	101	157	108	93.4	95.3	94.0	93.7	96.2	94.0
0.4	22	–27	20	91	137	102	95	149	106	93.8	95.2	94.2	92.3	95.5	93.3
0.5	16	–15	14	84	129	101	93	147	108	95.2	95.5	95.3	92.7	95.3	93.6
0.6	19	–21	20	84	133	107	93	149	112	95.5	96.1	93.7	94.8	95.6	93.9

Open in a new tab

Table 3 summarizes the simulation results on r̂. We present the same set of summary statistics including EBias, ESD, ASD, ECP_W and ECP_P that are presented in Table 2, along with the true values of r₀ and Kendall's tau coefficient, which is expressed as a known function of r₀, $K (r_{0})$ ; see the column labeled as TRUE. We can see that r̂ is virtually unbiased, and the estimated standard errors are close to their empirical counterparts. For the estimation of r₀ and $K (r_{0})$ , we note that the coverage rates of the Wald-type confidence intervals tend to be higher than those of the percentile-based intervals. In Table 3, we also present the empirical biases and empirical standard errors of the corresponding estimates for Kendall's tau, denoted by $K (\hat{r})$ . It is shown that, as an established measure of association, Kendall's tau coefficient can be accurately estimated with small standard errors.

Table 3.

Summary of simulation results on r̂, which include empirical biases (×10³), empirical standard errors (×10³), averages of resampling based standard error estimates (×10³), empirical coverages (%) of 95% wald-type and percentile-based confidence intervals. The same sets of summary statistics are also presented for the corresponding Kendall's tau coefficient estimates, $K (\hat{r})$ .

		TRUE	EBias(×10³)	ESD(×10³)	ASD(×10³)	ECP_W	ECP_P
S1.C	r	0.693	–14	206	226	96.7	94.8
	$K (r)$	0.500	–4	51	56	96.1	94.8
S1.F	r	5.750	–215	852	885	94.5	92.5
	$K (r)$	0.500	–15	48	51	95.2	92.5
S2.C	r	0.693	–30	305	370	98.6	96.0
	$K (r)$	0.500	–7	74	90	97.7	96.0
S2.F	r	5.750	–356	1317	1371	94.8	92.3
	$K (r)$	0.500	–28	77	81	96.1	92.3

Open in a new tab

The results of second stage inferences are summarized in Table 4. We present the EBias, ESD and ASD of the trimmed mean effect estimates ${\hat{Φ}}_{1}^{(q)} = {\int_{l}^{u} {\hat{β}}^{(q)} (τ) d τ} ∕ (u - l)$ , where q = 2,3, and (l, u) = (τ_a, τ_b). We also summarize the performance of the proposed hypothesis tests for $H_{01} : β_{0}^{(q)} (τ) = 0$ , and $H_{02} : β_{0}^{(q)} (τ) \equiv c_{0}$ , following the procedures in Section 2.5. To construct the test statistics for H₀₂, we chose the weight function Ξ(τ) as $2 I (τ \leq \frac{τ_{L} + τ_{U}}{2}) ∕ (τ_{U} - τ_{L})$ . For each test, we report the empirical rejection rates based on normal approximation (ERR_W) and those based on percentiles (ERR_P). We find that the trimmed mean effect estimates ${\hat{Φ}}_{1}^{(q)}, q = 2, 3$ , are well-performed in terms of biases and standard error estimates. For H₀₁, we observe that both Wald-type test and percentile-based test achieve empirical sizes close to the nominal level 0.05 and have satisfactory power. Regarding the constancy test, the percentile-based test, as compared to the Wald-based test, may be more conservative in terms of size but have higher power in the presence of non-constant effects. In Supplementary Material, we present additional simulation results with n = 400, which indicate better performance of the proposed method with a larger sample size while delivering similar observations to those from Tables 2–4.

Table 4.

Summary of simulation results on second stage inferences, which include empirical bias (×10³), empirical standard error (×10³) and average of resampling based standard error estimates (×10³) of ${\hat{Φ}}_{1}^{(q)}$ , q = 2, 3, as well as the empirical rejection rates of H₀₁ and H₀₂ with Wald-type (ERR_W) and percentile-based (ERR_P) methods.

		${\hat{Φ}}_{1}^{(q)}$			H ₀₁		H ₀₂
	q	EBias	ESD	ASD	ERR_W	ERR_P	ERR_W	ERR_P
S1.C	2	–2	108	109	0.056	0.062	0.049	0.038
	3	–1	68	68	0.129	0.129	0.936	0.946
S1.F	2	–9	93	96	0.046	0.056	0.043	0.033
	3	–1	61	61	0.162	0.178	0.950	0.967
S2.C	2	–9	123	128	0.886	0.930	0.026	0.017
	3	4	88	87	0.190	0.145	0.561	0.736
S2.F	2	–22	115	116	0.950	0.973	0.040	0.020
	3	15	83	80	0.167	0.136	0.614	0.750

Open in a new tab

We also perform simulation studies to evaluate the robustness of the proposed estimators. Specifically, we examine the proposed estimators of β₀(τ) and r₀ when the association model (2) is misspecified. We generated data from the set-ups S2.C and S2.F listed in Table 1, where the Clayton's copula and the Frank's copula hold respectively. For each set-up, we implemented the proposed method separately by assuming the Clayton's copula and the Frank's copula. Figure 1 presents the empirical averages of $\hat{β} (τ)$ under a correctly specified copula model and an incorrectly specified copula model. For comparison, we also plot the empirical averages of Peng and Huang (2008)'s estimates obtained by naively treating T₂ as independent censoring. Not surprisingly, the naive estimator yields large biases for τ ∈ [0.1, 0.65] in all scenarios. When the association model (2) is correctly specified, the empirical averages of $\hat{β} (τ)$ are close to the true regression quantiles. This agrees with the results in Table 2. With a misspecified association structure, the proposed estimator of β₀(τ) exhibits quite small deviations from the true coefficients. For example, the biases of $\hat{β} (τ)$ , when we incorrectly adopt Frank's copula in set-up S2.C, are nearly negligible. We have very similar observations for the estimation of the association parameter. The empirical average of the estimated Kendall's tau coefficient, when Frank's copula is incorrectly assumed in set-up S2.C, equals 0.475, and that when we assume Clayton's copula in set-up S2.F equals 0.472. Both empirical averages are not far from the true Kendall's coefficient 0.5. Overall, our simulations suggest that the proposed estimation of β₀(τ) and r₀ is quite robust to the misspecification of the parametric form of the copula model. The empirical results from our robustness study agree with previous reports in competing risks literature. For example, the marginal regression analysis proposed by Chen (2010) was found to be robust to the specification of the functional form of the copula when the level of association is correctly assumed. Compared to competing risk methods such as those by Zheng and Klein (1995) and Chen (2010), the proposed method only requires the assumption on the parametric form of the copula while leaving the copula parameter unspecified. Consequently, our robustness result entails a better utility of the proposed method than the sensitive analysis permitted by competing risk data.

4. A Real Example

We illustrate the proposed method via an application to the bone marrow transplant (BMT) example, where data were collected on patients with acute leukemia following allogeneic bone marrow transplantation (Copelan et al., 1991). One intermediate endpoint of interest is the development of chronic graft-versus-host disease (GVHD), a common complication following transplantation. The GVHD endpoint was subject to potentially dependent censoring due to death. With the origin at transplantation, we let T₁ denote time to the GVHD endpoint and T₂ denote time to death, while C represents time to end of follow-up, which is not fixed but random. For the 137 subjects in this dataset, 61 (44.5%) were observed to experience chronic GVHD, 81 (59%) died during the follow-up period, among which 52 patients (38%) died before the onset of chronic GVHD. The values of P₀₀, P₁₀, P₀₁ and P₁₁ defined in Table 1 are 17.5%, 38.0%, 23.4%, 21.2% respectively. Based on disease type, patients were classified into three groups, referred to as acute lymphoblastic leukemia (ALL), acute myeloctic leukemia (AML) low risk and AML high risk, respectively. The recorded data also include patients’ age at the time of study enrollment. In this analysis, we are interested in assessing the effects of disease type and age on the development of GVHD. The covariate vector Z = (1, Z₁, Z₂, Z₃), where Z₁ and Z₂ are indicator variables of the AML low risk group and AML high risk group, respectively, and Z₃ represents patients’ age.

We fitted models (1)–(3) to the BMT data, adopting Frank's copula and setting (τ_a, τ_b) = (0.05, 0.55). Standard error estimates and confidence intervals were obtained based on 400 bootstrap resamples. The estimated association parameter r̂ equals 4.65. The corresponding Kendall's tau equals 0.43, with a percentile-based 95% interval of (–0.04, 0.60) and a 95% Wald-type interval of (0.05, 0.63). This result provides some evidence for a moderate positive association between GVHD and death, conditionally upon disease type and age. In Figure 2, we present the fitted regression coefficients, coupled with percentile-based 95% confidence intervals. Figure 2 suggests that patients in AML low risk group may have overall significantly slower progression to GVHD compared to patients in ALL group. The benefit in the timing of GVHD associated with being in AML high risk group versus ALL group is much less evident, with the 95% confidence intervals excluding 0 only for τ < 0.15. In addition, the estimation of the coefficients for Z₃ suggests little age effect on the timing of GVHD development. For comparison purpose, we also present in Figure 2 Peng and Huang (2008)'s estimator ${\hat{β}}_{N} (τ)$ which treats death as independent censoring. We observe that the intercept term of ${\hat{β}}_{N} (τ)$ tends to be higher than that of $\hat{β} (τ)$ , which agrees with our observation in Figure 1 from the simulation studies that ${\hat{β}}_{N} (τ)$ may tend to overestimate the intercept term when T₁ and T₂ are positively associated. There are rather large discrepancies in the estimated coefficients for Z₂ and Z₃. For example, naively using ${\hat{β}}_{N} (τ)$ may lead to an impression that younger patients have more rapid progression to GVHD compared to older patients, which seems counterintuitive. Compared to the naive analysis based on ${\hat{β}}_{N} (τ)$ , the proposed method may provide us higher confidence on the results by appropriately accounting for to the dependence between time GVHD and time to death.

Fig. 2 — Estimated regression coefficients for the BMT dataset, where the x-axis represents τ and the y-axis represents the regression coefficients. Bold solid line represents the proposed regression coefficients as a function of τ, dashed line represents the estimator based on Peng and Huang's method, and dotted lines represent the percentile-based 95% intervals.

We next conducted second-stage inferences on covariate effects, setting (l, u) = (0.05, 0.55). The estimated trimmed mean effect of Z₂ (AML low risk versus ALL) equals 0.65, with an estimated standard error of 0.32. The percentile-based 95% confidence interval is (0.26, 1.44). These results confirm the beneficial effect of being in AML low risk group (versus ALL group) observed in Figure 2. For Z₂ (AML high risk versus ALL), the trimmed mean effect estimate is 0.17, with an estimated standard error of 0.14 and a percentile-based 95% confidence interval of (–0.03, 0.50). This indicates little evidence to support the difference in time to GVHD between AML high risk group and ALL group. The trimmed mean estimate for Z₃ (Age) equals –0.003, with an estimated standard error of 0.009 and a percentile-based 95% confidence interval of (–0.020, 0.016). This suggests a non-significant age effect on GVHD endpoint. Regarding the constancy of covariate effects, we obtained $T_{2, 2} = - 0.22$ for Z₁, coupled with a percentile-based 95% confidence interval of (–0.75, –0.04). This confirms our visual impression from Figure 2 and implies that the difference between AML low risk group and ALL group may be more pronounced in patients who are less prone to GVHD. For Z₂ and Z₃, the test statistics $T_{2, 3}$ and $T_{2, 4}$ equal 0.03 and 0.002, respectively, both with percentile-based confidence intervals covering 0. Therefore, constant effects may be adequate for AML high risk (versus ALL) and Age.

Finally, we present the estimated quantiles of time to GVHD endpoint for patients with different disease types in Figure 3, with age set at its mean of 28.4 years. These curves provide useful alternative survival summaries by using quantiles of time to the endpoint of interest, which have direct physical interpretations. We observe that the differences in estimated quantiles based on $\hat{β} (τ)$ and ${\hat{β}}_{N} (τ)$ seem to increase with τ. For example, for ALL, AML low and AML high risk groups, the differences between the estimated 40th quantiles based on these two estimators are 0.9, 2.3 and 1.7 months respectively. Further unreported analyses suggest that the discrepancies in estimated quantiles increase with patients’ age. It is also observed from Figure 3 that estimated quantiles of time to GVHD for the AML low risk group are much larger than those for the ALL group and AML high risk group, and this beneficial effect of being in AML low risk group appears more apparent when τ increases. By comparison, the estimated quantiles for AML high risks patients are quite similar to those for ALL patients. These observations are consistent with the findings from our second-stage inferences

Fig. 3 — Estimated quantiles of time to the GVHD endpoint, where x-axis represents quantiles, and y-axis represents predicted quantiles in months. The solid line and dashed line correspond to the proposed estimator and Peng and Huang (2008)'s estimator, respectively.

5. Remarks

In this paper, we propose a quantile regression method that can properly accommodate dependent censoring situations that fall into the paradigm of semi-competing risks. Our method offers a useful tool for investigating nonterminating endpoints that often arise in clinical follow-up studies and their relationship with subsequent competing endpoints. The net quantile inference pursued in this paper is useful when covariate effects with the removal of dependent censoring are of substantive relevance.

It is worth noting that the inequalities (4) and (5) adopted for estimating model (1) entail a rather comprehensive use of semi-competing risks information, which is not limited to the diagonal line of the joint distribution of (T₁, T₂) as in Peng and Fine (2007b). While formal efficiency comparisons between these two approaches are tricky due to their distinct modeling strategies, some simulation studies (unreported) show 25-35% more efficient coefficient estimation from the new method compared to that from Peng and Fine (2007b).

We impose assumptions on the dependent censoring scheme through a general class of copula models. As noted in the paper, this is necessary for addressing the identifiability issue inherited with the dependent censoring problem. Simulations have shown that the proposed estimators are quite robust to misspecification of the parametric form of the copula model. This robustness feature is expected to enhance the practical utility of the proposed method.

Supplementary Material

Supp Supplement S1

NIHMS549451-supplement-Supp_Supplement_S1.pdf^{(27.3KB, pdf)}

Acknowledgements

The authors are grateful to the editor, the associate Editor, and the two referees for many helpful comments. This research has been supported by National Science Foundation under Grant Number DMS-1007660 and National Institute of Health under Award Number R01HL 113548.

APPENDIX A: REGULARITY CONDITIONS

For a vector x, let x^⊗2 denote xx^T, and ∥x∥ denote the Euclidean norm of x. For a random variable T, let f_T (·|z) denote its conditional density function given Z = z. We define s(β, α, r, τ) = ES_n(β, α, r, τ), w(β, α, r, τ) = EW_n(α, β, r, τ), and $w (β, α, r) = \int_{τ_{a}}^{τ_{b}} w (β, α, r, τ) d τ$ . Let B_b(β, α, r, τ) = ∂s(β, α, r, τ)/∂β(τ), B_r(β, α, r, τ) = ∂s(β, α, r, τ)/∂r, L_b(β, α, r, τ) = ∂w(β, α, r, τ)/∂β(τ), and L_r(β, α, r) = ∂w(β, α, r)/∂r. Next, define μ(a) = E[ZI{Y ≤ exp(Z^Ta), η = 1}] as in Peng and Huang (2008). For d > 0, let $A (d) = {a \in R^{p + 1} : \inf_{τ \in (0, τ_{U, 2}]} ‖ μ (a) - μ {α_{0} (τ)} ‖ \leq d}$ and $R (d) = {r \in R^{q + 1} : ‖ r - r_{0} ‖ \leq d}$ . Define $\overset{‒}{B} (d) = {β \in R^{p + 1} : \max_{τ \in (ν_{1}, τ_{U, 1}]} ‖ β (τ) - β_{0} (τ) ‖ \leq d}$ . Let d_A, d_B and d_R be positive constants that determine the spans of the neighborhoods of α₀, β₀, r₀ respectively. For presentation simplicity, we exclude the indicator functions $I {Z_{i}^{T} β (τ) \leq Z_{i}^{T} α (τ_{U, 2})}$ in S_n and $I {\log t < Z_{i}^{T} α (τ_{U, 2})}$ in W_n respectively, corresponding to the situation with τ_U,2 close to 1. This incurs little essential difference in asymptotic arguments.

We require the following regularity conditions:

C1 The covariate space $Z$ is bounded, i.e., sup_i ∥Z_i∥ ≤ ∞.

C2 (i) The regularity conditions in Peng and Huang (2008) hold for (Y, η, Z) and α₀(τ), τ ∈ (0, τ_U,2]. (ii) For B_α(a) = ∂μ(a)/∂a, there exists a constant C_F, such that each component of ∥f_T₂{exp(z^Ta)|z} exp(z^Ta) × B_α(a)^–1∥ is bounded by C_F uniformly in $z \in Z$ and $a \in A (d_{A})$ , where d_A positive constant. (iii) Define

- \frac{1}{r_{f}} \log (1 + \frac{[\exp {- r_{f} \Pr (T_{1} > s ∣ Z)} - 1] \times [\exp {- r_{f} \Pr (T_{2} > t ∣ Z)} - 1]}{\exp (- r_{f}) - 1})

where d_A(u, v, θ) = ∂K_A(u, v, θ)/∂v and d_B(u, v, θ) = ∂K_B(u, v, θ)/∂v. For Z containing continuous covariates, V_A(a, τ) and V_B(a) are differentiable with respect to a, and every component of [∂E{V_A(a, τ)}/∂a]B_α (a)^–1 and [∂E{V_B(a)}/∂a]B_α(a)^–1 are bounded uniformly for $a \in A (d_{A})$ and τ ∈ (0, τ_U,1].

C3 (i) Each component of s{β₀,α₀, r₀, τ} and w{β₀,α₀, r₀, τ} is a Lipschitz function of τ when τ ∈ (0, τ_U,1], (ii) let d_θ(u, v, θ) = ∂Ψ(u, v, θ)/∂θ, ∥d_θ{u, v, g(z^Tr₀)}g^′(z^Tr₀)∥ is uniformly bounded for $z \in Z, u \leq τ_{U, 1}$ and v ≤ τ_U,2. (iii) the conditional density functions f_X(t|z), f_Y(t|z), f_T₁(t|z) and f_T₂(t|z) are bounded above uniformly in t and $z \in Z$ .

C4 (i) k_b ≡ infτ∈(ν₁, τ_U,1] eigmin{B_b(β₀, α₀, r₀, τ)} > 0, where eigmin(·) denotes the minimum eigenvalue of a matrix. (ii) For any fixed $r \in R (d_{R})$ , $\tilde{β} (τ; r)$ is the unique solution to s{β,₀, α₀r, τ} = 0 for $β \in \overset{‒}{B} (d_{B})$ and $k_{r} \equiv \inf_{r \in R (d_{R})} eigmin [\partial w {\tilde{β} (r), α_{0}, r} ∕ \partial r] > 0$ . (iii) Every component of L_b(β, α₀, r₀, τ) × B_b(β, α₀, r₀, τ)^–1 is uniformly bounded for $β \in \overset{‒}{B} (d_{B})$ .

C5 (i) For any $z \in Z$ , we have z^Tβ₀(τ_U,2). (ii) Both $\inf_{z \in Z} \Pr [C > \exp {Z^{T} β_{0} (τ_{U, 1})} ∣ Z]$ and $\inf_{z \in Z} \Pr [C > \exp {Z^{T} α_{0} (τ_{U, 2})} ∣ Z]$ are bounded away from 0.

Condition C1 requires bounded covariates and is often met in practice. Condition C2 imposes mild assumptions on the dependent censoring time T₂. Specifically, it ensures that $\int_{0}^{τ_{U, 2}} I [t \geq \exp {Z^{T} \hat{α} (u)}] d u$ serves an appropriate estimator of F₂(t|Z) ∧ τ_U,2, and that $s (β, \hat{α}, r, τ)$ preserves the nice asymptotic properties of $\hat{α} (\cdot)$ shown in Peng and Huang (2008). By condition C3, s{β₀(τ), α₀, r, τ} and w{β₀(τ), α₀, r, τ} are smooth in both τ and r, and the conditional density functions of X, Y, T₁ and T₂ are uniformly bounded. Condition C4, coupled with the boundary constraints in C5, entails the identifiability of the proposed estimator in a neighborhood of β₀ and r₀. Specifically, C4(i) and C4(ii) requires that the derivative of s(β, α, r, τ) over β(τ) and that of $w (\tilde{β}, α, r, τ)$ over r are bounded from below at the true parameters, and C4(iii) ensures that the derivative of $\tilde{β} (τ; r)$ over r is bounded at r = r₀. Note that C5(i) can be removed when equation (8) is adopted.

Our following proofs utilizes the asymptotic properties of Peng and Huang (2008)'s estimator $\hat{α} (\cdot)$ . It is worth pointing out that the asymptotic results on $\tilde{β} (\cdot)$ remain valid when other modeling strategies are adopted for T₂, provided that the strategy entail reasonable estimates of F₂(t|Z). The proof can be carried out following the lines of Appendix B and C in the following, and the detailed arguments are omitted here.

APPENDIX B: PROOF OF THEOREM 1

Lemma 1. For μ(a = EZI{Y ≤ exp(Z^Ta), η = 1}, we have

\begin{matrix} V_{A} (a, τ) = & - Z [I {\log Y > Z^{T} β_{0} (τ)} d_{A} {τ, F_{2} [\exp {Z^{T} β_{0} (τ)} ∣ Z], g (Z^{T} r_{0})} \times I {Z^{T} β_{0} (τ) \geq Z^{T} a}] \\ V_{B} (a) = & - \overset{‒}{Z} {\int_{τ_{a}}^{τ_{b}} \int_{t \in (0, \infty)} I {Z^{T} β_{0} (τ) \leq \log t < \log Y} \times d_{B} {τ, F_{2} (t ∣ Z), g (Z^{T} r_{0})} I {\log t \geq \exp (Z^{T} a)} d τ d t}, \end{matrix}

Proof of Lemma 1. By regularity condition C2(ii) and Taylor expansion, we have

\sup_{z \in Z, t} ∣ \int_{0}^{τ_{U, 2}} I [t \geq \exp {z^{T} \hat{α} (u)}] d u - \int_{0}^{τ_{U, 2}} I [t \geq \exp {z^{T} α_{0} (u)}] d u ∣ \leq 2 C_{F} \sup_{τ \in (0, τ_{U, 2}]} ‖ μ (\hat{α} (τ)} - μ {α_{0} (τ)} ‖ .

where F₂[exp{z^Tα₀(τ)}|z] = τ. Let $ε_{F} = \sup_{z \in Z, τ \in (0, τ_{U, 2}]} ∣ F_{2} [\exp {z^{T} \hat{α} (τ)} ∣ z] - τ ∣$ , we then have $ε_{F} \leq C_{F} \sup_{τ \in (0, τ_{U, 2}]} ‖ μ {\hat{α} (τ)} - μ {α_{0} (τ)} ‖$ .

Note that

\sup_{z \in Z, t} ∣ \int_{0}^{τ_{U, 2}} I [t \geq \exp {z^{T} \hat{α} (u)}] d u - \int_{0}^{τ_{U, 2}} I [t \geq \exp {z^{T} α_{0} (u)}] d u ∣ \leq \sup_{z \in Z, t} \int_{0}^{τ_{U, 2}} I {F_{2} (t ∣ z) \in [F_{2} [\exp {z^{T} \hat{α} (u)} ∣ z] \land u, F_{2} [\exp {z^{T} \hat{α} (u)} ∣ z] \lor u]} d u \leq \sup_{z \in Z, t} \int_{0}^{τ_{U, 2}} I {F_{2} (t ∣ z) \in [u - ε_{F}, u + ε_{F}]} d u = \sup_{z \in Z, t} \int_{0}^{τ_{U, 2}} I {u \in [F_{2} (t ∣ z) - ε_{F}, F_{2} (t ∣ B z) + ε_{F}]} d u \leq 2 ε_{F} .

where ∨ is the maximum operator. This immediately completes the proof of Lemma 1.

Proof of Theorem 1. For any fixed θ, the copula function Ψ(u, v, θ) satisfies the Lipschitz condition in u and v (Nelsen, 2006). Hence K_A(u, v, θ) and K_B(u, v, θ) are both Lipschitz continuous in v when v ≤ τ_U,2. By Lemma 1 and the fact that $\sup_{τ \in (0, τ_{U, 2}]} ‖ μ {\hat{α} (τ)} - μ {α_{0} (τ)} ‖ \overset{p}{\to} 0$ (Peng and Huang, 2008), we can show with Taylor expansion that

\sup_{z \in Z, r, b, τ} ∣ K_{A} {τ, \int_{0}^{τ_{U, 2}} I {z^{T} b \geq z^{T} \hat{α} (u)} d u, g (z^{T} r)} - K_{A} {τ, \int_{0}^{τ_{U, 2}} I {z^{T} b \geq z^{T} α_{0} (u)} d u, g (z^{T} r)} ∣ \overset{p}{\to} 0 .

It follows that

\sup_{β, r, τ} ‖ s (β, \hat{α}, r, τ) - s (β, α_{0}, r, τ) ‖ \overset{p}{\to} 0 .

(A1)

Next, we claim that $G_{1} = {Z_{i} P_{i} (β, α, r, τ) : Z_{i} \in Z, β, α \in R^{p + 1}, r \in R^{q + 1}, τ \in (0, 1)}$ is Donsker thus Glivenko Cantelli. This follows by noting that the class of indicator functions is a VC-class, and by using the permanence properties of the Donsker class (Van der Vaart and Wellner, 1996). Therefore, Glivenko-Cantelli theorem gives

\sup_{β, α, r, τ} ‖ S_{n} (β, α, r, τ) - s (β, α, r, τ) ‖ = o_{p} (1) .

(A2)

Since $\tilde{β} (τ; r)$ is the solution to s(β, α₀, r, τ) = 0, we can use simple manipulations and get

‖ s {\hat{β} (r), α_{0}, r, τ} - s \tilde{β} (r), α_{0}, r, τ} ‖ = ‖ s {\hat{β} (r), α_{0}, r, τ} ‖ \leq ‖ s {\hat{β} (r), \hat{α}, r, τ} ‖ + ‖ s {\hat{β} (r), α_{0}, r, τ} - s {\hat{β} (r), \hat{α}, r, τ} ‖ .

(A3)

Combining (A2) and the fact that $\sup_{r, τ \in [ν_{1}, τ_{U, 1}]} ‖ S_{n} {\hat{β} (r), \hat{α}, r, τ} ‖ = o (1)$ , we get $\sup_{r, τ \in [ν_{1} τ_{U, 1}]} ‖ s {\hat{β} (r), \hat{α}, r, τ} ‖ \overset{p}{\to} 0$ . This, coupled with (A1) and (A3), implies

\sup_{r, τ \in [ν_{1}, τ_{U, 1}]} ‖ s {\hat{β} (r), α_{0}, r, τ} - s {\tilde{β} (r), α_{0}, r, τ} ‖ \overset{p}{\to} 0 .

(A4)

By condition C4(i), the inverse function theorem implies that there exists $\hat{β} (τ, r) \in \overset{‒}{B} (d_{B})$ . More-over, Taylor expansion of $s {\tilde{β} (r), α_{0}, r, τ}$ around $\tilde{β} (τ; r)$ gives

\sup_{r \in R (d_{R}), τ \in [ν_{1}, τ_{U, 1}]} ‖ \hat{β} (τ; r) - \tilde{β} (τ; r) ‖ \leq \underset{r \in R (d_{R}), τ \in [ν_{1}, τ_{U, 1}]}{s u p} \frac{1}{k_{b}} ‖ s {\hat{β} (r), α_{0}, r, τ} - s {\tilde{β} (r), α_{0}, r, τ} ‖ \overset{p}{\to} 0 .

(A5)

This fact, combined with the Glivenko-Cantelli theorem on W_n(β, α, r), implies that $\sup_{r \in R (d_{R})} ‖ W_{n} (\hat{α}, r) - \tilde{w} (\hat{α}, r) ‖ \overset{p}{\to} 0$ , where $W_{n} (α, r) = W_{n} {\hat{β} (r), α, r}$ and $\tilde{w} (α, r) = w {\tilde{β} (r), α, r}$ . Similarly, we can use Lemma 1 and the uniform consistency of $μ {\hat{α} (τ)}, τ \in (0, τ_{U, 2}]$ to show $\sup_{r \in R (d_{R})} ‖ \tilde{w} (\hat{α}, r) - \tilde{w} (α_{0}, r) ‖ \overset{p}{\to} 0$ . It follows that

\sup_{r \in R (d_{R})} ‖ W_{n} (\hat{α}, r) - \tilde{w} (α_{0}, r) ‖ \leq \sup_{r \in R (d_{R})} [‖ W_{n} (\hat{α}, r) - \tilde{w} (\hat{α}, r) ‖ + ‖ \tilde{w} (\hat{α}, r) - \tilde{w} (α_{0}, r) ‖] \overset{p}{\to} 0 .

(A6)

Therefore, we can see w̃(α₀, r̂) = o_p(1) from $W_{n} (\hat{α}, \hat{r}) - o_{p} (1)$ . By regularity conditions C4(ii), w̃(α₀, r₀) = 0 and $‖ \hat{r} - r_{0} ‖ \leq k_{r}^{- 1} ‖ \tilde{w} α_{0}, \hat{r}) - \tilde{w} α_{0}, r_{0}) ‖$ . The consistency of r̂ follows.

To show the uniform consistency of $\hat{β} (\cdot)$ , we first need to show the partial derivative of $\tilde{β} (τ; r)$ with respect to r is bounded at r = r₀. Taking partial derivative ∂/∂r on both sides of $s {\tilde{β} (r), α_{0}, r, τ} = 0$ , we get

\frac{\partial \tilde{β} (τ; r)}{\partial r} ∣_{r = r_{0}} = - B_{b} {(β_{0}, α_{0}, r_{0}, τ)}^{- 1} B_{r} (β_{0}, α_{0}, r_{0}, τ) .

(A7)

By condition C3(ii) and C4(i), $\frac{\partial \tilde{β} (τ; r)}{\partial r}$ is uniformly bounded for τ ∈ [ν₁, τ_U,1], and thus

\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ \tilde{β} (τ; \hat{r}) - \tilde{β} (τ; r_{0}) ‖ = o_{p} (1)

(A8)

follows from the consistency of r̂. Based on (A5) and (A8), the following inequality,

‖ \hat{β} (τ) - β_{0} (τ) ‖ = ‖ \hat{β} (τ; \hat{r}) - \tilde{β} (τ; r_{0}) ‖ \leq ‖ \hat{β} (τ; \hat{r}) - \tilde{β} (τ; \hat{r}) ‖ + ‖ \tilde{β} (τ; \hat{r}) - \tilde{β} (τ; r_{0}) ‖,

implies $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ \hat{β} (τ) - β_{0} (τ) ‖ \overset{p}{\to} 0$ . This completes the proof of Theorem 1.

APPENDIX C: PROOF OF THEOREM 2

Lemma 2. For any sequence ${{\bar{β}}_{n} (τ), τ \in [ν_{1}, τ_{U, 1}]}_{n = 1}^{\infty} \in \overset{‒}{B} (d_{B})$ satisfying $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ s ({\bar{β}}_{n}, α_{0}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ) ‖ \overset{p}{\to} 0$ , we have

\sup_{τ \in [ν_{1}, τ_{U, 1}]} n^{1 ∕ 2} ‖ {S_{n} ({\bar{β}}_{n}, α_{0}, r_{0}, τ) - S_{n} (β_{0}, α_{0}, r_{0}, τ)} - {s ({\bar{β}}_{n}, α_{0}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)} ‖ \overset{p}{\to} 0 .

Similarly, for any sequence ${{\bar{α}}_{n} (τ), τ \in (0, τ_{U, 2}]}_{n = 1}^{\infty}$ satisfying $\sup_{τ \in (0, τ_{U, 2}]} ‖ μ {{\bar{α}}_{n} (τ)} - μ {α_{0} (τ)} ‖ \overset{p}{\to} 0$ ,

\sup_{τ \in [ν_{1}, τ_{U, 1}]} n^{1 ∕ 2} ‖ {S_{n} (β_{0}, {\bar{α}}_{n}, r_{0}, τ) - S_{n} (β_{0}, α_{0}, r_{0}, τ)} - {s (β_{0}, {\bar{α}}_{n}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)} ‖ \overset{p}{\to} 0 .

Proof of Lemma 2. Define $σ_{B}^{2} (β) = \sup_{τ \in [ν_{1}, τ_{U, 1}]} Var {P_{i} (β, α_{0}, r_{0}, τ) - P_{i} (β_{0}, α_{0}, r_{0}, τ)}$ , Following the arguments of Lai and Ying (1988) and Peng and Huang (2008), it is sufficient for the first part to hold if $σ_{B}^{2} ({\bar{β}}_{n}) \overset{p}{\to} 0$ . By condition C4(i)–(iii), one can use Taylor expansion on $s ({\bar{β}}_{n}, α_{0}, r_{0}, τ)$ around β₀(τ) to show $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ {\bar{β}}_{n} (τ) - β_{0} (τ) ‖ \overset{p}{\to} 0$ . Furthermore, the conditional density functions f_X(t|z), f_Y(t|z) and f_T₂(t|z) are bounded above uniformly in t and z, and K_A(u, v, θ) is Lipschitz continous in v. These facts allow us to mimic lemma B.1. in Peng and Huang (2008) and get $σ_{B}^{2} ({\bar{β}}_{n}) \overset{p}{\to} 0$ .

Similarly, define $σ_{A}^{2} (α) = \sup_{τ \in [ν_{1} τ_{U, 1}]} Var {P_{i} (β_{0}, α, r_{0}, τ) - P_{i} (β_{0}, α_{0}, r_{0}, τ)}$ , a sufficient condition for the second part is $σ_{A}^{2} ({\bar{α}}_{n}) \overset{p}{\to} 0$ , which follows directly from Lemma 1 and the Lipschitz continuity of K_A(u, v, θ) in v. This completes the proof of Lemma 2.

Weak Convergence $n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ)]$

By simple manipulations, we can show $n^{1 ∕ 2} [S_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}, τ} - S_{n} (β_{0}, α_{0}, r_{0}, τ)]$ equals

n^{1 ∕ 2} [S_{n} (β_{0}, \hat{α}, r_{0}, τ) - S_{n} (β_{0}, α_{0}, r_{0}, τ)] + n^{1 ∕ 2} [S_{n} {\hat{β} (r_{0}) α_{0}, r_{0}, τ} - S_{n} (β, α_{0}, r_{0}, τ)] + ε_{S}

(A9)

where $ε_{S} = n^{1 ∕ 2} [S_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}, τ} - S_{n} (β_{0}, \hat{α}, r_{0}, τ)] - n^{1 ∕ 2} [S_{n} {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - S_{n} (β_{0}, α_{0}, r_{0}, τ)]$ can be shown to be $o_{p}^{τ \in [ν_{1}, τ_{U, 1}]}$ (1) and asymptotically negligible. Based on the uniform convergence of $s {\hat{β} (r_{0}), α_{0}, r_{0}, τ}$ to s{β(r₀), α₀, r₀, τ} for τ ∈ [ν₁, τ_U,1] in (A4), and the uniform convergence of $μ {\hat{α} (τ)}$ to μ{α₀(τ)} for τ ∈ (0, τ_U,2] in Peng and Huang (2008), we can use Lemma 2 to show

\begin{matrix} n^{1 ∕ 2} [S_{n} (β_{0}, \hat{α}, r_{0}, τ) - S_{n} (β_{0}, α_{0}, r_{0}, τ)] = n^{1 ∕ 2} [s (β_{0}, \hat{α}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)] + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1), \\ n^{1 ∕ 2} [S_{n} {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - S_{n} (β_{0}, α_{0}, r_{0}, τ)] = n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ)] + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) . \end{matrix}

(A10)

Here and in the sequel, $o_{p}^{s} (1)$ means convergence to 0 in probability uniformly on set S. Similarly, we use $o_{p}^{s} (n^{- 1 ∕ 2})$ to denote uniform root n convergence to 0 in probability on set S.

According to Peng and Huang (2008),

n^{1 ∕ 2} [μ {\hat{α} (u)} - μ {α_{0} (u)}] = - n^{1 ∕ 2} \sum_{i = 1}^{n} ϕ {Z_{i} R_{i}^{Y} (α_{0}, u)} + o_{p}^{u \in (0, τ_{U, 2}]} (1),

(A11)

where $R_{i}^{Y} (α, u) = I [Y_{i} \leq \exp {Z_{i}^{T} α (u)}, η = 1] - \int_{0}^{u} I [Y_{i} \geq \exp {Z_{i}^{T} α (t)}] d H (t), H (t) = - \log (1 - t)$ for 0 ≤ t < 1, and ϕ is a continuous linear map that involves product integration. The detailed form of ϕ can be found in Appendix B of Peng and Huang (2008).

By Lemma 1 and the Lipschitz continuity of K_A(u, v, θ) in v, we can use Taylor expansion to see

n^{1 ∕ 2} {s (β_{0}, \hat{α}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)} = - n^{1 ∕ 2} E {ZI {\log Y > Z^{T} β_{0} (τ)} \times d_{A} {τ, F_{2} [\exp {Z^{T} β_{0} (τ)} ∣ Z], g (Z^{T} r_{0})} \times [\int_{0}^{τ_{U, 2}} I {Z^{T} β_{0} (τ) \geq Z^{T} \hat{α} (u)} d u - \int_{0}^{τ_{U, 2}} I {Z^{T} β_{0} (τ) \geq Z^{T} α_{0} (u)} d u]} + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1),

(A12)

where d_A(u, v, θ) = ∂K_A(u, v, θ)/∂v. In the following we will show that the left hand side (LHS) of (A12) has a uniform i.i.d. representation.

We first consider the one-sample case, where ${epx {\hat{α} (u)}, u \in (0, τ_{U, 2}]}$ is asymptotically equivalent to the quantile function of the Nelson-Aalen estimator (Peng and Huang, 2008) of T₂'s distribution, denoted ${\hat{F}}_{2}^{N A} (\cdot)$ . Specifically, it can be shown that ${\hat{F}}_{2}^{N A} [\exp {\hat{α} (u)}] = u + o_{p}^{u \in (0, τ_{U, 2}]} (n^{- 1 ∕ 2})$ . By this result and C5(i), we get

\int_{0}^{τ_{U, 2}} I {β_{0} (τ) \geq \hat{α} (u)} d u = \int_{0}^{τ_{U, 2}} I ({\hat{F}}_{2}^{N A} [\exp {β_{0} (τ)}] \geq u) d u + o_{p} (n^{- 1 ∕ 2}) = {\hat{F}}_{2}^{N A} [\exp {β_{0} (τ)}] + o_{p} (n^{- 1 ∕ 2}) .

Since $n^{1 ∕ 2} {{\hat{F}}_{2}^{N A} (t) - F_{2} (t)}$ is asymptotically Gaussian and can be written as $n^{- 1 ∕ 2} \sum_{i = 1}^{n} π_{i} (t) + o (1)$ , where ${π_{i} (t)}_{i = 1}^{n}$ are i.i.d. and form a Donsker's class with mean 0 (Kosorok, 2008), we can see the right hand side (RHS) of (A12) equals

- n^{- 1 ∕ 2} \sum_{i = 1}^{n} d_{A} {τ, F_{2} [\exp {β_{0} (τ)}], g (r_{0})} π_{i} [\exp {β_{0} (τ)}] \Pr {\log Y > β_{0} (τ)} + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) .

(A13)

The above arguments for the one sample case can be easily extended to the K-sample case, which covers situations where all covariates are discrete. Specifically, suppose $Z = {z_{1}, z_{2}, \dots, 2_{K}}$ and Z_i = z_k if and only if observation i belongs to the k_th group. We can show that RHS of (A12) equals

- n^{- 1 ∕ 2} \sum_{i = 1}^{n} \sum_{k = 1}^{K} [I (Z_{i} = z_{k}) z_{k} d_{A} {τ, F_{2} [\exp {z_{k}^{T} β_{0} (τ)} ∣ z_{k}], g (z_{k}^{T} r_{0})} π_{i} [\exp {z_{k}^{T} β_{0} (τ)}] \times \Pr {\log Y > z_{k}^{T} β_{0} (τ) ∣ z_{k}}] + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) .

(A14)

By the boundedness of $d_{A} (\cdot), Z$ and $\Pr {\log Y > z_{k}^{T} β_{0} (τ) ∣ z_{k}}$ , we can show that $- \sum_{k = 1}^{K} z_{k} I (Z_{i} = z_{k}) d_{A} {τ, F_{2} [\exp {z_{k}^{T} β_{0} (τ)} z_{k}], g (z_{k}^{T} r_{0})} \times \Pr {\log Y > z_{k}^{T} β_{0} (τ) ∣ z_{k}} π_{k, i} [epx {z_{k}^{T} β_{0} (τ)}]$ is mean 0 and form a Donsker's class, as the Donsker property is preserved under Lipschitz transformations. Therefore, (A14) converges weakly to a mean 0 Gaussian process.

Next we consider the case when Z involves some continuous components. Define v_A(a, τ) = EV_A(a, τ). By C2(iii), J_A(a, τ) ≡ ∂v_A(a, τ)/∂a exists, and each component of $J_{A} {α_{0} (u), τ} B_{a}^{- 1} {α_{0} (u)}$ is uniformly bounded for any u ∈ (0, τ_U,2] and τ ∈ [ν₁, τ_U,1]. Therefore, Taylor expansion, coupled with (A11) and (A12), shows that

\begin{matrix} n^{1 ∕ 2} {s (β_{0}, \hat{α}, r_{0}, τ - s (β_{0}, α_{0}, r_{0}, τ)} = & \int_{0}^{τ_{U, 2}} n^{1 ∕ 2} [v_{A} {\hat{α} (u), τ} - v_{A} {α_{0} (u), τ}] d u + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) \\ = & - n^{- 1 ∕ 2} \sum_{i = 1}^{n} \int_{0}^{τ_{U, 2}} J_{A} {α_{0} (u), τ} B_{a}^{- 1} {α_{0} (u)} ϕ {Z_{i} R_{i}^{Y} (α_{0}, u)} d u + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) . \end{matrix}

(A15)

It is seen from (A14) and (A15) that $n^{1 ∕ 2} [s (β_{0}, \hat{α}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)]$ is asymptotically equivalent to a sum of i.i.d. influence functions, no matter whether Z includes continuous covariates or not. In the sequel we unify (A14) and (A15) by writing

n^{1 ∕ 2} {s (β_{0}, \hat{α}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)} = n^{- 1 ∕ 2} \sum_{i = 1}^{n} ρ_{i} (β_{0}, α_{0}, r_{0}, τ) + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) .

(A16)

Note that $S_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}, τ} = o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (n^{- 1 ∕ 2})$ , (A9) and (A10) imply that

n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ)] = - n^{1 ∕ 2} S_{n} (β_{0}, α_{0}, r_{0}, τ) - n^{1 ∕ 2} {s (β_{0}, \hat{α}, r_{0}, τ) - s (β_{0}, α_{0}, r_{0}, τ)} + o_{p}^{τ \in [ν_{1}, τ_{U, 1}]} (1) .

By (A16), we then get

n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ)] = n^{- 1 ∕ 2} \sum_{i = 1}^{n} χ_{i} (β_{0}, α_{0}, r_{0}, τ) + o_{p}^{[ν_{1}, τ_{U, 1}]} (1),

(A17)

where χ_i(β₀, α₀, r₀τ) = –Z_iP_i(β₀, α₀, r₀τ) – ρ_i(β₀, α₀, r₀τ). This uniform i.i.d. representation implies the weak convergence of $n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ)]$ to a mean zero Gaussian process by an application of Donker theorem.

Weak Convergence of $n^{1 ∕ 2} W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}}$

Using similar arguments in (A9) and Lemma 2, we

\begin{matrix} n^{1 ∕ 2} [W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}} - W_{n} (β_{0}, α_{0}, r_{0})] = & n^{1 ∕ 2} [W_{n} (β_{0}, \hat{α}, r_{0}) - W_{n} (β_{0}, α_{0} . r_{0})] + n^{1 ∕ 2} [W_{n} {\hat{β} (r_{0}), α_{0}, r_{0}} - W_{n} (β_{0}, α_{0}, r_{0})] + ε_{W} \\ = & n^{1 ∕ 2} [w (β_{0}, \hat{α}, r_{0}) - w (β_{0}, α_{0}, r_{0})] + n^{1 ∕ 2} [w {\hat{β} (r_{0}), α_{0}, r_{0}} - w (β_{0}, α_{0}, r_{0})] + o_{p} (1) + ε_{W}, \end{matrix}

(A18)

where $ε_{W} = n^{1 ∕ 2} [W_{n} {\hat{β} (r_{0}, \hat{α}, r_{0}} - W_{n} {β_{0}, \hat{α}, r_{0}}] - n^{1 ∕ 2} [W_{n} {\hat{β} (r_{0}), α_{0}, r_{0}} - W_{n} (β_{0}, α_{0} . r_{0})]$ can also be shown to be o_p(1). It is easy to see n^1/2W_n(β₀, α₀, r₀) converges weakly to a Gaussian process by the Donsker property. Moreover, we can follow the arguments for (A14) and (A15) to show

n^{1 ∕ 2} {w (β_{0}, \hat{α}, r_{0}) - w (β_{0}, α_{0}, r_{0})} = n^{- 1 ∕ 2} \sum_{i = 1}^{n} κ_{i} (β_{0}, α_{0}, r_{0}) + o_{p} (1) .

(A19)

The i.i.d. influence functions κ_i(β₀, α₀, r₀) take the form

- \sum_{k = 1}^{K} \int_{τ_{a}}^{τ_{b}} \int_{t \in (0 . \infty)} [{\bar{z}}_{k} I (Z_{i} = z_{k}) I {z_{k}^{T} β_{0} (τ) < \log t} \Pr (Y > t ∣ z_{k}) d_{B} {τ, F_{2} (t ∣ z_{k}), g (z_{k}^{T} r_{0})} \times π_{k, i} (t) d τ d t],

(A20)

in the K-sample case, and equals

- \int_{0}^{τ_{U, 2}} J_{B} {α_{0} (u)} B_{a}^{- 1} {α_{0} (u)} ϕ {Z_{i} R_{i}^{Y} (α_{0}, u)} d u

(A21)

when Z̄ contains continuous components, where J_B(a) ≡ dv_B(a)/da, and v_B(a) = EV_B(a). It then follows from the central limit theory and condition C2 that $n^{1 ∕ 2} {w (β_{0}, \hat{α}, r_{0}) - w (β_{0}, α_{0}, r_{0})}$ converges to a mean 0 normal distribution.

We also show by C4(iv) and a Taylor expansion that

n^{1 ∕ 2} [w (\hat{β} (r_{0}), α_{0}, r_{0}} - w (β_{0}, α_{0}, r_{0})] = \int_{τ_{a}}^{τ_{b}} [L_{b} {β_{0} (τ), α_{0}, r_{0}, τ} B_{b} {β_{0} (τ), α_{0}, r_{0}, τ}^{- 1} \times n^{1 ∕ 2} [s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ)]] d τ + o_{p} (1) .

(A22)

Combining Equation (A17), (A18), (A19) and (A22), we then have

n^{1 ∕ 2} W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}} = n^{- 1 ∕ 2} \sum_{i = 1}^{n} ι_{i} (β_{0}, α_{0}, r_{0}) + o_{p} (1),

(A23)

where $ι_{i} (β_{0}, α_{0}, r_{0}) = {\overset{‒}{Z}}_{i} Q_{i} (β_{0}, α_{0} . r_{0}) + κ_{i} (β_{0}, α_{0}, r_{0}) + \int_{τ_{a}}^{τ_{b}} L_{b} {β_{0}, α_{0}, r_{0}, τ} B_{b} {β_{0} (τ), α_{0}, r_{0}, τ}^{- 1} χ_{i} (β_{0}, α_{0}, r_{0}, τ) d τ$ . The weak convergence of $n^{1 ∕ 2} W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}}$ follows from the Donsker Theorem.

Asymptotic Normality of n^1/2(r̂ – r₀) and Weak Convergence of $n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)}$ Using $s {\tilde{β} (\hat{r}), α_{0}, τ} = s (β_{0}, α_{0}, r_{0}, τ) = 0$ , we can obtain the following inequality

‖ s {\hat{β} (\hat{r}), α_{0}, r_{0}, τ} - s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} ‖ \leq [‖ s {\hat{β} (\hat{r}), α_{0}, r_{0}, τ} - s {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} ‖ + ‖ s {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} - s {\tilde{β} (\hat{r}), α_{0}, \hat{r}, τ} ‖ + ‖ s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s (β_{0}, α_{0}, r_{0}, τ) ‖] .

By C3(ii), $\hat{r} \overset{p}{\to} r_{0}$ implies $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ s {\hat{β} (\hat{r}), α_{0}, r_{0}, τ} - s {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} ‖ \overset{p}{\to} 0$ . This, coupled with the inequality above and (A4), implies $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ s {\hat{β} (\hat{r}), α_{0}, r_{0}, τ} - s {\tilde{β} (r_{0}), α_{0}, r_{0}, τ} ‖ \overset{p}{\to} 0$ . Hence, we have $\sup_{z \in Z, τ \in [ν_{1}, τ_{U, 1}]} ‖ \hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0}) ‖ \overset{p}{\to} 0$ after using C4 (i) and Taylor expansions. Mimicking the arguments in Lemma 2, we can show with C3(ii) that

\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ S_{n} {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} - S_{n} {\hat{β} (r_{0}), α_{0}, r_{0}, τ} - s {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} + s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} ‖ = o_{p} (n^{- 1 ∕ 2}),

(A24)

and similarly,

\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ S_{n} {\hat{β} (\hat{r}), \hat{α}, \hat{r}, τ} - S_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}, τ} - S_{n} {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} + S_{n} {\hat{β} (r_{0}), α_{0}, r_{0}, τ} ‖ = o_{p} (n^{- 1 ∕ 2}) .

Note that $\sup_{τ \in [ν_{1}, τ_{U, 1}]} S_{n} {\hat{β} (r), \hat{α}, r, τ} = o (n^{- 1 ∕ 2})$ , and thus $\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ S_{n} {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} - S_{n} {\hat{β} (r_{0}), α_{0}, r_{0}, τ} ‖ = o_{p} (n^{- 1 ∕ 2})$ . This, combined with (A24), futher implies

\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ s {\hat{β} (\hat{r}), α_{0}, \hat{r}, τ} - s {\hat{β} (r_{0}), α_{0}, r_{0}, τ} ‖ = o_{p} (n^{- 1 ∕ 2}) .

By the uniform convergence of $\hat{β} (τ; r_{0})$ to β₀(τ) in τ ∈ [ν₁, τ_U,1], we can use Taylor expansion to get

\sup_{τ \in [ν_{1}, τ_{U, 1}]} ‖ \hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0}) + B_{b} {(β_{0}, α_{0}, r_{0}, τ)}^{- 1} B_{r} (β_{0}, α_{0}, r_{0}, τ) (\hat{r} - r_{0}) ‖ = o_{p} (n^{- 1 ∕ 2} \lor ‖ \hat{r} - r_{0} ‖) .

(A25)

By the consistency of r̂, uniform convergence of $\hat{β} (τ; r_{0})$ , and (A25), we see after an Taylor expansion that

w {\hat{β} (\hat{r}), α_{0}, \hat{r}} - w {\hat{β} (r_{0}), α_{0}, r_{0}} = L_{r} (β_{0}, α_{0}, r_{0}) (\hat{r} - r_{0}) + \int_{τ_{a}}^{τ_{b}} L_{b} (β_{0}, α_{0}, r_{0}, τ) {\hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0})} d τ + o_{p} (n^{- 1 ∕ 2} \lor ‖ \hat{r} - r_{0} ‖) = D_{r} (β_{0}, α_{0}, r_{0}) \times (\hat{r} - r_{0}) + o_{p} (n^{- 1 ∕ 2} \lor ‖ \hat{r} - r_{0} ‖),

(A26)

where $D_{r} (β β_{0}, α_{0}, r_{0}) = L_{r} (β_{0}, α_{0}, r_{0}) - \int_{τ_{a}}^{τ_{b}} L_{b} (β_{0}, α_{0} . r_{0}, τ) B_{b} {(β_{0}, α_{0}, r_{0}, τ)}^{- 1} \times B_{r} (β_{0}, α_{0}, r_{0}, τ) d τ$ Based on (A7), D_r(β₀, α₀, r₀) equals $\partial_{w} {\tilde{β} (r), α_{0}, r} ∕ \partial r ∣_{r = r_{0}}$ .

Following the arguments for Lemma 2, we can also show that $n^{1 ∕ 2} [W_{n} {\hat{β} (\hat{r}), \hat{α}, \hat{r}} - W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}}]$ can be approximated by $n^{1 ∕ 2} [w {\hat{β}, (\hat{r}), \hat{α}, \hat{r}} - w {\hat{β} (r_{0}), \hat{α}, r_{0}}]$ , and furthermore by $n^{1 ∕ 2} [w {\hat{β} (\hat{r}), α_{0}, \hat{e}} - w {\hat{β} (r_{0}), α_{0}, r_{0}}]$ . This fact, coupled with $W_{n} {\hat{β} (\hat{r}), \hat{α}, \hat{r}} = o (n^{- 1 ∕ 2})$ and (A26), gives

- n^{1 ∕ 2} W_{n} {\hat{β} (r_{0}), \hat{α}, r_{0}} = D_{r} \times n^{1 ∕ 2} (\hat{r} - r_{0}) + o_{p} (1 \lor n^{1 ∕ 2} ‖ \hat{r} - r_{0} ‖),

which, combined with (A23), further implies

n^{1 ∕ 2} (\hat{r} - r_{0}) = - n^{- 1 ∕ 2} \sum_{i = 1}^{n} D_{r}^{- 1} ι_{i} (β_{0}, α_{0}, r_{0}) + o_{p} (1) .

(A27)

Here, $D_{r}^{- 1}$ is a shorthand notation for $D_{r}^{- 1} (β_{0}, α_{0}, r_{0})$ . It then follows that n^1/2(r̂ – r₀) converges to a normal distribution with mean 0.

Finally, by combining (A17) and (A25), we can see that

n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)} = n^{1 ∕ 2} {\hat{β} (τ; \hat{r}) - \hat{β} (τ; r_{0})} + n^{1 ∕ 2} {\hat{β} (τ; r_{0}) - β_{0} (τ)} = n^{- 1 ∕ 2} \sum_{i = 1}^{n} [B_{b} {(β_{0}, α_{0}, r_{0}, τ)}^{- 1} B_{r} (β_{0}, α_{0}, r_{0}, τ) D_{r}^{- 1} ι_{i} (β_{0}, α_{0}, r_{0}) + B_{b} {(β_{0}, α_{0}, r_{0}, τ)}^{- 1} χ_{i} (β_{0}, α_{0}, r_{0}, τ)] + o_{p} (1) .

This implies that $n^{1 ∕ 2} {\hat{β} (τ) - β_{0} (τ)}$ converges weakly to a Gaussian process with mean 0 for τ ∈ [ν₁, τ_U,1]. This completes the proof of Theorem 2.

APPENDIX D: Convergence criteria for the proposed algorithm

The algorithm proposed in Section 2.3 involves iterative procedures which are stopped by certain pre-specified convergence criteria. In the below, we elaborate the convergence criteria that were adopted in our numerical studies and are also expected to function well in other typical settings.

First, to assess the convergence status, the distance between two estimates for β₀(·) is defined as $D_{b} (β_{1} β_{2}) = \max_{i = 1, \dots, p + 1 \int_{ν_{1}}^{τ_{U, 1}}} ‖ β_{2}^{(i)} (τ) - β_{1}^{(i)} (τ) ‖ d τ$ . The distance between two estimates for r₀ is defined as $D_{r} (r_{1}, r_{2}) = ∣ K e n (r_{1}) - K e n (r_{2}) ∣$ , where Ken(r) denotes the Kendall's tau coefficient under the assumed model (2) with r₀ = r. We define the distance measure for r on the Kendall's tau's scale in order to have a unified criterion for different choices of copula functions.

For the iterations within Steps D.0–D.2, the convergence of the sequence, ${{\hat{β}}^{[k, m]} (τ) : m = 0, 1, \dots,}$ , is determined via the following procedure.

Set a maximum number of iterations as max.iter₀, and choose convergence tolerance levels as tol_b,1 and tol_b,2.
In the qth iteration (q < max.iter₀), we regard the sequence as converged if $D_{b} ({\hat{β}}^{[k, q]}, {\hat{β}}^{[k, q - 1]}) \leq t o l_{b, 1}$ or $D_{b} ({\hat{β}}^{[k, q]}, {\hat{β}}^{[k, q - 2]}) \leq t o l_{b, 1}$ . The final estimate, $\hat{β} (τ; r^{[k]})$ , is set as ${\hat{β}}^{[k, q]} (τ)$ in the former case and as ${{\hat{β}}^{[k, q]} (τ) + {\hat{β}}^{[k, q - 1]} (τ)} ∕ 2$ , in the latter case.
When q = max.iter₀, the sequence is concluded as convergent if $D_{b} ({\hat{β}}^{[k, q]}, {\hat{β}}^{[k, q - 1]}) \leq t o l_{b, 2}$ or $D_{b} {\hat{β}}^{[k, q]}, {\hat{β}}^{[k, q - 2]}) \leq t o l_{b, 2}$ , and as non-convergent otherwise. In the convergent case, $\hat{β} (τ; r^{[k]})$ is set in the same way as in 2.

The convergence criteria for the iterations within Steps A-D need to concern both ${{\hat{β}}^{[k]} (τ) : k = 0, 2, \dots}$ and {r̂^[k](τ) : k = 1, 2, . . . }. We assess the convergence of ${{\hat{r}}^{[k]} (τ) : k = 1, 2, \dots}$ in the same way as that for ${{\hat{β}}^{[k]} (τ) : k = 1, 2, \dots}$ , except that we may adopt a different maximum number of iterations, max.iter₁. We stop the algorithm at the qth iteration (q ≤ max.iter₁) if the sequence, ${{\hat{β}}^{[k]} (τ) : k = 1, 2, \dots}$ , reaches convergence and $D_{r} (r^{[q]}, r^{[q - 1]}) \leq t o l_{k}$ , where tol_k is a prespecified tolerance level.

In the numerical studies reported in Section 3 and 4, we choose tol_b,1 = 5 × 10^–4 and tol_b.2 = tol_k = 0.005, and set max.iter₀ = 10 and max.iter₁ = 20.

Contributor Information

Ruosha Li, Ruosha Li, University of Pittsburgh, Pittsburgh, USA..

Limin Peng, Emory University, Atlanta, USA..

References

Chen Y. Semiparametric marginal regression analysis for dependent competing risks under an assumed copula. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2010;72(2):235–251. [Google Scholar]
Chen Y. Maximum likelihood analysis of semicompeting risks data with semiparametric regression models. Lifetime Data Analysis. 2011:1–22. doi: 10.1007/s10985-011-9202-4. [DOI] [PubMed] [Google Scholar]
Clayton D. A Model for Association in Bivariate Life Tables and its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence. Biometrika. 1978;65(1):141–151. [Google Scholar]
Copelan E, Biggs J, Thompson J, Crilley P, Szer J, Klein J, Kapoor N, Avalos B, Cunningham I, Atkinson K. Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with bucy2. Blood. 1991;78(3):838–843. [PubMed] [Google Scholar]
Ding A, Shi G, Wang W, Hsieh J. Marginal regression analysis for semi-competing risks data under dependent censoring. Scandinavian Journal of Statistics. 2009;36(3):481–500. [Google Scholar]
Efron B. Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics. 1979;7(1):1–26. [Google Scholar]
Fine J, Jiang H, Chappell R. On Semi-competing Risks Data. Biometrika. 2001;88(4):907–919. [Google Scholar]
Fygenson M, Ritov Y. Monotone Estimating Equations for Censored Data. The Annals of Statistics. 1994;22(2):732–746. [Google Scholar]
Genest C. Frank's Family of Bivariate Distributions. Biometrika. 1987;74(3):549–555. [Google Scholar]
Hsieh J, Wang W, Adam Ding A. Regression Analysis based on Semicompeting Risks Data. Journal of the Royal Statistical Society: Series B(Statistical Methodology) 2008;70(1):3–20. [Google Scholar]
Huang Y. Quantile Calculus and Censored Regression. The Annals of Statistics. 2010;38(3):1607–1637. doi: 10.1214/09-aos771. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jiang H, Chappell R, Fine J. Estimating the distribution of nonterminal event time in the presence of mortality or informative dropout. Controlled Clinical Trials. 2003;24:135–146. doi: 10.1016/s0197-2456(02)00307-0. [DOI] [PubMed] [Google Scholar]
Kendall M, Gibbons J. Rank Correlation Methods. Charles Griffin; London: 1962. [Google Scholar]
Klein J, Moeschberger M. Survival analsysis: techniques for censored and truncated data. Springer; New York: 2005. [Google Scholar]
Koenker R. Quantile Regression. Cambridge Univ Pr; 2005. [Google Scholar]
Koenker R, Bassett G. Regression Quantiles. Econometrica. 1978;46(1):33–50. [Google Scholar]
Kosorok M. Introduction to Empirical Processes and Semiparametric Inference. Springer Verlag; 2008. [Google Scholar]
Lai T, Ying Z. Stochastic Integrals of Empirical-type Processes with Applications to Censored Regression. Journal of Multivariate Analysis. 1988;27(2):334–358. [Google Scholar]
Lin D, Robins J, Wei L. Comparing two Failure Time Distributions in the presence of Dependent Censoring. Biometrika. 1996;83(2):381–393. [Google Scholar]
Martinussen T, Scheike TH. Dynamic Regression Models for Survival Data. Springer; New York: 2006. [Google Scholar]
Nelsen R. An Introduction to Copulas. Springer Us; 2006. [Google Scholar]
Peng L, Fine J. Rank Estimation of Accelerated Lifetime Models with Dependent Censoring. Journal of the American Statistical Association. 2006;101(475):1085–1093. [Google Scholar]
Peng L, Fine J. Nonparametric Quantile Inference with Competing Risks Data. Biometrika. 2007a;94:735–744. [Google Scholar]
Peng L, Fine J. Competing Risks Quantile Regression. Journal of the American Statistical Association. 2009;104(488):1440–1453. [Google Scholar]
Peng L, Fine JP. Regression modeling of semi-competing risks data. Biometrics. 2007b;63:96–108. doi: 10.1111/j.1541-0420.2006.00621.x. [DOI] [PubMed] [Google Scholar]
Peng L, Huang Y. Survival Analysis with Quantile Regression Models. Journal of the American Statistical Association. 2008;103(482):637–649. doi: 10.1198/016214508000000184. [DOI] [PMC free article] [PubMed] [Google Scholar]
Portnoy S. Censored Regression Quantiles. Journal of the American Statistical Association. 2003;98(464):1001–1013. doi: 10.1198/01622145030000001007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Powell J. Censored Regression Quantiles. Journal of Econometrics. 1986;32(1):143–155. [Google Scholar]
Tsiatis A. A Nonidentifiability Aspect of the Problem of Competing Risks. Proceedings of the National Academy of Sciences. 1975;72(1):20–22. doi: 10.1073/pnas.72.1.20. [DOI] [PMC free article] [PubMed] [Google Scholar]
Van der Vaart A, Wellner J. Weak Convergence and Empirical Processes: with Applications to Statistics. Springer Verlag; 1996. [Google Scholar]
Wang H, Wang L. Locally weighted censored quantile regression. Journal of the American Statistical Association. 2009;104(487):1117–1128. doi: 10.1198/jasa.2009.tm08420. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ying Z, Jung S, Wei L. Survival Analysis with Median Regression Models. Journal of the American Statistical Association. 1995;90:178–184. [Google Scholar]
Zheng M, Klein J. Estimates of Marginal Survival for Dependent Competing Risks based on an Assumed Copula. Biometrika. 1995;82(1):127–138. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp Supplement S1

NIHMS549451-supplement-Supp_Supplement_S1.pdf^{(27.3KB, pdf)}

[R1] Chen Y. Semiparametric marginal regression analysis for dependent competing risks under an assumed copula. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2010;72(2):235–251. [Google Scholar]

[R2] Chen Y. Maximum likelihood analysis of semicompeting risks data with semiparametric regression models. Lifetime Data Analysis. 2011:1–22. doi: 10.1007/s10985-011-9202-4. [DOI] [PubMed] [Google Scholar]

[R3] Clayton D. A Model for Association in Bivariate Life Tables and its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence. Biometrika. 1978;65(1):141–151. [Google Scholar]

[R4] Copelan E, Biggs J, Thompson J, Crilley P, Szer J, Klein J, Kapoor N, Avalos B, Cunningham I, Atkinson K. Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with bucy2. Blood. 1991;78(3):838–843. [PubMed] [Google Scholar]

[R5] Ding A, Shi G, Wang W, Hsieh J. Marginal regression analysis for semi-competing risks data under dependent censoring. Scandinavian Journal of Statistics. 2009;36(3):481–500. [Google Scholar]

[R6] Efron B. Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics. 1979;7(1):1–26. [Google Scholar]

[R7] Fine J, Jiang H, Chappell R. On Semi-competing Risks Data. Biometrika. 2001;88(4):907–919. [Google Scholar]

[R8] Fygenson M, Ritov Y. Monotone Estimating Equations for Censored Data. The Annals of Statistics. 1994;22(2):732–746. [Google Scholar]

[R9] Genest C. Frank's Family of Bivariate Distributions. Biometrika. 1987;74(3):549–555. [Google Scholar]

[R10] Hsieh J, Wang W, Adam Ding A. Regression Analysis based on Semicompeting Risks Data. Journal of the Royal Statistical Society: Series B(Statistical Methodology) 2008;70(1):3–20. [Google Scholar]

[R11] Huang Y. Quantile Calculus and Censored Regression. The Annals of Statistics. 2010;38(3):1607–1637. doi: 10.1214/09-aos771. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Jiang H, Chappell R, Fine J. Estimating the distribution of nonterminal event time in the presence of mortality or informative dropout. Controlled Clinical Trials. 2003;24:135–146. doi: 10.1016/s0197-2456(02)00307-0. [DOI] [PubMed] [Google Scholar]

[R13] Kendall M, Gibbons J. Rank Correlation Methods. Charles Griffin; London: 1962. [Google Scholar]

[R14] Klein J, Moeschberger M. Survival analsysis: techniques for censored and truncated data. Springer; New York: 2005. [Google Scholar]

[R15] Koenker R. Quantile Regression. Cambridge Univ Pr; 2005. [Google Scholar]

[R16] Koenker R, Bassett G. Regression Quantiles. Econometrica. 1978;46(1):33–50. [Google Scholar]

[R17] Kosorok M. Introduction to Empirical Processes and Semiparametric Inference. Springer Verlag; 2008. [Google Scholar]

[R18] Lai T, Ying Z. Stochastic Integrals of Empirical-type Processes with Applications to Censored Regression. Journal of Multivariate Analysis. 1988;27(2):334–358. [Google Scholar]

[R19] Lin D, Robins J, Wei L. Comparing two Failure Time Distributions in the presence of Dependent Censoring. Biometrika. 1996;83(2):381–393. [Google Scholar]

[R20] Martinussen T, Scheike TH. Dynamic Regression Models for Survival Data. Springer; New York: 2006. [Google Scholar]

[R21] Nelsen R. An Introduction to Copulas. Springer Us; 2006. [Google Scholar]

[R22] Peng L, Fine J. Rank Estimation of Accelerated Lifetime Models with Dependent Censoring. Journal of the American Statistical Association. 2006;101(475):1085–1093. [Google Scholar]

[R23] Peng L, Fine J. Nonparametric Quantile Inference with Competing Risks Data. Biometrika. 2007a;94:735–744. [Google Scholar]

[R24] Peng L, Fine J. Competing Risks Quantile Regression. Journal of the American Statistical Association. 2009;104(488):1440–1453. [Google Scholar]

[R25] Peng L, Fine JP. Regression modeling of semi-competing risks data. Biometrics. 2007b;63:96–108. doi: 10.1111/j.1541-0420.2006.00621.x. [DOI] [PubMed] [Google Scholar]

[R26] Peng L, Huang Y. Survival Analysis with Quantile Regression Models. Journal of the American Statistical Association. 2008;103(482):637–649. doi: 10.1198/016214508000000184. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Portnoy S. Censored Regression Quantiles. Journal of the American Statistical Association. 2003;98(464):1001–1013. doi: 10.1198/01622145030000001007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Powell J. Censored Regression Quantiles. Journal of Econometrics. 1986;32(1):143–155. [Google Scholar]

[R29] Tsiatis A. A Nonidentifiability Aspect of the Problem of Competing Risks. Proceedings of the National Academy of Sciences. 1975;72(1):20–22. doi: 10.1073/pnas.72.1.20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Van der Vaart A, Wellner J. Weak Convergence and Empirical Processes: with Applications to Statistics. Springer Verlag; 1996. [Google Scholar]

[R31] Wang H, Wang L. Locally weighted censored quantile regression. Journal of the American Statistical Association. 2009;104(487):1117–1128. doi: 10.1198/jasa.2009.tm08420. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Ying Z, Jung S, Wei L. Survival Analysis with Median Regression Models. Journal of the American Statistical Association. 1995;90:178–184. [Google Scholar]

[R33] Zheng M, Klein J. Estimates of Marginal Survival for Dependent Competing Risks based on an Assumed Copula. Biometrika. 1995;82(1):127–138. [Google Scholar]

PERMALINK

Quantile Regression Adjusting for Dependent Censoring from Semi-Competing Risks

Ruosha Li

Limin Peng

Summary

1. Introduction

Fig. 1.

2. Quantile Regression Procedure

2.1. Data and Model

2.2. Estimating Equations

2.3 Computational Algorithm

2.4. Asymptotic Results

2.5. Other Inferences

3. Simulation Studies

Table 1.

Table 2.

Table 3.

Table 4.

4. A Real Example

Fig. 2.

Fig. 3.

5. Remarks

Supplementary Material

Acknowledgements

APPENDIX A: REGULARITY CONDITIONS

APPENDIX B: PROOF OF THEOREM 1

APPENDIX C: PROOF OF THEOREM 2

APPENDIX D: Convergence criteria for the proposed algorithm

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Quantile Regression Adjusting for Dependent Censoring from Semi-Competing Risks

Ruosha Li

Limin Peng

Summary

1. Introduction

Fig. 1.

2. Quantile Regression Procedure

2.1. Data and Model

2.2. Estimating Equations

2.3 Computational Algorithm

2.4. Asymptotic Results

2.5. Other Inferences

3. Simulation Studies

Table 1.

Table 2.

Table 3.

Table 4.

4. A Real Example

Fig. 2.

Fig. 3.

5. Remarks

Supplementary Material

Acknowledgements

APPENDIX A: REGULARITY CONDITIONS

APPENDIX B: PROOF OF THEOREM 1

APPENDIX C: PROOF OF THEOREM 2

APPENDIX D: Convergence criteria for the proposed algorithm

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases