Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jun 1.
Published in final edited form as: Stat Biosci. 2016 Oct 28;9(1):298–315. doi: 10.1007/s12561-016-9180-x

Estimating a Treatment Effect in Residual Time Quantiles under the Additive Hazards Model

Luis Alexander Crouch 1, Cheng Zheng 2, Ying Qing Chen 3,
PMCID: PMC5501423  NIHMSID: NIHMS826342  PMID: 28694879

Abstract

For randomized clinical trials where the endpoint of interest is a time-to-event subject to censoring, estimating the treatment effect has mostly focused on the hazard ratio from the Cox proportional hazards model. Since the model’s proportional hazards assumption is not always satisfied, a useful alternative, the so-called additive hazards model, may instead be used to estimate a treatment effect on the difference of hazard functions. Still, the hazards difference may be difficult to grasp intuitively, particularly in a clinical setting of, e.g., patient counseling, or resource planning. In this paper, we study the quantiles of a covariate’s conditional survival function in the additive hazards model. Specifically, we estimate the residual time quantiles, i.e., the quantiles of survival times remaining at a given time t, conditional on the survival times greater than t, for a specific covariate in the additive hazards model. We use the estimates to translates the hazards difference into the difference in residual time quantiles, which allows a more direct clinical interpretation. We determine the asymptotic properties, assess the performance via Monte-Carlo simulations, and demonstrate the use of residual time quantiles in two real randomized clinical trials.

Keywords: Clinical trial, Hazard function, Covariate-specific estimate, Remaining time, Survival analysis

1 Introduction

In most randomized clinical trials, estimating a treatment effect typically fits a Cox proportional hazards model [3] on censored time-to-event endpoints. The Cox model offers many advantages: the ability to handle censored data, no need for parametric assumptions about the baseline hazard function, interpretable regression coefficients in hazard ratios, and readily available software for implementation. However, not all the data collected may satisfy the assumption of proportional hazards. For this reason, many alternative models have been developed, see Kalbfleisch and Prentice [11].

One useful alternative is the so-called additive hazards model, as in Lin and Ying [16]. The additive hazards model can also handle censored data, and makes no parametric assumptions about the baseline hazard function; but its regression coefficient is interpreted in hazards difference. For example, Kalbfleisch and Prentice [11] contains a well-known oropharynx carcinoma trial data set. In the trial, a total of 195 patients with squamous cell carcinoma were randomly assigned treatment with radiation therapy alone or together with a chemotherapeutic agent. When the additive hazards model is applied to the data set, the hazards differences are estimated to be 0.08 and 0.35 for tumor stages III and IV, respectively, compared to stage I/II.

Although a hazards difference can be used to measure a treatment effect similar to the hazards ratio of the Cox proportional hazards model, itself is usually less intuitive to care providers and patients in real clinical settings. As a contrast, difference in event times may be more easily understood. In particular, when the patients are still event-free at time t, say, then comparing their remaining event times after t can be more meaningful. In literature, similar comparisons have been done for the event time quantiles that were based on their empirical survival functions [1]. Dabrowska and Doksum [5] developed methods to estimate quantiles based on the Cox model. For median event times, Ying, Jung and Wei [10] and Yang [19] used semiparametric regression models for more general estimation.

However, event time quantiles in clinical trials may not always be clinically meaningful, because their time-zero are usually artificially selected at enrollment or randomization. Therefore, estimate of median survival time, e.g., for a patient diagnosed with a disease with high initial hazard that drops over time may not be relevant when considering their outlook after the initial spike in hazard. In this paper, we instead consider residual time quantiles. A residual time is the amount of survival time remaining at a given time t, assuming survival to t. Estimating residual time quantiles is usually a generalization of that for estimating survival time quantiles, in which t ≡ 0.

Methods for residual time quantiles began to accumulate recently. Jeong, Jung and Costantino [9] proposed nonparametric methods for estimating median residual time (MRT) by inverting Kaplan-Meier estimators [12]. In a follow-up paper, Jung, Jeong and Bandos [10] proposed a regression model that allows for modeling of covariate effects on general quantile residual time. A different regression model is proposed by Ma and Yin [17], allowing for estimation of quantiles of residual times in addition to covariate effects on them. In a recent paper, Crouch, May and Chen [4] developed covariate-specific estimators for residual time quantiles based on the Cox model.

In this article, we aim to develop an estimator for residual time quantiles under the additive hazards model. We begin in Section 2 to show that the newly developed estimator based on the additive hazards model allows for estimation of covariate-specific residual time quantiles. We demonstrate our estimator’s consistency, determine its limiting distribution, and provide a consistent estimator for its variance. Also included are discussions of methods for obtaining confidence intervals and bands that do not rely on direct estimation of the variance. We further develop our method in Section 3, determining the limiting distribution for a difference between two estimators of covariate-specific residual time quantiles and thereby allowing formal testing. In Section 4 we demonstrate our estimator’s performance on simulated data, including figures showing confidence intervals and bands. Additionally, we apply our method to two real data sets: the VA lung cancer data set in Kalbfleisch and Prentice [11] and the Human Immunodeficiency Virus (HIV) Mother-to-Child transmission prevention trial data set in Jackson et al. [8]. Finally, in Section 5 we discuss the mean residual times, and extension of our method to allow for time-varying covariates, and low event rates.

2 Model-based Estimation of Residual Time Quantiles

Assume that T is a positive random variable representing a subject’s time-to-event. At a given time t, we first define the (1−q)th (0 < q < 1) percentile residual time of a random variable T as the amount of additional time necessary for (1 − q) × 100% of the individuals still under observation at time t to fail. We denote this quantity as θ(t, q|Z), where Zp is the associated p-dimensional vector of covariates. Let

S{t+θ(t,q|Z)|Z}=qS(t|Z), (1)

where S(·|Z) is the survival function. If expressed in terms of the cumulative hazard function, Λ(·|Z) = − log S (·|Z), we then have Λ {t + θ(t, q|Z)|Z} = Λ(t|Z) − log q.

Now consider the additive hazard model that assumes

Λ(t|Z)=Λ0(t)+tβTZ, (2)

where Λ0(·) is the unspecified baseline cumulative hazard function, βp is the p-dimensional regression parameter, and T denotes vector (matrix) transpose. Under model (2) we have

Λ0{t+θ(t,q|Z)}+θ(t,q|Z)βTZ=Λ0(t)logq.

Unfortunately we cannot obtain a closed-form solution for θ(t, q|Z), though it can be solved for numerically.

In order to estimate θ(t, q|Z) we need estimates for both β and Λ0 (·). Consider that data are collected in the form of (Xi, Δi, Zi) for i = 1, …, n with ·n being the sample size. For these data, Xi = min(Ti, Ci) where Ti is a failure time and Ci is a censoring time; Δi = I(Ti ≤ Ci); and Zi is a vector of covariates. Given Zi, Ti and Ci are assumed to be independent. Note that Ni(t) = I(Xi ≤ t, δi = 1) and Yi(t) = I(Xi ≥ t). As in Lin and Ying [16], we can estimate β with

β^=i=1n0Yi(t){ZiZ¯(t)}2dti=1n0{ZiZ¯(t)}dNi(t),

where Z¯(t)=i=1nYi(t)Zi(t)/i=1nYi(t). We can also estimate the baseline cumulative hazard function Λ0(·) with

Λ^0(t)=0ti=1n{dNi(s)Yi(s)β^TZids}i=1nYi(s).

With the estimators of β^ and Λ^0(|Z), we can estimate θ(t, q|Z) by θ^(t,q|Z) which is the solution of

Λ^0{t+θ^(t,q|Z)}+θ^(t,q|Z)β^TZ=Λ^0(t)logq. (3)

Note that θ^(t,q|Z) is only defined when Λ^0{t+θ^(t,q|Z)}+θ^(t,q|Z)β^TZ+logqΛ^0(t) where T is the largest observed failure time. For θ^(t,q|Z), we have the following asymptotic properties, summarized in Theorem 1:

Theorem 1

Assume that conditions A–C [1] hold. Denote θ^(t,q|Z) to be the solution to Λ^0{t+θ^(t,q|Z)}+θ^(t,q|Z)β^TZ=Λ^0(t)logq. As n → ∞, for a given Z,n{θ^(t,q|Z)θ(t,q|Z)} converges weakly to a zero-mean Gaussian process whose variance function at t can be estimated consistently by nV^(t) where

V^(t)=[λ^0{θ^(t,q|Z)+t}+β^TZ]2×{CTA1BA1C2CTA1D+0θ^(t,q|Z)i=1ndNi(t+v){i=1nYi(t+v)}2dv},

where A=i=1n0Yi(t){ZiZ¯(t)}2dt,B=i=1n0{ZiZ¯(t)}2dNi(t),C=0θ^(t,q|Z)Z¯(t+v)dvθ^(t,q|Z)Z, and

D=0θ^(t,q|Z)i=1n{ZiZ¯(t+v)dNi(t+v)}i=1nYi(t+v).

Details of Theorem 1’s proof are in the Appendix. With Theorem 1, calculating pointwise confidence intervals is possible. We may simply take θ^(t,q|Z)±z1α/2V^(t) for a 100(1 − α)% confidence interval. Moreover, we can use asymptotic results to obtain confidence bands as well. Specifically, we know that

sup0tτn|θ^(t,q|Z)θ(t,q|Z)|nV^(τ)psup0t1|W(t)|

where W (t) is standard Brownian motion. We can therefore solve

Pr(sup0t1|W(t)|c)=4πk=0(1)k2k+1eπ2(2k+1)28c21α

for c and compute 100(1 − α)% confidence bands using θ^(t,q|Z)±cV^(τ).

In addition, we can compute both confidence bands and intervals using either bootstrap or simulation methods. Using the bootstrap method, we simply find the 2:5th percentile and 97:5th percentile of the bootstrap sample estimates of θ^(t,q|Z) at each time point in order to get pointwise intervals (Efron and Tibshirani [6]).| To obtain bands, we first calculate the maximum deviation within each bootstrap sample across time, sup0tτ|θ^(t,q|Z)θ^(t,q|Z)|, and then find the 95th percentile across samples, c. We can then construct bands using θ^(t,q|Z)±c.

The simulation method is in fact another way of resampling (Parzen, Wei and Ying [18]). In the formulation of θ^(t,q|Z)θ(t,q|Z), we replace dMi(u) with GidNi(u) where Gi ∼ N(0, 1). For each simulated sample, all Gi are randomly generated and we compute the estimated residual time quantile for that sample, θ(t,q|Z). At each time point, the 2.5th percentile and 97.5th percentile of the deviations θ(t,q|Z)θ(t,q|Z) are calculated and added to the estimate θ^(t,q|Z) to get lower and upper pointwise confidence intervals, respectively. Calculating bands is the same as with the bootstrap: we find the maximum deviation within each simulated sample across time, sup0tτ|θ(t,q|Z)θ^(t,q|Z)|, and then find the 95th percentile across samples, c. We can then construct bands using θ^(t,q|Z)±c. One limitation of this type of confidence bound is that it is often too conservative in reality (Hall et al. [7] and Li et al. [14]).

3 Comparing residual time quantiles

While being able to estimate covariate-specific residual time quantiles and their variance is useful, in most practical applications it is also important to be able to carry out comparisons between different covariate values and perform formal tests to determine if any observed difference is statistically significant. We may also be interested in formally comparing residual times at different fixed time points or for different quantiles. All of these tasks require being able to estimate the covariance between two different residual time quantiles.

We can extend results from Theorem 1 to establish asymptotic properties for the differences between quantiles of residual time for different sets of covariate values, evaluation times, and quantiles. These properties are summarized in Theorem 2.

Theorem 2

Assume that conditions A-C [1] hold. Denote θ^(t,q|Z) to be the solution to Λ^0{t+θ^(t,q|Z)}+θ^(t,q|Z)β^TZ=Λ^0(t)logq, then as n → ∞, for a given Z1 and Z2 and fixed t1, t2, q1, and q2, n{(θ^(t1,q1|Z1)θ^(t2,q2|Z2))(θ(t1,q1|Z1)θ(t2,q2|Z2))} converges weakly to a zero-mean Gaussian process and whose variance function at t can be estimated consistently by nW^(t) where

W^(t)=[λ^0{θ^(t1,q1|Z1)+t1}+β^TZ1]2×{C1TA1BA1C12C1TA1D1+0θ^(t1,q1|Z1)i=1ndNi(t1+v){i=1nYi(t1+v)}2dv}+[λ^0{θ^(t2,q2|Z2)+t2}+β^TZ2]2×{C2TA1BA1C22C2TA1D2+0θ^(t2,q2|Z2)i=1ndNi(t2+v){i=1nYi(t2+v)}2dv}[λ^0{θ^(t1,q1|Z1)+t1}+β^TZ1]1[λ^0{θ^(t2,q2|Z2)+t2}+β^TZ2]1×{2[C1TA1BA1C2C1TA1D2C2TA1D1+0η^mintmaxi=1ndNi(tmax+v){i=1nYi(tmax+v)}2dv]}

where A=i=1n0Yi(t){ZiZ¯(t)}2dt,B=i=1n0{ZiZ¯(t)}2dNi(t),Cj=0θ^(tj,qj|Zj)Z¯(tj+v)dvθ^(tj,qj|Zj)Zj,

Dj=0θ^(tj,qj|Zj)i=1n{ZiZ¯(tj+v)dNi(tj+v)}i=1nYi(tj+v),

and η^min=min{θ^(t1,q1|Z1)+t1,θ^(t2,q2|Z2)+t2}, for j = 1, 2.

Details of Theorem 2’s proof are in the Appendix. With Theorem 2, calculating pointwise confidence intervals is possible. We may simply take {θ^(t1,q1|Z1)θ^(t2,q2|Z2)}±z1α/2W^(t) for a 100(1 − α)% confidence interval. Methods for calculating confidence bands and other methods for calculating confidence intervals are similar to those explained above.

Wald-type statistics W^(t)1[θ^(t1,q1|Z1)θ^(t2,q2|Z2)]2 could be used for the testing problem with null hypothesis H0 : θ(t1, q1|Z1) = θ(t2, q2|Z2). Or we can obtain the asymptotic variance of θ^(t1,q1|Z1)/θ^(t2,q2|Z2) as below by Delta method similar to Lin, Zhang and Zhou [15]:

W^r(t)=θ^(t2,q2|Z2)2[λ^0{θ^(t1,q1|Z1)+t1}+β^TZ1]2×{C1TA1BA1C12C1TA1D1+0θ^(t1,q1|Z1)i=1ndNi(t1+v){i=1nYi(t1+v)}2dv}+θ^(t1,q1|Z1)2θ^(t2,q2|Z2)4[λ^0{θ^(t2,q2|Z2)+t2}+β^TZ2]2×{C2TA1BA1C22C2TA1D2+0θ^(t2,q2|Z2)i=1ndNi(t2+v){i=1nYi(t2+v)}2dv}θ^(t1,q1|Z1)θ^(t2,q2|Z2)3[λ^0{θ^(t1,q1|Z1)+t1}+β^TZ1]1[λ^0{θ^(t2,q2|Z2)+t2}+β^TZ2]1×{2[C1TA1BA1C2C1TA1D2C2TA1D1+0η^mintmaxi=1ndNi(tmax+v){i=1nYi(tmax+v)}2dv]}

and then perform the test by W^r(t)1[θ^(t1,q1|Z1)/θ^(t2,q2|Z2)1]2.

4 Numerical Studies

4.1 Simulations

In order to demonstrate the asymptotic performance of our estimators θ^(t,q|Z) (that it is consistent and asymptotically unbiased) and V^(t) (that it is consistent, asymptotically unbiased, and yields confidence intervals with the correct coverage probabilities) we conducted a simulation study. Survival times were generated under the assumption of additive hazards with a baseline hazard of λ(t|Z) = t exp (βTZ) with β = (1, 2)T and Z = (Z1, Z2)T, where Z1 Bernoulli(0.5) and Z2 Uniform(0, 1). When generating censored data, censoring times followed an exponential distribution with rate parameter 0.195 ( 10% censoring) or 0.72 (∼30% censoring).

For the purposes of simulation, we considered median (q = 0.5) residual time at t = 0.25 and t = 0.75. Covariate values of z = (0, 0.5)T were chosen. We used 10,000 replications in our simulation. Calculated quantities included the bias, sample standard error (i.e. the standard error of the MRT estimates), mean standard error (i.e. the mean of the standard error estimates), and the coverage probability for nominal 95% confidence intervals.

Simulation results can be seen in Table 1. Bias appears to be negligible. Sample standard errors (SSEs) and mean standard errors (MSEs) were very close to each other, regardless of sample size. Coverage probabilities closely matched nominal confidence levels.

Table 1.

Simulation results of bias, sample standard errors (SSEs), mean standard errors (MSEs) and coverage probabilities of nominal 95% confidence interval (95% CP) of our proposed median (q = 0.5) residual time estimator at t = 0.25 and t = 0.75 under additive hazards model.

n Cens. % t = 0.25
t = 0.75
Bias SSE MSE 95% CP Bias SSE MSE 95% CP
200 0 0.000 0.060 0.060 0.94 0.000 0.065 0.063 0.94
10 −0.000 0.063 0.062 0.94 0.001 0.071 0.069 0.94
30 0.002 0.073 0.071 0.93 0.003 0.089 0.090 0.95
500 0 0.000 0.039 0.038 0.95 −0.000 0.041 0.041 0.95
10 −0.000 0.040 0.040 0.95 −0.000 0.044 0.044 0.95
30 0.001 0.046 0.045 0.94 0.001 0.057 0.056 0.94
1000 0 −0.000 0.027 0.027 0.95 0.000 0.029 0.029 0.94
10 0.000 0.028 0.028 0.95 0.000 0.032 0.031 0.95
30 −0.000 0.032 0.032 0.95 0.001 0.040 0.040 0.95
*

Cens. %: approximate percent of censoring in each simulation

To evaluate the performance of our proposed estimator in section 3, we considered the difference {θ^(t1,q1|z1)θ^(t2,q2|z2)} where t1 = 0.25, q1 = 0.4, z1 = (0, 0.75)T, t2 = 0.75, q2 = 0.6, and z2 = (1, 0.25)T. We used 10,000 replications in our simulation. We calculated bias, sample standard error, mean standard error, and the coverage probability for 95% confidence intervals associated with the difference {θ^(t1,q1|z1)θ^(t2,q2|z2)}.

Results for the performance of W^(t) can be seen in Table 2. While bias was negligible, mean standard errors (MSEs) were larger than sample standard errors (SSEs) for all combinations of simulation parameters. This resulted in slightly conservative confidence intervals, with coverage probabilities a bit larger than nominal levels.

Table 2.

Simulation results of bias, sample standard errors (SSEs), mean standard errors (MSEs) and coverage probabilities of nominal 95% confidence interval (95% CP) of our proposed residual time quantile estimator at t1 = 0.25, q1 = 0.4, z1 = (0, 0.75)T and t2 = 0.75, q2 = 0.6, z2 = (1, 0.75)T under additive hazards model.

n Cens. % Bias SSE MSE 95% CP
200 0 −0.002 0.077 0.080 0.960
10 −0.002 0.082 0.085 0.958
30 −0.002 0.094 0.098 0.959
500 0 −0.000 0.049 0.051 0.959
10 −0.001 0.052 0.054 0.958
30 −0.001 0.060 0.062 0.956
1000 0 −0.000 0.035 0.036 0.956
10 −0.001 0.037 0.038 0.954
30 −0.000 0.042 0.044 0.956
*

Cens. %: approximate percent of censoring in each simulation

4.2 Real data examples

We present two examples of analysis on existing data sets. For the first, we use data from a clinical trial in the treatment of carcinoma of the oropharynx, also known as the VA Lung Cancer Trial, as presented in Kalbfleisch and Prentice [11]. This data set includes 195 patients with squamous cell carcinoma of 3 sites in the oropharynx from the 6 largest of 16 total participating institutions. Patients were randomly assigned treatment with radiation therapy alone or radiation therapy with a chemotherapeutic agent. While many other characteristics were collected, the data set we examined included only: sex; treatment; tumor grade, site, T staging, and N staging; overall patient condition, date of entry, living status, and survival time. Of the 195 patients included, 142 were observed to fail.

We examined differences in MRTs to death across different T staging categories: I/II, III, or IV. Regression results can be seen in Table 3 and reflect what we see when plotting MRTs (Figure 1): higher T stage is associated with increased hazard of death and lower MRT. This association is not statistically significant, however, as Figure 2 shows. The confidence bands (and usually the intervals) contain the null value 0 for the differences between any pair of T stages. Nevertheless, our results highlight a benefit of examining MRTs: for patients with T staging of I or II, MRT changes markedly over time, increasing sharply after initially decreasing slightly. For example, patients with T stage I or II have a MRT of 1.73 years (95% CI 0.30–3.15 years) after having survived 0.75 years and a MRT of 3.29 years (95% CI 0–9.29 years) after having survived 1 year. This would be important and welcome news to surviving patients and their caregivers alike. Here we noticed that the confidence band is very wide due to the large variability at boundary.

Table 3.

Estimated differences in residual time quantils under additive hazards models: 1) differences in median residual times (MRTs) to death across different T staging categories in the VA lung cancer data; and 2) differences in the 10th percentile residual time quantiles of time to death or serious adverse event for Zidovudine-versus-Nevirapine, per-unit change in maternal CD4 counts, and per-unit change in maternal HIV-1 RNA viral loads in the HIVNET 012 data.

Data Set Variable Estimate 95% Confidence Interval P value
Carcinoma T stage I/II 0.00
T stage III 0.08 −0.09 to 0.25 0.011*
T stage IV 0.35 0.12 to 0.59

HIVNET 012 Nevirapine 0.00
Zidovudine −0.03 −0.11 to 0.06 0.511
Maternal CD4 −0.00 −0.02 to 0.01 0.663
Maternal HIV-1 RNA (log10) 0.10 0.04 to 0.17 0.001
*

From Wald test of complete versus null model.

Per 100 cells/μL.

Per unit increase of log10 HIV-1 RNA copies/mL.

Fig. 1.

Fig. 1

Plots of stage-specific median residual times (MRTs) for the VA Lung Cancer Trial data with pointwise confidence intervals and confidence bands. The solid curves represent estimated MRT, the dashed curves represent the pointwise 95% confidence intervals, and the dotted curves represent the 95% confidence bands.

Fig. 2.

Fig. 2

Plots of differences in median residual times (MRTs) between stages for the VA Lung Cancer Trial data with pointwise confidence intervals and confidence bands. The solid curves represent estimated MRT, the dashed curves represent the pointwise 95% confidence intervals, and the dotted curves represent the 95% confidence bands.

Our second example uses data collected as part of the HIVNET 012 randomized trial (Jackson et al. [8]). This trial randomly assigned Human Immunodeficiency Virus type-1 (HIV-1) infected pregnant women in Kampala, Uganda to either a nevirapine- or zidovudine-based treatment. Their infants were followed and tested at pre-determined intervals for HIV-1. Data were also collected on adverse events through 6–8 weeks postpartum for mothers and 18 months for babies. The study enrolled 645 mothers: 313 assigned to nevirapine, 313 to zidovudine, and 19 to placebo. Within 18 months, 109 serious adverse events and 34 deaths were observed in the nevirapine group while 97 serious adverse events and 42 deaths were observed in the zidovudine group.

For our example we examined the relationship between the 10th percentile of residual time to death or serious adverse event and treatment group (nevirapine versus zidovudine), maternal CD4 at pre-entry, and maternal HIV-1 RNA at baseline. We present results from the additive hazard model in Table 3. In the additive model, only maternal HIV-1 RNA was significantly associated with time to death or serious adverse event, with a coefficient of 0.10 (95% CI 0.04–0.17) for a unit increase of log10 HIV-1 RNA copies/mL. A plot of 10th percentile residual times for two different combinations of covariate values (an assumed infant whose mother was treated with AZT, had a maternal CD4 of 600 cells/μL, and a log10 maternal viral load of 3.5 copies/mL and another assumed infant whose mother was treated with NVP, had a maternal CD4 of 300 cells/μL, and a log10 maternal viral load of 5 copies/mL) can be seen in Figure 4. These covariate combinations were chosen to represent infants with relatively protective or relatively non-protective characteristics. For both groups, the 10th percentile residual times are fairly stable at early follow-up times (minus an immediate jump at the beginning of evaluation times). As expected, the 10th percentile residual times for the infants with protective characteristics were consistently higher than those for the infants with non-protective characteristics, though they were also somewhat more variable.

Fig. 4.

Fig. 4

Plots of differences in the 10th percentile residual times between the two assumed infants for the HIVNET 012 data with pointwise confidence intervals. The solid curves represent estimated 10th percentile residual time, the dashed curves represent the pointwise 95% confidence intervals, and the dotted curves represent the 95% confidence bands.

When examining the difference in 10th percentile residual times between the two different combinations of covariate values we conclude that it is not significant across all time as 0 is well within the limits of the confidence bands (see Figure 3). This is in spite of the fact that, for the majority of fixed time points, the difference is statistically significant. The lack of overall significance is driven to a large degree by the large increase in variance towards later times.

Fig. 3.

Fig. 3

Plots of the 10th percentile residual times for two assumed infants for the HIVNET 012 data. One infant is assumed to have the mother treated with Zidovudine (AZT), having a maternal CD4 of 600 cells/μL, and a log10 maternal viral load (VL) of 3.5 copies/mL. The other infant is assumed to have the mother treated with Nevirapine (NVP), having a maternal CD4 of 300 cells/μL, and a log10 maternal VL of 5 copies/mL. The solid curve represents estimated 10th percentile residual time, the dashed curves represent the pointwise 95% confidence intervals, and the dotted curves represent the 95% confidence bands.

5 Discussion

In this article, we presented a model-based estimation method of residual time quantiles for censored time-to-event outcomes. Besides residual time quantiles, mean residual time might also be of interest. It is easy to obtain an explicit form for such estimator based on β and Λ^0(t) as below

E^(Tt|Tt,Z)=0S^(t+s|Z)S^(t|Z)ds=0exp{[Λ^(t+s)Λ^(t)+sβ^TZ]}ds.

However, if the last observation is not an event, the estimator goes to infinity which indicates that we need specify parametric distirbution to the tail. Similarly, for our residual time quantile estimator, we are limited by the number (or proportion) of events that are actually observed. The estimator is only available when

Λ^0{t+θ^(t,q|Z)}+θ^(t,q|Z)β^TZ+logqΛ^0(τ)

where τ is the largest observed failure time. Thus our estimator may not be calculable at later times, for smaller quantiles, for particularly low-risk patient characteristics, or some combination of the three. Earlier censoring can offset this somewhat, though at the cost of increased variability.

As discussed earlier, in clinical practice knowing individuals’ covariate-specific residual time quantiles shall assist both clinicians and patients to better understand how individually carried risk factors affect an event time such as cancer survival. The model-based estimation of residual time quantiles that we propose harnesses the structure of the additive hazards model to provide a simple approach to estimating residual time quantiles with easy-to-compute variances.

As with the additive hazard model-based estimator, our estimator would be greatly improved by being able to handle time-varying covariates and time-varying effects. While the additive hazards model is semiparametric and therefore allows a great deal of flexibility, possible model mis-specification can still lead to biased estimate of residual time quantiles. Therefore, certain sensitivity analyses and goodness of fit test may be used to help detect such a bias and further provide guidance on alternative models for more accurate and reliable model-based estimation.

As a limitation of the additive model, it is not recommended to calculate residual time quantiles for (combinations of) covariates values that are not observed in the data since the estimated hazard function might not be positive for those covariates’ value. Furthermore, our method requires specification of a specific quantile and evaluation time as well as specific covariate values. While we have addressed the issue of comparisons at multiple time points by detailing how to compute confidence bands, the issue of comparisons for multiple covariate values or quantiles remains.

Acknowledgments

This work was partly supported by NIH grants R01 AI089341, R01 CA172415, R01 MH105857, R01 AI121259 and UM1 AI068617. The authors thank the Editor for the helpful comments that have improved the quality of this paper.

APPENDIX

We state conditions A-C used by Theorems 1 and 2 as follows. They are adapted from Chen et al. [1]:

  1. There exists a time t0 > 0 such that limnn1i=1nYi(t0)>0.

  2. There exists an integrable function v(t) such that, for any t ∈ [0, t0],
    n1i=1nYi(t)(ZiZ¯)2λi(t|Zi)v(t)0,
    in probability.
  3. For any ε > 0,
    n1i=1n0t0Yi(s)λ(t|Zi)ZiZ¯2I{n1ZiZ¯>ε}ds0,
    in probability.

Proof of Theorem 1

That n{θ^(t,q|Z)θ(t,q|Z)} converges weakly to a zero-mean Gaussian process follows directly from the established convergence of the individual estimators and the application of the continuous mapping theorem. Therefore we need only consider the asymptotic variance function in detail.

We begin with two equations: one based on the true values,

Λ0{t+θ(t,q|Z)}+θ(t,q|Z)βTZ=Λ0(t)logq,

and one based on the estimated values,

Λ^0{t+θ^(t,q|Z)}+θ^(t,q|Z)β^TZ=Λ^0(t)logq.

Taking the differences between the left- and right-hand sides of both equations yields

Λ^0{θ^(t,q|Z)+t}Λ0{θ(t,q|Z)+t}+θ^(t,q|Z)β^TZθ(t,q|Z)βTZ=Λ^0(t)Λ0(t).

Examining the left-hand side first, given the consistency results of θ^(t,q|Z), using Taylor’s expansion, we have

Λ^0{θ^(t,q|Z)+t}Λ0{θ(t,q|Z)+t}=Λ^0{θ^(t,q|Z)+t}Λ^0{θ(t,q|Z)+t}+Λ^0{θ(t,q|Z)+t}Λ0{θ(t,q|Z)+t}=λ^0{θ(t,q|Z)+t}{θ^(t,q|Z)θ(t,q|Z)}+Λ^0{θ(t,q|Z)+t}Λ0{θ(t,q|Z)+t}+op(θ^(t,q|Z)θ(t,q|Z)).
θ^(t,q|Z)β^TZθ(t,q|Z)βTZ=θ^(t,q|Z)β^TZθ^(t,q|Z)βTZ+θ^(t,q|Z)βTZθ(t,q|Z)βTZ=θ^(t,q|Z)ZT(β^β)+βTZ{θ^(t,q|Z)θ(t,q|Z)}.

We therefore have the expression

Λ^0(t)Λ0(t)=λ^0{θ(t,q|Z)+t}{θ^(t,q|Z)θ(t,q|Z)}+Λ^0{θ(t,q|Z)+t}Λ0{θ(t,q|Z)+t}+θ^(t,q|Z)ZT(β^β)+βTZ{θ^(t,q|Z)θ(t,q|Z)}+op(θ^(t,q|Z)θ(t,q|Z)).

Rearranging yields the approximation

θ^(t,q|Z)θ(t,q|Z)=1λ^0{θ(t,q|Z)+t}+βTZ(Λ^0(t)Λ0(t)[Λ^0{θ(t,q|Z)+t}Λ0{θ(t,q|Z)+t}]θ^(t,q|Z)ZT(β^β))+op(θ^(t,q|Z)θ(t,q|Z)).

From Lee and Hyun [13], we know that

Λ^0(t)Λ0(t)=(0tZ¯(u)du)T(β^β)+0ti=1ndMi(u)i=1nYi(u),

so we can rewrite the overall approximation as

θ^(t,q|Z)θ(t,q|Z)=1λ^0{θ(t,q|Z)+t}+βTZ{tθ(t,q|Z)+ti=1ndMi(u)i=1nYi(u)+[θ^(t,q|Z)Z+tθ(t,q|Z)+tZ¯(u)du]T(β^β)}+op(θ^(t,q|Z)θ(t,q|Z))

We can also perform a substitution, setting u = t + v, and integrating with respect to v. This yields the approximation

θ^(t,q|Z)θ(t,q|Z)=1λ^0{θ(t,q|Z)+t}+βTZ{tθ(t,q|Z)+ti=1ndMi(t+v)i=1nYi(t+v)+[θ^(t,q|Z)Z+0θ(t,q|Z)Z¯(t+v)du]T(β^β)}+op(θ^(t,q|Z)θ(t,q|Z))

Applying the martingale central limit theorem as in Lin and Ying [16] and Lee and Hyun [13], it follows that the variance function can be consistently estimated by

[λ^0{θ^(t,q|Z)+t}+β^TZ]2×[CTA1BA1C2CTA1D+0θ^(t,q|Z)i=1ndNi(t+v)i=1nYi(t+v)}2dv].

Proof of Theorem 2

That n{(θ^(t1,q1|Z1)θ^(t2,q2|Z2))(θ(t1,q1|Z1)θ(t2,q2|Z2))} converges weakly to a zero-mean Gaussian process follows directly from the established convergence of the individual estimators and the application of the continuous mapping theorem. Therefore we need only consider the asymptotic variance function in detail.

From the proof of Theorem 1, we have

θ^(tj,qj|Zj)θ(tj,qj|Zj)=1λ^0{θ(tj,qj|Zj)+tj}+βTZj×{0θ(tj,qj|Zj)i=1ndMi(tj+v)i=1nYi(tj+v)+CjT(β^β)}+op(1)

In order to calculate Var(θ^(t1,q1|Z1)θ^(t2,q2|Z2)), we note that

Var(θ^(t1,q1|Z1)θ^(t2,q2|Z2))=Var[{θ^(t1,q1|Z1)θ(t1,q1|Z1)}{θ^(t2,q2|Z2)θ(t2,q2|Z2)}]=Var[{θ^(t1,q1|Z1)θ(t1,q1|Z1)}+Var{θ^(t2,q2|Z2)θ(t2,q2|Z2)}]2×Cov{θ^(t1,q1|Z1)θ(t1,q1|Z1),θ^(t2,q2|Z2)θ(t2,q2|Z2)}.

Now

Cov{θ^(t1,q1|Z1)θ(t1,q1|Z1),θ^(t2,q2|Z2)θ(t2,q2|Z2)}=1[λ^0{θ(t1,q1|Z1)+t1}+βTZ1][λ^0{θ(t2,q2|Z2)+t2}+βTZ2][Cov{0θ(t1,q1|Z1)i=1ndMi(t1+v)i=1nYi(t1+v),0θ(t2,q2|Z2)i=1ndMi(t2+v)i=1nYi(t2+v)}Cov{0θ(t1,q1|Z1)i=1ndMi(t1+v)i=1nYi(t1+v),C2T(β^β)}Cov{0θ(t2,q2|Z2)i=1ndMi(t2+v)i=1nYi(t2+v),C1T(β^β)}+Cov{C1T(β^β),C2T(β^β)}]=1[λ^0{θ(t1,q1|Z1)+t1}+βTZ1][λ^0{θ(t2,q2|Z2)+t2}+βTZ2][Cov{i=1n0ηmintmaxdMi(tmax+v)i=1nYi(tmax+v),i=1n0ηmintmaxdMi(tmax+v)i=1nYi(tmax+v)}C2TA1D1C1TA1D2+C1TA1BA1C2]=1[λ^0{θ(t1,q1|Z1)+t1}+βTZ1][λ^0{θ(t2,q2|Z2)+t2}+βTZ2][i=1nCov{0ηmintmaxdMi(tmax+v)i=1nYi(tmax+v),0ηmintmaxdMi(tmax+v)i=1nYi(tmax+v)}+C2TA1D1C1TA1D2+C1TA1BA1C2]=1[λ^0{θ(t1,q1|Z1)+t1}+βTZ1][λ^0{θ(t2,q2|Z2)+t2}+βTZ2][Var{0ηmintmaxi=1ndMi(tmax+v)i=1nYi(tmax+v)}+C2TA1D1C1TA1D2+C1TA1BA1C2],

where ηmin = min{θ(t1, q1|Z1) + t1, θ(t2, q2|Z2) + t2} and tmax = max {t1, t2}.

So, combining the above with results from Theorem 1 and taking into account the consistency of the estimators used in this formulation, we can estimate the asymptotic variance with

W^(t)=1[λ^0{θ^(t1,q1|Z1)+t1}+β^TZ1]2[{C1TA1BA1C12C1TA1D1+0θ^(t1,q1|Z1)i=1ndNi(t1+v){i=1nYi(t1+v)}2dv]+1[λ^0{θ^(t2,q2|Z2)+t2}+β^TZ2]2[{C2TA1BA1C22C2TA1D2+0θ^(t2,q2|Z2)i=1ndNi(t2+v){i=1nYi(t2+v)}2dv]2[λ^0{θ^(t1,q1|Z1)+t1}+β^TZ1][λ^0{θ^(t2,q2|Z2)+t2}+β^TZ2]×[C1TA1BA1C2C1TA1D2C2TA1D1+0η^mintmaxi=1ndNi(tmax+v){i=1nYi(tmax+v)}2dv].

Contributor Information

Luis Alexander Crouch, Department of Biostatistics, University of Washington, Seattle, Washington 98105, U.S.A.

Cheng Zheng, Joseph J. Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, 53205, U.S.A.

Ying Qing Chen, Vaccine and Infectious Disease Division and Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A.

References

  • 1.Chen YQ, Jewell NP, Lei X, Cheng SC. Semiparametric estimation of proportional mean residual life model in presence of censoring. Biometrics. 2005;61:170–178. doi: 10.1111/j.0006-341X.2005.030224.x. [DOI] [PubMed] [Google Scholar]
  • 2.Cheng KF. On almost sure representation for quantiles of the product limit estimator with applications. Sankhyā: The Indian Journal of Statistics, Series A. 1984;46:426–443. [Google Scholar]
  • 3.Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society, Series B. 1972;34:187–220. [Google Scholar]
  • 4.Crouch LA, May S, Chen YQ. On estimation of covariate-specific residual time quantiles under the proportional hazards model. Lifetime Data Analysis. 2016;22:299–319. doi: 10.1007/s10985-015-9332-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Dabrowska DM, Doksum KA. Estimates and confidence intervals for median and mean life in the proportional hazard model. Biometrika. 1987;74:799–807. [Google Scholar]
  • 6.Efron B, Tibshirani RJ. An introduction to the bootstrap. CRC Press LLC; 1998. [Google Scholar]
  • 7.Hall P, Lee SMS, Young GA. Importance of interpolation when constructing double bootstrap confidence intervals. Journal of the Royal Statistical Society: Series B. 2000;62:479–491. [Google Scholar]
  • 8.Jackson JB, Musoke P, Fleming TR, Guay LA, Bagenda D, Allen M, Nakabiito C, Sherman J, Bakaki P, Owor M, Ducar C, Deseyve M, Mwatha A, Emel L, Duefield C, Mirochnick M, Fowler MG, Mofenson L, Miotti P, Gigliotti M, Bray D, Mmiro F. Intrapartum and neonatal single-dose nevirapine compared with zidovudine for prevention of mother-to-child transmission of HIV-1 in Kampala, Uganda: 18-month follow-up of the HIVNET 012 randomised trial. The Lancet. 2003;362:859–868. doi: 10.1016/S0140-6736(03)14341-3. [DOI] [PubMed] [Google Scholar]
  • 9.Jeong JH, Jung SH, Costantino JP. Nonparametric inference on median residual life function. Biometrics. 2008;64:157–163. doi: 10.1111/j.1541-0420.2007.00826.x. [DOI] [PubMed] [Google Scholar]
  • 10.Jung SH, Jeong JH, Bandos H. Regression on quantile residual life. Biometrics. 2009;65:1203–1212. doi: 10.1111/j.1541-0420.2009.01196.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. John Wiley & Sons, Inc; Hoboken, New Jersey: 2002. [Google Scholar]
  • 12.Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. Journal of the American Statistical Association. 1958;53:457–481. [Google Scholar]
  • 13.Lee J, Hyun S. Confidence bands for the difference of two survival functions under the additive risk model. Journal of Applied Statistics. 2011;38:785–797. [Google Scholar]
  • 14.Li J, Zhang C, Doksum KA, Nordheim EV. Simultaneous confidence intervals for semiparametric logistic regression and confidence regions for the multi-dimensional effective dose. Statistica Sinica. 2010;20:637–659. [Google Scholar]
  • 15.Lin C, Zhang L, Zhou Y. Conditional quantile residual lifetime models for right censored data. Lifetime Data Analysis. 2015;21:75–96. doi: 10.1007/s10985-013-9289-x. [DOI] [PubMed] [Google Scholar]
  • 16.Lin DY, Ying Z. Semiparametric analysis of the additive risk model. Biometrika. 1994;81:61–71. [Google Scholar]
  • 17.Ma Y, Yin G. Semiparametric median residual life model and inference. Canadian Journal of Statistics. 2010;38:665–679. [Google Scholar]
  • 18.Parzen MI, Wei LJ, Ying Z. A resampling method based on pivotal estimating functions. Biometrika. 1994;81:341–350. [Google Scholar]
  • 19.Yang S. Censored median regression using weighted empirical survival and hazard functions. Journal of the American Statistical Association. 1999;94:137–145. [Google Scholar]
  • 20.Ying Z, Jung SH, Wei LJ. Survival analysis with median regression models. Journal of the American Statistical Association. 1995;90:178–184. [Google Scholar]

RESOURCES