Skip to main content
Biometrika logoLink to Biometrika
. 2012 Mar 20;99(2):393–404. doi: 10.1093/biomet/ass004

Nonparametric inference for assessing treatment efficacy in randomized clinical trials with a time-to-event outcome and all-or-none compliance

Robert M Elashoff 1, Gang Li 2, Ying Zhou 3
PMCID: PMC3635705  PMID: 23843664

Abstract

To evaluate the biological efficacy of a treatment in a randomized clinical trial, one needs to compare patients in the treatment arm who actually received treatment with the subgroup of patients in the control arm who would have received treatment had they been randomized into the treatment arm. In practice, subgroup membership in the control arm is usually unobservable. This paper develops a nonparametric inference procedure to compare subgroup probabilities with right-censored time-to-event data and unobservable subgroup membership in the control arm. We also present a procedure to estimate the onset and duration of treatment effect. The performance of our method is evaluated by simulation. An illustration is given using a randomized clinical trial for melanoma.

Keywords: Biological efficacy, Censoring, Counting process, Martingale, Noncompliance, Survival probability

1. Introduction

In randomized clinical trials, subjects often fail to comply with their assigned treatment regimen. The intent-to-treat analysis is a standard method for primary analysis of randomized trials that compares the treatment and control based on the initial randomization regardless of whether the subjects actually received their assigned treatment or not. However, in the presence of non-compliance, the intent-to-treat analysis estimates the programmatic effectiveness, not the biological efficacy of the treatment (Sommer & Zeger, 1991; Hollis & Campbell, 1999; Lachin, 2000).

Consider a randomized clinical trial with all-or-none compliance in the treatment arm, but no noncompliance in the control arm, where all-or-none compliance means that subjects either fully comply or fully do not comply with their assigned treatment regimen. To evaluate the treatment efficacy, one needs to compare the compliers in the treatment arm with the latent subgroup of patients in the control arm who would have received the treatment if they were offered the treatment. However, the membership of the latent compliance subgroup in the control arm is unobservable. For example, in the Eastern Cooperative Oncology Group trial E9288 (Kemeny et al., 2002), patients with liver metastases of colorectal cancer were randomized to receive surgical resection alone or surgical resection followed by chemotherapy. All patients received surgical resection, but only a portion of those randomized to the chemotherapy arm actually received the postoperative chemotherapy for reasons related to survival. To assess the biological efficacy of chemotherapy, one should compare patients in the chemotherapy arm who have actually received chemotherapy with those in the no-chemotherapy arm who would have received chemotherapy had they been assigned to receive postoperative chemotherapy. However, the latent chemotherapy-compliance status is not observable in the no-chemotherapy arm. Another example is Zelen’s (1979, 1990) single-consent design, in which subjects randomized to the control arm must receive the standard care, and those to the treatment arm can choose either the experimental treatment or the standard care.

This problem also arises from subgroup analysis when the subgroup status is identified by a diagnostic test that is performed in the treatment arm but not in the control arm. One example is the Multicenter Selective Lymphadenectomy Trial I (Morton et al., 2006). In this trial, newly diagnosed melanoma patients were randomized to a sentinel-node biopsy arm or a nodal observation arm. All patients underwent wide excision of the primary melanoma. In the biopsy arm, sentinel-node biopsy was performed on patients to identify the presence of sentinel-node metastases. An immediate complete lymphadenectomy was then performed for patients whose sentinel-node biopsies were positive for metastases. For all other patients in the trial, delayed complete lymphadenectomy was performed only when nodal recurrences became clinically detectable. An interesting question is whether the trial provides sufficient evidence that immediate complete lymphadenectomy improves survival relative to delayed complete lymphadenectomy in patients with sentinel-node metastases. The design of this trial does not facilitate a direct comparison since the sentinel-node status is unknown in the nodal observation arm.

There is a large literature on treatment noncompliance. For instance, Sommer & Zeger (1991) studied the problem with a dichotomous outcome and all-or-none compliance. Their approach has been extended to handle ordered categorical compliance (Goethgebeur & Molenberghs, 1996), all-or-none compliance with contamination (Cuzick et al., 1997) and clustered binary outcomes (Albert, 2002). Several methods are available for a time-to-event outcome with right-censored data. Robins & Tsiatis (1991) proposed a structural failure time model based on potential outcomes. Loeys & Goetghebeur (2003), Loeys et al. (2005) and Cuzick et al. (2007) proposed methods based on proportional hazards models. Frangakis & Rubin (1999) studied a nonparametric procedure to evaluate treatment efficacy in a randomized trial with missing outcomes and all-or-none noncompliance and discussed an extension to right-censored time-to-event data. Loeys & Goetghebeur (2003) proposed a nonparametric plug-in estimator of the survival function for compliers in the control arm, but its large-sample properties have not been studied.

The purpose of this paper is to develop a nonparametric inference method to assess treatment efficacy in randomized clinical trials with a right-censored time-to-event outcome and all-or-none compliance. We derive large sample properties of the estimated subgroup survival probabilities, which lead to an asymptotic inference procedure to compare a survival probability for compliers in the treatment arm with that for the latent compliers in the control arm. We also provide a procedure to determine the onset and duration of the treatment effect.

2. Methods

2.1. Notation and assumptions

Suppose that n2 subjects are randomized to the treatment arm and n1 subjects to the control arm. Let n = n1 + n2. In the treatment arm, one observes n2 independent and identically distributed triplets (X2i, δ2i, g2i) (i = 1, …, n2), where for the ith subject the observation time X2i = min(T2i, C2i) is the minimum of a nonnegative continuous survival time T2i and a censoring time C2i, δ2i = I (T2iC2i) is a failure indicator, and g2i is a binary compliance indicator, 1 for compliance. In the control arm, one observes n1 independent and identically distributed pairs (X1i, δ1i) (i = 1, …, n1), where for subject i, X1i = min(T1i, C1i), δ1i = I (T1iC1i) and the binary compliance indicator g1i is not observed. Table 1 defines the subgroup survival functions by study arms and compliance status.

Table 1.

Notation of survival functions by study arms and compliance status

Treatment Control
Compliers (gi = 1) S2(t) = pr(T2i > t | g2i = 1) S1(t) = pr(T1i > t | g1i = 1)
Noncompliers (gi = 0) G2(t) = pr(T2i > t | g2i = 0) G1(t) = pr(T1i > t | g1i = 0)
Overall H1(t) = pr(T1i > t)

Let Λ1(t), Λ2S(t) and Λ2G(t) denote the cumulative hazard functions of H1(t), S2(t) and G2(t), respectively. In addition, let D1(t) = pr(C1i > t), D2S(t)=pr(C2i>t|g2i=1) and D2G(t)=pr(C2i>t|g2i=0).

The following assumptions are made throughout the paper.

Assumption 1. The compliance proportion is independent of randomization, i.e., pr(g1i =1) = pr(g2i = 1) ≡ p.

Assumption 2. The survival function of noncompliers is independent of randomization, i.e., G1(t) = G2(t), for all t.

Assumption 3. In the control arm, the censoring time is independent of the survival time. In the treatment arm, the censoring time is independent of the survival time conditional on the compliance status.

2.2. The estimator

The overall survival function for the control arm is H1(t) = pr(g1i = 1)S1(t) + pr(g1i = 0)G1(t). This, together with Assumptions 1 and 2, implies that

S1(t)={H1(t)(1p)G2(t)}/p,

which leads to the plug-in estimator (Loeys & Goetghebeur, 2003)

S^1(t)={H^1(t)(1p^)G^2(t)}/p^. (1)

Here Ĥ1(t) = ΠX1i:n1t{1 – 1/(n1i + 1)}δ1i:n1 is the Kaplan & Meier (1958) estimator of H1(t), where 0 ⩽ X11:n1 ⩽ ⋯ ⩽ X1n1:n1 are the ordered observation times in the control arm and δ1i:n1 is the failure indicator for X1i:n1, Ĝ2(t) is the Kaplan–Meier estimator of G2(t) based on the noncompliers in the treatment arm and p^=i=1n2g2i/n2.

Let Ŝ2(t) be the Kaplan–Meier estimator of S2(t) based on the compliers (g2i = 1) in the treatment arm. We estimate S2(t) – S1(t), a measure of treatment efficacy, by Ŝ2(t) – Ŝ1(t).

Remark 1. Frangakis & Rubin (1999, § 5) derived a different estimator of S1(t) that is consistent for S1(t) under independent censoring conditional on compliance status in both arms. Because compliance status is unobserved in the control arm, the conditional independent censoring assumption in the control arm is untestable. Furthermore, the plug-in estimator defined in (1) may not be consistent under the assumptions of Frangakis & Rubin (1999) since the conditional independent censoring assumption in the control arm renders the Kaplan–Meier estimator Ĥ1(t) generally inconsistent for H1(t). We make different assumptions in Assumption 3 that the censoring time is independent of the survival time conditional on the compliance status in the treatment arm, but unconditionally independent in the control arm. Thus, under Assumption 3, Ĥ1(t) is consistent for H1(t) and Ĝ2(t) is consistent for G2(t). Consequently, the plug-in estimator Ŝ1(t) is consistent for S1(t).

2.3. Asymptotic properties

The counting process and martingale theory are used to study the asymptotic properties of Ŝ1(t) and Ŝ2(t) – Ŝ1(t). For the control arm, define the at-risk processes

Y1i(t)=I(X1it),Y1(t)=i=1n1Y1i(t),

and counting processes

N1i(t)=I(X1it,δ1i=1),N1(t)=i=1n1N1i(t).

Then M1i(t)=N1i(t)0tY1i(u)dΛ1(u)(i=1,,n1), are orthogonal locally integrable martingales with respect to the filtration 𝒡1(t) = σ [{N1i (u), Y1i (u)}, i = 1, …, n1 : 0 ⩽ ut], the failure and censoring histories up to time t.

Similarly, let Y2i (t) and N2i (t) denote the corresponding at-risk process and counting process for subject i in the treatment arm. For the compliance subgroup in the treatment arm, define Y2S(t)=i=1n2g2iY2i(t), N2S(t)=i=1n2g2iN2i(t) and M2S(t)=i=1n2g2iM2iS(t), where M2iS(t)=N2i(t)0tY2i(u)dΛ2S(u). For the noncompliance subgroup in the treatment arm, define Y2G(t)=i=1n2(1g2i)Y2i(t), N2G(t)=i=1n2(1g2i)N2i(t) and M2G(t)=i=1n2(1g2i)M2iG(t), where M2iG(t)=N2i(t)0tY2i(u)dΛ2G(u).

Let

Λ^1(t)=0tdN1(u)Y1(u),Λ^2S(t)=0tdN2S(u)Y2S(u),Λ^2G(t)=0tdN2G(u)Y2G(u)

be the Nelson–Aalen estimators of Λ1(t), Λ2S(t) and Λ2G(t).

The following lemma is needed to derive the joint distribution of Ĥ1(t), Ŝ2(t), Ĝ2(t) and .

Lemma 1. Assume that Λ1(τ) < ∞, Λ2S(τ)<, Λ2G(τ)<, D1(τ) > 0, D2S(τ)>0 and D2G(τ)>0, where τ is usually the time when data collection ends. Assume n1/nρ1 and n2/nρ2 for 0 < ρ1, ρ2 < 1 as n → ∞. Then, for any t in (0, τ],

n1/2(H^1(t)H1(t)S^2(t)S2(t)G^2(t)G2(t)p^p)N((0000),Σ(t))

in distribution, as n∞, where the variance–covariance matrix is

Σ(t)=(H12(t)ρ11σ11(t)0000S22(t)ρ21σ22(t)0S2(t)ρ21σ24(t)00G22(t)ρ21σ33(t)G2(t)ρ21σ34(t)0S2(t)ρ21σ24(t)G2(t)ρ21σ34(t)ρ21p(1p)),

with

σ11(t)=E[{0tdM1i(u)H1(u)D1(u)}2],σ22(t)=E[{0tdM2iS(u)g2ipS2(u)D2S(u)}2],σ33(t)=E[{0tdM2iG(u)(1g2i)(1p)G2(u)D2G(u)}2],σ24(t)=E{(g2ip)0tdM2iS(u)g2ipS2(u)D2S(u)},σ34(t)=E{(g2ip)0tdM2iG(u)(1g2i)(1p)G2(u)D2G(u)}.

Furthermore, σ11(t), σ22(t), σ33(t), σ24(t) and σ34(t) can be consistently estimated by

σ^11(t)=n1i=1n1{0tdM^1i(u)Y1(u)}2,σ^22(t)=n2i=1n2{0tdM^2iS(u)g2iY2S(u)}2,σ^33(t)=n2i=1n2{0tdM^2iG(u)(1g2i)Y2G(u)}2,σ^24(t)=i=1n2(g2ip^)0tdM^2iS(u)g2iY2S(u),σ^34(t)=i=1n2(g2ip^)0tdM^2iG(u)(1g2i)Y2G(u),

where M^1i(t)=N1i(t)0tY1i(u)dΛ^1(u), M^2iS(t)=N2i(t)0tY2i(u)dΛ^2S(u) and M^2iG(t)=N2i(t)0tY2i(u)dΛ^2G(u).

The joint limiting distribution of Ŝ1(t) and Ŝ2(t) is stated below.

Theorem 1. Assume that the assumptions of Lemma 1 hold. At a given time-point t in (0, τ],

n1/2(S^1(t)S1(t)S^2(t)S2(t))N{(00),(ν11(t)ν12(t)ν12(t)ν22(t))}

in distribution, as n → ∞, n1/nρ1 and n2/n → ρ2, where

ν11(t)=H12(t)p2ρ11σ11(t)+(1p)2G22(t)p2ρ21σ33(t)+{H1(t)G2(t)}2p4ρ21σ442(1p)G2(t){H1(t)G2(t)}p3ρ21σ34(t), (2)
ν22(t)=S22(t)ρ21σ22(t), (3)
ν12(t)=S2(t){H1(t)G2(t)}p2ρ21σ24(t). (4)

The proofs of Lemma 1 and Theorem 1 are provided in the Appendix.

2.4. Pointwise confidence intervals for survival probabilities

Let ν̂11(t), ν̂22(t) and ν̂12(t) be the consistent estimates of ν11(t), ν22(t) and ν12(t) obtained by replacing the theoretical quantities in (2)–(4) with their consistent estimators. It follows from Theorem 1 that the 100(1 – α)% confidence intervals for S1(t) and S2(t) at a given t ∈ [0, τ] are given by

S^1(t)±z1α/2{ν^11(t)n}1/2,S^2(t)±z1α/2{ν^22(t)n}1/2,

respectively, where z1–α/2 is the upper α/2 quantile of the standard normal distribution.

Furthermore, a 100(1 – α)% confidence interval for S2(t) – S1(t) is given by

{S^2(t)S^1(t)}±z1α/2n1/2σ^(t),

where σ̂2(t) = ν̂11(t) + ν̂22(t) 2ν̂12(t).

2.5. Onset and duration of treatment effect

In practice, one is often interested in the estimation of the time interval in which S2(t) exceeds S1(t). Below we provide a procedure to determine the onset and duration of the time interval with a confidence level 1 – α. The procedure is based on ideas from Berger & Boos (1999).

Step 1. Construct a one-sided, α/2-level test of H0t : S2(t) – S1(t) = 0 versus Hat : S2(t) – S1(t) > 0 for a given t ∈ [0, τ]. Define Zt = {Ŝ2(t) – Ŝ1(t)}/{n−1/2σ̂(t)}. We reject the null hypothesis H0t if Zt > z1–α/2.

Step 2. Choose starting value ts ∈ [0, τ]. If Zts accepts H0ts, no confidence statement is made. If Zts rejects H0ts, we test sequentially downward and upward from ts. Let L be the last tts for which H0t is rejected, and U be the last tts for which H0t is rejected. Then [L, U] is the largest interval containing ts for which Zt rejects H0t for all t ∈ [L, U].

Theorem 2. Let [L, U] be defined by the above algorithm. Then,

pr{S2(t)S1(t)>0,t[L,U]}1α,

as n → ∞, n1/nρ1 and n2/nρ2 for some constants 0 < ρ1, ρ2 < 1.

As remarked by Berger & Boos (1999), the starting value ts should be chosen before the study. Any starting point satisfying LtsU will lead to the same interval [L, U]. The choice of ts requires some prior information about the time frame during which S2(t) is likely to be higher than S1(t). If prior knowledge is unavailable, one can repeat the procedure at k different starting points and use level α/(2k) for each one-sided test to achieve the overall confidence level 1 – α.

3. Simulation studies

We carried out some simulations to investigate the finite sample performance of our method for different combinations of sample size, censoring rate and compliance proportion.

In the first simulation, the survival times were generated from Weibull distributions S1(t) = exp(−t0.8), S2(t) = exp(−t0.75/2) and G1(t) = G2(t) = exp(−t0.7/4). The censoring times were generated from an exponential distribution with hazard rate λ chosen to give a prespecified overall censoring rate. For each simulation, 1000 Monte Carlo samples were generated. Table 2 reports the empirical mean of the plug-in estimate Ŝ1(t), the empirical standard error of Ŝ1(t), the empirical mean of estimated standard error of Ŝ1(t) and the achieved coverage probability of the 95% confidence interval for S1(t). For the moderate sample size n1 = n2 = 100, the plug-in estimator Ŝ1(t) and its standard error have very small biases and the confidence interval has very small coverage probability error in almost all cases. The only exception is that when the censoring rate is 70%, the coverage rate, 90.9%, is notably lower than the nominal level 95% in the right tail, S(t) = 0.25. Hence, one should interpret the analysis results for the right tail with caution in the presence of heavy censoring. Furthermore, as the sample sizes increase to n1 = n2 = 500, the performance of our method is satisfactory in all cases considered.

Table 2.

Performance of Ŝ1(t), estimated standard error and confidence interval

p CR S1(t) n1 = n2 = 100 n1 = n2 = 500
Ŝ1(t) SE SE^ ACP Ŝ1(t) SE SE^ ACP
0.7 0.3 0.50 0.50 0.08 0.08 94.2 0.50 0.04 0.04 94.8
0.25 0.25 0.09 0.09 94.7 0.25 0.04 0.04 94.9
0.5 0.50 0.50 0.08 0.08 94.8 0.50 0.04 0.04 95.2
0.25 0.25 0.10 0.10 94.6 0.25 0.04 0.04 95.6
0.7 0.50 0.50 0.10 0.09 94.5 0.50 0.04 0.04 94.9
0.25 0.24 0.15 0.13 90.9 0.25 0.06 0.06 95.1
0.5 0.3 0.50 0.50 0.12 0.11 94.5 0.50 0.05 0.05 96.0
0.25 0.24 0.13 0.13 94.6 0.25 0.06 0.06 95.4
0.5 0.50 0.50 0.12 0.12 94.1 0.50 0.05 0.05 95.9
0.25 0.24 0.15 0.14 94.6 0.25 0.06 0.06 95.3
0.7 0.50 0.50 0.14 0.13 94.3 0.50 0.06 0.06 95.4
0.25 0.24 0.19 0.18 94.6 0.25 0.08 0.08 94.9
0.3 0.3 0.50 0.49 0.20 0.19 95.7 0.50 0.08 0.08 95.9
0.25 0.24 0.23 0.23 96.1 0.25 0.10 0.10 95.8
0.5 0.50 0.49 0.21 0.20 95.6 0.50 0.08 0.09 95.4
0.25 0.23 0.25 0.24 95.3 0.25 0.10 0.11 95.4
0.7 0.50 0.49 0.22 0.21 95.9 0.50 0.09 0.09 96.0
0.25 0.23 0.30 0.29 95.7 0.25 0.13 0.13 96.2

Ŝ1(t), the empirical mean of estimated survival probability; SE, the empirical standard error of Ŝ1(t); SE^, the empirical mean of estimated standard error of Ŝ1(t); ACP, the achieved coverage probability of the 95% confidence interval for S1(t); p, the compliance proportion; CR, the censoring rate.

A similar simulation was conducted under conditions that emulate the Multicenter Selective Lymphadenectomy Trial I, in which the overall censoring rate is 87%, the observed proportion of node-positive patients is 15% and the sample sizes are n1 = 500 and n2 = 764. In this simulation, the survival times were generated from Weibull distributions S1(t) = exp(−t0.87/48.3), S2(t) = exp(−t0.87/89.8) and G1(t) = G2(t) = exp(−t0.79/117.2). The results are summarized in Table 3. It is seen that similar to the previous simulation, the plug-in estimate Ŝ1(t) and its standard error estimate show small bias and the confidence interval has small coverage probability error except in the right tail.

Table 3.

Simulation designed to emulate the Multicenter Selective Lymphadenectomy Trial I

n1 n2 p CR S1(t) Ŝ1(t) SE SE^ ACP
500 764 0.15 0.87 0.50 0.51 0.25 0.25 95.30
0.25 0.22 0.63 0.55 91.70

p, the sample size; CR, the censoring rate; Ŝ1(t), the empirical mean of estimated survival probability; SE, the empirical standard error of Ŝ1(t); SE^, the empirical mean of estimated standard error of Ŝ1 (t); ACP, the achieved coverage probability of the 95% confidence interval for S1(t).

We conducted more simulations under different scenarios that include different survival distributions and a null case where S1 = S2. The performance of Ŝ2(t) – Ŝ1(t) was also investigated. The results were similar to those for Ŝ1(t) and thus omitted. We also carried out a small sensitivity analysis of Assumption 2 and did not observe serious estimation bias and coverage error under moderate censoring when Assumption 2 is slightly violated.

4. An example

We applied our method to analyse the Multicenter Selective Lymphadenectomy Trial I described in § 1. Between January 1994 and March 2002, 769 patients were randomized to the sentinel-node biopsy arm and 500 patients to the nodal observation arm. Our analysis excludes five patients in the biopsy arm whose sentinel-node status was unavailable. Immediate complete lymphadenectomy was performed on 122 of 764 patients in the biopsy arm whose biopsies were positive for sentinel-node metastases. For other patients, delayed complete lymphadenectomy was performed upon clinically observable nodal relapse.

To apply our method, it is important that Assumption 2 in § 2.1 is reasonable for this trial. Because removing the sentinel node among the node-negative patients is not expected to change their survival experience, it is reasonable to assume that G1(t) = G2(t), where G1(t) and G2(t) represent the survival probabilities of node-negative patients in the observation arm and the biopsy arm, respectively.

We applied our method to estimate the benefit of immediate versus delayed complete lymphadenectomy on patients with sentinel-node metastases with respect to melanoma-specific survival, i.e., survival until death due to melanoma. Let S1(t) and S2(t) be the survival probabilities for patients with sentinel-node metastases in the observation arm and the biopsy arm, respectively. Table 4 reports the estimated survival probabilities Ŝ1(t) and Ŝ2(t), the estimated survival difference Ŝ2(t) – Ŝ1(t) and their estimated standard errors at some given time-points. It is seen from Table 4 that the 95% confidence interval for S2(t) – S1(t) is (0.02, 0.42) at 2.5 years, which implies that the 2.5-year melanoma-specific survival probability for patients with sentinel-node metastases in the biopsy arm is significantly higher than that of the observation arm at 0.05 significance level. Furthermore, using the procedure in § 2.5, we found that immediate complete lymphadenectomy significantly improves melanoma-specific survival relative to delayed complete lymphadenectomy for patients with sentinel-node metastases over the time interval [2.05, 2.97] years at 95% confidence level.

Table 4.

Melanoma-specific survival probabilities, and standard errors in parentheses, for patients with sentinel-node metastases

t (years)
1 2 2.5 3
Delayed complete lymphadenectomy Ŝ1(t) 0.90 (0.04) 0.75 (0.08) 0.65 (0.09) 0.64 (0.10)
Immediate complete lymphadenectomy Ŝ2(t) 0.97 (0.02) 0.88 (0.03) 0.86 (0.03) 0.83 (0.04)
Difference Ŝ2(t) – Ŝ1(t) 0.06 (0.04) 0.13 (0.08) 0.22 (0.10) 0.20 (0.11)

5. Discussion

Most existing methods for assessing treatment efficacy compare either mean survival times (Robins & Tsiatis, 1991) or hazard rates (Loeys & Goetghebeur, 2003; Loeys et al., 2005; Cuzick et al., 2007) between study arms using parametric or semiparametric models. The difference in subgroup survival probabilities provides a useful alternative measure of treatment efficacy. Our method is fully nonparametric. If the proportional hazards assumption between S1(t) and S2(t) holds, our nonparametric method may not be as efficient as a proportional hazards model-based method. On the other hand, our method is more robust when the hazards of S1(t) and S2(t) are not proportional.

As illustrated in our simulations, caution is needed when drawing inference in the right tail of a survival distribution for moderate sample size, especially under heavy censoring and extremely unbalanced compliance proportions.

The focus of this paper is on comparison of subgroup survival probabilities at a fixed time-point based on the plug-in estimator Ŝ1(t). As pointed out by Loeys & Goetghebeur (2003), Ŝ1(t) is not a proper estimate of the entire survival curve because it is not monotonically non-increasing. One possible solution is to apply isotonic regression to the plug-in estimator to obtain a proper survival function. Properties of the resulting estimator, however, are difficult to study. An alternative approach is to consider nonparametric maximum likelihood estimation of S1.

Acknowledgments

The authors are grateful to the editor, an associate editor and two referees for their insightful and constructive comments. We also thank Dr Donald L. Morton for providing the melanoma data used in the example. Gang Li’s research was supported by a National Institutes of Health grant.

Appendix. Proofs

To prove the main results, we need the following lemma.

Lemma A1. Under the assumptions of Lemma 1, we have supt[0,τ]|Y2S(t)/n2y2S(t)|0, supt[0,τ]|Y2G(t)/n2y2G(t)|0 and supt ∈ [0] |Y1(t)/n1y1(t)| → 0, in probability, where y2S(t)=pS2(t)D2S(t), y2G(t)=(1p)G2(t)D2G(t) and y1(t) = H1(t−) D1(t−).

Proof. For each i = 1, …, n2, Y2i = I (X2it) is a left-continuous nonincreasing random function. We assume that S2(t) and D2(t) are continuous functions. At a given time-point t ∈ [0, τ],

E{g2iY2i(t)}=pr(g2i=1)pr{Y2i(t)=1|g2i=1}=pS2(t)D2S(t).

By the weak law of large numbers, at each given time-point t,

|i=1n2g2iY2i(t)n2pS2(t)D2S(t)|0

in probability, as n2 → ∞. By the dominated convergence theorem,

E{g2iY2i(t+)}=E{limkg2iY2i(t+1k)}=limkE{g2iY2i(t+1k)}=pS2(t)D2S(t).

Thus, as n2|{i=1n2g2iY2i(t+)}/n2pS2(t)D2S(t)|0 in probability. Using arguments similar to those in the proof of Chung (2001, Theorem 5.5.1), it can be shown that supt[0,τ]|Y2S(t)/n2y2S(t)|0 in probability, as n2 → ∞, where y2S(t)=pS2(t)D2S(t).

The other results can be proved along the same lines.

Proof of Lemma 1. It is well known that Λ̂1(t), Λ^2S(t) and Λ^2G(t) are consistent estimators of Λ1(t), Λ2S(t) and Λ2G(t), respectively (Andersen et al., 1993), and that is a consistent estimator of p. It is also clear that Λ̂1(t) is independent of Λ^2S(t), Λ^2G(t) and . Furthermore, Λ2S(t) and Λ^2G(t) are independent. We have

n11/2{Λ^1(t)Λ1(t)}=n11/20tdM1(u)y1(u)+n11/20t{1Y1(u)/n11y1(u)}dM1(u). (A1)

Assuming Λ1(τ) < ∞, n11/2M1 converges weakly to a zero-mean Gaussian process by the martingale central limit theorem (Andersen et al., 1993). It follows from Lin & Ying (2001, Lemma A.1) and our Lemma A1 that n11/20t[1/{Y1(u)/n1}1/y1(u)]dM1(u)0 in probability uniformly in t ∈ [0, τ]. This, together with (A1), implies that

n1/2{Λ^1(t)Λ1(t)}=n11/2ρ11/2i=1n10tdM1i(u)y1(u)+op(1) (A2)

uniformly in t ∈ [0, τ]. Similarly, we can show that as n → ∞ and n2/nρ2,

n1/2{Λ^2S(t)Λ2S(t)}=n21/2ρ21/2i=1n10tdM2iS(u)g2iy2S(u)+op(1) (A3)

and

n1/2{Λ^2G(t)Λ2G(t)}=n21/2ρ21/2i=1n20tdM2iG(u)(1g2i)y2G(u)+op(1) (A4)

uniformly in t ∈ [0, τ].

Let 𝒟 [0, τ]3 be the metric space consisting of {f1(t), f2(t), f3(t)}, where fk : [0, τ] → R for k = 1, 2, 3 are right-continuous functions with left limits. The metric of 𝒟[0, τ]3 is defined as d(f, g) = maxt ∈ [0,τ]{‖fk (t) – gk(t) ‖: 1 ⩽ k ⩽ 3} for f, g𝒟[0, τ]3. It is easy to see from equations (A2)–(A4) that the stochastic process n1/2[{Λ̂1(t) – Λ1(t)}, {Λ^2S(t)Λ2S(t)}, {Λ^2G(t)Λ2G(t)}, (p)] in 𝒟[0, τ]3 × R is asymptotically equivalent to a sum of independent and identically distributed random vectors. By the multivariate central limit theorem, its finite-dimensional distributions converge asymptotically to zero-mean multivariate normal distributions. Moreover, because the elements, n1/2{Λ̂1(t) – Λ1(t)}, n1/2 {Λ^2S(t)Λ2S(t)} and n1/2{Λ^2G(t)Λ2G(t)}, are square-integrable martingales with respect to their marginal filtrations, their tightness follows from the proof of Pollard (1984, Theorem VIII.13). Hence, n1/2[{Λ^1(t)Λ1(t)},{Λ^2S(t)Λ2S(t)},{Λ^2G(t)Λ2G(t)},(p^p)] converges weakly to a zero-mean Gaussian stochastic process {𝒲1(t), 𝒲2(t), 𝒲3(t), 𝒲4} in 𝒟[0, τ]3 × R with the variance-covariance functions between 𝒲1(t1), 𝒲2(t2), 𝒲3(t3) and 𝒲4 given by

(ρ11σ11(t1)0000ρ21σ22(t2)0ρ21σ24(t2)00ρ21σ33(t3)ρ21σ34(t3)0ρ21σ24(t2)ρ21σ34(t3)ρ21p(1p)).

Recall that Ĥ1(t) = Πut{1–dΛ̂1(u)}, S^2(t)=ut{1dΛ^2S(u)} and G^2(t)=ut{1dΛ^2G(u)}, respectively. It follows from the functional delta method (Andersen et al., 1993) that the joint stochastic process n1/2[{Ĥ1(t) – H1(t)}, {Ŝ2(t) – S2(t)}, {Ĝ2(t) – G2(t)}, (p)] converges weakly to a zero-mean Gaussian process { 𝒲1*(t), 𝒲2*(t), 𝒲3*(t), 𝒲4*} with the variance-covariance function between 𝒲1*(t1), 𝒲2*(t2), 𝒲3*(t3) and 𝒲4* given by

(H12(t1)ρ11σ11(t1)0000S22(t2)ρ21σ22(t2)0S2(t2)ρ21σ24(t2)00G22(t3)ρ21σ33(t3)G2(t3)ρ21σ34(t3)0S2(t2)ρ21σ24(t4)G2(t3)ρ21σ34(t3)ρ21p(1p)).

Therefore, for any t ∈ [0, τ], n1/2[{Ĥ1(t) – H1(t)}, {Ŝ2(t) – S2(t)}, {Ĝ2(t) – G2(t)}, (p)] converges to a multivariate normal distribution with mean zero and variance-covariance matrix Σ(t).

To prove the consistency of σ̂11(t), we have

σ^11(t)=1n1i=1n1[0t{1Y1(u)/n11y1(u)}dM^1i(u)]2+2n1i=1n1[0t{1Y1(u)/n11y1(u)}dM^1i(u)0tdM^1i(u)y1(u)]+1n1i=1n1{0tdM^1i(u)y1(u)}2.

By the uniform consistency of Λ̂1(t) and Y1(t)/n1 and the fact that |Y1i (t)| ⩽ 1, we have

supt[0,τ]|1n1i=1n1[0t{1Y1(u)/n11y1(u)}dM^1i(u)]2|0,supt[0,τ]|1n1i=1n1[0t{1Y1(u)/n11y1(u)}dM^1i(u)0tdM^1i(u)y1(u)]|0,supt[0,τ]|1n1i=1n1{0tdM^1i(u)y1(u)}21n1i=1n1{0tdM1i(u)y1(u)}|0,

in probability. Thus, σ^11(t)=n11i=1n1{0tdM1i(u)/y1(u)}2+op(1) uniformly in t ∈ [0, τ]. Furthermore, it follows from the uniform law of large numbers (Newey & McFadden, 1994, Lemma 2.4) that

supt[0,τ]|1n1i=1n1{0tdM1i(u)y1(u)}2E[{0tdM1i(u)y1(u)}2]|0

in probability. Therefore, σ^11(t)E[{0tdM1i(u)/y1(u)}2]σ11(t) in probability uniformly in t ∈ [0, τ].

The uniform consistency of σ̂22(t), σ̂33(t), σ̂24(t) and σ̂34(t) can be proved similarly. The consistency of (1 – ) follows directly from the weak law of large numbers.

Proof of Theorem 1. It follows from Lemma 1 and the delta method that at a given time-point t ∈ [0, τ ], n1/2[{Ŝ1(t) – S1(t)}, {Ŝ2(t) – S2(t)}] is asymptotically normally distributed with mean zero and variance-covariance matrix f Σ(t) f′, where

f=(1/p0(1p)/p{H1(t)G2(t)}/p20100).

Applying Slutsky’s theorem, we have ν̂11(t) → ν11(t), ν̂22(t) → ν22(t) and ν̂12(t) → ν12(t) in probability as n → ∞, n1/nρ1 and n2/nρ2.

Proof of Theorem 2. The theorem can be proved using essentially the same arguments given in the appendix of Berger & Boos (1999). Thus, we omit the details.

References

  1. Albert JM. Estimating efficacy in clinical trials with clustered binary responses. Statist Med. 2002;21:649–61. doi: 10.1002/sim.1059. [DOI] [PubMed] [Google Scholar]
  2. Andersen PK, Borgan Ø, Gill RD, Keiding N. Statistical Models Based on Counting Processes. New York: Springer; 1993. [Google Scholar]
  3. Berger RL, Boos DD. Confidence limits for the onset and duration of treatment effect. Biomet J. 1999;41:517–31. [Google Scholar]
  4. Chung KL. A Course in Probability Theory. 3rd ed. New York: Academic Press; 2001. [Google Scholar]
  5. Cuzick J, Edwards R, Segnan N. Adjusting for non-compliance and contamination in randomized clinical trials. Statist Med. 1997;16:1017–29. doi: 10.1002/(sici)1097-0258(19970515)16:9<1017::aid-sim508>3.0.co;2-v. [DOI] [PubMed] [Google Scholar]
  6. Cuzick J, Sasieni P, Myles J, Tyrer J. Estimating the effect of treatment in a proportional hazards model in the presence of non-compliance and contamination. J. R. Statist. Soc. B. 2007;69:565–88. [Google Scholar]
  7. Frangakis CE, Rubin DB. Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-non-compliance and subsequent missing outcomes. Biometrika. 1999;86:365–79. [Google Scholar]
  8. Goethgebeur E, Molenberghs G. Causal inference in a placebo-controlled clinical trial with binary outcome and ordered compliance. J Am Statist Assoc. 1996;91:928–34. [Google Scholar]
  9. Hollis S, Campbell F. What is meant by intention to treat analysis? Survey of published randomised controlled trials. Br Med J. 1999;319:670–4. doi: 10.1136/bmj.319.7211.670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Statist Assoc. 1958;53:457–81. [Google Scholar]
  11. Kemeny MM, Adak S, Gray B, Macdonald JS, Smith T, Lipsitz S, Sigurdson ER, O’Dwyer PJ, Benson AB. Combined-modality treatment for resectable metastatic colorectal carcinoma to the liver: surgical resection of hepatic metastases in combination with continuous infusion of chemotherapy—an intergroup study. J Clin Oncol. 2002;20:1499–505. doi: 10.1200/JCO.2002.20.6.1499. [DOI] [PubMed] [Google Scholar]
  12. Lachin JM. Statistical considerations in the intent-to-treat principle. Contr. Clin. Trials. 2000;21:167–89. doi: 10.1016/s0197-2456(00)00046-5. [DOI] [PubMed] [Google Scholar]
  13. Lin DY, Ying Z. Semiparametric and nonparametric regression analysis of longitudinal data. J Am Statist Assoc. 2001;96:103–13. [Google Scholar]
  14. Loeys T, Goetghebeur E. A causal proportional hazards estimator for the effect of treatment actually received in a randomized trial with all-or-nothing compliance. Biometrics. 2003;59:100–5. doi: 10.1111/1541-0420.00012. [DOI] [PubMed] [Google Scholar]
  15. Loeys T, Goetghebeur E, Vabdebosch A. Casual proportional hazards models and time-consistent exposure in randomized clinical trials. Lifetime Data Anal. 2005;11:435–49. doi: 10.1007/s10985-005-5233-z. [DOI] [PubMed] [Google Scholar]
  16. Morton DL, Thompson JF, Cochran AJ, Mozzillo N, Elashoff R, Essner R, Nieweg OE, Roses DF, Hoekstra HJ, Karakousis CP, et al. Sentinel-node biopsy or nodal observation in melanoma. New Engl J Med. 2006;355:1307–17. doi: 10.1056/NEJMoa060992. [DOI] [PubMed] [Google Scholar]
  17. Newey WK, McFadden D. Handbook of Econometrics. Vol. 4. Amsterdam: Elsevier Science; 1994. Large sample estimation and hypothesis testing; pp. 2111–245. Ch. 36. [Google Scholar]
  18. Pollard D. Weak Convergence of Stochastic Processes. New York: Springer; 1984. [Google Scholar]
  19. Robins JM, Tsiatis AA. Correcting for non-compliance in randomized trials using rank preserving structural failure time models. Commun. Statist. A. 1991;20:2609–31. [Google Scholar]
  20. Sommer A, Zeger SL. On estimating efficacy from clinical trials. Statist Med. 1991;10:45–53. doi: 10.1002/sim.4780100110. [DOI] [PubMed] [Google Scholar]
  21. Zelen M. A new design for randomized clinical trials. New Engl J Med. 1979;300:1242–5. doi: 10.1056/NEJM197905313002203. [DOI] [PubMed] [Google Scholar]
  22. Zelen M. Randomized consent designs for clinical trials: an update. Statist Med. 1990;9:645–56. doi: 10.1002/sim.4780090611. [DOI] [PubMed] [Google Scholar]

Articles from Biometrika are provided here courtesy of Oxford University Press

RESOURCES