Abstract
Instrumental variables (IV) are a useful tool for estimating causal effects in the presence of unmeasured confounding. IV methods are well developed for uncensored outcomes, particularly for structural linear equation models, where simple two-stage estimation schemes are available. The extension of these methods to survival settings is challenging, partly because of the nonlinearity of the popular survival regression models and partly because of the complications associated with right censoring or other survival features. Motivated by the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer screening trial, we develop a simple causal hazard ratio estimator in a proportional hazards model with right censored data. The method exploits a special characterization of IV which enables the use of an intuitive inverse weighting scheme that is generally applicable to more complex survival settings with left truncation, competing risks, or recurrent events. We rigorously establish the asymptotic properties of the estimators, and provide plug-in variance estimators. The proposed method can be implemented in standard software, and is evaluated through extensive simulation studies. We apply the proposed IV method to a data set from the Prostate, Lung, Colorectal and Ovarian cancer screening trial to delineate the causal effect of flexible sigmoidoscopy screening on colorectal cancer survival which may be confounded by informative noncompliance with the assigned screening regimen.
Keywords: Causal treatment effect, Cox proportional hazards model, Instrumental variable, Noncompliance
1. Introduction
Research studies are often fundamentally interested in understanding the causal effect of a treatment or exposure on an outcome of interest (Holland, 1986). In observational studies, unmeasured confounding is a major obstacle to estimating the causal effect of a nonrandomized exposure on disease etiology. Such a challenge also arises in well-designed randomized clinical trials. When there are issues of non-compliance in the treatment arms, the treatment decision may be based on latent (unobserved) factors that strongly correlate with clinical outcomes. This would result in bias from unmeasured confounding and hence complicate the task of estimating the “efficacy” of the treatment.
Instrumental variables (IVs) offer a useful tool for estimating causal treatment or exposure effects in these settings (Angrist and Imbens, 1995; Angrist et al., 1996; Loeys and Goetghebeur, 2003; Li and Lu, 2015; Li and Gray, 2016; MacKenzie et al., 2016). Informally, IVs have the characteristics of being independent of unmeasured confounders, being related to the treatment, and only being related to the outcome through the treatment (Baiocchi et al., 2014). In observational studies, there are a variety of potential sources for instruments that can aid in the estimation of causal effects, either of treatment or exposure (Baiocchi et al., 2014). In randomized clinical trials with non-compliance, the treatment assignment mechanism can serve as an instrumental variable.
The motivating example of this work is the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer screening trial, which is a multi-center randomized trial designed to evaluate the effectiveness of the screening with flexible sigmoidoscopy compared versus usual care. In this study, 77,449 subjects were randomly assigned to the intervention group, but only 85% complied with the assigned sigmoidoscopy protocol. Such non-compliance may be outcome-related. For example, relatively healthy individuals may be more likely to skip the screening. In the presence of unmeasured confounding, neither intent-to-treat (ITT) analysis nor “as-treated” analysis would be adequate for assessing the causal benefit of the treatment (i.e. flexible sigmoidoscopy screening). A possible remedy is an IV analysis that properly adjusts for the selection bias induced by subjects’ post-randomization care selection. The assigned treatment in a randomized trial serves a natural instrumental variable which may be utilized in this analysis.
IV methodology has primarily focused on linear models and continuous outcomes in contexts without censoring. Recently, research on IV methodology for time-to-event data with right censoring has grown rapidly. For example, Baker (1998) developed an IV method for randomized trials with all-or-none compliance and discrete time survival data, which extended the method of latent class instrumental variables (also known as the approach of Local Average Treatment Effect or Complier Average Causal Effect) (Baker and Lindeman, 1994; Imbens and Angrist, 1994). Building on Baker (1998)’s work, Nie et al. (2011) developed an estimation method with improved efficiency. Robins and Tsiatis (1991) considered a structural accelerated failure time model and developed estimators for the causal treatment effect in the context with non-compliance and only administrative censoring. Joffe (2001) provided a detailed discussion of this general approach. Imposing parametric distributional assumptions, Li and Lu (2015) developed a Bayesian approach for IV analysis with censored time-to-event outcome under a two-stage linear model. Li et al. (2015) and Tchetgen et al. (2015) developed IV based methods under additive hazards modeling of time-to-event data. Martinussen et al. (2017) studied a structural cumulative survival model that allows for assessing time-varying exposure effect directly on the scale of survival function.
In time-to-event analysis, the proportional hazards model is the most popular formulation for the effects of treatment and covariates. There have been several IV approaches developed under the proportional hazards modeling. For example, for the special case of all or none noncompliance without covariates, Loeys and Goetghebeur (2003) proposed an estimate for the complier proportional hazards effect of treatment by deriving a properly imputed partial likelihood that recovered the unobserved information on the treatable subgroup in the control arm. Also working in the noncompliance setting, Cuzick et al. (2007) constructed a Mantel-Haenszel-type estimator for the case without covariates and a partial-likelihood based estimator when covariates were present and independent of compliance types. A full likelihood based approach was explored for situations where covariates were correlated with compliance type. Li and Gray (2016) further proposed an EM algorithm for the full likelihood based estimation. Yu et al. (2015) tackled the problem of estimating causal estimands including the complier average causal effect, complier survival probability, and compiler quantile causal effect under the semiparametric transformation model. They adapted the nonparametric likelihood estimation technique of Zeng and Lin (2007), and provided an EM algorithm for implementing the proposed estimation as well as theoretical justifications. While the likelihood-based strategies accommodate both censoring and covariates in the estimation of the causal treatment effect with censored time-to-event data, the resulting estimation and inference procedures are generally very complicated. The computational complexity and stability may become unbearable when the sample size is large, such as in the PLCO Cancer screening trial. Furthermore, they require specifying causal models for all latent compliance classes, not just that of interest, which may impair the robustness of these methods to potential model misspecification.
In this work, we develop a new IV approach to estimating a causal treatment effect under the proportional hazards modeling of time-to-event outcome subject to independent right censoring. The causal estimand of our interest is defined within the latent subgroup of compliers, and is different from the population causal hazards ratio considered in other recent developments of IV methods (Martinussen et al., 2017; Wang et al., 2018, for example). Notably, our method does not need to impose regression models for the latent compliance classes other than the complier subgroup. Our key strategy is to adapt the seminal work of Abadie (2003) which provides a simple link between the unconditional moment of the observed data and the conditional moment given the latent complier group. Abadie (2003) developed a simple weighting strategy which is easily applied to estimating equations which are sums of independent terms. However, an analogous application to the proportional hazards regression is not straightforward. This is because the partial likelihood does not yield an estimating function of the simple form as a sum of independent terms, as with the least squares criterion for linear regression. To circumvent this difficulty, we take carefully designed steps to incorporate the weighting idea of Abadie (2003) through the asymptotic influence functions of the partial likelihood score equation. We further devise the weighting scheme so that the calculations of the parameter estimate can be easily and stably implemented via existing software for weighted proportional hazards regression. Compared to currently available weighting methods, such as the time-dependent weighted estimator proposed by Li and Gray (2016) and the estimator from principal stratification weighting (MacKenzie et al., 2016), our weighting strategy is simple and yet readily applicable to more complex survival settings, for example, in the presence of left truncation, competing risks, or recurrent events; see Section S3 of Supplementary Materials for more details. Such a broad applicability appears lacking in existing IV approaches for proportional hazards models. Finally, we establish the large sample properties of the proposed parameter estimators, including consistency and asymptotic normality.
In Section 2, we first introduce the potential outcomes framework including the latent compliance groups, the IV assumptions, and the set-up of causal proportional hazards regression. We next describe the proposed estimation procedure with randomly censored data, discuss computational considerations, and present a modification of the proposed method which has improved computational features. We rigorously present the consistency and asymptotic normality of the estimators. The results include a closed form for the asymptotic variance of the estimator and a consistent plug-in variance estimator. Bootstrap variance estimates are also provided. The results from extensive simulations are reported Section 3 and demonstrate that the methods perform well with realistic sample sizes. In Section 4, we apply our methods to the data from the PLCO Cancer screening trial. Some remarks conclude in Section 5.
2. Weighted Partial Likelihood Estimation for Causal Proportional Hazards Models
2.1. Potential Outcomes Framework
We introduce the potential outcomes framework and notation commonly employed in the causal inference literature. Consider potential survival times T1 and T0 based on receiving (D = 1) and not receiving the treatment (D = 0), respectively. Define V as a binary IV, and define the potential treatment Dv such that D1 denotes the treatment received when V = 1 and D0 denotes the treatment received when V = 0. Following the terminology of Abadie (2003), subjects can be classified into 4 latent compliance groups based on the potential treatment indicators: compliers (D1 > D0), always-takers (D1 = D0 = 1), never-takers (D1 = D0 = 0), and defiers (D1 < D0). In the PLCO Cancer screening trial, compliers would be the individuals who were assigned to the intervention group and also took the flexible sigmoidoscopy screening. Always-takers (or never-takers) are defined as always (or never) taking the flexible sigmoidoscopy screening. Defiers are individuals who would take the the flexible sigmoidoscopy screening if assigned to the usual care group but not if assigned to the intervention group. Since D1 and D0 cannot be observed at the same time, we are not able determine the latent compliance group membership of any individual based on the observed data alone.
Define the potential outcome for each subject as Tvd, which represents the survival time T if V = v and D = d. Let X represent the covariate vector. We re-state several key assumptions from Abadie (2003) about the IV, V: Let Tvd, X, V, D, Dv be defined as above.
(A1) Independence of the instrument:
(A2) Exclusion of the instrument: P(T1d = T0d∣X) = 1 for d = 0, 1.
(A3) First stage: 0 < P(V = 1∣X) < 1 and P(D1 = 1∣X) > P(D0 = 1∣X)
(A4) Monotonicity: P(D1 ≥ D0∣X) = 1
Assumption (A1) says that the instrument V is as good as random conditional on the covariates X, or equivalently, that V is independent of unmeasured confounders conditional on X. Assumption (A2) says that the instrument V only influences the outcome T through its effect on the treatment D. Assumption (A3) states that every subject has some chance of receiving the instrument V, conditional on the covariate X, and that conditional on X, V has an effect on the treatment received. Finally, assumption (A4) says that with probability one, defiers do not exist.
2.2. Model Formulation
Our focus is to estimate and make inferences about the treatment effect for the latent group of compliers. Specifically, we adopt Cox’s proportional hazards regression model to formulate the effects of treatment and covariates for compliers:
| (2.1) |
where h(t; D, X) is the hazard function for compliers defined as
and h0(t) is an unspecified baseline hazard at time t. In model (2.1), βd is the causal estimand of the primary interest. This is because by assumption (A.1), it is easy to show that h(t; D = d, X) = Pr(t ≤ T(d) < t + Δt∣T(d) ≥ t, V = d, D1 > D0, X) = Pr(t ≤ T(d) < t + Δt∣T(d) ≥ t, D1 > D0, X). Thus, βd can be interpreted as the causal treatment effect for compliers after adjusting for the covariate effects captured by βx (Abadie, 2003). Such a quantity has frequently been of interest in literature (Loeys and Goetghebeur, 2003; Cuzick et al., 2007; Yu et al., 2015, for example). It is worth emphasizing that the proportional hazards model (2.1) is only assumed for compliers. In contrast, likelihood-based approaches (Cuzick et al., 2007; Yu et al., 2015; Li and Gray, 2016, for example) typically require distributional modeling for the other compliance subgroups (e.g. always takers, never-takers) and may be biased under misspecification of those models.
2.3. Estimation
In practice, T is often subject to right censoring by C; thus we observe W = min(T, C) and δ = I(T ≤ C) instead of T. We adopt the standard random censoring assumptions that C is independent of T conditional on (V, D, X). We further assume that C is independent of V given X. Defined O = (W, δ, D, X, V). The observed data consist of n independently identically distributed (i.i.d.) replicates of O, denoted by . Define Yi(t) = I(Wi ≥ t) and Ni(t) = I(Wi ≤ t, δi = 1), which represent the at-risk process and the observed event counting process for subject i respectively. We also assume that there are no ties (i.e. dNi(t) ≤ 1). In the sequel, we use the subscript i to differentiate population quantities and their sample analogues throughout the paper.
Let β0 = (βd, βx) and Z = (D, X). When all subjects are known to be compliers, the estimation of β0 can proceed through standard Cox regression analysis (Andersen and Gill, 1982). This is because, in this case, the hazard function for the whole study population, λ(t∣Z) ≡ limΔt→0 Pr(t ≤ T ≤ t + Δt∣T ≥ t, D, X)/Δt, equals that for the latent complier subgroup, exp(βTZ) h0(t). Then is a martingale, and thus a consistent estimator of β0 can be obtained as the solution of the partial likelihood score equation,
| (2.2) |
where for j = 0, 1, 2. Here and in the sequel, for a vector v, v⊗0 = 1, v⊗1 = v, and v⊗2 = vvT.
Next we consider the more realistic case where the study population consists of both compliers and non-compliers. In this case, λ(t∣Z) generally deviates from the hazard function assumed for the complier group, . As a result, M(t) is no longer a martingale for the whole study population, and equation (2.2) would fail to provide a valid estimate for β0.
To construct an appropriate estimating equation for β0, we utilize the fact that M(t) remains a martingale for the complier group. Using this fact, we can show that μc(β0) = 0 under model (2.1), where (j = 0, 1, 2) and
However, μc(β) cannot be directly used to estimate β0 because the latent complier group, {D1 > D0}, is not observed. To tackle this difficulty, we adopt the strategy of Abadie (2003), which established a simple link between the unconditional moment of the observed data and the conditional moment of the data within the complier group. A simple weighting approach may be employed to identify the regression parameters associated with the complier group. More specifically, let g(·) be a measurable real function of (T, D, X, C) such that E∣g(T, D, X, C)∣ < ∞. Under assumptions (A1)–(A4) and given C is independent of V given X, an application of Theorem 3.1 of Abadie (2003) immediately implies that
| (2.3) |
where
| (2.4) |
This result suggests that a weighting scheme involving κ can lead to the identification of moment-type statistics for compliers. One should recognize that κ can take both positive and negative values. This differs from standard weighting procedures based on probability weighting, where the weights are always positive as a result of probabilities being non-negative. This creates nonstandard computational challenges, which are discussed further below.
Using (2.3), we obtain the following key results for deriving an estimating equation for β0:
where
Suppose κi is known for each subject i. One may construct a weighted estimating equation for β0, Un,κ(β) = 0, where
with . Note that Un,κ(β) remains the same if dNi(s) is replaced by dMi(s), and hence Un,κ(β) is proportional to an empirical counterpart of μc(β). This justifies the use of Un,κ(β) for constructing the estimating equation for β0.
In general, κi’s is known a priori, for example, with external information. In practice, we propose to estimate κi by imposing additional modeling assumptions for Pr(V = 1∣X). Specially, we may assume a logistic regression model for V:
| (2.5) |
with . Let be the maximum likelihood estimator of α0 (Gourieroux and Monfort, 1981; Agresti, 2013) and define
| (2.6) |
Replacing the κi in Un,κ(β) by leads to the proposed estimating equation:
| (2.7) |
where
| (2.8) |
Denote the solution to equation (2.7) by . The detailed computational algorithm for obtaining , and the related algorithmic issues and remedies are discussed in the next subsection.
2.4. The Computational Algorithm
The form of the proposed estimation equation (2.7) closely resembles the estimating equation for a weighted Cox proportional hazards regression. However, an important distinction is that in (2.7) can take negative values. As a result, can have a highly irregular surface with multiple zero-crossings. To address this complication, we propose to locate through finding the maximizer of a properly designed objective function. Specifically, instead of directly solving , we propose to obtain as the maximizer of the following objective function
| (2.9) |
where and ν is a pre-specified small positive value. The justification for doing so is that would be nearly the same as because ν can be arbitrarily small. Truncating below by ν ensures the positiveness of the resulting quantity. In theory, the asymptotic limit of is strictly positive under mild regularity conditions. Therefore, such a truncation should have negligible impact on the finite-sample performance of when n is reasonably large. In our numerical studies, we choose ν = 10−4.
The procedure for obtaining is as follows.
Fit the logistic regression model (2.5) to and obtain .
Calculate using formula (2.6).
Find the maximizer of the objective function in (2.9) by an optimization routine, such as optim() function in R (R Core Team, 2017).
2.5. A Modified Weighting Scheme
In principle, the objective function approaches a limit that is concave, and standard optimization routines are expected to work well when the sample size is large. However, the presence of negative weights κi’s can sometimes lead to a highly irregular surface for and (see figures in Section S4 of Supplementary Materials) and result in numerical instability for estimating β0. To address this problem, we propose a modified weighting scheme, which can avoid negative weights and allow us to obtain through standard computational routines for the weighted proportional hazards regression, such as the coxph() function in R (Therneau, 2015).
Let U = (W, δ, D, X). We define a modified weight by projecting the original weight κ as follows:
| (2.10) |
where v0(U) = E(V∣U) = P(V = 1∣W, δ, D, X). Adapting the arguments of Abadie et al. (2002), we can show that κv = P(D1 > D0∣U) and κv can play the same role as κ in equation (2.3) (see Section S1 of Supplementary Materials). This result indicates that κv is a probability; thus it is always non-negative and can be regarded as a proper weight. Adopting the weighting scheme by κv can avoid the potential numerical issues present with κ. We propose to estimate κv as follows.
Stratify the data by the censoring and treatment status: {(δ = c, D = d)}, c = 0,1 d = 0,1.
Within each stratum, fit a nonparametric or parametric regression model for V given covariates (W, X). This will provide an estimate for v0(U), denoted by .
Calculate the estimated κv as
In Step 2 above, we may consider non-parametric power series (NPPS) regression or logistic regression for V given (W, X). Based on our extensive numerical experiences (including those reported and not reported in Section 3), a second-order logistic regression model with the interaction between W and X works well compared to approaches that estimate from NPPS, or the first-order logistic regression. When the dimension of X is large, we recommend using penalized logistic regression to obtain a reasonable estimate for v0(U). Note that, with finite sample sizes, the resulting estimator may be negative or greater than 1. To circumvent the undesirable numerical properties associated with negative weights, we propose a slightly different modified weight, , that truncates such that its value lies strictly in an interval , say [0.01,0.99]. Since the true weight κv is between 0 and 1 and we can let be arbitrarily close to (0, 1), there should be negligible asymptotic bias induced by such a truncation. Using in place of κ in (2.7), we can easily obtain through the R function, coxph(), with the weight argument properly specified. In Section 3, we thoroughly examine the performance of the proposed estimator with different choices of weight.
2.6. Large Sample Results
We assume the following regularity conditions:
(C1): The parameter space for β, , is compact.
(C2): ∥Z∥ < ∞ and ∣κ∣ < ∞.
(C3): is bounded away from 0 uniformly in β and t.
(C4): Σ0 > 0, where Σ0 is defined in (B.1) of Section S2 of Supplementary Materials.
(C5): .
(C6) There exists an influence function Iα(·) such that
We establish the consistency and the asymptotic normality for the proposed estimator in the following theorems:
Theorem 1. (Consistency) Under conditions (C1)-(C5), .
Theorem 2. (Asymptotic normality) Under conditions (C1)-(C6), , where Ω is defined in Section S2 of Supplementary Materials (see equation (B.12)).
The regularity conditions (C1)-(C2) impose the boundedness of the parameter space and covariates, which are mild and are often met in practice. The boundedness of κ is satisfied when Pr(V = 0∣X) is always away from 0 and 1. Conditions (C3)-(C4) are standard assumptions for Cox proportional hazard regression methods. For example, condition (C4) ensures the identifiability of β0. Conditions (C5)-(C6) depict reasonable requirements on the estimator of α0, such as consistency and asymptotic i.i.d. sum representation. The detailed proofs of Theorems 1 and 2 are provided in Section S2 of Supplementary Materials.
It is worth pointing out that the theoretical properties, including consistency and root-n asymptotic normality, can also be established for the proposed estimator based on the modified weighting scheme presented in Section 2.5. The theoretical arguments can follow similar lines of the proofs of Theorems 1–2. The main distinction lies in the derivation of the influence function, which needs to account for the additional variability induced by . The detailed asymptotic results for the estimator with the modified weight are omitted in this paper but is available upon request.
2.7. Variance Estimation
In the proof of Theorem 2, we derive a closed form for the asymptotic variance of ; see equation (B.12) of Section S2 of Supplementary Materials. A consistent variance estimator for (with weight ) can be obtained by , where is Ω with unknown quantities replaced by their empirical counterparts or consistent estimators.
An alternative approach to estimating the asymptotic variance of (with weight or ) is to use bootstrapping: Step 1: Resample n observations from the original dataset with replacement, , and add some small amount of noise (e.g. N(0, 10−10)) to avoid the presence of ties in the resampled data; Step 2: Calculate based on with weights as described in Section 2 (i.e. , , or ); Step 3: Repeat steps 1-2 for b = 1,…, B; Step 4: Estimate the asymptotic variance of by the empirical variance of .
In the bootstrapping procedure, the computations in Step 2 may fail to converge. In such a case, we would carry out Step 3 until there are B convergent estimates. In addition, repeated resampling may occasionally produce outlier estimates that artificially inflate the empirical variance in Step 4. When this occurs, one may estimate the standard deviation of by the median absolute deviation, namely, 1.4826 × MAD, where (Rousseeuw and Croux, 1993). This alternative approach performs quite well based on our numerical experiences.
3. Simulation Study
We conduct extensive simulations to assess the performance of the proposed estimators. To create data under assumptions (A1) to (A4), we take the following steps:
Generate X from a bounded distribution.
Generate the latent group membership (i.e. complier, always-taker or never-taker) from a multinomial distribution.
Generate V ~ Bernoulli(P(V = 1∣X)), where , and determine D by V and the latent group membership.
For compliers, generate potential survival times and , where ϵ follows the extreme value distribution.
For non-compliers, generate T00 or T01 given X possibly from a non-Cox regression model, and let T00 = T10 and T01 = T11.
Set T = Tvd and draw independent censoring times C ~ Exponential(0.5).
We consider two basic data generation scenarios with a single covariate X. In scenario 1, for compliers, survival times are generated with β0 = (βd, βx) = (−0.5, −0.2). Survival times for non-compliers in scenario 1 are generated according to T = exp(−0.02X + ϵ1) where ϵ1 ~ N(0, 0.01) (i.e. no treatment effect). In scenario 2, compliers’ survival times are generated with β0 = (−0.3, 0.05). Non-compliers’ survival times are also generated from a Cox proportional hazards regression model, where T = exp(0.5D − 0.05X + ϵ2) and ϵ2 follows the extreme value distribution.
For each scenario, we consider 8 cases with different combinations of rate of compliers, sample size, and covariate distribution. Specifically, in cases 1-4, X follows Uniform(−1, 1) distribution, and in cases 5-8, X follows Bernoulli(0.5) distribution. The sample size n = 1000 in cases 1, 2, 5, and 6, and n = 4000 in cases 3, 4, 7, and 8. The probability of compliers equals 1/3 in cases 1, 3, 5, and 7, and equals 2/3 in cases 2, 4, 6, and 8.
We compare several different methods of estimation: (1) the benchmark estimate based only on the compliers (unknown in a real data analysis); (2) the naive estimate which assumes the entire sample follows the same Cox model; (3) the proposed -weighted estimate; (4) the modified -weighted estimate; (5) the estimate based on the truncated modified weights . Hereafter, we refer to these methods as “Complier”, “κ”, “κv”, “κv,tr”.
To estimate and , we estimate v0(U) = P(V = 1∣W, X, D, δ) using the method described in Section 2.5 with a second-order logistic regression including the interaction between W and X for each of the 4 partitions by the censoring and treatment status. Without further mentioning, estimation using and follows the algorithms and caveats laid out in Section 2.4 and 2.5, where is estimated by maximizing the objective function in (2.9). More specifically, we use the R function optim with the BFGS method option (R Core Team, 2017), considering three different starting values (based on the naive estimate, ±0.5), to solve the maximization problem. For the method κv,tr, we use the R function coxph to implement the proposed estimation as described in Section 2.5. For each method under comparison, we check whether the resulting estimate solves the proposed estimating equation within some tolerance (e.g. 0.05). We record a failure to converge if such an estimate cannot be produced.
The top row of Figure 1 shows the convergence rates for the three proposed estimators. In scenario 1, the convergence rates of both and are close to 100% times across the 8 cases considered. In scenario 2, the convergence rate varies considerably, but generally increases with n and the proportion of compliers P(D1 > D0). Anecdotal examination reveals that the objective and estimating function surfaces for this scenario can be highly irregular. In contrast, as -weights are always positive, the resultant surfaces are smooth and the resulting convergence rates are always 100%. The second row of Figure 1 demonstrates the empirical bias by comparing the treatment and covariate parameter estimates to the truth. The naive parameter estimators generally demonstrate large empirical bias, with the proposed methods reducing the bias considerably.
Figure 1:
Simulation results: convergence rates, mean estimates, and empirical coverage probabilities of 95% confidence intervals: Complier (■); Naive (•); κ (o); κv (+); κvtr (✳)
In Figure 2, we compare various standard error (SE) estimates to the empirical standard deviations (SD) of the proposed estimators. We denote the mean and median estimated SE based on the analytic variance estimation by Mean SE and Median SE respectively, and denote the mean and median estimated SE based on the bootstrapping variance estimation by Mean Bootstrap SE and Median Bootstrap SE respectively. The empirical standard deviation (SD) is denoted by Empirical. For the method κ, we evaluate both analytic variance estimation and bootstrapping based variance estimation. It is observed that both Mean Bootstrap SE and Median Bootstrap SE are rather close to the corresponding empirical SDs in both Scenarios 1 and 2. As for the analytic variance estimation, Median SEs are in good agreement with the empirical SDs, while in Scenario 2, many Mean SEs considerably depart from the empirical SDs. The latter phenomenon may reflect the unstable performance of the κ-weighted estimator in Scenario 2, which is consistent with the lower convergence rates of the method κ in Scenario 2. For the methods κv and κv,tr, we only examine the bootstrapping based variance estimation. Two extreme outliers are removed from calculating the mean bootstrap SE for the covariate coefficient estimate based on method κv in the Case 5 of Scenario 1. We observe fairly small discrepancies among Mean Bootstrap SEs, Median Bootstrap SEs, and empirical SDs for both Scenarios 1 and 2, while the method κv,tr shows slightly better performance.
Figure 2:
Simulation results: the estimated standard errors and empirical standard deviations of , , weighted estimators: Empirical (□); Mean SE (+); Median SE (∇); Mean Bootstrap SE (×); Median Bootstrap SE (✳)
The bottom row of Figure 1 demonstrates the empirical coverage probabilities of 95% confidence intervals, constructed as , where stands for the bootstrapping based SE. The coverage probabilities associated with the method κv,tr are fairly close to the nominal 95% level, dipping to 93% in a few cases. The methods, κ and κv, have similar and generally more conservative performance in terms of the empirical coverage probabilities. Note that the results presented for these two methods are only based on simulations which produce converged estimates. In Scenario 2 where the convergence rates of κ and κv can be considerably below 1, the results in Figure 1 may over-represent the performance of these two methods.
Based on all the simulations, the method κv,tr evidences the best performance of the different weighting methods, exhibiting good coverage probabilities, low bias, and reliable convergence.
4. Colon Cancer Screening with Flexible Sigmoidoscopy
The Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial is a multicenter, two-armed randomized trial, sponsored by the National Cancer Institute, of screening tests for prostate, lung, colorectal and ovarian cancers. Ten centers across the U.S. recruited approximately 155,000 participants between November 1993 and July 2001. Data were collected until December 31, 2009. One objective of the trial is evaluating the effectiveness of screening with flexible sigmoidoscopy on mortality from colorectal cancer compared to usual-care. Prorok et al. (2000) reported further details about this trial.
The original data consist of 154, 897 individuals aged 55 to 74 years. They were randomly assigned to either the usual-care (control, N = 77,453) group or the screening with flexible sigmoidoscopy (intervention, N = 77,444) group. For the intervention group, subjects were offered the screening at baseline and 3 or 5 years later. The data from 187 participants who dropped out, died, were diagnosed with cancer, or had an organ removed before the first screening visit and the data on 4 participants who have no follow-up after randomization are discarded. Thus, we only consider 154, 706 individuals in our analyses.
Table 1 presents descriptive statistics for the baseline characteristics of the participants stratified by the screening assignment (i.e. V = 0, V = 1) and the actual screening status (i.e. D = 0, D = 1). We also consider risk factors, including age (in years), gender, family history of any cancer, family history of colorectal cancer, colorectal polyps, and diabetes. We apply t-tests or chi-square tests to check the balance of these observed risk factors between the groups determined by the screen assignment or the actual screening status. Based on the p-values reported in Table 1, there is strong evidence that this trial was well randomized, with small and nonsignificant associations between the screening assignment and the risk factors. However, most of these risk factors are unbalanced by the actual screening status. The summary statistics in Table 1 suggest that participants, who were older, male, with family history of any cancer or with family history of colorectal cancer, or with diabetes, were more likely to take the colon cancer screening when it was assigned. Thus, there is some evidence to suggest that the study participants’ post-randomization care selections and their potential survival outcomes are dependent. Hence, the traditional ITT or the “as-treated” analysis may be problematic for evaluating the causal effect of flexible sigmoidoscopy screening on colorectal cancer mortality.
Table 1:
Characteristics of the Study Participants
| Characteristics | Control (V = 0) N = 77449 |
Intervention (V = 1) N = 77257 |
Not Screened (D = 0) N = 90056 |
Screened (D = 1) N = 64650 |
||
|---|---|---|---|---|---|---|
|
|
|
|||||
| Number of Participants (%) | p-value | Number of Participants (%) | p-value | |||
| Age* | ||||||
| 62.60 (5.37) | 62.59 (5.39) | 0.8274 | 62.65 (5.39) | 62.52 (5.33) | <.0001 | |
| Age Level | ||||||
| 55-59 yr | 25838 (33.36) | 25789 (33.38) | 29902 (33.20) | 21725 (33.60) | ||
| 60-64 yr | 23767 (30.69) | 23736 (30.72) | 27451 (30.48) | 20052 (31.02) | ||
| 65-69 yr | 17473 (22.56) | 17402 (22.52) | 20352 (22.60) | 14523 (22.46) | ||
| 70-74 yr | 10371 (13.39) | 10330 (13.37) | 0.9967 | 12351 (13.71) | 8350 (12.92) | <.0001 |
| Sex | ||||||
| Male | 38340 (49.50) | 38229 (49.48) | 43529 (48.34) | 33040 (51.11) | ||
| Female | 39109 (50.50) | 39028 (50.52) | 0.9393 | 46527 (51.66) | 31610 (48.89) | <.0001 |
| Family History of Any Cancer | ||||||
| No | 32742 (42.28) | 33327 (43.14) | 37798 (41.97) | 28271 (43.73) | ||
| Yes | 41305 (53.33) | 41971 (54.33) | 0.8735§ | 47137 (52.34) | 36139 (55.90) | 0.0190§ |
| Unknown | 3402 (4.39) | 1959 (2.54) | <.0001 | 5121 (5.69) | 240 (0.37) | <.0001 |
| Family History of Colorectal Cancer | ||||||
| No | 64504 (83.29) | 65203 (84.40) | 73997 (82.17) | 55710 (86.17) | ||
| Yes † | 7320 (9.45) | 7627 (9.87) | 0.0809§ | 8331 (9.25) | 6616 (10.23) | 0.0022§ |
| Possibly ‡/Unkown | 5625 (7.26) | 4427 (5.73) | <.0001 | 7728 (8.58) | 2324 (3.59) | <.0001 |
| Colorectal Polyps | ||||||
| No | 68690 (88.69) | 69910 (90.49) | 78705 (87.40) | 59895 (92.65) | ||
| Yes | 4947 (6.39) | 5185 (6.71) | 0.1565§ | 5739 (6.37) | 4393 (6.80) | 0.7865§ |
| Unknown | 3812 (4.92) | 2162 (2.80) | <.0001 | 5612 (6.23) | 362 (0.56) | <.0001 |
| Diabetes | ||||||
| No | 68028 (87.84) | 69371 (89.79) | 77773 (86.36) | 59626 (92.23) | ||
| Yes | 5699 (7.36) | 5810 (7.52) | 0.9971§ | 6776 (7.52) | 4733 (7.32) | <.0001§ |
| Unknown | 3722 (4.81) | 2076 (2.69) | <.0001 | 5507 (6.12) | 291 (0.45) | <.0001 |
denotes a continuous variable. Mean and standard deviation are reported.
indicates colorectal cancer family history in immediate family member.
indicates colorectal cancer family history in relatives or unclear cancer type.
indicates p-value without considering missing category.
To address this issue, we employ the proposed IV methods, with the survival outcome of interest (T) defined as the time from trial entry (i.e. randomization) to death from colorectal cancer (in years), and the IV chosen as the screening assignment (V). In our dataset, 351 and 249 colorectal cancer deaths were observed in the control group (n = 77098) and the intervention group (n = 77,098) respectively; 409 and 191 colorectal cancer deaths were observed in the group without screening (n = 89,647) and the group with screening (n = 64,459) respectively. In our analysis, deaths due to other causes are competing risks for death from colon cancer. As discussed in Section S3 of Supplementary Materials, naively treating such competing events as censoring events leads to a valid IV proportional hazards analysis of the cause-specific hazard function for colon cancer death. Our instrumental variable is justified as follows: (i) the screening assignment is highly informative of the actual screening status (D) (i.e. screened vs. not screened); (ii) the screen assignment is random and hence is expected to be independent of unmeasured confounders (given the observed risk factors); (iii) it is reasonable to expect that the impact of the screening assignment on the survival outcome is only through its influence on the actual screening status.
We first assess the unadjusted causal effect of the flexible sigmoidoscopy screening by fitting model (2.1) without X to the full dataset and stratifying the analysis by each risk factor. For comparison purposes, we also perform the “as-treated” counterparts (i.e. fitting a Cox model for T with D being the only covariate), and the ITT counterparts (i.e. fitting a Cox model for T with V being the only covariate) of these IV analyses. For the IV analyses, we implement the three methods κ, κv, and κv,tr in the same way as in our simulation studies (see Section 3), except we use a simple logistic regression model stratified by (δ, D) to estimate the v0(U) in (2.10). Table 2 reports the parameter estimates and the associated standard errors. For the IV methods, we present the bootstrap-based standard errors. Table 2 also reports the rate of compliance in the intervention group (i.e. the proportion of screened participants in the intervention group), pc.
Table 2:
Analyses for the Unadjusted Screening Effect Based on the Whole Data Set or Stratified by Each Risk Factor
| Data | N | pc | As-Treated | ITT | κ | κv | κv,tr |
|---|---|---|---|---|---|---|---|
|
|
|||||||
| (Subgroup) | Parameter Estimates (Standard Errors) | ||||||
| Total | 154706 | 0.84 | −0.442* | −0.343* | −0.427* | −0.427* | −0.427* |
| (0.088) | (0.083) | (0.099) | (0.097) | (0.101) | |||
| Age Level | |||||||
| 55-59 yr | 51627 | 0.84 | −0.572* | −0.380* | −0.496* | −0.496* | −0.496* |
| (0.198) | (0.184) | (0.229) | (0.240) | (0.248) | |||
| 60-64 yr | 47503 | 0.84 | −0.313 | −0.130 | −0.169 | −0.169 | −0.169 |
| (0.160) | (0.153) | (0.201) | (0.198) | (0.193) | |||
| 65-69 yr | 34875 | 0.83 | −0.475* | −0.590* | −0.654* | −0.655* | −0.655* |
| (0.164) | (0.158) | (0.178) | (0.164) | (0.182) | |||
| 70-74 yr | 20701 | 0.81 | −0.420* | −0.264 | −0.351 | −0.350 | −0.350 |
| (0.188) | (0.176) | (0.228) | (0.213) | (0.218) | |||
| Sex | |||||||
| Male | 76569 | 0.86 | −0.549* | −0.445* | −0.536* | −0.536* | −0.536* |
| (0.115) | (0.109) | (0.124) | (0.123) | (0.123) | |||
| Female | 78137 | 0.81 | −0.319* | −0.200 | −0.262 | −0.262 | −0.262 |
| (0.135) | (0.128) | (0.156) | (0.172) | (0.166) | |||
| Family History of Any Cancer | |||||||
| Yes | 83276 | 0.86 | −0.237* | −0.258* | −0.294* | −0.294* | −0.294* |
| (0.114) | (0.111) | (0.120) | (0.124) | (0.127) | |||
| No | 66069 | 0.85 | −0.704* | −0.492* | −0.639* | −0.639* | −0.639* |
| (0.144) | (0.132) | (0.158) | (0.179) | (0.162) | |||
| Family History of Colorectal Cancer | |||||||
| Yes | 14947 | 0.87 | −0.010 | −0.097 | −0.105 | −0.106 | −0.106 |
| (0.241) | (0.239) | (0.251) | (0.271) | (0.254) | |||
| No | 129707 | 0.85 | −0.457* | −0.391* | −0.469* | −0.469* | −0.469* |
| (0.099) | (0.094) | (0.117) | (0.113) | (0.104) | |||
| Colorectal Polyps | |||||||
| Yes | 10132 | 0.85 | 0.315 | 0.288 | 0.335 | 0.336 | 0.336 |
| (0.305) | (0.309) | (0.388) | (0.405) | (0.389) | |||
| No | 138600 | 0.86 | −0.490* | −0.401* | −0.490* | −0.485* | −0.485* |
| (0.093) | (0.089) | (0.111) | (0.112) | (0.110) | |||
| Diabetes | |||||||
| Yes | 11509 | 0.81 | −1.036* | −0.335 | −0.606 | −0.603 | −0.603 |
| (0.311) | (0.253) | (0.451) | (0.454) | (0.438) | |||
| No | 137399 | 0.86 | −0.355* | −0.349* | −0.404* | −0.404* | −0.404* |
| (0.093) | (0.090) | (0.095) | (0.099) | (0.092) | |||
indicates p-value ≤ 0.05
From Table 2, we observe that the estimates for the causal effect of screening are very similar among the three IV methods. The conclusions regarding the survival impact of screening are generally consistent across the IV analyses, the as-treated analyses, and the ITT analyses, except for the sub-cohort with baseline age between 70 and 74 years and the sub-cohort with diabetes. In these two cases, rather large, significant benefits of screening are suggested by the as-treated analyses but not by the ITT or IV analyses. Such discrepancies may be explained by the relatively high noncompliance rates (≈ 19%) observed in the intervention group. That is, study participants who refused assigned screening are likely to be less health-conscious, which may be associated with worse potential survival outcomes. When the non-screened group includes a large proportion of such participants, the as-treated analyses would tend to over-estimate the benefit of screening as a result of ignoring the survival impact of the unmeasured confounder related to health-consciousness. Therefore, in these two cases, it is more plausible to conclude that the flexible sigmoidoscopy screening offers little survival benefits for the participants aged between 70 and 74 years and for participants with diabetes. Overall, the unadjusted stratified analyses support the benefit of flexible sigmoidoscopy in reducing colorectal mortality, with the greatest benefit in subpopulations with relatively low mortality risk, for example, age group 55-59 years and subjects without family history of colorectal cancer.
We next evaluate the causal effect of screening while accounting for other risk factors. Specifically, we fit model (2.1) with X capturing gender, family history of any cancer, family history of colorectal cancer, colorectal polyps, and diabetes, separately for the four age groups, 55-59 years, 60-64 years, 65-69 years, and 70-74 years. Table 3 provides the summary statistics (i.e. count and percentage) of the risk factors by V and by D within each age group, along with the p values from testing the association of the risk factors with V or D based on the Chi-square tests. Similarly to Table 1, within each age group, the risk factors show little association with the screening assignment D but may be significantly different between the participants who were screened versus those who were not screened.
Table 3:
Characteristics of the Study Participants by Age Subgroups
| Age Level | 55-59 yr | 60-64 yr | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|||||||||||
| Covariates | V = 0 | V = 1 | p-value | D = 0 | D = 1 | p-value | V = 0 | V = 1 | p-value | D = 0 | D = 1 | p-value |
| Gender | ||||||||||||
| Male | 11078 (46.8) | 11576 (47.6) | 12403 (46.1) | 10251 (48.7) | 10831 (49.5) | 11145 (50.0) | 12074 (48.4) | 9902 (51.4) | ||||
| Female | 12595 (53.2) | 12724 (52.4) | 0.0661 | 14530 (53.9) | 10789 (51.3) | <.0001* | 11070 (50.5) | 11164 (50.0) | 0.2946 | 12886 (51.6) | 9348 (48.6) | <.0001* |
| Family History of Any Cancer | ||||||||||||
| No | 11148 (47.1) | 11545 (47.5) | 12793 (47.5) | 9900 (47.1) | 9979 (45.6) | 10139 (45.4) | 11427 (45.8) | 8691 (45.1) | ||||
| Yes | 12525 (52.9) | 12755 (52.5) | 0.3633 | 14140 (52.5) | 11140 (52.9) | 0.3361 | 11922 (54.4) | 12170 (54.6) | 0.8138 | 13533 (54.2) | 10559 (54.9) | 0.1882 |
| Family History of Colorectal Cancer | ||||||||||||
| No | 21485 (90.8) | 22000 (90.5) | 24455 (90.8) | 19030 (90.4) | 19681 (89.9) | 19923 (89.3) | 22455 (90.0) | 17149 (89.1) | ||||
| Yes | 2188 (9.2) | 2300 (9.5) | 0.4118 | 2478 (9.2) | 2010 (9.6) | 0.1935 | 2220 (10.1) | 2386 (10.7) | 0.0565 | 2505 (10.0) | 2101 (10.9) | 0.0029* |
| Colorectal Polyps | ||||||||||||
| No | 22691 (95.9) | 23264 (95.7) | 25806 (95.8) | 20149 (95.8) | 20432 (93.3) | 20747 (93.0) | 23259 (93.2) | 17920 (93.1) | ||||
| Yes | 982 (4.1) | 1036 (4.3) | 0.5448 | 1127 (4.2) | 891 (4.2) | 0.8029 | 1469 (6.7) | 1562 (7.0) | 0.2282 | 1701 (6.8) | 1330 (6.9) | 0.7117 |
| Diabetes | ||||||||||||
| No | 22217 (93.8) | 22888 (94.2) | 25223 (93.7) | 19882 (94.5) | 20243 (92.4) | 20648 (92.6) | 23008 (92.2) | 17883 (92.9) | ||||
| Yes | 1456 (6.2) | 1412 (5.8) | 0.1211 | 1710 (6.3) | 1158 (5.5) | 0.0001* | 1658 (7.6) | 1661 (7.4) | 0.6308 | 1952 (7.8) | 1367 (7.1) | 0.0047* |
| 65-69 yr | 70-74 yr | |||||||||||
|
|
|
|||||||||||
| V = 0 | V = 1 | p-value | D = 0 | D = 1 | p-value | V = 0 | V = 1 | p-value | D = 0 | D = 1 | p-value | |
| Gender | ||||||||||||
| Male | 8042 (50.1) | 8192 (50.4) | 8985 (48.7) | 7249 (52.3) | 4579 (48.1) | 4641 (48.5) | 5177 (46.3) | 4043 (51.0) | ||||
| Female | 8015 (49.9) | 8073 (49.6) | 0.6203 | 9483 (51.3) | 6605 (47.7) | <.0001* | 4949 (51.9) | 4925 (51.5) | 0.5368 | 5997 (53.7) | 3877 (49.0) | <.0001* |
| Family History of Any Cancer | ||||||||||||
| No | 7252 (45.2) | 7309 (44.9) | 8399 (45.5) | 6162 (44.5) | 4083 (42.9) | 4153 (43.4) | 4844 (43.4) | 3392 (42.8) | ||||
| Yes | 8805 (54.8) | 8956 (55.1) | 0.6898 | 10069 (54.5) | 7692 (55.5) | 0.0754 | 4083 (57.1) | 4153 (56.6) | 0.4421 | 4844 (56.6) | 3392 (57.2) | 0.4819 |
| Family History of Colorectal Cancer | ||||||||||||
| No | 14320 (89.2) | 14461 (88.9) | 16476 (89.2) | 12305 (88.8) | 4083 (88.4) | 4153 (88.5) | 4844 (88.6) | 3392 (88.2) | ||||
| Yes | 1737 (10.8) | 1804 (11.1) | 0.4415 | 1992 (10.8) | 1549 (11.2) | 0.2686 | 1104 (11.6) | 1102 (11.5) | 0.9029 | 1275 (11.4) | 931 (11.8) | 0.4771 |
| Colorectal Polyps | ||||||||||||
| No | 1737 (91.5) | 1804 (91.2) | 1992 (91.5) | 1549 (91.3) | 8561 (89.9) | 8582 (89.7) | 10036 (89.8) | 7107 (89.7) | ||||
| Yes | 1360 (8.5) | 1426 (8.8) | 0.3509 | 1576 (8.5) | 1210 (8.7) | 0.5387 | 967 (10.1) | 984 (10.3) | 0.7722 | 1138 (10.2) | 813 (10.3) | 0.8750 |
| Diabetes | ||||||||||||
| No | 14652 (91.2) | 14799 (91.0) | 16798 (91.0) | 12653 (91.3) | 967 (90.5) | 984 (89.5) | 1138 (90.1) | 813 (89.8) | ||||
| Yes | 1405 (8.8) | 1466 (9.0) | 0.4169 | 1670 (9.0) | 1201 (8.7) | 0.2506 | 907 (9.5) | 1007 (10.5) | 0.0218* | 1110 (9.9) | 804 (10.2) | 0.6390 |
N (Row Percentage, %)
indicates p-value ≤ 0.05
Table 4 presents the parameter estimates and the associated standard errors based on the IV methods, κ, κv, and κv,tr. The coefficient estimates from the as-treated analysis (i.e. a multivariate Cox model for T given D and X) and the ITT analysis (i.e. a multivariate Cox model for T given V and X) are also presented along with the corresponding standard errors. From Table 4, we again observe a quite good agreement among the three IV estimates. The IV analyses suggest that the flexible sigmoidoscopy screening has a significant protective effect on colorectal cancer mortality in the older age groups, such as 65-69 years and 70-74 years, but not in the younger age groups, 55-59 years and 60-64 years, after adjusting for age, gender, family history of any cancer, family history of colorectal cancer, and colorectal polyps, and diabetes.
Table 4:
Results of Adjusted Models within Age Subgroups
| Age Level | As-Treated | ITT | κ | κv | κv,tr | |
|---|---|---|---|---|---|---|
|
|
||||||
| (pc) | Covariates | Point Estimates (Standard Errors) | ||||
| 55-59 yr | Screening | −0.474* (0.207) | −0.296 (0.196) | −0.373 (0.228) | −0.373 (0.242) | −0.373 (0.246) |
| (0.84) | Female | −0.101 (0.195) | −0.089 (0.195) | −0.003 (0.244) | −0.013 (0.246) | −0.013 (0.232) |
| Family History of Any Cancer | 0.204 (0.208) | 0.201 (0.208) | 0.468 (0.272) | 0.463 (0.280) | 0.465 (0.280) | |
| Family History of Colorectal Cancer | 0.194 (0.313) | 0.192 (0.313) | −0.080 (0.471) | −0.071 (0.468) | −0.073 (0.386) | |
| Colorectal Polyps | 0.276 (0.422) | 0.277 (0.422) | 0.137 (0.736) | 0.135 (1.725) | 0.131 (1.768) | |
| Diabetes | 0.168 (0.392) | 0.179 (0.392) | 0.127 (0.606) | 0.125 (0.591) | 0.126 (0.710) | |
| 60-64 yr | Screening | −0.333* (0.169) | −0.184 (0.163) | −0.228 (0.197) | −0.229 (0.205) | −0.242 (0.181) |
| (0.84) | Female | −0.419* (0.167) | −0.409* (0.166) | −0.579* (0.214) | −0.585* (0.206) | −0.563* (0.189) |
| Family History of Any Cancer | −0.182 (0.176) | −0.183 (0.176) | −0.055 (0.234) | −0.054 (0.231) | −0.071 (0.225) | |
| Family History of Colorectal Cancer | 0.396 (0.260) | 0.391 (0.260) | 0.564* (0.279) | 0.566* (0.275) | 0.556* (0.276) | |
| Colorectal Polyps | −0.124 (0.329) | −0.121 (0.329) | −0.141 (0.429) | −0.147 (0.446) | −0.108 (0.351) | |
| Diabetes | 0.520* (0.258) | 0.526* (0.258) | 0.114 (0.458) | 0.117 (0.554) | 0.206 (0.369) | |
| 65-69 yr | Screening | −0.386* (0.168) | −0.526* (0.165) | −0.564* (0.187) | −0.568* (0.166) | −0.576* (0.188) |
| (0.83) | Female | −0.402* (0.164) | −0.388* (0.164) | −0.426* (0.194) | −0.435* (0.181) | −0.408* (0.182) |
| Family History of Any Cancer | −0.182 (0.176) | −0.185 (0.176) | −0.187 (0.190) | −0.190 (0.196) | −0.198 (0.186) | |
| Family History of Colorectal Cancer | 0.565* (0.129) | 0.563* (0.139) | 0.625* (0.260) | 0.642* (0.242) | 0.627* (0.253) | |
| Colorectal Polyps | −0.306 (0.314) | −0.299 (0.314) | −0.226 (0.331) | −0.243 (0.349) | −0.245 (0.328) | |
| Diabetes | 0.370 (0.251) | 0.377 (0.251) | 0.036 (0.419) | 0.045 (0.411) | 0.138 (0.313) | |
| 70-74 yr | Screening | −0.414* (0.196) | −0.364 (0.186) | −0.437 (0.225) | −0.439* (0.223) | −0.439* (0.223) |
| (0.81) | Female | −0.486* (0.189) | −0.467* (0.189) | −0.472* (0.228) | −0.484* (0.246) | −0.486* (0.228) |
| Family History of Any Cancer | 0.157 (0.195) | 0.152 (0.195) | 0.387 (0.250) | 0.389 (0.268) | 0.388 (0.233) | |
| Family History of Colorectal Cancer | −0.219 (0.318) | −0.222 (0.318) | −0.344 (0.508) | −0.333 (0.405) | −0.342 (0.384) | |
| Colorectal Polyps | 0.088 (0.286) | 0.093 (0.286) | 0.137 (0.387) | 0.124 (0.396) | 0.121 (0.349) | |
| Diabetes | 0.444 (0.270) | 0.451 (0.270) | −0.027 (0.583) | −0.031 (0.589) | −0.037 (0.409) | |
indicates p-value ≤ 0.05
This finding is generally consistent with that based on the ITT analyses, but moderately disagrees with the results from the as-treated analyses, particularly in the age groups, 55-59 years and 60-64 years. The similarity between the ITT analyses and the proposed new analyses might be due to the dilution effect commonly seen in screening trials that was discussed in Baker et al. (2002). To understand the discrepancies with the as-treated analyses, we observe a more marked imbalance of risk factors by the actual screening status, compared to that presented in the two older age groups, 55-59 years and 60-64 years. For example, in the age group 60-64 years, participants who were female, had diabetes, or had no family history of colorectal cancer are significantly less likely to comply to the assigned screening assignment than those who were males, had no diabetes, or had a family of colorectal cancer. Such associations may bias the estimation of the causal treatment effect by the as-treated analyses, and this may explain the discrepancies observed in Table 4 between the as-treated analyses and the IV analyses. In addition, the IV analyses provide strong evidence for the lower colorectal cancer mortality risk in females (versus males) in all age groups beyond the age of 60 years. They also suggest some survival disadvantages (regarding colorectal cancer mortality) associated with the presence of family history of colorectal cancer.
5. Concluding Remarks
The use of instrumental variables in survival settings with binary treatments has been severely limited by complexities arising from nonlinear model specifications, as with the proportional hazards model. The application of simple two stage estimation procedures developed for linear models is challenging and only valid in special cases. Alternative procedures may entail strong modelling assumptions on strata other than that of interest, tend to be complex, both computationally and inferentially, and are not readily implemented using standard software. Our approach based on a special characterization of instrumental variables enables a simple two stage procedure analogous to propensity score weighting. At the first stage, a binary regression model is fit to the instrumental variable while at the second stage, the fitted regression model from the first stage is used to construct a weight which “debiases” naive estimating equation for the proportional hazards model. Previous work on this approach (Abadie et al., 2002; Abadie, 2003) has only considered iid estimating equations with limited attention to the practical computational issues. The current paper demonstrates rigorously its validity with the partial likelihood score function. Moreover, the proposed estimators can be easily computed using existing software for the proportional hazards model, with variance estimation based on bootstrapping correctly accounting for the first stage estimation of the weights, and is generally applicable to instrumental variable estimation of proportional hazards model in complex survival scenarios, for example, in the presence of left truncation, competing risks, and recurrent events.
Supplementary Material
Acknowledgements
The first two authors have equal contributions to this work. The authors would like to express special thanks to Jerome Mabie, Tom Riley, Ryan Nobel and Josh Rathmell, Information Management Services (IMS) Inc, for supporting and managing the PLCO data. The authors also thank Dr. Stuart G. Baker, National Cancer Institute, for kindly introducing the IMS team for this research. The authors gratefully acknowledge the support from the National Institutes of Health grant R01 HL113548.
Footnotes
Supplementary Materials
Supplementary Materials, which include theoretical justifications and proofs, discussions of generalizations to complex survival settings, and additional figures and tables, are available online.
References
- Abadie A (2003). Semiparametric instrumental variable estimation of treatment response models. Journal of econometrics 113, 231–263. [Google Scholar]
- Abadie A, Angrist J, and Imbens G (2002). Instrumental variables estimates of the effect of subsidized training on the quantiles of trainee earnings. Econometrica 70, 91–117. [Google Scholar]
- Agresti A (2013). Categorical Data Analysis. Wiley Series in Probability and Statistics. Wiley. [Google Scholar]
- Andersen PK and Gill RD (1982). Cox’s regression model for counting processes: A large sample study. The Annals of Statistics 10, 1100–1120. [Google Scholar]
- Angrist J and Imbens G (1995). Identification and estimation of local average treatment effects.
- Angrist JD, Imbens GW, and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American, Statistical Association 91, 444–455. [Google Scholar]
- Baiocchi M, Cheng J, and Small DS (2014). Instrumental variable methods for causal inference. Statistics in medicine 33, 2297–2340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker SG (1998). Analysis of survival data from a randomized trial with all-or-none compliance: estimating the cost-effectiveness of a cancer screening program. Journal of the American Statistical Association 93, 929–934. [Google Scholar]
- Baker SG, Kramer BS, and Prorok PC (2002). Statistical issues in randomized trials of cancer screening. BMC medical research methodology 2, 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker SG and Lindeman KS (1994). The paired availability design: a proposal for evaluating epidural analgesia during labor. Statistics in Medicine 13, 2269–2278. [DOI] [PubMed] [Google Scholar]
- Cuzick J, Sasieni P, Myles J, and Tyrer J (2007). Estimating the effect of treatment in a proportional hazards model in the presence of non-compliance and contamination. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69, 565–588. [Google Scholar]
- Gourieroux C and Monfort A (1981). Asymptotic properties of the maximum likelihood estimator in dichotomous logit models. Journal of Econometrics 17, 83–97. [Google Scholar]
- Holland PW (1986). Statistics and causal inference. Journal of the American, statistical Association 81, 945–960. [Google Scholar]
- Imbens G and Angrist J (1994). dentification and estimation of local average treatment effects. Econometrica 62, 467–476. [Google Scholar]
- Joffe MM (2001). Administrative and artificial censoring in censored regression models. Statistics in medicine 20, 2287–2304. [DOI] [PubMed] [Google Scholar]
- Li G and Lu X (2015). A bayesian approach for instrumental variable analysis with censored time-to-event outcome. Statistics in Medicine 34, 664–684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li J, Fine J, and Brookhart A (2015). Instrumental variable additive hazards models. Biometrics 71, 122–130. [DOI] [PubMed] [Google Scholar]
- Li S and Gray RJ (2016). Estimating treatment effect in a proportional hazards model in randomized clinical trials with all-or-nothing compliance. Biometrics 3, 742–750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loeys T and Goetghebeur E (2003). A causal proportional hazards estimator for the effect of treatment actually received in a randomized trial with all-or-nothing compliance. Biometrics 59, 100–105. [DOI] [PubMed] [Google Scholar]
- MacKenzie TA, Løberg M, and O’Malley AJ (2016). Patient centered hazard ratio estimation using principal stratification weights: application to the norccap randomized trial of colorectal cancer screening. Observational studies 2, 29. [PMC free article] [PubMed] [Google Scholar]
- Martinussen T, Nørbo Sørensen D, and Vansteelandt S (2017). Instrumental variables estimation under a structural cox model. Biostatistics. [DOI] [PubMed] [Google Scholar]
- Martinussen T, Vansteelandt S, Tchetgen E, and Zucker DM (2017). Instrumental variables estimation of exposure effects on a time-to-event response using structural cumulative survival models. Biometrics 73, 1140–1149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nie H, Cheng J, and Small DS (2011). Inference for the effect of treatment on survival probability in randomized trials with noncompliance and administrative censoring. Biometrics 67, 1397–1405. [DOI] [PubMed] [Google Scholar]
- Prorok PC, Andriole GL, Bresalier RS, Buys SS, Chia D, Crawford ED, and et al. (2000). Design of the prostate, lung, colorectal and ovarian (plco) cancer screening trial. Controlled Clinical Trials 21, 273S–309S. [DOI] [PubMed] [Google Scholar]
- R Core Team (2017). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. [Google Scholar]
- Robins JM and Tsiatis AA (1991). Correcting for non-compliance in randomized trials using rank preserving structural failure time models. Communications in statistics-Theory and Methods 20, 2609–2631. [Google Scholar]
- Rousseeuw PJ and Croux C (1993). Alternatives to the median absolute deviation. Journal of the American, Statistical association 88, 1273–1283. [Google Scholar]
- Tchetgen EJT, Walter S, Vansteelandt S, Martinussen T, and Glymour M (2015). Instrumental variable estimation in a survival context. Epidemiology (Cambridge, Mass.) 26, 402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Therneau TM (2015). A Package for Survival Analysis in S. version 2.38.
- Wang L, Tchetgen ET, Martinussen T, and Vansteelandt S (2018). Learning causal hazard ratio with endogeneity. arXiv preprint arXiv:1807.05313. [Google Scholar]
- Yu W, Chen K, Sobel ME, and Ying Z (2015). Semiparametric transformation models for causal inference in time-to-event studies with all-or-nothing compliance. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 77, 397–415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeng D and Lin DY (2007). Maximum likelihood estimation in semiparametric regression models with censored data (with discussion). Journal of the Royal Statistical Society, Series B 69, 507–564. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


