Abstract
Two stage instrumental variable methods are commonly used to estimate the causal effects of treatments on survival in the presence of measured and unmeasured confounding. Two stage residual inclusion (2SRI) has been the method of choice over two stage predictor substitution (2SPS) in clinical studies. We directly compare the bias in the causal hazard ratio estimated by these two methods. Under a principal stratification framework, we derive a closed form solution for asymptotic bias of the causal hazard ratio among compliers for both the 2SPS and 2SRI methods when survival time follows the Weibull distribution with random censoring. When there is no unmeasured confounding and no always takers, our analytic results show that 2SRI is generally asymptotically unbiased but 2SPS is not. However, when there is substantial unmeasured confounding, 2SPS performs better than 2SRI with respect to bias under certain scenarios. We use extensive simulation studies to confirm the analytic results from our closed-form solutions. We apply these two methods to prostate cancer treatment data from SEER-Medicare and compare these 2SRI and 2SPS estimates to results from two published randomized trials
Keywords: instrumental variable, two-stage residual inclusion, two-stage predictor substitution, unmeasured confounding, survival, bias
1. Introduction
Evaluating the effectiveness of treatment and identifying the causal relationship between exposure and disease are critical objectives for clinical and health services researchers. Confounding is often a concern when analyzing nonrandomized observational studies and even randomized studies with non-compliance [1]. Instrumental variable (IV) methods are increasingly being used in clinical comparative effectiveness studies to potentially control for both measured and unmeasured confounding. Angrist et al.[2] defined the IV for causal effects of treatment on outcome to be a variable satisfying the following five assumptions: i) The potential outcomes on one subject are unrelated with the particular assignment of treatment to the other subjects; ii) IV is randomly (or ignorably) assigned; iii) Any effect of IV on the outcome must be mediated by treatment received (the exclusion restriction);iv) IV has nonzero effect on treatment received; v) There are no defiers. (for details see section 2)
In a recent clinical study, we were interested in comparing the effectiveness of two treatments for prostate cancer in elderly men using SEER-Medicare, a large national observational database. Specifically, we planned to use IV methods to estimate the effect of the addition of external beam radiation therapy (EBRT) to androgen suppression therapy (ADT) in improving overall survival in men with locally advanced prostate cancer. We considered a commonly used IV in health services research: local area treatment patterns defined by the percentage of active treatment in hospital referral regions (HRR). This IV has been shown to capture regionally distinct structural variation in care [3]. Such variation is not fully explained by patient characteristics. Further, this IV varies across HRRs and is strongly associated with treatment assignment. Finally, it is balanced across important observed prognostic factors. Although there is an extensive literature on the importance of choosing an appropriate instrument, less attention has been paid to using the appropriate modeling approach once an IV is selected.
Recently, there has been rapid uptake and widespread use of two IV based analytic approaches called two-stage residual inclusion (2SRI) and two-stage predictor substitution (2SPS)[4, 5]. These methods have been used to correct for bias due to endogeneity in non-linear models for both binary and time-to-event outcomes. Among these two IV approaches, 2SRI was shown to consistently estimate a conditional causal parameter under certain assumptions [4] and has been adopted as the method of choice in clinical research studies involving survival outcomes[6, 7, 8]. The conditional causal parameter that Terza et al.[4] consider is only identified by making homogeneity assumptions that go beyond the five assumptions for a valid IV defined in the first paragraph. Angrist et al. [2] showed that under these five assumptions for a valid IV, the only treatment effect that is identified is the average treatment effect for the compliers, where the the compliers are the subjects who would take the treatment if encouraged to do so by the IV but would not take the treatment if not encouraged by the IV; this is called the local average treatment effect (LATE). In the context of a binary outcome, Cai et al.[5] demonstrated that both the 2SRI and 2SPS methods generated biased estimates of LATE among compliers for binary outcome. In this paper, we focus on the properties of 2SPS and 2SRI as estimators of the LATE for time-to-event data.
Despite the fact that there is growing interest in applying two stage IV methods to time-to-event data, little is known about the potential bias of using such methods to estimate LATE among compliers. We derive closed form expressions of the bias and conduct extensive simulations to quantify this bias. We then apply both of the two-stage IV methods to our prostate cancer treatment data and compare them to the results from two published randomized clinical trials [9, 10]
2. Notation, Assumptions, Compliance Categories, and Model
2.1. Notation
Following the notation of Cai et al.[5] and Nie et al.[11], an N-dimensional vector of binary IV is represented by Ṟ. An IV value of 1 represents encouragement to receive the active treatment and 0 represents no encouragement to receive the active treatment. In a RCT setting, where the IV is the randomized assignment, then an IV value of 1 represents random assignment to treatment and 0 represents random assignment to control; in the prostate cancer observational study described in the introduction, an IV value of 1 represents a high local area rate (above median) of adding EBRT to ADT and 0 represents a low local area rate (below the median) of adding EBRT to ADT. The ith element Ri = 1 implies that subject i is encouraged to receive the active treatment, whereas Ri = 0 indicates that subject i is not encouraged to receive the active treatment. Let ẔṞ be an N-dimensional vector of potential treatment received given Ṟ, and ith element indicates that subject i receives the active treatment and means that subject i receives the control under Ṟ.
Similarly, we define ṮṞ,Ẕ to be an N-dimensional vector of potential survival time under Ṟ and Ẕ, and ith element is the potential survival time for subject i under Ṟ and Ẕ. Let ḺṞ, Ẕ to be an N-dimensional vector of potential censoring time under Ṟ and Ẕ, and ith element is the potential censoring time for subject i under Ṟ and Ẕ.
We define Y̱Ṟ, Ẕ=min{ṮṞ,Ẕ, ḺṞ,Ẕ}, the elementwise minimum of potential censoring and survival times, to be an N-dimensional vector of potential observed follow up time under Ṟ and Ẕ, and ith element represents the potential follow up time for subject i under Ṟ and Ẕ. Let indicates whether subject i is observed to terminate by failure ( ) or by censoring ( ) given Ṟ and Ẕ. The vector X̱i represents measured confounding variables for subject i.
2.2. Assumptions
The main assumptions we will make for causal modeling are the five assumptions made by Angrist et al. [2], and a random censoring assumption for the survival setting.
-
Stable Unit Treatment Value Assumption (SUTVA)[12, 13]
if , then
-
if and , then
The SUTVA assumption says that the potential outcomes for subject i are not related with the treatment status of other subjects such that we can write , , , , as , , , , respectively. The SUTVA assumption also implies the assumption of consistency, such that the value of the potential outcome given a treatment remains unchanged no matter what the treatment assignment mechanism is [12]
-
Independence of the instrument Ṟ [14]:
Conditional on a vector of confounders X̱, the random vector (Y̱Ṟ,Ẕ, ṮṞ,Ẕ,ḺṞ,Ẕ,ẔṞ) is independent of Ṟ. In a randomized trial where R is the IV, the independence assumption holds without conditioning on X̱.
-
Exclusion Restriction
∀Ẕ,Ṟ, and Ṟ′, we have:
ṮṞ,Ẕ=ṮṞ′,Ẕ,ḺṞ,Ẕ = ḺṞ′,Ẕ, Y̱Ṟ′,Ẕ = Y̱Ṟ′,Ẕ, This assumption implies that any effect of IV on potential outcomes must be through its effect on treatment actually received. Thus, we can write , , as , , by combining the exclusion restriction and SUTVA assumptions.
-
Non-zero Average Causal Effect of Ṟ on Ẕ
This assumption means the IV is correlated with treatment received.
-
Monotonicity [15]
This assumption rules out the existence of defiers. No subject always does the opposite of the treatment assigned.
-
Independent censoring
The distribution of potential survival time ṮṞ,Ẕ is independent of the distribution of potential censoring time ḺṞ,Ẕ.
2.3. Compliance Categories
Under the framework of principal stratification and potential outcomes [2, 16], subjects in a two-arm randomized trial can be categorized into 4 principal strata: Always takers (AT) are subjects who always take the treatment regardless of assignments (Z1 = 1, Z0 = 1); Compliers (C) are subjects who comply with their assignments (Z1 = 1, Z0 = 0); Never takers (NT) are the subjects who never take the treatment no matter which group they are assigned to (Z1 = 0, Z0 = 0); Defiers (D) are the subjects who take the treatment opposite of their assignments (Z1 = 0, Z0 = 1).
2.4. Model
We first define the probability of receiving the treatment Pr(R = 1) = r, the probability of being a always taker Pr(AT) = ρa, and the probability of being a complier Pr(C) = ρc. We also define the probability of being a defier Pr(D) = ρd, but under the monotonicity assumption, there are no defiers so that ρd = 0. Hence, the probability of being a never taker Pr(NT) is equal to 1 − ρa − ρc.
We assume both potential censoring time and potential survival time follow the Weibull distribution with the same shape parameter α. The potential censoring time for the subjects in each principal strata follows Weibull(α, λ), and we define the parameters of the probability distribution of potential survival time for each principal strata as follows:
We also examined scenarios in which different shape parameters α’s are assumed for the potential censoring time and the potential survival time. These details are given in Appendix E. The density of Weibull distribution is f(t) = (α/K)(t/K)K−1exp(−(t/K)α) and the hazard rate is h(t) = αK−αtα−1. In the case of Weibull regression with covariates X, K−α can be reparameterized as exp(βX). The hazard rate for the compliers if treated is . The hazard rate for the compliers if not treated is . Hence, the log causal hazard ratio ϕ for the compliers is the difference between two log hazard rates:
3. Two Stage Predictor Substitution (2SPS) Method
The 2SPS method is frequently used and simple to implement [4]. In the first stage, the treatment received Z is regressed on the IV-treatment assignment R, and let P = E (Z∣R). In the second stage, a log linear model including P, defined as:
is fitted to estimate the coefficient ξ. This is 2SPS estimator of the log causal hazard ratio. We first derive a closed form expression to the probability limit of the maximal likelihood estimator (M.L.E) of ξ, then take the difference between this probability limit and true log causal parameter ϕ for the expression of the asymptotic bias of the 2SPS estimator as an estimator of the log causal hazard ratio for compliers.
3.1. Probability limit of M.L.E of causal parameter
Let P̂ denote the predicted value from the estimated binary regression model. i.e., P̂ = Ê(Z∣R). When P̂ is substituted for P, the second stage Weibull model becomes:
Let ξ̂* and ξ̂ denote the estimators (M.L.E) of ξ* and ξ respectively. As sample size n → ∞, P̂ → P, , and . Therefore, . To derive closed form expression for the asymptotic bias, we need to re-express ξ in terms of parameters specified in Section 2 under the principal stratification framework.
Only always takers receive the treatment when assigned to control (R = 0). Both always takers and compliers take the treatment when assigned to treatment (R = 1). Thus, it can be shown that [5]:
Since P = {p0, p1} is an one-to-one transformation of R = {0, 1}, we have the following for the second stage Weibull regression:
(1) |
and,
(2) |
Instead of working with a second stage model involving P, we can work with a model involving R instead. Solving (1) and (2), we have:
(3) |
The log linear model including R assumes two underlying Weibull distributions of the same shape parameter α*, Weibull(α*, K0) and Weibull(α*, K1), for subjects assigned to control (R = 0) and treatment (R = 1) respectively. Thus, (3) can be expressed as:
(4) |
It is worth noting that both follow up times of subjects assigned to control, denoted as Y∣R = 0, and follow up times of subjects assigned to treatment, denoted as Y∣R = 1, actually follow mixture distributions consisting of three different Weibull distributions. Details are given in Appendix A. However, the second stage Weibull model of 2SPS method imposes the two Weibull distributions, with the same shape parameter α* but different scale parameters K0, K1, upon subjects assigned to treatment (R = 1) or assigned to control (R = 0) respectively. Thus, the M.L.E of α*, K0, K1 are derived by maximizing the likelihood function Ln (α*, K0, K1) that consists of products of two Weibull densities: Weibull(α*, K0) and Weibull(α*, K1).
Let α̂* denote the M.L.E of α* and We set , the expectation of score equation derived from profile likelihood of α*, equal to 0 and let be the solution. Under the assumptions stated in Section 2 and consistency of M.L.E, the probability limit of the estimator α̂* is . Details are given in Appendix C. Once the parameters of the principal strata are defined, can be solved numerically using a root-finding algorithm such as the “bisection” method. Let K̂0, K̂1 be the M.L.Es of the two scale parameters K0, K1 respectively. After the value of is determined, the probability limits of the estimators K̂0, K̂1 can be derived as follows:
(5) |
and,
(6) |
The detailed steps of the derivation of (5) and (6) are given in Appendix C. By substituting (5) and (6) into (4), we derive the expression of log causal hazard ratio ξ as the following:
(7) |
Thus, (7) is the closed-form expression of the probability limit of the log causal hazard ratio estimator ξ̂* from the 2SPS Weibull model.
3.2. Bias analysis
The asymptotic bias of the causal parameter ξ of the 2SPS Weibull regression model is simply the difference between the true log causal hazard ratio ϕ and the derived closed form expression of ξ, such that
(8) |
We can re-paramterize in (8) with one additional parameter as the following:
(9) |
Δ in (9) is the log hazard ratio between never takers and compliers given no treatment. It can be interpreted as the magnitude of the unmeasured confounding because the differences between principal strata are attributable to the unmeasured confounding [5]. When Δ = 0 or , there is no unmeasured confounding.
We make the following observations about the bias of 2SPS method from (3.11): 1) When α = 1 and we treat α* as a known parameter and fix it at 1, that is the scenario when the survival outcomes of all principal strata follow exponential distributions and we also fit an exponential model in the second stage instead of estimating the shape parameter for a more general form of Weibull distribution; 2) When ρc = 1, every subject is a complier and (8) can be simplified as . Then we have . Setting ρc = 1, ρa = 0, and ρn = 0, (8) becomes 0 so that bias B2sps = 0 when a randomized controlled trial has perfect compliance; 3) When there is no causal effect ( ), all terms in (8) cancel out and we have B2sps = 0; 4) When ρa = 0 and , there is no confounding because there are no always takers and never takers can’t get treatment so that the confounding can only be attributable to the difference between never takers and compliers given no treatment[5]. However, (8) can not be reduced to 0 under this setting so that the bias of 2SPS method B2sps is generally not 0 even when there is no confounding. 5) λ, the scale parameter of the censoring distribution is involved in bias equation (9), which coincides with the results in Struthers and Kalbfleisch[17].
We can analyze how parameters influence the relationship between the magnitude of confounding and bias using derived closed form expression (9). For the purpose of demonstration only, here we create four scenarios in which there are no always takers. The results are revealed in Figure 1 (a)-(d).
In Figure 1, we can clearly see that the bias of the 2SPS method is not 0 when there is no confounding. The bias increases with the larger shape parameter α of the survival function (within each principal stratum). The bias is the smallest when we have an decreasing hazard rate (α < 1) and the highest when we have an increasing hazard rate (α > 1). By comparing Figure 1 (a) and (b), we also observe that the bias decreases as the compliance rate increases from 0.5 to 0.8. When the scale parameter (θc) is smaller, the bias is also smaller (Figure 1 (a) vs. (c)). Although the probability of being randomly assigned to the treatment group is involved in computing the shape parameter of the second stage Weibull regression model, its effects on the bias are very small (compare Figure 1 (b) to (d)).
4. Two Stage Residual Inclusion (2SRI) Method
Similar to the 2SPS method, the 2SRI method involves two stage modeling [4]. In the first stage, we regress the treatment received Z on the IV-treatment assignment R and calculate the residual term E = Z − E (Z∣R). In the second stage, we fit a log linear model on both treatment received variable Z and residual E as,
(10) |
, to estimate the regression coefficient λ1. This is 2SRI estimaor of the log causal hazard ratio. We derive the probability limit of the M.L.E of λ1 first and then calculate the asymptotic bias by taking the difference between this probability limit of the estimator and true log causal hazard ratio among compliers.
4.1. Probability limit of M.L.E of causal parameter
As discussed in a previous study[5], (10) is not the true model for the hazard function h(Y∣Z, E). In fact the true model includes the interaction term between Z and E. However, deriving the closed-form expression for the probability limit of the estimator from (10) is very difficult when (10) is not the true model. With one additional assumption that there are no always takers, (10) becomes the true model. We derive a closed-form expression of the probability limit of the estimator of causal parameter λ1 assuming that there are no always takers and thus (10) is the true model. Let Ê denote the residuals from the estimated binary regression model in the first stage. i.e., Ê = Z − Ê(Z∣R). When Ê is substituted for E, (10) becomes:
Let and λ̂1 be the estimators (M.L.E) of and λ1. As sample size n → ∞, Ê → E, , and . Thus, . To derive a closed form expression for the asymptotic bias, we need to first re-express λ1 in terms of the parameters specified in section 2.3 under the principal stratification framework.
As shown in a previous study[5], under the no always taker assumption, the first stage binary regression is E(Z∣R) = ρa + ρcR and residual term E = Z − E (Z∣R), thus the residual term can be re-expressed as E = Z − ρa − ρcR. Since {Z, E} has an one to one relationship with {Z, R}, we can establish the following equivalence between the model involving {Z, E} and the model involving {Z, R} for the second stage Weibull model:
(11) |
Under the no always taker assumption, the second stage Weibull regression model defined by (10) assumes the three underlying Weibull distributions with the same shape parameter but different scale parameters for subjects in the three different subgroups: 1) ~ Weibull(α*, K0) for those who are assigned to treatment and receive the treatment actually (Z = 1, R = 1). Only compliers are in this group; 2) ~ Weibull(α*, K1) for those who are assigned to treatment but do not receive the treatment actually (Z = 0, R = 1), This group has only never takers; 3) ~ Weibull(α*, K2) for those who are assigned to control and do not receive the treatment (Z = 0, R = 0), both never takers and compliers are in this group. There are no subjects that are assigned to control but still take the active treatment (Z = 1, R = 0) under the assumption of no always takers. Thus, the M.L.E of α*, K0, K1, K2 are derived by maximizing the likelihood function Ln(α*, K0, K1, K2) that consists of products of three Weibull densities: Weibull(α*, K0), Weibull(α*, K1), and Weibull(α*, K2).
Let α̂* denote the M.L.E of α* and set , the expectation of score equation derived from profile likelihood of α*, to 0 and let be the solution. Under the assumptions stated in section 2 and consistency of the M.L.E, the probability limit of the estimator α̂* is . Details are given in Appendix D. With the parameters of principal strata defined, can be solved numerically using a root-finding algorithm. Let K̂0, K̂1, K̂2 be the M.L.Es of two scale parameters K0, K1, K2. Once the value of is determined, we compute the probability limits of the estimators K̂0, K̂1, K̂2 as follows:
(12) |
and
(13) |
and
(14) |
The derivation of (12),(13) and (14) is detailed in Appendix D. Based on (11), we can establish the following three equations with all possible combination of values of Z and R excluding the always takers scenario (Z=1, R=0).
- When Z=1 and R=1, there are only compliers in this subgroup.
(15) - When Z=0 and R=1, there are only never takers in this subgroup.
(16) - When Z=0 and R=0, there are mixture of both never takers and compliers in this subgroup.
(17)
We then derive the closed form expression for the causal parameter λ1 by solving (15),(16), and (17) for λ1 as follows:
4.2. Bias analysis
To compute asymptotic bias of the 2SRI method, we subtract the true log hazard ratio ϕ from the closed-form expression of λ1.
(18) |
We can re-parameterize in (18) in the way as in Section 3 and let . From the derived expression of asymptotic bias of 2SRI estimator, we can make the following observations: 1) When α = 1, the survival outcome within a principal stratum follows an exponential distribution. If we treat α* as known and set α* = 1, it means we fit an exponential regression model in the second stage; 2) When there is perfect compliance (ρc = 1), we have B2SRI = 0. In this scenario, . By plugging ρc = 1 into (18), we can easily verify the results; 3) When there is no confounding ( ), B2SRI = 0; 4) When there is no causal effect ( ), B2 SRI is not 0; 5) λ, the scale parameter of the censoring distribution is involved in bias equation (18), similar to the findings for 2SPS method.
We can analyze how parameters influence the relationship between the magnitude of confounding and bias from the 2SRI method using (18). Similar to the previous section, four scenarios were created assuming there are no always takers. The results are shown in Figure 2 (a)-(d). In Figure 2, it is apparent that the bias of the 2SRI method is 0 when there is no confounding. Intuitively, under the condition of no confounding, substituting the term of the estimated residuals in the second stage survival model has no effect on the estimate of the causal parameter. By comparing Figure 2 (a) and (b), we also observe that the bias decreases as the compliance rate increases from 0.5 to 0.8. When the scale parameter (θc) is smaller, the bias tends to be smaller (Figure 2 (a) vs. (c)). The probability of being randomly assigned to the treatment group has very small impact on the bias (compare Figure 2 (b) to (d)).
5. Simulation
5.1. Simulation algorithm
We follow the five step algorithm used by Cai et al.[5] to generate data for a simulation study. In the first step, a data set of N subjects is generated. Always takers, compliers, and never takers among these subjects are generated from a multinomial distribution with probabilities {ρa, ρc, ρn}. At the second step, treatment assignment status R is generated for each subject with probability P(R = 1) = ρr. Because outcome in the present study is time to event, we modified step 3 to generate potential survival time {T0, T1} and censoring time {L0, L1} for each principal stratum based on the parameters , , , , , , λ. For instance, if a subject is a complier, the potential time to death under control is generated from weibull (α, ) and the potential time to death under treatment is generated from weibull (α, ). The potential censoring time { , } are generated from weibull(α, λ). At step 4, we use compliance status (always taker, complier, or never taker) and treatment assignment status R to determine the treatment received status Z. For instance, if a subject is a complier and assigned to treatment group (R = 1), then Z = 1. If a subject is an always taker but assigned to the control group, then Z = 0. At step 5, the observed survival time and censoring time are generated as follows:
and finally observed follow up time and censoring indicator are given as:
5.2. Simulation results
To demonstrate the consistency between the derived closed form expressions and the asymptotic biases from the 2SPS and 2SRI approaches under the assumption of no always takers (ρa = 0), we ran the simulation 2000 times, with the sample size n=10000, according to the same parameter settings presented in Figure 1 d) and Figure 2 d). Table 1 shows simulation results from 4 scenarios (α = 0.5, 1, 1.5, 2). As shown in this table, the biases from simulated results are consistent with the values computed with the derived analytic formula for both the 2SPS and 2SRI Weibull models. We also considered 2SPS and 2SRI Cox models (the second stage regression is a Cox model instead of a Weibull model). The pattern of the biases from 2SPS and 2SRI Cox models remains the same as for the 2SPS and 2SRI Weibull models respectively. With decreasing hazard (α = 0.5), the bias from using the 2SPS approach is smaller than the bias from the 2SRI approach. When the hazard is constant or increasing (α ≥ 1), the results are mixed. With stronger negative confounding, the 2SPS method produces smaller bias than the 2SRI method. However, with no confounding or stronger positive confounding, the 2SPS method produces larger bias than the 2SRI method.
Table 1.
α | δ |
|
|
|
|
|
|
||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.5 | 2 | -0.094 | -0.093 | -0.091 | -0.477 | -0.476 | -0.476 | ||||||
1.5 | -0.067 | -0.068 | -0.064 | -0.238 | -0.239 | -0.235 | |||||||
1 | -0.039 | -0.040 | -0.039 | -0.086 | -0.087 | -0.086 | |||||||
0.5 | -0.013 | -0.016 | -0.012 | -0.015 | -0.018 | -0.014 | |||||||
0 | 0.007 | 0.009 | 0.007 | 0.000 | 0.002 | 0.000 | |||||||
-0.5 | 0.023 | 0.020 | 0.026 | 0.000 | -0.003 | -0.001 | |||||||
-1 | 0.038 | 0.037 | 0.051 | 0.029 | 0.028 | 0.029 | |||||||
-1.5 | 0.055 | 0.053 | 0.075 | 0.114 | 0.112 | 0.108 | |||||||
-2 | 0.073 | 0.074 | 0.101 | 0.261 | 0.263 | 0.236 | |||||||
| |||||||||||||
1 | 2 | -0.250 | -0.253 | -0.247 | -0.545 | -0.550 | -0.544 | ||||||
1.5 | -0.177 | -0.175 | -0.177 | -0.285 | -0.284 | -0.284 | |||||||
1 | -0.096 | -0.093 | -0.097 | -0.110 | -0.107 | -0.112 | |||||||
0.5 | -0.017 | -0.020 | -0.018 | -0.022 | -0.025 | -0.023 | |||||||
0 | 0.051 | 0.053 | 0.055 | 0.000 | 0.002 | 0.000 | |||||||
-0.5 | 0.107 | 0.106 | 0.116 | -0.007 | -0.008 | -0.009 | |||||||
-1 | 0.152 | 0.153 | 0.177 | 0.000 | 0.000 | 0.002 | |||||||
-1.5 | 0.193 | 0.191 | 0.232 | 0.057 | 0.053 | 0.055 | |||||||
-2 | 0.230 | 0.232 | 0.280 | 0.175 | 0.176 | 0.157 | |||||||
| |||||||||||||
1.5 | 2 | -0.422 | -0.423 | -0.418 | -0.605 | -0.607 | -0.602 | ||||||
1.5 | -0.285 | -0.285 | -0.284 | -0.326 | -0.325 | -0.326 | |||||||
1 | -0.132 | -0.133 | -0.134 | -0.133 | -0.134 | -0.134 | |||||||
0.5 | 0.019 | 0.023 | 0.021 | -0.028 | -0.027 | -0.029 | |||||||
0 | 0.153 | 0.152 | 0.159 | 0.000 | -0.004 | 0.000 | |||||||
-0.5 | 0.261 | 0.266 | 0.274 | -0.015 | -0.012 | -0.015 | |||||||
-1 | 0.345 | 0.342 | 0.376 | -0.030 | -0.033 | -0.027 | |||||||
-1.5 | 0.412 | 0.412 | 0.461 | -0.005 | -0.008 | -0.002 | |||||||
-2 | 0.467 | 0.468 | 0.531 | 0.078 | 0.075 | 0.068 | |||||||
| |||||||||||||
2 | 2 | -0.574 | -0.578 | -0.571 | -0.656 | -0.656 | -0.653 | ||||||
1.5 | -0.359 | -0.360 | -0.357 | -0.362 | -0.361 | -0.359 | |||||||
1 | -0.122 | -0.124 | -0.122 | -0.152 | -0.153 | -0.152 | |||||||
0.5 | 0.111 | 0.115 | 0.112 | -0.034 | -0.032 | -0.036 | |||||||
0 | 0.317 | 0.320 | 0.324 | 0.000 | 0.003 | 0.002 | |||||||
-0.5 | 0.481 | 0.479 | 0.494 | -0.022 | -0.026 | -0.026 | |||||||
-1 | 0.605 | 0.605 | 0.636 | 0.059 | -0.059 | -0.056 | |||||||
-1.5 | 0.698 | 0.701 | 0.747 | -0.069 | -0.069 | -0.063 | |||||||
-2 | 0.769 | 0.770 | 0.833 | -0.023 | -0.024 | -0.026 |
- bias computed using analytic formula derived for 2SPS method; - bias computed via simulation for 2SPS Weibull accelerated failure time model; -bias computed via simulation for 2SPS Cox model; - bias computed using analytic formula derived for 2SRI method; -bias computed via simulation for 2SRI Weibull accelerated failure time model; -bias computed via simulation for 2SRI Cox model;
To evaluate the performance of both 2SPS and 2SRI methods in the setting where there are always takers, we simulated the data with various combination of parameters based on the following settings: i) Shape parameter α varies among {0.5, 1, 2}, which represent decreasing, constant, and increasing hazard scenarios; ii) Probabilities of being always takers ρa and compliers ρc were set to 3 combinations: {0.2, 0.7}, {0.7, 0.2}, and {0, 0.5}. In this way, low, medium, and high levels of compliance were represented; iii) probability of being assigned to treatment ρr were set to {0.1, 0.5} to reflect both new and relatively established treatments; iv) Scale parameter of censoring distribution were set to {0.5, 1, 2}; v) Each of the parameters , , was set to {0.5, 1, 3} separately. Thus, 1458 possible combinations were created. For each setting, we generated 10,000 observations and fit the 2SPS and 2SRI models to the data. This process was repeated 2000 times.
The results are presented in Figure 3. The magnitude of bias increases with increasing magnitudes of unmeasured confounding. As the value of shape parameter α increases, the magnitude of bias increases. In the scenarios with decreasing hazard, the 2SPS method outperforms the 2SRI method. The 2SRI method tends to have larger asymptotic bias when the magnitude of unmeasured confounding is large. In the scenarios with constant hazard, the 2SPS method slightly outperforms the 2SRI method when the magnitude of unmeasured confounding is large. In the scenarios with increasing hazard, both approaches produce larger biases. The 2SRI method performs better when the magnitude of unmeasured confounding is small. When there are always takers, the 2SRI method could be biased even when there is no measured confounding. We also compared the two methods using mean square error and the conclusions remain the same (4).
6. Seer-Medicare Prostate Cancer Study
Prostate cancer is the highest prevalence non-skin malignancy among American men (In 2011, there were an estimated 2,707,821 men living with prostate cancer in the United States. The number of deaths was 23.0 per 100,000 men per year). Unlike prostate cancers that are diagnosed at an early stage, locally advanced prostate cancer is associated with substantial morbidity and mortality. Radiation therapy is a common treatment for locally advanced prostate cancer. Two randomized trials recently demonstrated that radiation therapy reduces mortality for men with locally advanced tumors who also receive systemic androgen deprivation[9, 10]. However, both trials excluded elderly patients and those with early stage, PSA-screen detected cancer and therefore had less generalizability, a common criticism of randomized evidence. Therefore, we applied two-stage IV methods to evaluate survival outcomes in locally advanced prostate cancer, assessing survival outcomes of androgen deprivation therapy with or without radiation therapy in comparison to the randomized trials.
We analyzed data from the Surveillance, Epidemiology and End Results (SEER)-Medicare database. The SEER-Medicare database links patient demographic and tumor-specific data collected by SEER cancer registries to Medicare claims for inpatient and outpatient care. We considered patients with prostate cancer diagnosed between January 1, 1995 and December 31, 2007 in SEER with follow up through December 31, 2010 in Medicare. The following patients were excluded: 1) older than age 85; 2) with unknown urban category; 3) in hospital referral regions (HRR) with less than 50 patients; 4) with unknown distance to the closest radiation facility; 5) patients who died within the first 9 months of the study. A total of 31,541 patients were selected and categorized as receiving androgen deprivation with or without radiation therapy.
The cohort was divided into the following three groups: 1) patients with American Joint Commission on Cancer (AJCC) Tumor stage (T-stage) of T2 or T3 and aged 65-75 (called the RCT Cohort). The patients in the “RCT Cohort” are most comparable to the patients from the two randomized studies of androgen deprivation with or without radiation therapy[9, 10]; 2) elderly patients under-represented or excluded from the published randomized trials with T-stage T2 or T3, aged 76-85 (called the “Elderly Cohort”); and 3) patients with early stage, PSA-screen detected cancer with T-stage T1 disease who were excluded from the published randomized trials (called the “Screen-Detected Cohort”).
The study by Widmark et al.[9] included men from 47 centers in Europe diagnosed between February, 1996 and December, 2002. 875 patients with locally advanced prostate cancer (T3; 78%; prostate-specific antigen (PSA) ≤ 70 ng/mL; N0; M0) were enrolled. 439 patients were randomly assigned to androgen deprivation alone and the other 436 patients received androgen deprivation with radiation therapy. The study by Warde et. al. enrolled 1,205 patients with locally advanced (T3 or T4) prostate cancer, organ-confined disease (T2) with either PSA >40 ng/mL or PSA >20 ng/mL and a Gleason score of 8 or higher between 1995 and 2005. 1205 patients were randomly assigned to receive the androgen deprivation alone (n=602) or androgen deprivation with radiation therapy (n=603). The hazard ratios for overall mortality reported previously [9] and [10] were 0.68 (95% CI 0.52–0.89) and 0.77 (95% CI 0.61–0.98). For ease of comparison, we combined the results of the randomized trials using weighted-average meta-analysis. The meta-analytic HR was 0.73 (0.61–0.87).
To assess the effectiveness of androgen deprivation with or without radiation therapy in reducing overall mortality (death from any cause), we performed two-stage IV Weibull regression analysis (2SPS and 2SRI) using a local area treatment rate instrument and controlling for the propensity score. The local area treatment rate instrument was defined as the proportion of patients who received definitive treatment (surgery or radiation therapy) among all patients with prostate cancer in the hospital referral region (HRR) and we categorized this instrument into a binary variable according to its median. This IV measures the aggressiveness of local area treatment and captures regionally distinct structural care variation not fully explained by patient characteristics. The IV was strongly associated with treatment assignment and balanced important prognostic factors [3]. The propensity score model included potential confounding variables including age, race, ethnicity, clinical T stage, N stage, and World Health Organization tumor grade, 17 categories of co-morbid disease, urban residence, and census track median income.
As shown in Table 2, there is variability in the estimated HRs obtained from the 2SPS and 2SRI methods. We estimated the shape parameter α ≈ 1.6 from the data. Using Figure 3, we can see that the bias for both the 2SPS and 2SRI methods is the largest when we have an increasing hazard (α > 1), even when the magnitude of unmeasured confounding is relatively small. When the hazard function is a decreasing one (α < 1), the 2SPS method produces more stable and less biased estimates than the 2SRI method. In this case, 2SPS may be a more appropriate approach to use. In the RCT Cohort, the estimated HRs (HR=0.96) from both IV methods are much larger than the meta-analytic HR from the two randomized studies. Note that the confidence intervals are also much larger in both IV analyses than in the original RCTs. In the published RCTs, the authors concluded that there was a statistically significant treatment effect (combined therapy is better) whereas from our IV analysis, we can’t draw this conclusion. In the total study sample and separately in the RCT Cohort and the Screen-Detected Cohort, the two IV estimates are quite similar. However, for the Elderly Cohort, the estimate from the 2SPS method is different from the estimate from the 2SRI method.
Table 2.
Outcome | Group | IV2sri | IV2sps |
---|---|---|---|
All cause mortality | Total (n=31541) | 0.57(0.17-1.06) | 0.59(0.19-1.09) |
RCT Cohort (n=12924) | 0.96(0.18-5.81) | 0.97(0.18-5.94) | |
Elderly Cohort (n=14340) | 0.74(0.20-1.83) | 0.96(0.26-2.35) | |
Screen-Detected Cohort (n=4277) | 0.34(0.02-2.99) | 0.35(0.03-3.22) |
7. Discussion
Many clinical and health services studies are using health care databases to compare the treatment effectiveness for drug and surgical therapies, but are prone to unmeasured confounding. Two stage IV methods have been gaining popularity among clinical researchers because these methods provide a relatively simple approach to analyzing survival outcome studies in the presence of unmeasured confounding. However, current knowledge about potential bias in estimating the log causal hazard ratio is limited. As demonstrated in our prostate cancer study, the large treatment effects estimated from two stage IV methods could be attributable to potential bias. We have derived closed-form expressions for the asymptotic bias of the 2SRI and 2SPS approaches assuming the survival times follow a Weibull distribution with shape parameter α and scale parameter K. We have demonstrated that these analytic results are consistent with our simulation results.
For binary outcomes, two previous studies[5, 18] demonstrated that the bias in the treatment effect estimated using the 2SRI approach increases as the magnitude of confounding increases. In this current work, we have shown analytically and by simulation that the 2SRI and 2SPS approaches are both biased in estimating the causal hazard ratio among compliers. In some situations when the hazard is decreasing (e.g among patients who have recently received a kidney transplantation), the 2SPS method is less biased than the 2SRI method and could be a more appropriate method to use. When the hazard is an increasing function, both IV methods may produce very large bias even under a moderate amount of unmeasured confounding. In this case, we recommend exercising caution when interpreting results from two-stage IV survival models.
We have shown that even when all IV assumptions are met, both the 2SRI and the 2SPS methods could fail to consistently estimate the causal hazard ratio among compliers. Our analytic results for bias may help to guide researchers in deciding when the bias is likely to be reasonably small so that two stage IV methods may be reasonably applied. Furthermore, in a sensitivity analysis approach, one may estimate the shape parameter and the censoring proportion among patients assigned to treatment or control from the data. With the shape parameter and censoring proportions fixed based on our known data the level of the unmeasured confounding could be varied to examine how the estimates would change, as shown in Figures 1 and 2. Alternative methods include partial likelihood estimation [19].
Appendix
Appendix A: Mixture of Weibull Distributions
Prove the distribution function of observed survival time T conditional on random assignment R can be expressed as the following equations:
(A.1) |
and,
(A.2) |
In the above equations, AT represents always takers, C represents compliers, and NT represents never takers. Other definitions of parameters and distributions that are used in the proof are given below:
no defiers under monotonicity assumption
Proof
F(T∣R = 1) can be expressed as:
F(T∣R = 0) can be expressed as:
Appendix B: Proofs related with Derivation of Closed Form Solution
-
Assume survival time T ~ Weibull(α, K) and censoring time L ~ Weibull(α, λ). Let Y = min(T, L) and δ = I(T ≤ L). Show that
and,(B.1) Proof:Thus, -
Assume survival time T is a mixture of three Weibull distributions with Density . T1 ~ Weibull(α, K1), T2 ~ Weibull(α, K2), and T3 ~ Weibull(α, K3). The weights are p1, p2, p3 and . The censoring time L ~ Weibull(α, λ). Let Y = min(T, L) and δ = I(T ≤ L). Show that
(B.2) Proof: -
Given X follows a Weibull distribution (α*, K). Show that
(B.3) Proof: -
Given X follows a Weibull distribution (α*, K). Show that
(B.4) Proof: -
Given X follows a Weibull distribution (α*, K). Show that
(B.5) Proof: -
Let Ti denote the survival time and Ci denote the censoring time for subject i. Ti and Ci are independent. Ti ~ weibull(α, K), and Ci ~ weibull(α, λ). Let Yi = min(Ti, Ci) denote observed follow-up time and δi be the indicator variable δi = (Ti ≤ Ci). Show that:
(B.6) Proof:
Let and use (B.1)
Both E(Yi δi) and E(Yi) E (δi) have the same integral functions. Thus,
Similarly, we can establish the following:
Appendix C: Derivation of probability limits of M.L.E of α, K0, K1 for 2SPS
Let Y = min(T, C) be observed follow-up time and δ = I(T ≤ C) be the censoring time. The subjects are assigned to either treatment group (R = 1) or control group (R = 0). The distribution of each subgroup has different scale parameter K but the same shape parameter α*. Thus, likelihood function of observed follow up time Y can be written as:
For treatment assignment group and control assignment group, subjects are from compliers (c), never takers (nt), and always takers (at). Let nR1, nR0 denote number of subjects assigned to treatment (R = 1) and control (R = 0). Let nR1, at, nR1, nt, nR1, c denote number of always takers, never takers, and compliers that are assigned to treatment group. nR1, at + nR1, nt + nR1, c = nR1. Let nR0, at, nR0, nt, nR0, c denote number of always takers, never takers, and compliers, who are assigned to control group.nR0,at + nR0, nt + nR0, c = nR0. Therefore, the likelihood can be rewritten as:
Next, the log likelihood function is:
To derive the M.L.E of K0, K1, take the first derivative of l(y) with respect to K0, K1 and set score equation to 0, we have
(C.1) |
and,
(C.2) |
To derive the M.L.E of α*, take the first derivative of l(y) with respect to α* and set score equation to 0 and replace K1, K0 with the expressions (C.1) and (C.2), we have
M.L.E α̂* is the solution to the above equation. Next, divide both sides by total number of subject n, we have
As nR1, nR0, nR1, at, nR1, nt, nR1, c, nR0, at, nR0, nt, nR0, c → ∞, the score equation converges to the following:
(C.3) |
Use the results from Appendix B, we can derive the following:
and,
Let be the solution to the equation (C.3). By the consistency of M.L.E, Thus, we have Next, substitute α̂* into equation (C.1)
Asymptotically, it converges to
Similarly, K̂1 converges to
Appendix D: Derivation of probability limits of M.L.E of α, K0, K1, K2 for 2SRI
Under the no AT assumption, we can find an expression for λ1 as follows. The first stage regression can be re-expressed as following:
Note that Z, E and Z, R are one-to-one correspondence. Knowing Z, E will let us know Z, R and vice versa. Under no always taker assumption, we observe three subgroups 1) Z = 1, R = 1. Only compliers in this group; 2) Z = 0, R = 1, Only never takers in this group; 3) Z = 0, R = 0, both never takers and compliers in this group. There are no patients that are assigned to control but still takes on active treatment (Z = 1, R = 0). For the 3 subgroups, essentially we are fitting 3 Weibull distributions with the same shape parameter α* and 3 different shape parameter K0, K1, K2 with Weibull regression model: logh(t) = λ0 + λ1 Z + λ2 E
The likelihood function is:
The log likelihood is:
Take the first derivative of l(y) with respective to K0, K1, K2 respectively and set score equation to 0, then we have
(D.1) |
(D.2) |
(D.3) |
Take the first derivative of l(y) with respective to α* and replace K0, K1, K2 with expression (D.1),(D.2),(D.3), then we have:
M.L.E α̂* is the solution to the above score equation. Next, divide the equation by total sample size n,
As sample sizes in each principal strata → ∞, the score equation will converge to:
(D.4) |
where,
is the solution to the equation (D.4). Thus, . Probability limits of M.L.E of K0 can be derived as following:
Similarly, for K1, K2,
Appendix E: Assumption of the same shape parameter for survival and censoring distributions
In section 2 of the manuscript, we made the assumption that both time to event and censoring time have the same shape parameter so that close form solution could be derived. To evaluate the potential impact on the bias when the time to event and censoring time have two different shape parameters and the assumption is violated, we re-evaluated the scenario in the table 1 with the shape parameter α = 0.5. We set the shape parameter of censoring distribution to be 1.2 and compared the differences. We found that the differences in bias of 2SPS between two scenarios ranges from 0.01 to 0.018 (δ varies from -2 to 2). For 2SRI approach, the differences ranges from 0.001 to 0.13. These differences are attributable to the different censoring proportions between two scenarios. The shape of relationship between bias and δ remains approximately unchanged (data not shown). It should be noted that under the assumption of having the same shape parameters for both survival time and censoring time, the maximum likelihood estimator based on the survival likelihood that does not incorporate the assumption of the shape parameters being the same is not fully efficient.
References
- 1.Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 2006;17(4):360–372. doi: 10.1097/01.ede.0000222409.00878.37. [DOI] [PubMed] [Google Scholar]
- 2.Angrist J, Imbens G, Rubin DB. Identification of causal effects using instrumental variables. Journal of the American Statistical Association. 1996;91:444–455. [Google Scholar]
- 3.Bekelman JE, Mitra N, Handorf E, Uzzo RG, Hahn S, Polsky D, Armstrong K. Effectiveness of Androgen Deprivation Therapy and Radiotherapy for Older Men with Locally Advanced Prostate Cancer. Journal of Clinical Oncology. doi: 10.1200/JCO.2014.57.2743. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Terza J, Basu A, Rathouz P. Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling. Journal of Health Economics. 2008;27(3):531–543. doi: 10.1016/j.jhealeco.2007.09.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cai B, Small D, Ten Have T. Two-stage instrumental variable methods for estimating the causal odds ratio: analysis of bias. Statistics in Medicine. 2011;30(15):1809–1824. doi: 10.1002/sim.4241. [DOI] [PubMed] [Google Scholar]
- 6.Gore JL, Litwin MS, Lai J, et al. Use of Radical Cystectomy for Patients with Invasive Bladder Cancer. Journal of the National Cancer Institute. 2010;102(11):802–811. doi: 10.1093/jnci/djq121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hadley J, Yabroff KR, Barrett MJ, Penson DF, Saigal CS, Potosky AL. Comparative effectiveness of prostate cancer treatments: evaluating statistical adjustments for confounding in observational data. Journal of the National Cancer Institute. 2010;102(23):1780–1793. doi: 10.1093/jnci/djq393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tan HJ, Norton EC, Ye Z, Hafez KS, Gore JL, Miller DC. Long-term survival following partial vs radical nephrectomy among older patients with early-stage kidney cancer. The Journal of the American Medical Association. 2012;307(15):1629–1635. doi: 10.1001/jama.2012.475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Widmark A, Klepp O, Solberg A, et al. Endocrine treatment, with or without radiotherapy, in locally advanced prostate cancer (SPCG-7/SFUO-3): an open randomised phase III trial. Lancet. 2009;373(9660):301–308. doi: 10.1016/S0140-6736(08)61815-2. [DOI] [PubMed] [Google Scholar]
- 10.Warde P, Mason M, Ding K, et al. Combined androgen deprivation therapy and radiation therapy for locally advanced prostate cancer: a randomised, phase 3 trial. Lancet. 2011;378(9809):2104–2111. doi: 10.1016/S0140-6736(11)61095-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Nie H, Cheng J, Small DS. Inference for the effect of treatment on survival probability in randomized trials with noncompliance and administrative censoring. Biometrics. 2011;67(4):1397–1405. doi: 10.1111/j.1541-0420.2011.01575.x. [DOI] [PubMed] [Google Scholar]
- 12.Rubin DB. Statistics and causal inference–Which ifs have causal answers. Journal of the American Statistical Association. 1986;81:961–962. [Google Scholar]
- 13.Rubin DB. Comment: Neyman (1923) and causal inference in experiments and observational studies. Statistical Science. 1990;5:472–480. [Google Scholar]
- 14.Abadie A. Semiparametric Instrumental Variable Estimation of Treatment Response Models. Journal of Econometrics. 2003;113:231–263. [Google Scholar]
- 15.Imbens G, Angrist J. Identification and Estimation of Local Average Treatment Effects. Econometrica. 1994;62:467–475. [Google Scholar]
- 16.Rubin DB. Causal Inference Using Potential Outcomes: Design, Modelling, Decisions. Journal of the American Statistical Association. 2005;100:322–331. [Google Scholar]
- 17.Struthers CA, Kalbfleisch JD. Misspecified proportional hazard models. Biometrika. 1986;73:363–369. [Google Scholar]
- 18.Ten Have T, Joffe M, Cary M. Causal logistic models for non-compliance under randomized treatment with univariate binary response. Statistics in Medicine. 2003;22(8):1255–1283. doi: 10.1002/sim.1401. [DOI] [PubMed] [Google Scholar]
- 19.Cuzick J, Sasieni P, Myles J, Tyrer J. Estimating the effect of treatment in a proportional hazards model in the presence of non-compliance and contamination. Journal of the Royal Statistical Society, Series B (Methodological) 2007;69:565–88. [Google Scholar]