Abstract
A major concern in any observational study is unmeasured confounding of the relationship between a treatment and outcome of interest. Instrumental variable (IV) analysis methods are able to control for unmeasured confounding. However, IV analysis methods developed for censored time-to-event data tend to rely on assumptions that may not be reasonable in many practical applications, making them unsuitable for use in observational studies. In this report, we develop weighted estimators of the complier average causal effect (CACE) on the restricted mean survival time (RMST) in the overall population as well as in an evenly matchable population (CACE-m). Our method is able to accommodate instrument-outcome confounding and adjust for covariate dependent censoring, making it particularly suited for causal inference from observational studies. We establish the asymptotic properties and derive easily implementable asymptotic variance estimators for the proposed estimators. Through simulation studies, we show that the proposed estimators tend to be more efficient than instrument propensity score matching based estimators or inverse probability of instrument weighted estimators. We apply our method to compare dialytic modality-specific survival for end stage renal disease (ESRD) patients using data from the United States Renal Data System (USRDS).
Keywords: Complier Average Causal Effect, Dialysis, Instrumental Variables, Unmeasured Confounding, Restricted Mean Survival Time
1. Introduction
A major concern in any study lacking randomized treatment assignment is the potential for confounding of the relationship between the treatment and outcome of interest. In the absence of randomization, estimation of the causal effect of treatment generally requires an untestable and often unrealistic assumption that the treatment is randomly assigned conditional on the observed covariates, i.e., there are no unmeasured confounders of the treatment-outcome association. Unmeasured confounding can be overcome by conducting an instrumental variable (IV) analysis. This requires the availability of an IV, which is a variable that has the following properties: (a) has an association with the treatment of interest that is not confounded by unmeasured variates; (b) no direct effect on the outcome, except through the treatment of interest; and (c) has an association with the outcome that is not confounded by unmeasured variates. Such a variable, when available, can be used to identify treatment effects without knowledge of the treatment selection mechanism (Imbens and Angrist, 1994; Angrist et al., 1996). Some common examples of IVs in the binary treatment setting include physician preferences for treatment prescription, randomized encouragement to treatment, and treatment assignment in randomized clinical trials with noncompliance.
There has been relatively little research into developing methods for IV analysis of right censored time-to-event data. For the randomized trial setting, Robins and Tsiatsis (1991) developed semiparametric estimators of the treatment effect under a semiparametric structural accelerated failure time model for the outcome. Loeys and Goetghebeur (2003) extended the approach proposed by Robins and Tsiatsis (1991) to a proportional hazards model of treatment effect. Baker (1998) considered discrete-time survival data, and developed an estimator of the difference in hazards at a specific time between compliers in the treatment and control groups of a randomized trial. Richardson et al. (2016) considered competing risks data and proposed non-parametric estimators, decomposing the overall causal effect of treatment on survival probability at a fixed time point into the sum of causal effects on cause-specific cumulative incidence functions. Like the other approaches mentioned above, Richardson et al. (2016) assume independent censoring.
A major drawback of all the aforementioned methods and various others (e.g., Elashoff et al., 2012), is that they do not allow for the adjustment of IV-outcome confounding. As such, these methods have somewhat limited applicability for causal inference from observational data. Among the methods that do permit inclusion of covariates, Mark and Robins (1993) considered an accelerated failure time model for the outcome, Cuzick et al. (2007) considered a proportional hazards model and Gong (2008) considered parametric survival models. Nie et al. (2011) utilized the mixture structure implied by the latent compliance model to develop a plug-in non-parametric empirical maximum likelihood estimation approach for the difference between compliers in the treatment and control groups of a randomized trial; the treatment effect was with respect to survival probability at a specific time point. Tchetgen Tchetgen et al. (2015) proposed a control function approach to estimate the difference in hazards at a specific time under an additive hazards model for the outcome; see also Li et al. (2015). Yu et al. (2016) extended the work of Cuzick et al. (2007) to the class of semiparametric linear transformation models. Martinussen et al. (2017a) proposed to estimate the causal effect among the treated using a flexible structural cumulative failure time model (see also Martinussen et al., 2017b). A feature of the methods we propose (that is not shared by the majority of the afore-listed approaches) is the lack of parametrization of the treatment effect.
Of our interest in this report is the CACE, which has been studied in several existing works. For instance, Cai et al. (2011) showed that both two-stage predictor substitution (2SPS) and two-stage residual inclusion (2SRI) are biased for the causal odds ratio among compliers. Wan et al. (SIM, 2015) demonstrated analogous results for the hazard ratio. Recently, Yang et al. (2019) proposed a fix to estimate the local causal hazard ratio using IV. Note that, with respect to the Average Causal Effect, Wan et al. (2018) showed generally that neither 2SPS nor 2SRI can be directly applied to non-collapsible outcome models; this includes the proportional hazards model. The above-described difficulties in applying IV methods in the survival analysis setting imply the need for treatment effect estimators which are less tied to model specification. As will be described, we address this issue in our work by a CACE estimator which is constructed nonparametrically, rather than summarized by a model parameter.
In this report, we develop estimators of the CACE on the restricted mean survival time (RMST) under unmeasured confounding of the treatment-outcome association using a binary IV analysis. Specifically, we develop estimators of the treatment effect on RMST in a sub-population where assignment to either treatment is possible and compliance to assigned treatment is guaranteed. Being a cumulative treatment effect, the effect of treatment on RMST may be of greater interest than the effect on the hazard function (or the survival function at a specific time point), especially in the presence of a treatment effect that changes over time (Wei and Schaubel, 2008; Schaubel and Wei, 2011). To the best of our knowledge, only one other paper has specifically studied IV analysis of RMST. Kjaersgaard and Parner (2016) proposed a pseudo-outcome approach to determine treatment effects on RMST in a setting with a continuous IV.
Our motivating example involves the comparison between peritoneal dialysis (PD) and haemodialysis (HD), the two most frequently used dialysis modalities, with respect to 5-year RMST among end stage renal disease (ESRD) patients. While many studies have compared HD and PD, results have been conflicting. Some studies have shown PD to be associated with a survival advantage initially but no significant difference afterward (e.g., Fenton et al., 1997; Heaf et al., 2002; Kumar et al., 2014) and others showing that mortality rate is higher in patients receiving PD than those receiving HD (e.g., Kim et al., 2014). A key concern in most of the afore-listed comparisons is strong selection bias; PD patients tend to be healthier, likely in ways not completely captured by the observed covariates. This leads to the question, if unmeasured confounders were accounted for, which dialysis modality would be superior in terms of patient survival?
In an observational setting, the assumption that the instrument of choice is completely randomly assigned might not be valid. For example, the instrument in our setting is the dialysis facility-level variation in PD usage. However, the random assignment requirement may be met after adjusting for a set of observed instrument-outcome confounders, making the instrument conditionally distributed “as good as random”. These measured instrument-outcome confounders can be incorporated by including them in two-stage regression models or through matching. Two-stage regression modeling in the survival setting often requires additional modeling assumptions (Li et al., 2015; Tchetgen Tchetgen et al., 2015). Matching, on the other hand, may be infeasible in the presence of even a moderate number of covariates, as in our setting in which there are > 20 covariates. For such situations, some authors (e.g., Frolich, 2007) have proposed matching using the instrument propensity score, i.e., the conditional probability of assignment to the instrument group encouraging treatment given covariates. An alternative to matching and regression based estimators are inverse weighting based estimators with weights based on the estimated instrument propensity score. Previously, Tan (2006) proposed an inverse probability of instrument weighted (IPIW) IV estimator, where subjects in each instrument group are weighted by the inverse of the conditional probability of assignment to that instrument group.
In this report, we propose to use weights that tend to produce more efficient estimators than matching or IPIW. Further, unlike matching based estimators, we are able to derive easily implementable asymptotic variance estimators for our proposed treatment effect estimators and thus do not have to rely on resampling based methods.
In Section 2, we describe the notation and assumptions required for our method. We then state the asymptotic properties of our proposed estimators, proofs of which are provided in the Appendix. In Section 3, we present a finite sample evaluation of our estimators. In Section 4, we apply our methods to compare HD and PD modalities using data from the United States Renal Data System (USRDS). Finally, in Section 5, we provide a discussion.
2. Methods
2.1. Notation
Our data consist of n subjects randomized to one of two levels of a binary instrumental variable. Hence-forth, we refer to these two levels of the IV as encouragement toward the treatment and encouragement towards control. For each subject i, the binary IV is denoted by Zi, with Zi = 1 for subjects randomized to receive encouragement toward treatment and Zi = 0 for subjects randomly encouraged toward control.
We use Rubin’s potential outcomes framework (Rubin, 2005) to define quantities of interest. We first define a vector of potential treatment outcomes for each subject as , where Ai(0) and Ai(1) denote the treatment that subject i would have received had they been randomized to receive Zi = 0 and Zi = 1, respectively. Then, under the consistency assumption (Rubin, 2005), the observed treatment Ai = Ai(0)I(Zi = 0) + Ai(1)I(Zi = 1), where I(·) is an indicator function taking the value 1 when its argument is true and 0 otherwise. Based on the vector of potential treatment outcomes, subjects can be grouped into four complier classes: subjects with are compliers (i.e., they receive treatment only if encouraged toward treatment); subjects with are always takers (i.e., they always receive treatment); subjects with are never takers (i.e., they never receive treatment); and subjects with are defiers (i.e., they receive treatment only if encouraged toward control and vice-versa). We further define Ti(z, ai(z)), the potential time-to-event that would be observed if subject i is randomized to Z = z and actually receives treatment ai(z), for all combinations of (z, ai(z)). An IV analysis generally estimates the causal effect of the treatment received on subjects in the complier class, i.e., . This is commonly referred to as the local average treatment effect (LATE) or the complier average causal effect (CACE).
Let Ti denote the time-to-event, which is subject to right censoring by Ci. We let Xi be a vector of observed time-independent covariates that, in the absence of adjustment, could potentially confound the instrument-outcome relationship. We let Δi = I(Ti ≤ Ci) be the observed event indicator. The observed data for each subject i, then consist of , where represents observation time, with a ∧ b = min(a,b). We propose to measure the causal effect in terms of the restricted mean survival time (RMST). The RMST at a fixed (pre-specified) time L is defined as E(T ∧ L). It is required that L ≤ τ with τ denoting the maximum observation time. Note that RMST is also equal to the area under the survival curve over the interval [0, L]. The causal effect of interest then becomes, .
2.2. Assumptions
We assume that the are independent and identically distributed across i = 1, …, n. In addition, to estimate the CACE, we make the following six assumptions:
-
A1.
Stable unit treatment value assumption (SUTVA).
Each subject’s potential outcomes are not affected by the randomly assigned encouragement status (IV) of other subjects in the population (‘no interference’). In addition, there is a single value for each level of treatment.
-
A2.
Independence of instrument. .
The IV is independent of unmeasured confounders given observed covariates.
-
A3.
Exclusion Restriction. Ti(0,1) = Ti(1,1), Ti(1,0) = Ti(0,0).
The IV can affect the outcome only by affecting the treatment received.
-
A4.
Non-zero causal effect of Z on A: E{Ai(1) − Ai(0)} > 0.
-
A5.
Monotonicity: Ai(1) ≥ Ai(0).
The existence of the defiers class is ruled out.
-
A6.
Independent Censoring. Ci ⊥ Ti|Zi, Xi.
Censoring is independent of time-to-event given observed covariates and IV.
While the effect of the randomly assigned encouragement status may sometimes be of interest, the goal of an IV analysis is to estimate the causal effect of the treatment actually received. Under assumptions A1-A5, if all event times before L were observed, we can obtain the treatment effect among the complier sub-group (CACE) as:
| (1) |
Assumption A6 is required to accommodate censoring, enabling the construction of an inverse probability of censoring weighted estimator which converges in probability to (1).
2.3. Weighting
To account for measured confounders of the instrument-outcome relationship, we re-weight the data using weights based on the instrument propensity score: e(X) = Pr(Z = 1|X). Owing to the balancing property of the propensity score, conditioning on e(X) retains independence of the IV and unmeasured confounders, and using weights based on e(X) sufficiently adjusts for confounding due to X. Furthermore, assumptions A2, A3 and A6 which condition on X, can also be written by conditioning on e(X) instead.
We propose using a matching weight (Li et al., 2013),
| (2) |
developed as a weighting analogue to paired matching on the propensity score. Li and Greene (2013) show that, in the unconfounded setting, the estimator of treatment effect obtained using m(X) is asymptotically equivalent to the estimator from one-to-one paired matching on the propensity score, when matching is done without replacement and within a pre-specified caliper. Essentially, using m(X) then provides a method to make treatment comparisons using all the subjects in the data and thus provides a more efficient alternative to matching where unmatched subjects are discarded. Further, unlike matching, the weighting approach leads to more accurate variance calculation and simpler asymptotic analysis. These appealing attributes motivate us to develop a matching weight (MW) estimator for the setting with unmeasured confounders. The developed estimator can be viewed as a more efficient alternative to the instrument propensity score matching based estimator.
Note that the paired matching estimator in the unconfounded setting estimates a quantity different from the average treatment effect and average treatment effect on the treated (Li et al., 2013). When treatment is not evenly distributed in the sample, paired matching creates a cohort where some subjects are discarded due to the unavailability of a similar subject in the opposite treatment group. In the special case where all the treated (or untreated) subjects are retained in the matched cohort, paired matching is able to estimate the average treatment effect among the treated (or untreated). More generally, however, the quantity estimated from pair matched studies can be described formally as the average treatment effect on the evenly matchable population (ATM) (Samuels, 2017). To understand ‘evenly matchable’, it is easiest to consider a single discrete covariate. At each covariate level, subjects of the less frequent treatment level are 1:1 matched without replacement to subjects with the other treatment level. The ATM targets the effect of treatment in the sub-population for whom assignment to either treatment is possible and places a greater weight on subjects in the center of the propensity score distribution. These subjects may often be of most interest when contrasting treatments, as they represent those most likely to be recruited for random assignment in clinical trials (Li and Greene, 2013).
In the setting with unmeasured confounders, paired matching and the matching weight estimators produce an estimate of the CACE on the evenly matchable population across the two treatment assignment (or instrument) levels. To distinguish this quantity from the CACE defined in equation (1), we call it CACE-m. Like the CACE, under assumptions A1 − A5 and in the absence of censoring, the CACE-m can be recovered as:
| (3) |
where Mi = 1 indicates that subject i belongs to the evenly matchable population. In estimating this quantity, explicit identification of the evenly matchable units is bypassed using matching weights or by performing pair matching without replacement on the instrument propensity score. In the simulations in Section 4, we compare these two approaches. At this juncture, it should be noted that, since the MW and IPIW estimators average over different populations in estimating the CACE, the two will tend to have different estimands when the treatment effect is heterogeneous and treatment assignment probability depends strongly on the covariate.
We also compare the performance of the proposed matching weights to an inverse instrument probability weights (IPIW). These were first proposed by Tan (2006) and are expressed as: IPIW = [Ze(X) + (1 − Z)(1 − e(X))]−1. The IPIW weights seek to achieve covariate balance across instrument groups by re-weighting subjects in each group by the probability of being assigned to the observed instrument group. These weights enable the estimation of δ(L), the quantity in equation (1) and thus target a different estimand than the matching weights. A problem with the IPIW is that the weights may become very large when the propensity score approaches 0 or 1, leading to biased and highly inefficient estimates of treatment effect. However, the matching weights we propose are able to guard against this problem as they are bounded between 0 and 1.
2.4. Estimation
We now develop an estimator, , of the CACE-m on the RMST in presence of measured time-independent confounders of the instrument-outcome and instrument-treatment relationship, Xi. We first concentrate on quantities from the numerator of (1).
Let Ni(t) = ΔiI(Ti ≤ t) be the observed event counting process indicator and let be the at-risk process indicator. Further, define IV level-specific versions dNiz(t) = I(Zi = z)dNi(t) and Yiz(t) = I(Zi = z)Yi(t), where z ∈ {0,1}. If censoring did not depend on Xi (not among our assumptions), then the cumulative hazard Λz(t) when randomly assigned to IV level Z = z could be estimated by the following weighted Nelson-Aalen estimator,
| (4) |
where is defined as in (2), but with e(Xi) replaced by its estimate based on a logistic model, Pr(Zi = 1|Xi), and given by .
To adjust for covariate-dependent censoring, we weight each subject’s contribution at time t by the inverse of the probability of being uncensored at time t, i.e., each subject’s contribution in (2) is multiplied by . As the true censoring distribution is unknown in most cases, an estimate can be obtained non-parametrically or by fitting separate Cox proportional hazards models to the censoring distribution at each level of the IV. For example, if the censoring time for subjects assigned to IV level Z = z is modeled using the following proportional hazards model , then, an estimate for when randomly assigned to IV level Z = z is given by . Thus, in the presence of covariate-dependent censoring, the weighted Nelson-Aalen estimator of the cumulative hazard Λz(t) is given by
| (5) |
Let μT,z(L) = E[Ti ∧ L|Zi = z,Mi = 1] denote the average RMST at IV level Z = z among evenly matchable units. An estimator of the numerator in equation (3) is then given by , where , with .
The denominator in (3) is estimated as the difference in the weighted average of actual treatment received between the two IV levels. Let μA,z = E[Ai|Zi = z,Mi = 1], then an estimate of the denominator in (3) is given by:
| (6) |
Thus, an estimate of the complier average causal effect in the evenly matchable population is given by:
| (7) |
Note that by replacing the MW with the IPIW in equations (6) - (10), we can obtain an estimator, , of the CACE, δ(L), in equation (1). Likewise, with set equal to IPIW in stead of matching weights, the subsequently stated theorems can be used to summarize the asymptotic behavior of the estimator .
2.5. Asymptotic Properties
The following two theorems summarize the asymptotic behavior of our proposed estimator.
THEOREM 1: Under assumed regularity conditions (a.) to (g.) in Appendix, converges almost surely and uniformly to Λz(t) for t ∈ [0, τ], and converges asymptotically to a zero mean Gaussian process with covariance function σz(s,t) = E{Φiz(s)Φiz(t)}, where Φiz(t) = Φiz1(t) + Φiz2(t) + Φiz3(t) + Φiz4(t); and
and represent influence functions for the IV assignment and censoring models respectively. We refer to the Appendix for a detailed proof which utilizes some of the ideas in Schaubel and Wei (2011).
THEOREM 2: Under regularity conditions (a.) - (g.) in Appendix, converges almost surely to δm(t) for t ∈ [0, τ], and converges asymptotically to a mean zero Gaussian process with covariance function , where ; with and .
A detailed proof with definitions for QT and Σi(t) is given in the Appendix. The covariance function is estimated by replacing the limiting values with their empirical counterparts.
3. Simulation Studies
We conducted simulation studies to assess the finite sample properties of the proposed estimator and the associated asymptotic standard error estimator. We also demonstrate the benefits of doing an IV analysis over a ‘naive’ analysis which assumes no unmeasured confounding. In a ‘naive’ analysis, average treatment effect is estimated by adjusting only for the observed covariates. Treatment comparisons can still be made by using propensity score matching or inverse weighting based methods, but (instead of the instrument propensity score) the treatment propensity score (i.e., probability of receiving treatment) is used. For example, to obtain a matching based estimator of the average treatment effect in the ‘naive’ analysis, subjects are matched across treatment groups on the treatment propensity score and the difference estimates of RMST between treated and untreated subjects is averaged across matched sets. Similarly, the inverse-weighting based estimator of the treatment effect is simply equal to the difference in inverse weighted estimates of RMST in the treatment and untreated groups, using weights based on the treatment propensity score.
For our simulations, under assumptions A1 − A6, 1000 datasets were generated for a setting with two observed instrument-outcome confounders, {X1,X2} and one unmeasured confounder, Xu. For each subject, X2 was generated from a Bern(0.6) distribution and X1 and Xu were generated from two separate univariate N(0,0.5) distributions. Given covariate values, the level of IV that each subject was generated from the logistic model, Pr(Z = 1) = logit−1(−0.5+3X1+X2). Each subject was then assigned to be a complier, never taker or always taker based on the value of their unmeasured confounder. Specifically, the actual treatment receipt status was generated as: A = 1×I(Xu > 0.5)+0×I(Xu < −0.5)+Z×I(−0.5 ≤ Xu ≤ 0.5), where A = 1 denotes treated and A = 0 denotes untreated. Event times T were then generated from an exponential model with rate λT = 0.01exp{−2 − A − 0.5X1 + 0.5X2 − 0.25Xu}. Censoring times C were generated from an exponential model with rate λc = 0.01exp{−θc0 − 1.5X2}, where θc0 was set to {1.25, 2.5} to correspond to a high (~ 52%) and moderate (~ 35%) level of censoring. For each censoring scenario, the performance of the estimators was evaluated at sample sizes n = 500, 1000, 2000. CACE and CACE-m were estimated at L = 1825 (i.e., 5 years, if the time scale were in days). Note that, since the target estimands for the MW and IPIW methods can be unequal, we used the estimator-specific estimands to compute bias and MSE.
A comparison of the different weighting estimators and a propensity score matching approach in an IV analysis, and a naive analysis, at different sample sizes with ~ 35% and ~ 52% censoring is displayed in Tables 1 and 2 respectively. As expected, a naive analysis that ignored confounding seemed to produce systematically biased estimates at all sample sizes with relative bias ranging between 7 − 13%. For the IV analysis, the proposed MW estimators outperformed the matching based estimator with respect to efficiency, measured in terms of mean squared error (MSE), at all sample sizes and all levels of censoring, with MSE ranging from 0.7 – 0.8 times that of matching based estimators. The IPIW weights, on the other hand, were less efficient than matching. Note that bias and MSE were computed relative to each estimator’s estimand.
Table 1.
Simulation results: Proposed IV estimators and propensity score matching with ≈ 35% censored before L = 1825 and δ(L) = 501, δm(L) = 502
| Naive Analysis | IV - Correct PS | ||||||
|---|---|---|---|---|---|---|---|
| Method | Percent Bias | ESD | Percent Bias | Relative MSE | ESD | ASE | CP |
| N = 500 | |||||||
| IPTW / IPIW | −12 | 63 | 1 | 1.06 | 124 | 129 | 0.96 |
| MW | −11 | 59 | 1 | 0.77 | 105 | 114 | 0.97 |
| Matching | −12 | 67 | 1 | 1 | 120 | – | – |
| N = 1000 | |||||||
| IPTW / IPIW | −11 | 44 | 1 | 1.27 | 92 | 100 | 0.97 |
| MW | −11 | 42 | 1 | 0.83 | 74 | 82 | 0.97 |
| Matching | −12 | 46 | −1 | 1 | 82 | – | – |
| N = 2000 | |||||||
| IPTW / IPIW | −11 | 31 | 1 | 1.26 | 62 | 68 | 0.98 |
| MW | −11 | 30 | 2 | 0.86 | 51 | 58 | 0.98 |
| Matching | −13 | 33 | −1 | 1 | 55 | – | – |
For Naive Analysis
IPTW = {Ae(X) + (1 − A)(1 − e(X))}−1
MW - Matching Weight = min{e(X), (1 − e(X))}{Ae(X) + (1 − A)(1 − e(X))}−1
For IV Analysis
IPIW = {Ze(X) + (1 − Z)(1 − e(X))}−1
MW - Matching Weight = min{e(X), (1 − e(X))}{Ze(X) + (1 − Z)(1 − e(X))}−1
Table 2.
Simulation results: Proposed IV estimators and propensity score matching with ≈ 52% censored before L = 1825 and δ(L) = 501, δm(L) = 502
| Naive Analysis | IV Analysis | ||||||
|---|---|---|---|---|---|---|---|
| Method | Percent Bias | ESD | Percent Bias | Relative MSE | ESD | ASE | CP |
| N = 500 | |||||||
| IPTW / IPIW | −9 | 76 | 0 | 0.97 | 148 | 132 | 0.92 |
| MW | −8 | 73 | 1 | 0.75 | 130 | 127 | 0.94 |
| Matching | −7 | 84 | 5 | 1 | 149 | – | – |
| N = 1000 | |||||||
| IPTW / IPIW | −9 | 55 | 1 | 1.15 | 112 | 104 | 0.94 |
| MW | −9 | 52 | 2 | 0.8 | 93 | 93 | 0.94 |
| Matching | −10 | 58 | 1 | 1 | 104 | – | – |
| N = 2000 | |||||||
| IPTW / IPIW | −10 | 39 | 1 | 1.15 | 78 | 77 | 0.96 |
| MW | −10 | 38 | 3 | 0.83 | 65 | 68 | 0.95 |
| Matching | −12 | 42 | 0 | 1 | 73 | – | – |
For Naive Analysis
IPTW = {Ae(X) + (1 − A)(1 − e(X))}−1
MW - Matching Weight = min{e(X), (1 − e(X))}{Ae(X) + (1 − A)(1 − e(X))}−1
For IV Analysis
IPIW = {Ze(X) + (1 − Z)(1 − e(X))}−1
MW - Matching Weight = min{e(X), (1 − e(X))}{Ze(X) + (1 − Z)(1 − e(X))}−1
For evaluation in a setting with stronger unmeasured confounder effect on outcome, event times T were generated from an exponential model with rate λT = 0.01 exp{−2−A−0.5X1+0.5X2−2Xu}. In results presented in Table 3, we see that the matching weight estimators outperform the matching estimators in terms of RMSE. Further, as observed earlier, the CACE-m estimators seem to be more efficient than the CACE estimators obtained from IPIW. With stronger unmeasured confounding, the naive analysis becomes heavily biased, with percent bias ranging between 63 − 78%.
Table 3.
Simulation results: Proposed IV estimators and propensity score matching with ≈ 52% censored before L = 1825 and δ(L) = 460, δm(L) = 461
| Naive Analysis | IV Analysis | ||||||
|---|---|---|---|---|---|---|---|
| Method | Percent Bias | ESD | Percent Bias | Relative MSE | ESD | ASE | CP |
| N = 500 | |||||||
| IPTW / IPIW | −73 | 80 | −2 | 0.85 | 160 | 141 | 0.93 |
| MW | −68 | 80 | −2 | 0.69 | 143 | 133 | 0.94 |
| Matching | −63 | 95 | 5 | 1 | 172 | – | – |
| N = 1000 | |||||||
| IPTW / IPIW | −77 | 58 | 0 | 0.89 | 119 | 113 | 0.93 |
| MW | −72 | 57 | 1 | 0.68 | 105 | 100 | 0.93 |
| Matching | −69 | 68 | 1 | 1 | 127 | – | – |
| N = 2000 | |||||||
| IPTW / IPIW | −78 | 40 | 2 | 1.05 | 85 | 86 | 0.95 |
| MW | −73 | 40 | 3 | 0.78 | 72 | 73 | 0.95 |
| Matching | −73 | 45 | 0 | 1 | 83 | – | – |
For Naive Analysis
IPTW = {Ae(X) + (1 − A)(1 − e(X))}−1
MW - Matching Weight = min{e(X), (1 − e(X))}{Ae(X) + (1 − A)(1 − e(X))}−1
For IV Analysis
IPIW = {Ze(X) + (1 − Z)(1 − e(X))}−1
MW - Matching Weight = min{e(X), (1 − e(X))}{Ze(X) + (1 − Z)(1 − e(X))}−1
In Tables 1–3, we also show that the proposed asymptotic variance estimator provided 95% confidence intervals that covered the true parameter 93% - 99% in both censoring scenarios. Asymptotic variance estimators and thus coverage probability estimates are not availabe for matching estimators.
4. Application
We applied our methods compare HD and PD modalities for end stage renal disease (ESRD) patients using data from the United States Renal Data System (USRDS). Previous studies of this question have yielded conflicting results, providing no conclusive evidence for or against the use of PD. This suggests that an IV analysis might be in order to adequately address unmeasured treatment-outcome confounding and shed some valuable insight on the problem.
We conducted an IV analysis to estimate the effect of dialysis modality on the restricted mean survival at 5 years (L = 1825 days) since ESRD incidence. Our study population consisted of incident dialysis patients initiating dialysis between 01/01/2009 and 12/31/2014. A potential instrument is the dialysis facility-level variation in PD usage, defined as the facility-specific proportion of patients initiating dialysis with PD. We used a dichotomized version of this instrument, with patients in facilities with PD usage above the national average defined as receiving the instrument of encouragement toward PD. Owing to the nature of our analysis we excluded small dialysis facilities defined as having < 10 PD patients and < 50 patients in total. After this step, our study cohort had 164,837 patients distributed across 929 dialysis facilities in the United States. To avoid introducing patient-level confounding between the instrument and unmeasured confounders, historical data from 2006–08 was used to determine PD usage. The mean PD usage rate within facility varied from 1.8% to 54.6% with a mean of 14.5% and median of 12.5%. The correlation between facility-level mean PD usage in 2006–08 and 2009–14 was 0.57, and the dichotomized PD encouragement status was significantly associated (β = 0.1, p < 0.0001) with individual PD usage in a model adjusting for available patient-level covariates, suggesting the potential for a good instrument.
To check if there was an obvious violation of the exclusion restriction assumption, we carried out a graphical assessment of the association between the instrument and outcome given treatment. In Figure 1, we show that the covariate-adjusted standardized mortality ratio at each dialysis facility for patients receiving HD is unrelated to the proportion of patients receiving PD in that facility. This indicates that dialysis facility may indeed be a good instrument as among patients not receiving treatment, the instrument i.e., facility-level PD usage, has no effect on the outcome of interest, i.e., mortality.
Figure 1.
Figure 1: Assessing exclusion restriction assumption for facility instrument
Table 3 presents a comparison of patients initiating dialysis on PD and HD with respect to age, comorbidities and primary renal diagnosis. On average, PD patients were indeed 5 years younger and healthier in terms of having fewer comorbidities than HD patients. While these patient-level factors are observed and can be adjusted for in a regression analysis, it is plausible that other unmeasured patient-level confounders might influence both the choice of dialysis modality and survival, thus, necessitating an IV analysis. The likelihood of unmeasured confounding seems greater knowing that every available risk factor in Table 4 is more prevalent for HD than PD patients.
Table 4.
Analysis of USRDS Data: Description of Study Cohort by Dialysis Modality
| Covariate | Haemodialysis | Peritoneal Dialysis | Standardized Difference |
|---|---|---|---|
| Percent Died | 53 | 36 | −36.3 |
| Age (Years) | 63.6 | 58.1 | −36.5 |
| Primary Renal Diagnosis |
|||
| Diabetes | 46 | 43 | −5.9 |
| Hypertension | 28 | 26 | −4.6 |
| Glomerulonephritis | 8 | 15 | 22.3 |
| Other | 17 | 15 | −5.5 |
| Comorbidities |
|||
| Alcohol Use | 2 | 1 | −11.6 |
| ASHD | 21 | 13 | −21.3 |
| Cancer | 8 | 5 | −11.9 |
| CHF | 33 | 16 | −39.5 |
| COPD | 10 | 4 | −22.4 |
| CVA | 10 | 6 | −13.7 |
| Diabetes | 11 | 7 | −12.1 |
| Drug Use | 1 | 0 | −10.7 |
| PVD | 14 | 9 | −17.4 |
| Tobacco Use | 7 | 6 | −2.9 |
ASHD - athlero-sclerotic heart disease
CHF - congestive heart failure
COPD - chronic obstructive pulmonary disorder
CVA - cerebrovascular accident
PVD - peripheral vascular disease
Based on historical evidence as important predictors, we included the year of ESRD incidence, gender, race, ethnicity (Hispanic or not), age at dialysis initiation, binary indicators of important comorbidities and primary renal diagnosis (Table 4) as covariates in the logistic regression model for instrument propensity score. Censoring was either administrative or due to receiving a kidney transplant. The censoring distribution in each instrument-level was estimated using separately fitted Cox hazards regression models including all the aforementioned patient-level covariates. The estimated propensity score and censoring probabilities was then used to construct the weighted estimator in equation (7).
In Table 5 we present the results from the IV analysis and the corresponding ‘naive’ analysis for each of the proposed inverse-weighting based estimators. For the ‘naive’ analysis, we estimated the causal effect as the difference in inverse weighted estimates of RMST in the treatment and control groups, using weights based on the treatment propensity score; this ignores the presence of unmeasured confounders. Both the IV and naive analysis showed the use of PD to be beneficial. The IV analysis indicated that initiating dialysis using PD may lead to a gain in 5-year RMST of nearly 3 months. The naive analysis indicated a slightly higher benefit of nearly 4 months for using for using PD. As shown in Table 4, the PD patient population was younger and healthier as measured by presence of observed comorbidities. It is plausible that other unmeasured health characteristics of the PD population may have contributed to the slightly higher benefit of PD seen in the naive analysis.
Table 5.
Analysis of USRDS Data: Results from IV and Naive analysis
| Method | IV Analysis | Naive Analysis | ||
|---|---|---|---|---|
| Method | Estimate | 95% Interval | Estimate | 95% Interval |
| IPTW / IPIW | 87.5 | (17.8, 157.1) | 118.4 | (107.5, 129.1) |
| MW | 88.5 | (18.1, 158.8) | 102.0 | (88.5, 115.5) |
| Matching | 69.9 | (−12.0, 151.8) | 115.6 | (103.3, 127.9) |
For the sake of comparison we also performed a naive Cox regression analysis that ignored any unmeasured confounding. In this analysis, we estimated the treatment effect via a Cox proportional hazards regression model which included the aforementioned covariates. The model yielded a hazard ratio of 0.75[95%CI : 0.74, 0.77] for treatment, also suggesting a strong benefit of initiating dialysis using PD.
5. Discussion
In this report, we develop a weighted estimator of the complier average causal effect on the restricted mean survival time. The proposed method addresses unmeasured confounding of the treatment-outcome relationship in the censored time-to-event setting. A unique feature of our approach it that it accommodates observed instrument-outcome confounding, such that one only needs to assume that the instrument is randomly assigned conditional on some observed covariates. This makes the proposed estimator particularly suited for causal inference from observational studies. The weights we propose to use, namely the matching weights tend to outperform the IPIW proposed by Tan (2006) and propensity score matching in terms of MSE even in the presence of moderate variability in instrument propensity score. This is mainly because, unlike the IPIW, the MW are bounded between 0 and 1 and are thus less sensitive to extreme weights. Further, an advantage of using the proposed weighted estimators over matching is the availability of easily implementable asymptotic variance estimators which are derived in this report. Future research could concentrate on improving these estimators further by developing a doubly robust version of the MW based estimators.
To the best of our knowledge, this is the first study of a binary IV analysis of RMST. The only other study of an IV analysis method for estimating causal effects on the RMST (Kjaersgaard and Parner (2016)) was in the context of a contionuous IV and used a pseudo-observation appraoch. As Kjaersgaard and Parner (2016) point out, a major limitation of the pseudo-observation approach is that censoring is required to be independent of covariates and instrument. This is important in many practical settings, including our motivating example. For example, calendar period of dialysis-initiation naturally affects the censoring distribution. Moreover, many patient characteristics affects the probability of receiving a kidney transplant. In comparison, the proposed method only requires the assumption that censoring and survival times are independent conditional on covariates and instrument. Future research could concentrate on developing methods that further relax this assumption by allowing censoring to depend on treatment. This is especially useful in our application where censoring, often due to termination of dialysis, is known to depend on some morbidity covariates which are also risk factors for mortality.
It is worth noting that the semiparametric structural cumulative failure time model proposed by Martinussen et al. (2017) can be utilized to estimate causal effects on the RMST scale. The authors note that an assumption of no interaction between the confounder and treatment may be required for an interpretation of the parameters and thereby the estimated causal effect. In comparison, our approach does not assume a specific model structure for the failure time outcome, let alone require additional assumptions for interpretation of the parameters. In addition, we focus on estimating the causal effect in a matchable population which could be but is not always the same as the treated population.
Supplementary Material
Acknowledgements
This work was supported in part by National Institutes of Health Grant R01-DK070869. The data reported here have been supplied by the United States Renal Data System (USRDS), which is funded by the National Institute of Digestive and Diabetes and Kidney Diseases (NIDDK), through National Institutes of Health (NIH) contract HHSN276201400001C. The interpretation and reporting of these data are the responsibility of the authors and in no way should be seen as an official policy or interpretation of the United States government. The funder had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.
Footnotes
Conflict of Interest
The authors have declared no conflict of interest.
Supporting Information for this article is available from the first author.
References
- Angrist JD, Imbens GW and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91, 444–455. [Google Scholar]
- Baker SG (1998). Analysis of survival data from randomized trial with all-or-none compliance: Estimating the cost-effectiveness of a cancer screening program. Journal of the American Statistical Association 93, 929–934. [Google Scholar]
- Cai B, Small DS, and Ten Have TR (2011). Two-stage instrumental variable methods for estimating the causal odds ratio: Analysis of bias. Statistics in Medicine 30, 1809–1824. [DOI] [PubMed] [Google Scholar]
- Cuzick J, Sasieni P, Myles J, and Tyler J (2007). Estimating the Effect of Treatment in a Proportional Hazards Model in the Presence of Non-compliance and Contamination. Journal of the Royal Statistical Society, Series B (Methodological) 69, 565–588. [Google Scholar]
- Elashoff R, Li G, Zhou Y (2012). Nonparametric inference for assessing treatment efficacy in randomized clinical trials with a time-to-event outcome and all-or-none compliance. Biometrika. 99(2), 393–404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fenton SA, Schaubel DE, Desmeules M, Morrison HI, Mao Y, Copleston P, Jeffery JR, and Kjellstrand CM (1997). Hemodialysis versus peritoneal dialysis: a comparison of adjusted mortality rates. American Journal of Kidney Diseases, 30(3), 334–342. [DOI] [PubMed] [Google Scholar]
- Frolich M (2007). Nonparametric IV estimation of local average treatment effects with covariates. Journal of Econometrics 139, 35–75. [Google Scholar]
- Heaf JG, Løkkegaard H, and Madsen M (2002). Initial survival advantage of peritoneal dialysis relative to haemodialysis. Nephrology Dialysis Transplantation 17(1), 112–117. [DOI] [PubMed] [Google Scholar]
- Imbens GW, and Angrist JD (1994). Identification and estimation of local average treatment effects. Econometrica 62, 467–475. [Google Scholar]
- Kjaersgaard MI, and Parner ET(2016) Instrumental variable method for time-to-event data using a pseudo-observation approach. Biometrics 72(2), 463–472. [DOI] [PubMed] [Google Scholar]
- Kim H, Kim KH, Park K, Kang S-W, Yoo T-H, Ahn SV, Ahn HS, Hann HJ, Lee S, Ryu J-H, et al. (2014) A population-based approach indicates an overall higher patient mortality with peritoneal dialysis compared to hemodialysis in Korea. Kidney international 86(5), 991–1000. [DOI] [PubMed] [Google Scholar]
- Kumar VA, Sidell MA, Jones JP, and Vonesh EF (2014). Survival of propensity matched incident peritoneal and hemodialysis patients in a united states health care system. Kidney international 86(5), 1016–1022. [DOI] [PubMed] [Google Scholar]
- Li F, Morgan KL, and Zaslavsky AM (2016). Balancing covariates via propensity score weighting. Journal of the American Statistical Association to appear. [Google Scholar]
- Li J, Fine JP and Brookhart MA (2015). Instrumental variable additive hazards models. Biometrics 71, 122–130. [DOI] [PubMed] [Google Scholar]
- Li L, and Greene T (2013). A Weighting Analogue to Pair Matching in Propensity Score Analysis. International Journal of Biostatistics 9, 1–20. [DOI] [PubMed] [Google Scholar]
- Mark SD, Robins JM (1993). Estimating the causal effect of smoking cessation in the presence of confounding factors using a rank preserving structural failure time model. Statistics in Medicine 12, 1605–1628. [DOI] [PubMed] [Google Scholar]
- Martinussen T, Vansteelandt S, Tchetgen T,J E, and Zucker DM (2017). Instrumental variables estimation of exposure effects on a time-to-event endpoint using structural cumulative survival models. Biometrics 73, 1140–1149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martinussen T, Nørbo Sørensen D, and Vansteelandt S (2017). Instrumental variables estimation under a structural cox model. Biostatistics 20, 65–79. [DOI] [PubMed] [Google Scholar]
- Nie H, Cheng J, Small D (2011). Inference for the effect of treatment on survival probability in randomized trials with noncompliance and administrative censoring. Biometrics 67, 1397–1405. [DOI] [PubMed] [Google Scholar]
- Richardson A, Hudgens M, Fine JP, and Brookhart MA (2017). Nonparametric binary instrumental variable analysis of competing risks data. Biostatistics 18, 48–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robins JM and Tsiatis AA (1991). Correcting for non-compliance in randomized trials using rank preserving structural failure time models. Communications in Statistics - Theory and Methods 20, 2609–2631. [Google Scholar]
- Samuels LR (2017). Aspects of Causal Inference within the Evenly Matchable Population: The Average Treatment Effect on the Evenly Matchable Units, Visually Guided Cohort Selection, and Bagged One-to-One Matching. PhD dissertation, Vanderbilt University. [Google Scholar]
- Schaubel DE, and Wei G (2011). Double inverse-weighted estimation of cumulative treatment effects under non-proportional hazards and dependent censoring. Biometrics 67, 29–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan Z (2006). Regression and weighting methods for causal inference using instrumental variables. Journal of the American Statistical Association 101, 1607–1618. [Google Scholar]
- Tchetgen Tchetgen EJ, Walter S, Vansteelandt S, Martinussen T and Glymour M (2015). Instrumental variable estimation in a survival context. Epidemiology 26, 402–410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wan F, Small D, Bekelman JE, and Mitra N (2015). Bias in estimating the causal hazard ratio when using two-stage instrumental variable methods. Statistics in Medicine 34, 2235–2265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wan F, Small D, and Mitra N (2018). A general approach to evaluating the bias of 2-stage instrumental variable estimators. Statistics in Medicine 37, 1997–2015. [DOI] [PubMed] [Google Scholar]
- Yang F, Cheng J, and Huo D (2019). Instrumental variable approach for estimating a causal hazard ratio: application to the effect of postmastectomy radiotherapy on breast cancer patients. Observational Studies 5, 141–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu W, Chen K, Sobel M, Ying Z (2015). Semiparametric transformation models for causal inference in time to event studies with all-or-nothing compliance. Journal of the Royal Statistical Society, Series B (Methodological) 77(2), 397–415. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

