Abstract
Censored survival data are common in clinical trial studies. We propose a unified framework for sensitivity analysis to censoring at random in survival data using multiple imputation and martingale, called SMIM. The proposed framework adopts the δ-adjusted and control-based models, indexed by the sensitivity parameter, entailing censoring at random and a wide collection of censoring not at random assumptions. Also, it targets a broad class of treatment effect estimands defined as functionals of treatment-specific survival functions, taking into account missing data due to censoring. Multiple imputation facilitates the use of simple full-sample estimation; however, the standard Rubin’s combining rule may overestimate the variance for inference in the sensitivity analysis framework. We decompose the multiple imputation estimator into a martingale series based on the sequential construction of the estimator and propose the wild bootstrap inference by resampling the martingale series. The new bootstrap inference has a theoretical guarantee for consistency and is computationally efficient compared to the nonparametric bootstrap counterpart. We evaluate the finite-sample performance of the proposed SMIM through simulation and an application on an HIV clinical trial.
Keywords: delta adjustment, jump-to-reference, restrictive mean survival time, restrictive mean time loss, wild-bootstrap
1 |. INTRODUCTION
Censored survival outcomes are common in clinical trial research of chronic diseases. Three assumptions about the censoring mechanism have been proposed: censoring completely at random (CCAR), censoring at random (CAR), and censoring not at random (CNAR). Common survival analysis methods assume CCAR and CAR that patients censored at t and patients uncensored at t with the same history have the same distribution of the entire current and future variables. This assumption will be violated if sicker subjects are more likely to withdraw from the study, even after accounting for their observed history, leading to CNAR. Unfortunately, the censoring assumptions are often not testable empirically. Inappropriate assumptions may lead to biased conclusions. Regulatory agencies, such as the FDA, and the national research council (NRC, 2010) recommend sensitivity analyses to assess the robustness of study conclusions to unverifiable assumptions.
In this article, we distinguish different reasons for censoring including administrative reasons and nonadministrative reasons. For the latter, we consider patient premature dropout, which could be a case of CNAR. Many sensitivity analysis approaches have been developed for CNAR survival data. One approach is to specify a range of the residual dependence of the hazard of censoring times on the event times for the sensitivity parameter; see, for example, Rotnitzky et al. (2001), Scharfstein and Robins (2002), and Rotnitzky et al. (2007). A different approach is to directly specify pattern mixture models (Little, 1993) for event times for censored and uncensored patients and impute the missing outcomes for the censored subjects. Zhao et al. (2014) considered Kaplan–Meier curves to impute data, which, however, cannot include covariates. Alternatively, the δ-adjusted (Jackson et al., 2014; Lipkovich et al., 2016) and control-based (Lu et al., 2015; Atkinson et al., 2019) models are flexible to accommodate auxiliary information for sensitivity analysis of unverifiable missing data assumptions. Due to the transparency, these models have been widely used in applied research to handle missing data (e.g., NRC, 2012; Ratitch et al., 2013). For generality, we consider a class of δ-adjusted/control-based Cox models for censoring due to premature dropout, indexed by sensitivity parameter δ. In δ-adjusted models, δ is a parameter comparing the outcome distribution of the subjects after nonadministrative censoring with the outcome distribution of the same subjects had they remained on study. Our framework extends readily to multiple reasons by adopting different δ’s for different groups. Control-based models assume that the event hazard for censored subjects in the active treatment group is higher (more conservative) or similar to those in the control group (Gao et al., 2017). In superiority trials, the control-based models are appealing to clinical scientists since they would procedure conservative conclusions about the treatment effect if the experimental treatment is hypothesized to be better than the control treatment.
Another important question arises regarding the estimand of interest for treatment comparison in the presence of missing data. Following the International Council for Harmonization (ICH) E9 (R1) addendum, estimands should be clearly defined that describe the quantity to be estimated including how to handle intercurrent events such as premature dropout (ICH, 2019). In this article, we consider a treatment policy strategy, which evaluates treatment effect for all randomized patients on time to event endpoint regardless of the deviation of treatment such as taking rescue medication or treatment switch. When the time to event data are censored due to premature dropout, the primary analysis often assumes CAR which implicitly assumes that the hazard function for a dropout patient is the same as that for a non-dropout patient after adjusting for baseline variables included in the model. For survival sensitivity analysis using δ-adjusted models, Lipkovich et al. (2016) considered a marginal proportional hazards parameter, an additional structural assumption entailing a constant ratio of the hazard rates between the treatment groups. However, this parameter may be misleading (Hernán, 2010) if the proportional hazards assumption is violated as in the δ-adjusted models. Alternatively, we consider a broad class of treatment effect estimands defined as functionals of the treatment-specific survival functions, such as the restricted mean survival time (RMST, Chen and Tsiatis, 2001), that is, the expectation of survival time restricted to a finite time τ. Instead of focusing on a constant hazards ratio, the RMST provides a time-evolving profile of survival times for evaluating the treatment effect, without requiring additional model assumptions.
To implement sensitivity analysis, multiple imputation (MI) is the most popular method. It consists of three steps: first, fill the missing values by plausible values to create multiple complete data sets; second, apply standard full-sample methods to analyze the multiple imputed data sets; and third, use Rubin’s combining rule to summarize the results for inference. Because of its intuitive appeal, MI is recommended by the NRC as one of its preferred approaches to addressing missing data (NRC, 2012). However, many studies have realized that Rubin’s variance estimator is not always consistent for general purposes (e.g., Yang and Kim, 2016). A sufficient condition for the validity of the MI inference is the congeniality condition. Roughly speaking, it requires the imputation model to be correctly specified and the subsequent analysis to be compatible with the imputation model. Even with a correctly specified imputation model, Yang and Kim (2016) showed that MI is not necessarily congenial for the method of moments estimation, so common statistical procedures may be incompatible with MI. This phenomenon becomes pronounced for adopting MI for general sensitivity analysis in clinical trials.
Lu et al. (2015) and Liu and Pang (2016) demonstrated that Rubin’s combining rule is often conservative in control-based imputation. To overcome the conservative of Rubin’s combining rule, several authors suggested the nonparametric bootstrap to obtain the standard errors (e.g., Lu et al., 2015); however, the nonparametric bootstrap requires repeating imputation and analysis for all bootstrap samples and therefore causes a huge computation burden. Recently, Guan and Yang (2019) proposed the wild-bootstrap inference of a martingale representation of the MI estimator; however, their method is only applicable to continuous or binary outcomes but not censored survival outcomes. The standard nonparametric bootstrap requires resampling individual observations and repeating the imputation and analysis procedures; on the contrary, the wild-bootstrap uses an auxiliary zero-mean, unit variance random multiplier on the martingale residuals for variance estimation without re-imputation.
In this article, we propose a unified framework of survival sensitivity analysis for a class of functional estimands via MI. Specifically, the missing event times are imputed by a δ-adjusted or control-based Cox model for each treatment group. We derive a novel martingale representation of the proposed MI estimator. The martingale representation is inspired by the sequential construction of the MI estimator, namely, model parameter estimation and imputations. This new representation invokes the easy-to-implement wild-bootstrap inference with a theoretical guarantee for consistency. Moreover, unlike the nonparametric bootstrap, we do not require repeating imputation and analysis for the bootstrap resamples and therefore largely reduce the computation burden. The new SMIM (Survival sensitive analysis using Multiple Imputation and Martingale) framework is fairly flexible to accommodate a wide collection of censoring assumptions and treatment effect estimands.
2 |. SETUP
2.1 |. Notation and estimands
Without loss of generality, we focus on randomized clinical trials that compare a new treatment to a control treatment. We assume that the subjects constitute a random sample from a larger population. Let Xi be a vector of covariates for subject i, and let Ai be a binary treatment, 1 for the active treatment and 0 for the control treatment. Let Ti and Ci denote the time to a clinical event and the time to censoring, respectively. The full set of variables is Fi = (Xi, Ai, Ti, Ci). In the presence of censoring, denote Ui = Ti ∧ Ci, where ∧ represents the minimum of two values, and Ii = 1(Ti ≤ Ci). To distinguish different reasons for censoring, denote Ri = 1 if censoring is due to administrative reasons and Ri = 2 if censoring is due to premature dropout. The observed set of variables is Oi = {Xi, Ai, Ui, Ii, (1 − Ii)Ri}. We use O1:k to denote the k copies {O1, … , Ok}. For the total of n subjects, let and . Let the treated subjects be indexed by i = 1, … , n1, and let the control subjects be indexed by i = n1 + 1, … , n.
For treatment comparison, define and as the treatment-specific hazard rate and survival function at time t, respectively, for a = 0, 1. Under a proportional hazards assumption, one can focus on estimating the log hazard ratio β = log{λ1(t)/λ0(t)}. However, the proportional hazards assumption may be problematic, especially when two survival curves cross. Alternatively, we focus on treatment effect estimands defined as functionals of treatment-specific survival distributions, Δτ = Ψτ{S1(t), S0(t)}with some prespecified constant τ. This formulation covers a broad class of estimands favored in the context of nonproportional hazards; see examples of Δτ below.
Example 1 (Treatment effect estimands).
With a proper choice of Ψτ(·), Δτ represents the following measures of treatment effect: (a) the difference in survival at a fixed time point τ, Δτ = S1(τ) − S0(τ); (b) the difference of treatment-specific τ-RMSTs (restrictive mean survival times) Δτ = μ1,τ − μ0,τ, where for a = 0, 1; (c) the difference of weighted τ-RMSTs , where the nonnegative weight function ω(t) provides differentiable importance at different times; (d) the ratio of τ-RMTLs (restrictive mean time lost) ; (e) the difference of τth quantiles (e.g., medians) of survivals Δτ = q1,τ − q0,τ, where qa,τ = infq{Sa(q) ≤ τ}.
For identifiability, τ should be chosen properly. For the estimands in (a)–(d), we restrict τ to be smaller than tmin, the minimum of the largest observed survival times in the two treatment groups. Similarly, for the τth quantiles in (e), we require τ >max{S0(tmin), S1(tmin)}.
2.2 |. Simple full-sample estimator and asymptotic linearity
If the event times are fully observed, standard full-sample estimators can apply. To estimate Sa(t), a simple estimator is the sample proportion , for a = 0, 1. Then, a plug-in estimator of Δτ is .
To establish a unified framework, it is important to note that is asymptotically linear for all estimands given in Example 1. Under mild regularity conditions, we have
(1) |
for bounded variation functions ψa(·).
Lemma 1 (Asymptotic linear characterizations).
For all estimands in Example 1, the full-sample estimators have the following asymptotic linear characterizations. (a) For the difference in the survivals at a fixed time point τ, , corresponding to (1) with ψ1(t) = −ψ0(t) = 1(t = τ). (b) For the difference of τ-RMSTs, , corresponding to (1) with ψ1(t) = −ψ0(t) = 1. (c) For the difference of weighted τ-RMSTs, , corresponding to (1) with ψ1(t) = −ψ0(t) = ω(t). (d) For the ratio of τ-RMTLs, , corresponding to (1) with and . (e) For Δτ = q1,τ − q0,τ, , where , corresponding to (1) with and , where .
2.3 |. MI and the outline of the proposed SMIM framework
To facilitate applying full-sample estimators, MI proceeds as described in Table 1. It is well known that Rubin’s combining rule may overestimate the variance of the MI estimator when the full-sample estimators are not self-efficient. We provide an alternative decomposition of the MI estimator, which invokes the wild bootstrap for consistent variance estimation for general imputation models and estimands.
TABLE 1.
Step MI-1. | Create m complete data sets by filling in missing times to event with imputed values generated from an imputation model. Specifically, to create the jth imputed data set, generate from the imputation model for each missing Ti. |
Step MI-2. | Apply a full-sample estimator of Δτ to each imputed data set. Denote the point estimator applied to the jth imputed data set by . Denote the variance estimator applied to the jth imputed data set by . |
Step MI-3. | Use Rubin’s combining rule to summarize the results from the multiple imputed data sets. The MI estimator of Δτ is , and Rubin’s variance estimator is . |
In Step MI-1, we consider δ-adjusted and control-based Cox imputation models for sensitivity analysis. For example, the δ-adjusted Cox model assumes the treatment-specific hazard rate of failing at time t is λa(t | Xi) without premature dropout and δλa(t | Xi) after dropout, for a = 0, 1.
Based on the MI with Rubin’s combining rule in Step MI-3, the variance estimator overestimates the true variance of . For rectification, we propose a wild bootstrap variance estimator (Wu, 1986) to replace Rubin’s combining rule; Theorem 2 in Section 4 shows that the proposed variance estimator is consistent for general imputation models and treatment effect estimands. The wild bootstrap procedure does not require repeating the missing data imputation step (i.e., Step MI-1) and recalculating the point estimator (i.e., Step MI-2) using resampling data, therefore it is computationally efficient compared with the naive bootstrap.
The wild bootstrap variance estimator is motivated by a novel martingale representation of the MI estimator. Specifically, we show in Section 3 that the MI estimator of Δτ can be represented as , where the series along with properly defined σ-fields is a martingale array. This representation invokes the wild bootstrap procedure that provides valid inference of the MI estimator of Δτ (Pauly, 2011).
3 |. DELTA-ADJUSTED AND CONTROL-BASED MODELS
3.1 |. Primary analysis with the CAR benchmark assumption
To motivate the imputation models for sensitivity analysis, we first consider a CAR assumption that Ci ᚇ Ti | (Ai, Xi), analogous to the missingness at random assumption (Rubin, 1976) or the coarsening at random assumption (Tsiatis, 2006). Under CAR, we have , for a = 0, 1. Then, we can derive .
Following the common survival analysis literature (e.g., Chen and Tsiatis, 2001), we posit a conditional treatment-specific Cox regression with covariate Xi; that is,
(2) |
where λa(t) is an unknown baseline hazard function and βa is a vector of unknown parameters for a = 0, 1. Importantly, under model (2), we do not impose the restrictive proportional hazards assumption on the treatment effect because both λa(t) and βa can be different for the two treatment groups. Let θ = {λa(·), βa : a = 0, 1} summarize the infinite-dimensional parameter in the Cox model. Under CAR, we can estimate θ from the standard software such as “coxph” in R.
We adopt the counting process framework (Andersen and Gill, 1982) to introduce the estimators and their large sample properties. Define the counting process Ni(t) = 1(Ui ≤ t, Ii = 1) of observing the event and the at-risk process Yi(t) = 1(Ui ≥ t). Let be the maximum partial likelihood estimator of βa, for a = 0, 1. We can estimate by the Breslow (1974) estimator
and estimate Sa(t | Xi) by .
3.2 |. Sensitivity analyses with δ-adjusted and control-based models
CAR is not empirically testable and may be questionable for censoring due to premature dropout. We propose sensitive analysis using δ-adjusted and control-based models.
Assumption 1 (Delta-adjusted Cox model).
The treatment-specific hazard rate of failing at time t is λa(t | Xi) given in (2) without premature dropout and is δλa(t | Xi) after premature dropout (Ri = 2), for a = 0, 1, where δ > 0.
It can be seen that δ quantifies the degree of the departure from the CAR assumption. If δ = 1, we have CAR. If δ > 1, the hazard increases after dropout, indicating a worsening of condition after dropout. If δ < 1, the hazard decreases after dropout, indicating an improvement of condition after dropout. The larger the magnitude of δ, the larger the deviation from CAR. Without retrieving information for the nonadministratively censored subjects, δ cannot be ascertained. Therefore, it is recommended to vary δ in a wide range of plausible values for sensitivity analysis. To fix ideas, we use the same δ for both treatment groups, but it is easy to accommodate different δ values depending on the worsening/improvement condition for different treatment groups. For example, if the control group is a placebo group, it is reasonable to choose δ to be one for the control subjects who were nonadministratively censored. We illustrate the use of different δ for different treatment groups in Sections 5 (an application) and S7 (simulation studies).
Control-based models (e.g., Carpenter et al., 2013) are another popular and appealing class of sensitivity models because of their reduced bias in favor of the experimental treatment.
Assumption 2 (Control-based Cox model).
The treatment-specific hazard rate of failing at time t is λa(t | Xi) given in (2) for a = 0, 1 and is δλ0(t | Xi) after dropout (Ri = 2) for the treated, where δ ≤ 1.
The control-based Cox model with δ = 1 becomes the jump-to-reference model (Atkinson et al., 2019). It assumes that censored subjects on the active arm follow the same distribution as similar subjects in the control group after the censored time. This model is, for example, plausible for superiority trials if subjects on the control arm received the standard care, and censoring on the active arm is because subjects revert to the standard of care. For generality, we also allow δ to be less than 1, such that the treatment effect can be bracketed by the treatment effect under CAR and that for the control arm (Lu et al., 2015).
In fact, censoring due to dropout can be interpreted as a time-dependent binary covariate, and δ-adjusted and control-based sensitivity models entail time-dependent Cox models. Let the history of the information up to time t be Hi(t) = {Xi, Ri, Ni(u), Yi(u) : u < t}. Because we use Ri = 2 to indicate premature dropout, Assumption 1 describes the time-dependent Cox model
(3) |
Assumption 2 describes the time-dependent Cox model with the hazard function, for a = 0, 1,
(4) |
The de facto estimand for treatment policy takes into account the likely attenuation of the treatment effect after dropout. By (3) and (4), the de facto survival function is , for a = 0, 1. Here we use the superscript “sen” to denote either “δ-adj” or “cb” for the delta-adjusted or control-based sensitivity model. The de facto treatment effect estimand becomes . If the sensitivity parameter δ is not one, differs from Δτ in general. By varying δ over a certain range, provides valuable insights into the impact of possible departures from CAR, allowing an investigator to assess the extent to which the censoring assumption alters the treatment effect estimator.
MI requires generating the missing values from the imputation model in Step MI-1. From (3) or (4), one can derive the conditional survival function Sa{t | Hi(t); δ, θ} for imputation. Consider the δ-adjusted model for example, if a treated subject i withdrew from the treatment, the conditional survival at t > Ui is
(5) |
Unlike the parametric models, sampling from the semiparametric Cox model is difficult. Following Lipkovich et al. (2016), we introduce a general inverse transform sampling scheme. Suppose we would like to generate from (5) for t ≥ Ui. First, generate a random number ui from Unif[0, pi], where pi = {S1(Ui | Xi)}δ. Second, solve for . Then, we show that given the observed data O1:n,
is the target imputation model (5).
In practice, we need numerical approximations to obtain . Let Ta,max be the largest observed event time in treatment group a for a = 0, 1. Because Sa(t | Xi) is semiparametric, is only available for t ≤ Ta,max. Thus we require τ to be smaller than , and then the imputed value can be truncated at .
To summarize, the MI procedure proceeds as in Table 2, where Step MI-1–3 and Step MI-1–3’ are used for δ-adjusted imputation model and control-based imputation model, respectively.
TABLE 2.
Step MI-1–1. | Fit a Cox model assuming CAR; denoted by . |
Step MI-1–2. | For administratively censored subject i with (Ai, Ii, Ri) = (a, 0, 1), compute . Draw a uniform random value ut ~ Unif[0, pt]. Impute the event time as the solution of . Numerically, we use , where is the set of realized times to event or censoring with the largest value being . This will ensure that the imputed event time falls between the censoring time and . |
Step MI-1–3. | For nonadministratively censored subject i with (Ai, Ii, Ri) = (a, 0, 2), compute . Draw a uniform random value ui ~ Unif[0, pt]. Impute the event time as the solution of . Numerically, we use . |
Step MI-1–3’. | For nonadministratively censored subject i with (Ai, Ii, Ri) = (0, 0, 2), draw by Step MI-1–3 with a = 0 and δ = 1. For nonadministratively censored subject i with (Ai, Ii, Ri) = (1, 0, 2), draw by Step MI-1–3 with a = 0 and δ, that is, using the corresponding control distribution. |
4 |. WILD BOOTSTRAP INFERENCE BASED ON MARTINGALE SERIES
4.1 |. A novel martingale representation
For variance estimation, the key insight is that the MI estimator is intrinsically created in a sequential manner: first, the imputation model is fitted based on the observed data; second, the missing data are drawn from the imputation model conditioned on the observed data. This conceptualization leads to a martingale representation of the MI estimator by expressing the MI estimator in terms of a series of random variables that have mean zero conditional on the sigma-algebra generated from the preceding variables. We provide heuristic steps below toward linearizing the MI estimator and forming the proper sigma-algebra and regulate details to the Web Appendix.
We first focus on treatment group a = 1. To unify the notation, let denote the jth imputed value for subject i if subject i was censored and the observed Ti if we observe subject i’s event time. By the imputation mechanism, follows the conditional survival distribution for t ≥ Ui, where θ = {λa(·), βa : a = 0, 1}. Then, for t ∈ [0, τ], it is insightful to express
(6) |
(7) |
Here, we use the total sample size n for scaling; we will use the same scaling for the estimators for the control group and the treatment effect.
We analyze the two terms in (6) and (7), separately. First, because the imputations are independent given the observed data, it follows that the individual terms in (6) are independent mean-zero terms conditional on the observed data. Second, because the term in (7) depends on , by exploiting the counting process theory, we express
(8) |
(9) |
where the exact expressions of ϕ11,i(t) and ϕ10,i(t) are given in Section S4. Importantly, ϕ11,i(t) reflects the estimation of {λ1(·), β1}, ϕ10,i(t) reflects the estimation of {λ0(·), β0}, and . Note that in the sensitivity analysis using the δ-adjusted models, the imputation for the treated group uses the information only from the treated group, so ϕ10,i(t) = 0 for all i; while in the sensitivity analysis using the control-based models, the imputation for the treated group uses information from both treatment groups, so ϕ11,i(t) ≠ 0 and ϕ10,i(t) ≠ 0 for all i. Also, by definition, the expectation of the term in (8) is zero. Together, decomposes into the summation of three terms (6), (8), and (9) with (conditional) mean zero, and converges to a Gaussian process in [0, τ]. Similarly, we obtain a similar asymptotic linearization of given in (S3)–(S5).
We now leverage the unified linear characterization (1) to express the MI estimator for various treatment effect estimands. Combining (1) and the above decompositions of and , we derive
(10) |
(11) |
(12) |
(13) |
(14) |
To gain intuitions, based on the decomposition in (10), the first n1 terms of ξn,k contribute to the variability of because of the unknown parameters, and the next mn1 terms of contribute to the variability of because of the imputations given the estimated parameter values, reflecting the sequential MI procedure. Other terms have similar explanations.
We form the sigma-algebra for all k; thus,
(15) |
We focus on the ξn,k terms in (11) and (12) for treatment group a = 1, because the discussion for the ξn,k terms in (13) and (14) for treatment group a = 0 is similar and is presented in the Web Appendix. Obviously, for k = i (1 ≤ i ≤ n1) and ξn,k in (11), we have and , and thus we let . For k = n1 + (i − 1)m + j, where i = 1, … , n1 and j = 1, … , m, and ξn,k in (12), under the regularity conditions, we have , and thus we let .
The martingale representation allows us to characterize the asymptotic distribution of with the proof presented in Section S4.
Theorem 1.
Under Assumptions 1/2, and S1 (regularity conditions), , as n → ∞, where is a finite variance given in (S23).
4.2 |. Wild bootstrap for the MI estimator
The martingale representation invokes the wild or weighted bootstrap procedure (Wu, 1986; Liu, 1988) that provides valid variance estimation and inference of the linear statistic for martingale difference arrays. Pauly (2011) proved the validity of the wild bootstrap re-sampling under the conditions of a general central limit theorem (CLT). Guan and Yang (2019) applied the wild bootstrap for a martingale series in the context of causal inference with observational studies.
Based on the martingale representation (10), we propose the wild bootstrap procedure to estimate the variance of . The martingale representation relies on unknown quantities, requiring approximations. We then estimate (i) by , (ii) ϕ11,i(t), ϕ10,i(t), and ϕ0,i(t) by , , and , and (iii) Sa{t | Hi(t); θ} by , for a = 0, 1.
Based on the above approximations, the wild bootstrap inference proceeds as in Table 3.
TABLE 3.
Step WB-1. | Sample uk, for k = 1,…, (1 + m)n, that satisfy and . |
Step WB-2. | Compute the bootstrap replicate as , where is the empirical version of ξn,k by replacing the unknown quantities with their estimators and the one-dimensional integrals by the numerical integration. |
Step WB-3. | Repeat Steps 1 and 2 B times, and estimate the variance of by the sample variance of these copies of . |
Theorem 2 shows the asymptotic validity of the above bootstrap inference method.
Theorem 2.
Under Assumptions 1/2, and S1 (regularity conditions), we have
in probability, as n → ∞.
We provide the proof of Theorem 2 in the Web Appendix, which draws on the martingale central limit theory (Hall and Heyde, 1980) and the asymptotic property of weighted sampling of martingale difference arrays (Pauly, 2011). Theorem 2 indicates that the distribution of the wild bootstrap statistic consistently estimates the distribution of the MI estimator.
5 |. AN APPLICATION
We apply the proposed semiparametric δ-adjusted and control-based Cox model to an HIV clinical trial. The randomized double-blinded ACTG175 trial was conducted to compare the treatment effect of a single nucleoside and two nucleosides in adults with HIV (Hammer et al., 1996). The data set is available in the R package speff2trial. The event of interest was the progression of the disease defined as the first occurrence of more than 50% decline in the CD4 cell count or death. For illustration purposes, we compare the treatment effect between Zidovudine monotherapy and Zidovudine plus Didanosine combination therapy in a subgroup of participants who never took any type of antiretroviral therapy before randomization. In this subgroup, there were 197 subjects in the monotherapy group and 185 subjects in the combination therapy group. There are 152 (82.2%) subjects in the Zidovudine plus Didanosine combination therapy group and 144 (73.0%) subjects in Zidovudine monotherapy group censored. We focus on estimating the RMST with the truncation time point 24 months because the ACTG175 study required at least 24 months follow-up for subjects. While re-analyzing the data, we assume CAR in the primary analysis and assume the event times follow a Cox model adjusting for age, and symptomatic indicator terms. This model assumption is assessed based on the test of the proportional hazards (Grambsch and Therneau, 1994) with a p > 0.05 and thus is adopted in analyses. The estimated RMST with 95% confidence interval is 22.1 (21.5, 22.8) months in the monotherapy group versus 23.0 (22.6, 23.5) in the combination therapy. The estimated between-group RMST difference with 95% confidence interval is 0.92 (0.15, 1.68). p = 0.019 indicates a statistically significant improvement of the combination therapy compared with the monotherapy. We also analyze the data using a direct estimator of RMST (Tian et al., 2014) without imputation using the survRM2 package. The results are close to the δ-adjusted method when δ = 1, because both methods assume CAR. However, the direct estimator does not require a Cox model for missing data imputation.
We conduct the sensitivity analysis based on the δ-adjusted and control-based method to evaluate the impact of plausible departures from CAR in the primary analysis. One of the main objectives of the ACTG175 trial was to evaluate the additional benefit with the combination therapy on top of Zidovudine. Therefore, we treat the Zidovudine monotherapy group as the control group and the Zidovudine plus Didanosine combination therapy as the test treatment group. In the sensitivity analysis, we consider subjects censored before 24 months as censored for nonadministrative reasons and subjects censored after 24 months as censored for administrative reasons. For the imputation models in both δ-adjusted and control-based methods, we assume CAR for nonadministratively censored subjects in the combination therapy group or censored in the monotherapy group. In δ-adjusted method, the δ-adjustment is applied to the primary Cox model for subjects who were nonadministratively censored in the combination therapy group. The analysis model is the resulting patten-mixture model carried out by MI with m = 50. We estimate the standard errors by Rubin’s combining rule and the proposed wild bootstrap method with B = 100.
Table 4 summarizes the results. The estimated within- and between-group standard errors from the wild bootstrap are smaller than that from Rubin’s combining rule for all evaluated methods. This is coherent with the findings in the simulation study. From the p-value of each δ, the estimated tipping point of the sensitivity analysis is larger than 5 by using wild bootstrap and between 4 and 5 by using Rubin’s rule. The results from the proposed wild bootstrap method demonstrate stronger evidence for the robustness of the primary analysis compared with the conservative Rubin’s rule. From the sensitivity results based on the wild bootstrap, to eliminate the statistical significance of the treatment effect, the hazard of those subjects who were nonadministratively censored should be more than five times higher than subjects with the observed event times in the same group. The control-based method also provides p-values smaller than 0.05 by using both Wild Bootstrap and Rubin’s rule. Therefore, the findings from the primary analysis are robust to the censoring assumption.
TABLE 4.
Zidovudine | Zidovudine plus | Difference | |||||
---|---|---|---|---|---|---|---|
(n = 197) | Didanosine (n = 185) | ||||||
Method | RMST | SE | RMST | SE | RMST (95% CI) | SE | p-Value |
Primary and sensitivity analysis with wild bootstrap | |||||||
δ = 1 | 22.10 | 0.28 | 23.04 | 0.22 | 0.92 (0.15, 1.68) | 0.39 | 0.019 |
δ = 2 | 22.10 | 0.28 | 23.00 | 0.23 | 0.88 (0.11, 1.64) | 0.39 | 0.024 |
δ = 3 | 22.10 | 0.28 | 22.97 | 0.23 | 0.84 (0.18, 1.61) | 0.39 | 0.031 |
δ = 4 | 22.10 | 0.28 | 22.93 | 0.23 | 0.81 (0.04, 1.58) | 0.39 | 0.038 |
δ = 5 | 22.10 | 0.28 | 22.90 | 0.23 | 0.78 (0.02, 1.55) | 0.39 | 0.047 |
Control-based | 22.12 | 0.31 | 23.00 | 0.23 | 0.88 (0.12, 1.65) | 0.39 | 0.023 |
Primary and sensitivity analysis with Rubin’s combining rule | |||||||
δ = 1 | 22.12 | 0.31 | 23.04 | 0.24 | 0.92 (0.14, 1.69) | 0.39 | 0.020 |
δ = 2 | 22.12 | 0.31 | 23.00 | 0.25 | 0.88 (0.10, 1.67) | 0.40 | 0.027 |
δ = 3 | 22.12 | 0.31 | 22.97 | 0.25 | 0.84 (0.06, 1.63) | 0.40 | 0.034 |
δ = 4 | 22.12 | 0.31 | 22.93 | 0.26 | 0.81 (0.02, 1.60) | 0.40 | 0.043 |
δ = 5 | 22.12 | 0.31 | 22.90 | 0.26 | 0.78 (−0.01, 1.59) | 0.40 | 0.054 |
Control-based | 22.12 | 0.31 | 23.00 | 0.25 | 0.87 (0.08, 1.65) | 0.40 | 0.030 |
(Tian et al., 2014) | 22.11 | 0.31 | 23.05 | 0.24 | 0.88 (0.11, 1.66) | 0.40 | 0.026 |
Note: In δ-adjusted sensitivity analysis, the value of δ applied to subjects who were nonadministrative censored in the Zidovudine plus Didanosine group.
6 |. CONCLUDING REMARKS
In this article, we provide a general framework for survival sensitivity analysis based on semiparametric δ-adjusted and control-based Cox models to assess the impact of plausible departures from CAR. The δ-adjusted/control-based models are flexible enough to accommodate different censoring mechanisms by changing the sensitivity parameter. MI facilitates the use of a simple full-sample estimator; however, the standard Rubin’s combining rule may be conservative or anti-conservative when the analysis method is uncongenial to the imputation model (Robins and Wang, 2000). This is likely to occur in our general sensitivity analysis framework when the full-sample estimator is not an efficient estimator under the combined data and imputation models. To overcome this issue, Wang and Robins (1998) proposed consistent variance estimators for imputation estimators in the missing data literature under a parametric imputation model, which, however, is not applicable in our survival sensitivity analysis. We reformulate the MI estimator as a martingale series based on the sequential construction of the MI estimator and propose the wild bootstrap inference based on resampling the martingale series with a theoretical guarantee for consistency. The current framework considers only baseline covariates. If time-dependent covariates were available, including them makes the CAR assumption more plausible. However, the δ-adjusted and control-based models are still useful to conduct sensitivity analysis of assumptions about post-censoring behavior. Extending SMIM to incorporate time-dependent covariates will be our future work.
The proposed inferential framework targets consistent estimation of the repeated-sampling variance of the MI estimator. It appears paradoxical that the repeated-sampling variance of the MI estimator may decrease as the missingness rate increases; however, this phenomenon can happen given that the true value of the estimand changes with the missingness rate under the control-based imputation models. Alternatively, to avoid the seemly paradoxical phenomenon, Cro et al. (2019) proposed a novel principle of information anchored analysis in the sense that the information ratio between the analysis with missing data and the analysis with full data is similar for the primary analysis and the sensitivity analysis. Their research suggested that the control-based imputation with Rubin’s variance estimate provides an information anchored analysis. In survival sensitivity analysis using control-based imputation models, Atkinson et al. (2019) showed by simulation that Rubin’s combining rule is information-anchored.
Supplementary Material
ACKNOWLEDGMENT
Yang is partially supported by the NSF DMS 1811245, NIH 1R01AG066883, and 1R01ES031651.
Funding information
National Institutes of Health, Grant/Award Numbers: 1R01AG066883, 1R01ES031651; National Science Foundation, Grant/Award Number: DMS 1811245
Footnotes
SUPPORTING INFORMATION
Web Appendices and Tables referenced in Sections 3 and 4 and the R package for implementing the proposed methods are available with this paper at the Biometrics website on Wiley Online Library.
REFERENCES
- Andersen PK& Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Annals of Statistics, 10, 1100–1120. [Google Scholar]
- Atkinson A, Kenward MG, Clayton T& Carpenter JR (2019) Reference-based sensitivity analysis for time-to-event data. Pharmaceutical Statistics, 18, 645–658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breslow N (1974) Covariance analysis of censored survival data. Biometrics, 30, 89–99. [PubMed] [Google Scholar]
- Carpenter JR, Roger JH& Kenward MG (2013) Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. Journal of Biopharmaceutical Statistics, 23, 1352–1371. [DOI] [PubMed] [Google Scholar]
- Chen P-Y& Tsiatis AA (2001) Causal inference on the difference of the restricted mean lifetime between two groups. Biometrics, 57, 1030–1038. [DOI] [PubMed] [Google Scholar]
- Cro S, Carpenter JR& Kenward MG (2019) Information-anchored sensitivity analysis: theory and application. Journal of the Royal Statistical Society: Series A, 182, 623–645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao F, Liu GF, Zeng D, Xu L, Lin B, Diao G, et al. (2017) Control-based imputation for sensitivity analyses in informative censoring for recurrent event data. Pharmaceutical Statistics, 16, 424–432. [DOI] [PubMed] [Google Scholar]
- Grambsch PM& Therneau TM (1994) Proportional hazards tests and diagnostics based on weighted residuals. Biometrika, 81, 515–526. [Google Scholar]
- Guan Q & Yang S (2019) A unified framework for causal inference with multiple imputation using martingale. [Preprint], arXiv:1911.04663.
- Hall P & Heyde C (1980) Martingale limit theory and applications. New York, NY: Academic Press. [Google Scholar]
- Hammer SM, Katzenstein DA, Hughes MD, Gundacker H, Schooley RT, Haubrich RH, et al. (1996) A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with cd4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335, 1081–1090. [DOI] [PubMed] [Google Scholar]
- Hernán MA (2010) The hazards of hazard ratios. Epidemiology, 21, 13–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ICH (2019) ICH harmonised guideline E9 (R1): addendum on estimand and sensitivity analysis in clinical trials. Geneva: International Council for Harmonization. [Google Scholar]
- Jackson D, White IR, Seaman S, Evans H, Baisley K& Carpenter J (2014) Relaxing the independent censoring assumption in the Cox proportional hazards model using multiple imputation. Statistics in Medicine, 33, 4681–4694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lipkovich I, Ratitch B& O’Kelly M (2016) Sensitivity to censored-at-random assumption in the analysis of time-to-event endpoints. Pharmaceutical Statistics, 15, 216–229. [DOI] [PubMed] [Google Scholar]
- Little RJ (1993) Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc, 88, 125–134. [Google Scholar]
- Liu GF& Pang L (2016) On analysis of longitudinal clinical trials with missing data using reference-based imputation. Journal of Biopharmaceutical Statistics, 26, 924–936. [DOI] [PubMed] [Google Scholar]
- Liu RY (1988) Bootstrap procedures under some non-iid models. Annals of Statistics, 16, 1696–1708. [Google Scholar]
- Lu K, Li D& Koch GG (2015) Comparison between two controlled multiple imputation methods for sensitivity analyses of time-to-event data with possibly informative censoring. Statistics in Biopharmaceutical Research, 7, 199–213. [Google Scholar]
- NRC (2010) The prevention and treatment of missing data in clinical trials. Washington, DC: National Academies Press. [PubMed] [Google Scholar]
- NRC, (2012) The prevention and treatment of missing data in clinical trials. New England Journal of Medicine, 367, 1355–1360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pauly M (2011) Weighted resampling of martingale difference arrays with applications. Electronic Journal of Statistics, 5, 41–52. [Google Scholar]
- Ratitch B, O’Kelly M& Tosiello R (2013) Missing data in clinical trials: from clinical assumptions to statistical analysis using pattern mixture models. Pharmaceutical Statistics, 12, 337–347. [DOI] [PubMed] [Google Scholar]
- Robins JM& Wang N (2000) Inference for imputation estimators. Biometrika, 87, 113–124. [Google Scholar]
- Rotnitzky A, Farall A, Bergesio A& Scharfstein D (2007) Analysis of failure time data under competing censoring mechanisms. Journal of the Royal Statistical Society: Series B, 69, 307–327. [Google Scholar]
- Rotnitzky A, Scharfstein D, Su T-L& Robins J (2001) Methods for conducting sensitivity analysis of trials with potentially nonignorable competing causes of censoring. Biometrics, 57, 103–113. [DOI] [PubMed] [Google Scholar]
- Rubin DB (1976) Inference and missing data. Biometrika, 63, 581–592. [Google Scholar]
- Scharfstein DO& Robins JM (2002) Estimation of the failure time distribution in the presence of informative censoring. Biometrika, 89, 617–634. [Google Scholar]
- Tian L, Alizadeh AA, Gentles AJ& Tibshirani R (2014) A simple method for estimating interactions between a treatment and a large number of covariates. Journal of the American Statistical Association, 109, 1517–1532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsiatis A(2006). Semiparametric theory and missing data. NewYork, NY: Springer. [Google Scholar]
- Wang N& Robins JM (1998) Large-sample theory for parametric multiple imputation procedures. Biometrika, 85, 935–948. [Google Scholar]
- Wu C-FJ (1986) Jackknife, bootstrap and other resampling methods in regression analysis. Annals of Statistics, 14, 1261–1295. [Google Scholar]
- Yang S& Kim JK (2016) A note on multiple imputation for method of moments estimation. Biometrika, 103, 244–251. [Google Scholar]
- Zhao Y, Herring AH, Zhou H, Ali MW& Koch GG (2014) A multiple imputation method for sensitivity analyses of time-to-event data with possibly informative censoring. Journal of Biopharmaceutical Statistics, 24, 229–253. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.