Skip to main content
Biostatistics (Oxford, England) logoLink to Biostatistics (Oxford, England)
. 2020 Apr 4;23(1):34–49. doi: 10.1093/biostatistics/kxaa008

A Bayesian nonparametric approach for evaluating the causal effect of treatment in randomized trials with semi-competing risks

Yanxun Xu 1,, Daniel Scharfstein 2, Peter Müller 3, Michael Daniels 4
PMCID: PMC10448950  PMID: 32247284

Summary

We develop a Bayesian nonparametric (BNP) approach to evaluate the causal effect of treatment in a randomized trial where a nonterminal event may be censored by a terminal event, but not vice versa (i.e., semi-competing risks). Based on the idea of principal stratification, we define a novel estimand for the causal effect of treatment on the nonterminal event. We introduce identification assumptions, indexed by a sensitivity parameter, and show how to draw inference using our BNP approach. We conduct simulation studies and illustrate our methodology using data from a brain cancer trial. The R code implementing our model and algorithm is available for download at https://github.com/YanxunXu/BaySemiCompeting.

Keywords: Bayesian nonparametrics, Brain cancer trial, Causal inference, Identification assumptions, Principal stratification, Sensitivity analysis

1. Introduction

Semi-competing risks (Fine and others, 2001) occur in studies where observation of a nonterminal event (e.g., progression) may be pre-empted by a terminal event (e.g., death), but not vice versa. In randomized clinical trials to evaluate treatments of life-threatening diseases, patients are often observed for specific types of disease progression and survival. Often, the primary outcome is patient survival, resulting in data analyses focusing on the terminal event using standard survival analysis tools (Ibrahim and others, 2005). However, there may also be interest in understanding the causal effect of treatment on nonterminal outcomes such as progression, readmission, etc. An example is a randomized trial for the treatment of malignant brain tumors, where one of the important progression endpoints is based on deterioration of the cerebellum. An important feature of this progression endpoint is that it is biologically plausible that a patient could die without cerebellar deterioration. Thus, analyzing the effect of treatment on progression needs to account for the fact that progression is not well-defined after death.

Varadhan and others (2014) review models that have been proposed for analyzing semi-competing data. These models can be classified into two broad categories: models for the distribution of the observable data, e.g., cause-specific hazards, subdistribution functions (Fix and Neyman, 1951; Hougaard, 1999; Xu and others, 2010; Lee and others, 2015), and models for the distribution of the latent failure times (Robins, 1995a,b; Lin and others, 1996; Wang, 2003; Peng and Fine, 2007; Ding and others, 2009; Peng and Fine, 2012; Chen, 2012; Hsieh and Huang, 2012; Comment and others, 2019). Xu and others (2010) argued against the use of latent failure time models because the marginal distribution of the nonterminal event is hypothetical. This is because the joint distribution of the nonterminal event (Inline graphic) and terminal event (Inline graphic) is only identified on a wedge of Inline graphic. Rather, they argued that “semi-competing risks data are better modeled using an illness-death compartment model,” where “a subject can either transit directly to the terminal event or first to the nonterminal event and then to the terminal event.” They proposed a Markov shared frailty model for the transition rates. Lee and others (2015) proposed a Bayesian semi-parametric extension, which focused on estimation of regression parameters, characterization of dependence between event times and prediction of event times for specific covariate profiles. The latent failure approaches of Fine and others (2001), Wang (2003), and Peng and Fine (2007) have focused on estimating regression parameters and estimating dependence between nonterminal and terminal event times using copula models. Robins (1995a,b) focused solely on estimating regression parameters and discusses causal interpretability. Recently, Comment and others (2019) proposed a casual estimand similar to the one we discuss here, but uses different models (i.e., parametric frailty models) and different causal assumptions (i.e., latent ignorability).

In this article, we are interested in estimating the causal effect of treatment on the nonterminal endpoint from a randomized trial generating semi-competing risk data. Using the potential outcomes framework (Rubin, 1974), we propose a principal stratification estimand (Frangakis and Rubin, 2002) to quantify the causal effect. Our estimand is a time-varying version of the survival average causal effect (see, e.g., Zhang and Rubin, 2003; Tchetgen Tchetgen, 2014), quantified on a relative risk scale. We introduce assumptions that utilize baseline covariates to identify this estimand from the distribution of the observable data and propose a Bayesian nonparametric (BNP) approach for modeling this distribution. An important feature of BNP models is their large support. For example, a Dirichlet process (Ferguson and others, 1973) location-scale mixture of normals (one of the most widely used BNP models), has full support on the space of absolutely continuous distributions (Lo, 1984). To handle covariates, our approach is based on the dependent Dirichlet process (DDP) prior introduced by MacEachern (1999).

The article is outlined as follows: Section 2 introduces the motivating brain tumor study. The formal definition of the causal estimand is introduced in Section 3. We introduce the BNP model in Section 4. A simulation study is summarized in Section 5. We analyze the brain tumor data in Section 6, and conclude with brief discussion in Section 7.

2. Motivating brain tumor study

The methodology is motivated by a randomized and placebo-controlled Phase II trial for 222 recurrent gliomas patients, who were scheduled for tumor resection with recurrent malignant brain tumors (Brem and others, 1995). Eligible patients had a single focus of tumor in the cerebrum, had a Karnofsky score greater than 60, had completed radiation therapy, had not taken nitrosoureas within 6 weeks of enrollment, and had not had systematic chemotherapy within 4 weeks of enrollment. The data include 11 baseline prognostic measures and a baseline evaluation of cerebellar function. The former includes age, race, Karnofsky performance score, local vs. whole brain radiation, percent of tumor resection, previous use of nitrosoureas, and tumor histology (glioblastoma, anapestic astrocytoma, oligodendrolioma, or other) at implantation. Patients were randomized to receive surgically implanted biodegradable polymer discs with or without 3.85% of carmustine. The follow-up duration was 1 year. Of the 219 patients with complete baseline measures, 204 were observed to die and 100 were observed to progress prior to death. Of the 15 patients who did not die, 4 were observed to have cerebellar progression. Our goal is to estimate the causal effect of treatment on time to cerebellar progression.

3. Causal estimand and identification assumptions

3.1. Potential outcomes and causal estimand

Let Inline graphic, Inline graphic, and Inline graphic denote progression time, death time, and censoring time, under treatment Inline graphic. Here, Inline graphic represents control and treatment group, respectively. All event times are log-transformed. Fundamental to our setting is that Inline graphic (i.e., progression cannot happen after death).

The causal estimand of interest is the function

τ(u)=Pr[YP1<uYD0u,YD1u]Pr[YP0<uYD0u,YD1u], (3.1)

where Inline graphic is a smooth function of Inline graphic. Among patients who survive to time Inline graphic under both treatments, this estimand contrasts the risk of progression prior to time Inline graphic for treatment 1 relative to treatment 0, which is a causal effect in a subgroup defined by potential outcomes. This estimand is an example of a principal stratum causal effect (Frangakis and Rubin, 2002).

3.2. Observed data

Let Inline graphic denote treatment assignment and Inline graphic denote a vector of the baseline covariates. Let Inline graphic, Inline graphic, and Inline graphic. Let Inline graphic, Inline graphic, Inline graphic, and Inline graphic denote the observed event times and event indicators. The observed data for each patient are Inline graphic. We assume that we observe Inline graphic i.i.d. copies of Inline graphic. Throughout, variables subscripted by Inline graphic will denote data specific to patient Inline graphic.

3.3. Identification assumptions

We introduce the following four assumptions that are sufficient for identifying our causal estimand.

Assumption 1: Treatment is randomized, i.e.,

graphic file with name Equation2.gif

and Inline graphic.

This obviously holds by design in randomized trials as considered here.

Assumption 2: Censoring is noninformative in the sense that

graphic file with name Equation3.gif

and Inline graphic for all Inline graphic.

Let Inline graphic and Inline graphic denote the conditional hazard function and conditional distribution function of Inline graphic given Inline graphic, respectively. Under Assumptions 1 and 2, Inline graphic and Inline graphic are identified via the following formulae:

graphic file with name Equation4.gif

and

Gxz(t)=1exp{0tλxz(s)ds}. (3.2)

Furthermore, the conditional subdistribution function of Inline graphic given Inline graphic and Inline graphic, Inline graphic, is identified via the following formula:

Vxz(s|t)=Pr[T1s,δ=1|T2=t,ξ=1,X=x,Z=z], (3.3)

where Inline graphic. Together Inline graphic and Inline graphic identify the joint subdistribution Inline graphic for Inline graphic given Inline graphic.

Assumption 3: The conditional joint distribution function of Inline graphic given Inline graphic, Inline graphic, follows a Gaussian copula model, i.e.,

Gx(v,w;ρ)=Φ2,ρ[Φ1{Gx0(v)},Φ1{Gx1(w)}], (3.4)

where Inline graphic is a standard normal c.d.f. and Inline graphic is a bivariate normal c.d.f. with mean 0, marginal variances 1, and correlation Inline graphic. For fixed Inline graphic, Inline graphic is identified since Inline graphic and Inline graphic are identified. Similar assumptions have been used in the causal mediation literature (Daniels and others, 2012).

Assumption 4: Progression time under treatment Inline graphic is conditionally independent of death time under treatment Inline graphic given death time under treatment Inline graphic and covariates Inline graphic, i.e.,

graphic file with name Equation8.gif

Under Assumptions 1–4, Inline graphic is identified from the distribution of the observed data as follows:

τ(u)=xs<uvutudVx1(s|t)dGxv,t)dK(x)xs<uvutudVx0(s|t)dGx(v,t)dK(x), (3.5)

where Inline graphic is the empirical distribution of Inline graphic.

4. Bayesian regression model

In this section, we propose a BNP survival regression model on the unknown conditional (on Inline graphic) distribution of Inline graphic. However, any alternative Bayesian survival regression models could be implemented (Hanson and Johnson, 2002; Gelfand and Kottas, 2003; Zhou and Hanson, 2018; Sparapani and others, 2016; Xu and others, 2019); however, the first three are restrictive in how covariates are entered and the fourth one is semi-parametric.

4.1. Dependent Dirichlet process—Gaussian process prior

We start with a review of the Dirichlet process (DP) as a prior for an unknown distribution and step by step extend it to the dependent Dirichlet process—Gaussian process prior.

The DP prior has been widely used as a prior model for a random unknown probability distribution. We write Inline graphic if a random distribution Inline graphic of a Inline graphic-dimensional random vector Inline graphic follows a DP prior, where Inline graphic is known as the total mass parameter and Inline graphic is known as the base measure. Sethuraman (1994) provides a constructive definition of a DP, where Inline graphic, Inline graphic, Inline graphic and Inline graphic. In many applications, the discrete nature of Inline graphic is not appropriate. A DP mixture model extends the DP model by replacing each point mass Inline graphic with a continuous kernel. For example, a DP mixture of normals takes the form: Inline graphic, where Inline graphic is the density function of a multivariate normal random vector with mean vector Inline graphic and variance–covariance matrix Inline graphic.

To introduce a prior on the conditional (on covariates Inline graphic) distribution (Inline graphic) of Inline graphic, the DP mixture model has been extended to a dependent DP (DDP) by replacing Inline graphic in each term with Inline graphic, which is a multivariate stochastic process indexed by Inline graphic. A DDP mixture of normals takes the form:

dHx(v)=hwhϕ(v;θh(x),Σ)dv. (4.6)

To complete the prior specification, we need to posit a stochastic process prior for Inline graphic. A common specification are independent Gaussian process (GP) priors (MacEachern, 1999; Xu and others, 2016) on Inline graphic. A GP prior is specified such that for all Inline graphic and (Inline graphic, the distribution of Inline graphic follows a multivariate normal distribution with mean vector Inline graphic and Inline graphic covariance matrix where the Inline graphic entry is Inline graphic). We write Inline graphic. For an extensive review of the GP priors, see Rasmussen and Williams (2006) and MacKay (1999). We model the mean function Inline graphic as a linear regression on covariates Inline graphic with covariance process specified as

Rj(xl,xl)=exp{d=1D(xldxld)2}+δllϵ2, (4.7)

where Inline graphic is the dimension of the covariate vector, Inline graphic and Inline graphic is a small constant (e.g., Inline graphic) used to ensure that the covariance function is positive definite. To ensure a reasonable covariance structure, continuous covariates should be standardized to have mean 0 and variance 1. More flexible covariance functions can be considered if desired. Additional priors are introduced on the Inline graphic’s and Inline graphic, the details of which are discussed in Appendix A.1. We write Inline graphic.

4.2. Application to semi-competing risks data

Separately for each treatment group Inline graphic, we posit independent DDP-GP’s on the unknown conditional (on Inline graphic) probability measure (Inline graphic) of Inline graphic. Since Inline graphic (Inline graphic) and Inline graphic, the prior on Inline graphic induces priors on Inline graphic and Inline graphic (identified under Assumptions 1 and 2) and together with the Gaussian copula for Inline graphic implies a prior on the estimand Inline graphic. The prior on Inline graphic also induces priors on non-identified quantities which have no impact on our analysis. More specifics about our prior are presented in Appendix A.1.

Before transitioning to the posterior sampling algorithm, note that the relevant portion of the observed data likelihood for individual Inline graphic, with data Inline graphic is

graphic file with name Equation12.gif

We include the second equality because it allows us to see that, using data augmentation to replace the integrals, the joint full data likelihood is Inline graphic. This will allow us to use existing posterior simulation techniques for DDP-GP models.

4.3. Posterior simulation

The details of the Markov chain Monte Carlo (MCMC) algorithm are presented in Appendix A.2. Here, we focus on individuals assigned to treatment Inline graphic and suppress the dependence of the notation on Inline graphic. As noted above, the MCMC implementation is based on the full data likelihood. While Inline graphic is an infinite mixture of normals, we approximate it by a finite mixture with Inline graphic components. This finite mixture model for Inline graphic can be replaced by a hierarchical model where (1) Inline graphic is a latent variable that selects mixture component Inline graphic (Inline graphic) with probability Inline graphic (properly normalized to handle the finite number of mixture components) and (2) given Inline graphic, the pair Inline graphic follows a multivariate normal distribution with mean Inline graphic and variance Inline graphic.

Posterior simulation is based on this hierarchical model characterization. Importantly, all of the full conditionals in the MCMC algorithm have a closed form representation. Details of the MCMC posterior simulation can be found in Appendix A.2.

5. Simulation studies

5.1. Simulation setup

We considered three simulation scenarios to evaluate the performance of our proposed approach with 500 repeated simulations for each scenario. We generated Inline graphic. Independently of Inline graphic, we generated two independent covariates Inline graphic and Inline graphic, where Inline graphic followed a truncated normal distribution with mean 4.5, variance 1, and truncation interval Inline graphic and Inline graphic. For the first two simulation scenarios, we simulated progression time and death time on the log scale as follows:

graphic file with name Equation13.gif

In Scenario 1, we assumed Inline graphic followed a bivariate normal distribution with mean Inline graphic, marginal variances Inline graphic, and correlation Inline graphic In Scenario 2, we assumed Inline graphic to be a scaled multivariate Inline graphic distribution with degree of freedom Inline graphic, mean Inline graphic, marginal variance Inline graphic, and correlation Inline graphic. Scenario 3 explored performance under a nonlinear covariate effect specification on progression and death times. We generated Inline graphic and Inline graphic, with Inline graphic following the same distribution as in Scenario 1.

In all scenarios, the censoring time Inline graphic on the log scale was generated independently according to a Inline graphic distribution. In Scenario 1, 56.6% of the patients’ deaths and progressions were both observed (Inline graphic), 2% of the patients’ deaths and progressions were both censored (Inline graphic), 36.4% of the patients’ deaths were observed and progressions were censored (Inline graphic), 5% of the patients’ deaths were censored and progressions were observed (Inline graphic). In Scenario 2, 55.8% of the patients’ deaths and progressions were both observed, 4.8% of the patients’ deaths and progressions were both censored, 33.6% of the patients’ deaths were observed and progressions were censored, 5.8% of the patients’ deaths were censored and progressions were observed. In Scenario 3, these percentages were 69.4%, 3.4%, 10.6%, and 16.6%, respectively. For the joint distribution of Inline graphic and Inline graphic in (3.4), we set Inline graphic in the Gaussian copula as the truth. We generated Inline graphic for Inline graphic independent patients and then coarsened to Inline graphic.

To explore sensitivity of Inline graphic with respect to Inline graphic, we conducted inference for Inline graphic under several values of Inline graphic. For all three scenarios, we specified hyperparameters as described in Appendix A.1.

For comparative purposes, we implemented two alternative models. The first one is a naive Bayesian (Naive) model by assuming that the conditional probability measure (Inline graphic of Inline graphic follows a multivariate normal distribution with mean Inline graphic and variance–covariance matrix Inline graphic, with conjugate multivariate normal priors on Inline graphic and Inline graphic and an inverse Wishart prior on Inline graphic (i.e., Inline graphic, Inline graphic, and Inline graphic, Inline graphic). The second one is the linear dependent DDP (LinearDDP) model proposed in De Iorio and others (2009), which simplifies the proposed BNP model by assuming that Inline graphic in (4.6) is a linear regression on Inline graphic, instead of a Gaussian process prior on Inline graphic used in the proposed BNP model.

For each analysis, we ran 5000 MCMC iterations with an initial burn-in of 2000 iterations and a thinning factor of 10. The convergence diagnostics using the R package coda show no evidence of practical convergence problems.

5.2. Simulation results

We first report on the performance in terms of recovering the true treatment-specific marginal survival functions for time to death. For the BNP approach, Figure 1 shows, for each of the three simulation scenarios and by treatment group (first and second rows refer to treatments 0 and 1, respectively), the true survival functions (solid line), the posterior mean survival functions averaged over simulated data sets (dashed line), and 95% point-wise credible intervals (computed using quantiles) averaged over simulated data sets (dotted lines) on the original time scale (days). As another metric of performance, we computed, for each simulated data set, the root mean squared error (RMSE) taken as the square root of the average of the squared errors at 34 equally spaced grid points in log-scaled time interval Inline graphic. For each scenario, Table 1(a) summarizes the mean and standard deviation of RMSE across the 500 simulated data sets. Both Figure 1 and Table 1(a) show that our proposed BNP procedure performs well, for each of the three scenarios, in terms of recovering the true survival function.

Fig. 1.

Fig. 1.

For each simulation scenario and by treatment group (first and second rows refer to treatments 0 and 1, respectively), the true survival functions (solid line), the posterior mean survival functions averaged over simulated data sets (dashed line), and 95% point-wise credible intervals (computed using quantiles) averaged over simulated data sets (dotted lines). Survival times are on the original scale (days).

Table 1.

(a) For each scenario, mean and standard deviation of RMSE across 500 simulated data sets under the proposed BNP method, the naive Bayesian method (Naive), and the LinearDDP method. Bold values indicate that the proposed BNP yields the smallest mean RMSE when Inline graphic. (b) Means and standard deviations of RMSE for estimating Inline graphic across 500 simulations in three scenarios under the proposed BNP approach, the naive Bayesian method (Naive), and the LinearDDP method, respectively

Scenario Inline graphic Inline graphic  
BNP Naive LinearDDP BNP Naive LinearDDP  
1 0.012 (0.007) 0.013 (0.007) 0.014 (0.007) 0.012 (0.006) 0.013 (0.007) 0.015 (0.008) (a)
2 0.042 (0.022) 0.088 (0.032) 0.063 (0.020) 0.019 (0.007) 0.073 (0.035) 0.058 (0.023)  
3 0.012 (0.006) 0.013 (0.007) 0.014 (0.007) 0.012 (0.007) 0.014 (0.007) 0.016 (0.008)  
Scenario Inline graphic Inline graphic  
BNP Naive LinearDDP BNP Naive LinearDDP  
1 0.286 (0.087) 0.328 (0.126) 0.332 (0.087) 0.059 (0.035) 0.073 (0.051) 0.091 (0.016)  
2 0.277 (0.128) 0.493 (0.250) 0.449 (0.189) 0.090 (0.062) 0.199 (0.169) 0.179 (0.123)  
3 0.106 (0.032) 0.105 (0.038) 0.115 (0.043) 0.033 (0.016) 0.035 (0.021) 0.043 (0.027)  
Scenario Inline graphic  
BNP Naive LinearDDP  
1 0.185 (0.037) 0.207 (0.047) 0.181 (0.042) (b)
2 0.261 (0.070) 0.243 (0.111) 0.203 (0.084)  
3 0.086 (0.028) 0.097 (0.034) 0.086 (0.035)  

Table 1(a) also shows the mean and standard deviation of RMSE for the Naive and the LinearDDP models. In Scenario 1, the two models match the true simulation model, thereby yielding comparable results as the proposed BNP model. In contrast, the Naive and the LinearDDP models perform worse than the BNP model in Scenario 2 when the fitted model does not match the simulation truth. In Scenario 3, the BNP model performs slightly better than the Naive and the LinearDDP models. Overall, the proposed BNP model is more robust compared to the Naive and the LinearDDP models.

Evaluation of Inline graphic requires evaluation of Inline graphic as the second marginal under Inline graphic. Expression (3.5) allows us now to estimate Inline graphic. Both the numerator and denominator can be evaluated as functionals of the currently imputed random probability measure Inline graphic of time to log progression Inline graphic and time to log death Inline graphic under treatment Inline graphic, marginalizing with respect to the empirical distribution of covariates Inline graphic’s. Each iteration of the posterior MCMC simulation evaluates a point-wise estimate and we estimate the posterior mean of Inline graphic as Inline graphic across iterations. We also report the mean RMSE in estimating the Inline graphic by averaging over 500 repeated simulations under the proposed BNP, the Naive, and the LinearDDP models. Table 1(b) summarizes the results.

Figure 2 shows Inline graphic versus Inline graphic in the three scenarios, respectively, using Inline graphic and Inline graphic. As shown in Figure 2, in all three scenarios, when Inline graphic, the estimates under the proposed BNP model reliably recover the simulated true Inline graphic and avoid the excessive bias seen with other Inline graphic values. This agrees with the results reported in Table 1(b) that Inline graphic always yields the smallest mean RMSE in all three scenarios. Furthermore, when Inline graphic, the proposed BNP model has smaller mean RMSE compared to the Naive and the LinearDDP models. When Inline graphic or Inline graphic, the BNP model performs better or comparable to the Naive model in terms of providing smaller mean RMSE and variability of RMSE across simulations.

Fig. 2.

Fig. 2.

The posterior estimates (dashed lines) of Inline graphic versus Inline graphic on the original scale (days) for the three scenarios using Inline graphic, respectively. The solid lines represent the simulation truth using Inline graphic. The dotted lines represent 95% point-wise credible intervals (computed using quantiles) averaged over simulated datasets.

6. Brain tumor data analysis

An initial analysis of the brain tumor death outcome using Kaplan–Meier is given in Figure 3, indicating that the treatment group has higher estimated survival probabilities. The estimated difference at 365 days is 2.6% (95% CI Inline graphic8.1% to 13.3%). Figure 3 plots the estimated posterior survival curves for treatment and control groups marginalized over the distribution of covariate with 95% credible intervals; panels (a), (b), and (c) display the results for the BNP, Naive, and LinearDDP approaches, respectively. Using the BNP approach, the estimated posterior difference in survival at 365 days is 6.2% (95% CI Inline graphic1.2% to 13.3%). For the Naive approach, the estimated posterior difference in survival at 365 days is 8.4% (95% CI 0.2% to 17.9%). The LinearDDP approach estimated the posterior difference in survival at 365 days to be 9.9% (95% CI 0.9% to 20.8%). The BNP approach produces comparable or higher treatment-specific estimates of survival and greater treatment differences than Kaplan–Meier. In contrast, the Naive and LinearDDP approaches produce comparable or lower (higher) estimate of survival for the control (treatment) group than Kaplan–Meier. Comparatively speaking, the Naive and LinearDDP approaches produce lower treatment-specific posterior estimates of survival than the BNP approach. When we compare the fit to the observed survival data of the three approaches using the log-pseudo marginal likelihood (LPML; (Geisser and Eddy, 1979), a leave-one-out cross-validation statistic, we see the BNP performs better. Specifically, the LPML for the treatment arm is Inline graphic144, Inline graphic161, Inline graphic147 for the BNP, Naive, and LinearDDP approaches, respectively. The corresponding numbers for the control arm are Inline graphic137, Inline graphic174, and Inline graphic139.

Fig. 3.

Fig. 3.

The dashed lines in (a) represent the estimated posterior mean survival curves for the proposed BNP method. The dotdash lines in (b) and (c) represent the estimated posterior mean survival curves for the Naive method and LinearDDP method, respectively. In all figures, the solid lines represent the Kaplan–Meier curves of the observed survival data in control and treatment groups, and the dotted lines represent 95% point-wise credible intervals of the posterior estimated survival curves. Survival times are on the original scale (days).

For the BNP (panel (a)), Naive (panel (b)), and LinearDDP (panel (c)) approaches, Figure 4 plots the posterior estimates (along with point-wise 95% credible intervals) of the causal estimand Inline graphic versus Inline graphic for three choices of Inline graphic, 0.2, 0.5, and 0.8. Except near Inline graphic, there are no appreciable differences between the two approaches. In addition, the results are insensitive to choice of Inline graphic. Overall, this analysis shows that there is a lower estimated risk of progression for treatment versus of control at all time points, except near zero. However, there is appreciable uncertainty, characterized by wide posterior credible intervals, that precludes more definitive conclusions about the difference between treatment groups with regards to progression. When we compare the fit to the observed survival and progression data of the BNP and Naive approaches using LPML, we see that the approaches perform comparably. Specifically, the LPML for the treatment arm is Inline graphic227, Inline graphic232, and Inline graphic235 for the BNP, Naive, and LinearDDP approaches, respectively. The corresponding numbers for the control arm are Inline graphic215, Inline graphic214, and Inline graphic219.

Fig. 4.

Fig. 4.

Posterior estimated Inline graphic versus Inline graphic on the original scale (days) in brain tumor data analysis for different Inline graphic’s under the proposed BNP method, the Naive method, and the LinearDDP method, respectively. The solid lines represent the posterior estimated Inline graphic, and the dashed lines represent 95% point-wise credible intervals. (a) BNP, (b) Naive, and (c) LinearDDP.

7. Discussion

In this article, we proposed a causal estimand for characterizing the effect of treatment on progression in a randomized trials with a semi-competing risks data structure. We introduced a set of identification assumptions, indexed by a non-identifiable sensitivity parameter that quantifies the correlation between survival under treatment and survival under control. Selecting a range of the sensitivity parameter Inline graphic in a specific trial will depend on clinical considerations. For example, in trial of a biomarker targeted therapy, one might expect weaker correlation, since survival under control might be primarily determined by co-morbidities and the survival under treatment might be more determined by the presence of the targeted molecular aberration. For example, a recent FDA-approved drug LOXO-101 (Hyman and others, 2017) targeting NTRK fusion has an overall response rate of 78% in the treatment group, while only 10% in the control group. In contrast, for some chemotherapies, the same factors that impact survival under control may equally impact survival under treatment, e.g., co-morbidities, social support (Kaufman and others, 2015). Then we would suggest a medium or high Inline graphic, say Inline graphic. Fortunately, the sensitivity parameter is bounded between Inline graphic1 and 1 and, in most settings, should be positive; a range should be selected in close collaboration with subject matter experts.

We proposed a flexible BNP approach for modeling the distribution of the observed data. Since the causal estimand is a functional of the distribution of the observed data and Inline graphic, we draw inference about it using posterior summarization. Our procedure can easily be extended to accommodate a prior distribution on Inline graphic, which will allow for integrated inference. Our procedure also allows for posterior inferences about other identified causal contrasts such as the distribution of survival under treatment versus under control. The procedure can also be used for predictive inference for patients with specific covariate profiles.

Acknowledgments

The authors would like to thank Drs Henry Brem and Steven Piantadosi for providing access to data from the brain cancer trial. Conflict of Interest: None declared.

Appendix A

A.1 Determining prior hyperparameters

As priors for Inline graphic in the GP mean function, we assume Inline graphic. We assume Inline graphic, where Inline graphic. The precision parameter Inline graphic in the DDP is assumed to be distributed Inline graphic.

In applications of Bayesian inference with small to moderate sample sizes, a critical step is to fix values for all hyperparameters Inline graphic. Inappropriate information could be introduced by improper numerical values, leading to inaccurate posterior inference. We use an empirical Bayes method to obtain Inline graphic by fitting a bivariate normal distribution for responses of patients under treatment Inline graphic, Inline graphic. For Inline graphic, we assume a diagonal matrix with the diagonal values being 10. After an empirical estimate of Inline graphic is computed, we tune Inline graphic and Inline graphic so that the prior mean of Inline graphic matches the empirical estimate, Inline graphic and Inline graphic. Finally, we assume Inline graphic.

A.2 MCMC computational details

Unless required for clarity, we suppress dependence of the notation on treatment Inline graphic. Here, Inline graphic is used to denote endpoint (Inline graphic for progression and Inline graphic for death). We define

graphic file with name Equation14.gif

Let Inline graphic and Inline graphic, Inline graphic, Inline graphic (Inline graphic), Inline graphic is an Inline graphic matrix where the Inline graphicth row contains the Inline graphic-dimensional covariate vector Inline graphic for the Inline graphicth patient, Inline graphic is an Inline graphic matrix where the Inline graphic entry is Inline graphic, Inline graphic is an Inline graphic matrix where the Inline graphicth row refers to the Inline graphicth patient in Inline graphic, the Inline graphicth column refers to patient Inline graphic and Inline graphic element is the indicator that the patient in Inline graphicth row is the same as the patient in the Inline graphicth column, Inline graphic is an Inline graphic identity matrix, Inline graphic where Inline graphic and Inline graphic, Inline graphic, and Inline graphic.

For Inline graphic, we iterate through the following six updating steps:

  • (1) Update Inline graphic
    graphic file with name Equation15.gif
    here Inline graphic is the number of observations such that Inline graphic. Then, Inline graphic and Inline graphic.
  • (2) Update Inline graphic

    Assuming that Inline graphic,
    graphic file with name Equation16.gif
    here Inline graphic is generated from Step 1.
  • (3) Update Inline graphic
    graphic file with name Equation17.gif
  • (4) Update Inline graphic, Inline graphic
    graphic file with name Equation18.gif
  • (5) Update Inline graphic
    graphic file with name Equation19.gif
    here Inline graphic.
  • (6) Update Inline graphic, where Inline graphic.

    We write Inline graphic as Inline graphic where Inline graphic includes Inline graphic.

    • If Inline graphic
      graphic file with name Equation20.gif
      Inline graphic is point mass at Inline graphic.
    • If Inline graphic (i.e., Inline graphic),
      graphic file with name Equation21.gif
      graphic file with name Equation22.gif
      here Inline graphic.
    • If Inline graphic and Inline graphic
      graphic file with name Equation23.gif
      graphic file with name Equation24.gif
      here Inline graphic.
    • If Inline graphic and Inline graphic
      graphic file with name Equation25.gif
      graphic file with name Equation26.gif
      here Inline graphic.

Contributor Information

Yanxun Xu, Department of Applied Mathematics and Statistics, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218, USA yanxun.xu@jhu.edu.

Daniel Scharfstein, Department of Biostatistics, Johns Hopkins University, 615 N Wolfe St, Baltimore, MD 21205, USA.

Peter Müller, Department of Mathematics, The University of Texas at Austin, 2515 Speedway, RLM 8.100, Austin, TX 78712, USA.

Michael Daniels, Department of Statistics, University of Florida, Union Rd, Gainesville, FL 32603, USA.

Funding

This research is supported by National Institute Health NIH CA183854 and NIH GM 112327, and National Science Foundation NSF1918854.

References

  1. Brem, H., Piantadosi, S., Burger, P. C., Walker, M., Selker, R., Vick, N. A., Black, K., Sisti, M., Brem, S., Mohr, G.. and others. (1995). Placebo-controlled trial of safety and efficacy of intraoperative controlled delivery by biodegradable polymers of chemotherapy for recurrent gliomas. The Lancet 345, 1008–1012. [DOI] [PubMed] [Google Scholar]
  2. Chen, Y.-H. (2012). Maximum likelihood analysis of semicompeting risks data with semiparametric regression models. Lifetime Data Analysis 18, 36–57. [DOI] [PubMed] [Google Scholar]
  3. Comment, L., Mealli, F., Haneuse, S. and Zigler, C. (2019). Survivor average causal effects for continuous time: a principal stratification approach to causal inference with semicompeting risks. arXiv preprint arXiv:1902.09304. [Google Scholar]
  4. Daniels, M. J., Roy, J. A., Kim, C., Hogan, J. W. and Perri, M. G. (2012). Bayesian inference for the causal effect of mediation. Biometrics 68, 1028–1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. De Iorio, M., Johnson, W. O., Müller, P. and Rosner, G. L. (2009). Bayesian nonparametric nonproportional hazards survival modeling. Biometrics 65, 762–771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Ding, A., Shi, G., Wang, W. and Hsieh, J.-J. (2009). Marginal regression analysis for semi-competing risks data under dependent censoring. Scandinavian Journal of Statistics 36, 481–500. [Google Scholar]
  7. Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1, 209–230. [Google Scholar]
  8. Fine, J. P., Jiang, H. and Chappell, R. (2001). On semi-competing risks data. Biometrika 88, 907–919. [Google Scholar]
  9. Fix, E. and Neyman, J. (1951). A simple stochastic model of recovery, relapse, death and loss of patients. Human Biology, 205–241. [PubMed] [Google Scholar]
  10. Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Geisser, S. and Eddy, W. F. (1979). A predictive approach to model selection. Journal of the American Statistical Association 74, 153–160. [Google Scholar]
  12. Gelfand, A. E. and Kottas, A. (2003). Bayesian semiparametric regression for median residual life. Scandinavian Journal of Statistics 30, 651–665. [Google Scholar]
  13. Hanson, T. and Johnson, W. O. (2002). Modeling regression error with a mixture of Polya trees. Journal of the American Statistical Association 97, 1020–1033. [Google Scholar]
  14. Hougaard, P. (1999). Multi-state models: a review. Lifetime Data Analysis 5, 239–264. [DOI] [PubMed] [Google Scholar]
  15. Hsieh, J.-J. and Huang, Y.-T. (2012). Regression analysis based on conditional likelihood approach under semi-competing risks data. Lifetime Data Analysis 18, 302–320. [DOI] [PubMed] [Google Scholar]
  16. Hyman, D. M., Laetsch, T. W., Kummar, S., DuBois, S. G., Farago, A. F., Pappo, A. S., Demetri, G. D., El-Deiry, W. S., Lassen, U. N., Dowlati, A.. and others. (2017). The efficacy of larotrectinib (LOXO-101), a selective tropomyosin receptor kinase (TRK) inhibitor, in adult and pediatric TRK fusion cancers. Journal of Clinical Oncology 18_suppl, LBA2501. [Google Scholar]
  17. Ibrahim, J. G., Chen, M.-H. and Sinha, D. (2005). Bayesian Survival Analysis. Hoboken, NJ: Wiley Online Library. [Google Scholar]
  18. Kaufman, P. A, Awada, A., Twelves, C., Yelle, L., Perez, E. A., Velikova, G., Olivo, M. S., He, Y., Dutcus, C. E. and Cortes, J. (2015). Phase III open-label randomized study of eribulin mesylate versus capecitabine in patients with locally advanced or metastatic breast cancer previously treated with an anthracycline and a taxane. Journal of Clinical Oncology 33, 594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Lee, K. H., Haneuse, S., Schrag, D. and Dominici, F. (2015). Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis. Journal of the Royal Statistical Society: Series C (Applied Statistics) 64, 253–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Lin, D. Y., Robins, J. M. and Wei, L. J. (1996). Comparing two failure time distributions in the presence of dependent censoring. Biometrika 83, 381–393. [Google Scholar]
  21. Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates: I. Density estimates. The Annals of Statistics 12, 351–357. [Google Scholar]
  22. MacEachern, S. N. (1999). Dependent nonparametric processes. In: ASA Proceedings of the Section on Bayesian Statistical Science. Alexandria, VA: American Statistical Association, pp. 50–55. [Google Scholar]
  23. MacKay, D. (1999). Introduction to Gaussian processes. Technical Report. Cambridge University. http://wol.ra.phy.cam.ac.uk/mackay/GP/.ter. [Google Scholar]
  24. Peng, L. and Fine, J. P. (2007). Regression modeling of semicompeting risks data. Biometrics 63, 96–108. [DOI] [PubMed] [Google Scholar]
  25. Peng, L. and Fine, J. P. (2012). Rank estimation of accelerated lifetime models with dependent censoring. Journal of the American Statistical Association. [Google Scholar]
  26. Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press. [Google Scholar]
  27. Robins, J. M. (1995a). An analytic method for randomized trials with informative censoring: Part II. Lifetime Data Analysis 1, 417–434. [DOI] [PubMed] [Google Scholar]
  28. Robins, J. M. (1995b). An analytic method for randomized trials with informative censoring: Part 1. Lifetime Data Analysis 1, 241–254. [DOI] [PubMed] [Google Scholar]
  29. Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66, 688. [Google Scholar]
  30. Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica 4, 639–650. [Google Scholar]
  31. Sparapani, R. A., Logan, B. R., McCulloch, R. E. and Laud, P. W. (2016). Nonparametric survival analysis using Bayesian Additive Regression Trees (BART). Statistics in Medicine 35, 2741–2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Tchetgen Tchetgen, E. J. (2014). Identification and estimation of survivor average causal effects. Statistics in Medicine 33, 3601–3628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Varadhan, R., Xue, Q.-L. and Bandeen-Roche, K. (2014). Semicompeting risks in aging research: methods, issues and needs. Lifetime Data Analysis 20, 538–562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Wang, W. (2003). Estimating the association parameter for copula models under dependent censoring. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 257–273. [Google Scholar]
  35. Xu, J., Kalbfleisch, J. D. and Tai, B. (2010). Statistical analysis of illness–death processes and semicompeting risks data. Biometrics 66, 716–725. [DOI] [PubMed] [Google Scholar]
  36. Xu, Y., Müller, P., Wahed, A. S. and Thall, P. F. (2016). Bayesian nonparametric estimation for dynamic treatment regimes with sequential transition times. Journal of the American Statistical Association 111, 921–950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Xu, Y., Thall, P. F., Hua, W. and Andersson, B. S. (2019). Bayesian non-parametric survival regression for optimizing precision dosing of intravenous busulfan in allogeneic stem cell transplantation. Journal of the Royal Statistical Society: Series C (Applied Statistics) 68, 809–828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Zhang, J. L. and Rubin, D. B. (2003). Estimation of causal effects via principal stratification when some outcomes are truncated by “death”. Journal of Educational and Behavioral Statistics 28, 353–368. [Google Scholar]
  39. Zhou, H. and Hanson, T. (2018). A unified framework for fitting Bayesian semiparametric models to arbitrarily censored survival data, including spatially-referenced data. Journal of the American Statistical Association 113, 571–581. [Google Scholar]

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press

RESOURCES