Abstract
In semicompeting risks problems, nonterminal time-to-event outcomes, such as time to hospital readmission, are subject to truncation by death. These settings are often modeled with illness-death models for the hazards of the terminal and nonterminal events, but evaluating causal treatment effects with hazard models is problematic due to conditioning on survival– a post-treatment outcome– that is embedded in the definition of a hazard. Extending an existing survivor average causal effect (SACE) estimand, we frame the evaluation of treatment effects in the context of semicompeting risks with principal stratification and introduce two new causal estimands: the time-varying survivor average causal effect (TV-SACE) and the restricted mean survivor average causal effect (RM-SACE). These principal causal effects are defined among units that would survive regardless of assigned treatment. We adopt a Bayesian estimation procedure that parameterizes illness-death models for both treatment arms. We outline a frailty specification that can accommodate within-person correlation between nonterminal and terminal event times, and we discuss potential avenues for adding model flexibility. The method is demonstrated in the context of hospital readmission among late-stage pancreatic cancer patients.
Keywords: Causal inference, Hospital readmission, Principal stratification, Semicompeting risks, Survivor average causal effect
1. Introduction
In end-of-life care settings, doctors patients, and policymakers often make decisions based on outcomes related to quality of the remaining lifespan, e.g., hospitalization, onset of dementia, or loss of independence. Comparing treatments on the basis of non-mortality time-to-event outcomes is complicated by the fact that interventions on quality of life can also affect mortality. As a motivating example, consider a hospital tasked with reducing hospital readmission among late-stage cancer patients. Death is a “terminal” or “truncating” event because its occurrence precludes future hospitalizations, and a hospital readmission is a “nonterminal” event which does not truncate death. An intervention that increases risk of readmission may simultaneously increase risk of death, leading a naïve analysis to attribute the lower observed readmission rates to a beneficial effect of the intervention. This danger stems from a problem known as “truncation by death” and has been addressed in the causal inference literature using principal stratification (Zhang and Rubin, 2003).
Principal stratification handles truncation by death by restricting causal contrasts to groups that would not experience the truncating event under either treatment at a fixed time point. Initially introduced by Robins (1986) and formalized in Rubin (2000) and Zhang and Rubin (2003), the traditional survivor average causal effect (SACE) is the causal effect of the treatment on the truncated (i.e., non-mortality) outcome among the subpopulation that would survive regardless of treatment assignment. A number of papers have discussed the nonparametric identifiability conditions and assumptions for the SACE (Long and Hudgens, 2013; Zhang and Rubin, 2003; Robins, 1986; Tchetgen Tchetgen, 2014). The time point for defining always-survivorship is often implicit, such as “by the end of the study,” and typically only one such time is considered. With a time-to-event structure of both the terminal (truncating) event and the nonterminal event, explicit definition of causal effects is indexed by both: (1) the time defining the “always survivors,” denoted with , and (2) the time interval over which treatment contrasts on the nonterminal event are evaluated, . For example, interest may lie in the causal effect on cumulative incidence of readmission at 30 days post-discharge () among patients who would survive under either treatment at 60 days . Examining such quantities across different values of can serve different inferential purposes. We use the term “snapshot causal effect” to describe survivor causal effects with .
When the truncated outcome is time-to-event, estimating the SACE at a single can be problematic. First, providing one snapshot effect does not give decision makers information about the sensitivity of conclusions to the (possibly arbitrary) choice of . It also does not account for the fact that timing matters for the nonterminal event. With hospital readmission, being hospitalized earlier may lead to more total hospitalizations or accelerate death. In other contexts – such as the onset of dementia – an earlier occurrence of the nonterminal event means more time spent in an unfavorable state, even if total lifespan remains unaffected. These concerns motivate the development of principal stratification methods that explicitly account for the time-to-event nature of the nonterminal outcome.
Others have partially grappled with principal stratification defined over time. For example, methods exist for treatment noncompliance in longitudinal contexts (Lin et al., 2008; Dai et al., 2012) But unlike treatment compliance status, which can vary over time arbitrarily, death at under either treatment condition necessarily precludes membership in an always-alive state at . Defining strata on the basis of survival is also closely linked to principal strata generated by other continuously-scaled quantities (Jin and Rubin, 2008; Bartolucci and Grilli, 2011; Schwartz et al., 2011) which can, in principle, create an infinite number of strata that are difficult to collapse into meaningful subpopulations and can entail estimation strategies complicated by problems of partial identifiability. Others have deployed similar perspectives to time-to-event problems where survival is intertwined with interest in a non-terminal event. Mattei et al. (2024) consider a clinical trial where patients could switch treatment, with principal strata defined based on the timing of treatment switching. Nevo and Gorfine (2022) provide estimands anchored to potential outcomes but based on a principal stratification defined by the ordering of a terminal and a nonterminal event, focusing inference on estimating effects on both event times. Lyu et al. (2023) build off of a more similar Bayesian estimation approach to that pursued here and focus estimation on snapshot effects of the effect of a treatment on recurrent nonterminal events by a given time, , among principal strata who would survive under both treatments up to time . Xu et al. (2022) provide a flexible Bayesian nonparametric estimation strategy targeting similar snapshot effects.
In the survival analysis literature, the problem of nonterminal time-to-event outcomes which may be truncated by terminal events is referred to as semicompeting risks because the terminal event acts as a competing risk for the nonterminal event, but the reverse is not true (Fine et al., 2001). Models which accommodate this semicompeting risks structure have been applied to a wide range of settings, including hospital readmission (Lee et al., 2015), cancer recurrence (Xu et al., 2010, 2022), career advancement (Pan and Gastwirth, 2013), and subscription product upgrades (Chen et al., 2017). Shared subject- and/or cluster-specific random effects, termed “frailties,” can allow for within-subject or across-subject correlation between event times that is induced by unmeasured factors (Xu et al., 2010). Such models are typically constructed on the hazard scale and account for truncation by removing individuals from nonterminal risk sets after the time of their observed terminal event; this removal is akin to what occurs with cause-specific hazards in competing risks problems. Joint modelling of the time-to-event outcomes is used to describe cumulative incidences or hazard-based predictive models (Lee et al., 2015). With a few notable exceptions(Xu et al., 2022; Lyu et al., 2023; Nevo and Gorfine, 2022; Mattei et al., 2024), analyses of semicompeting risks emanating from the survival analysis literature typically do not focus explicitly on causal inference for treatment effects, and inference for regression parameters in those approaches do not necessarily support causal interpretation. We offer new causal estimands that describe how the causal treatment effect on the nonterminal event varies across strata of the population defined by how the treatment causally affects their survival for the terminal event, i.e., their “survivorship.” Further c larifications of the relationship to causal estimands for competing events such as total effects and controlled direct effects appear in Young et al. (2020); Stensrud et al. (2020); Stensrud and Dukes (2022); Mattei et al. (2024).
This paper adapts existing semicompeting risks models and anchoring them to a principal stratification framework for the purpose of drawing causal inferences. As a first contribution, we outline a framework for principal strata defined by a continuous time-to-event truncating variable, such as death time. Second, we motivate and define two new causal estimands for truncated time-to-event outcomes, which represent extensions over previously-considered snapshot effects. We offer practical guidance for interpreting time-varying estimates of these quantities and describe their relationship to different types of decision making about the intervention. Third, we describe a density factorization which is innovative for principal stratification problems and that allows for explicit links to (non-causal) semicompeting risks models. Lastly, a Bayesian estimation procedure is provided with accompanying software.
The work here has important commonalities with work in Xu et al. (2022), who propose a Bayesian nonparametric estimation procedure for a distinct set of estimands for semicompeting risks problems, the aforementioned work in Nevo and Gorfine (2022), who provide causal estimands based on an alternative “population stratification,” and work in Lyu et al. (2023) which builds on the same foundation as our framework.
2. A potential outcomes approach for semicompeting risks data
2.1. Notation
Consider the evaluation of a binary intervention (0=control, 1=treated), where interest lies in its effect on the times to a nonterminal event, , and a terminal event, , the occurrence of which may leave ill defined. We continue with the motivating setting of late-stage cancer care, where is an intervention intended to reduce hospital readmission among recently discharged patients, is the time to hospital readmission, and is the time to death. The occurrence of death leaves future readmission undefined. Using the potential outcomes framework, let and denote the potential event times for readmission and death for person , respectively, that would occur if the person were treated with . One or both of these events may be right censored by the potential censoring time . If death occurs without readmission, we set to be , a non-real value. The observed times are and , where is defined to be for any real . The nonterminal event indicator is one if the nonterminal event is observed to occur and zero otherwise. The analogous death event indicator is . The set of covariates available at baseline, denoted by , may consist of confounders, predictors of censoring, and measured baseline predictors of either event type. Together, the observed data for individual is .
2.1.1. Principal stratification for continuous time
A principal stratification is a partition of the population into subpopulations defined by joint values of the potential outcomes under all treatment conditions. Our basic principal strata are defined by the pair of potential death times . Since potential outcomes are not affected by treatment, stratifications based on the basic principal strata – and unions of these strata – exist prior to treatment assignment and can play a role similar to baseline covariates. While the basic principal strata describe a unit’s survival experience under both treatments across the entire time scale, it is useful to derive related quantities. For any , let denote the time-varying principal state (Lin et al., 2008; Dai et al., 2012) implied by the basic principal strata:
The value of represents a union of basic principal strata, depending on whether the individual is alive at in both arms (, for “always alive”), alive only under treatment (, for “control-killed”) or control (, for “treatment-killed”), or dead under both (, for “doubly dead”). In the context of hospital readmission for cancer patients, we may be interested in readmission differences among the “always-alive” at 30 days, i.e., , as well as the net difference in 30-day survival probabilities .
The set of individuals with can also be viewed as a cohort with a well-defined and time-varying nonterminal event causal contrast function on interval . For various , we can define survivorship cohorts, denoted by :
We note that for . In the context of hospital readmission, refers to the cohort of patients who would survive at least 90 days regardless of treatment assignment. Like the principal states, these principal strata are defined solely in terms of potential terminal event times. Within a cohort , there can be no treatment effect on survival during the interval ; this fact ensures the time at risk for the nonterminal event is the same under both treatment and control conditions, allowing us to separate the intervention’s effect on the nonterminal event from the effect on survivorship. Time-varying estimands defined across different survivorship cohorts can inform downstream decision making about whether the intervention impacts different types of patients differently, as we will discuss in Section 2.3.
2.2. Causal estimands for semicompeting risks
2.2.1. The time-varying survivor average causal effect (TV-SACE)
On the cumulative incidence scale, the existing “snapshot” survivor average causal effect is
(1) |
As previously discussed, snapshot estimands do not describe time-varying effects for any well-defined population. When they are estimated at a single time point, as is typically done, it is also unclear how sensitive conclusions are to the choice of . To address these limitations, we define a new quantity, the time-varying survivor average causal effect (TV-SACE). This estimand is a function taking two arguments and , and it conveys the difference in the cumulative incidence of nonterminal events by time among the group that survives past regardless of assigned treatment:
(2) |
The TV-SACE captures the causal effect of on that has manifested by time , among the alwayssurvivors at . For example, hospitals may be interested in comparing 30-day () and 90-day () readmission rates among the cancer patients always-surviving at least 90 days post-discharge . When , the of Equation 2 coincides with the as defined in Equation 1.
Joint indexing of by both and is essential for characterizing causal effects. For a fixed , the function (as a function of ) is a time-varying causal effect within the cohort. It describes the accumulation of benefit causally attributable to treatment among the well-defined – if latent – survivorship cohort. The shapes of these curves for cohorts defined by different describe possible treatment effect heterogeneity on the nonterminal outcome across subpopulations with different underlying risks of death, and may reveal, for example, whether treatment effects steadily accrue or decay with time differently across the survivorship cohorts.
Importantly, viewing as a function of does not characterize a time-varying causal effect in a static population, but a function of snapshot effects, each defined within a different cohort. In particular, the shape of the function captures the sensitivity of causal effect estimation – in both sign and magnitude – to the moment in time used to define the always-survivorship group. Implications for decision makers is addressed in Section 2.3
2.2.2. The restricted mean survivor average causal effect (RM-SACE)
Another causal estimand is a variation of the the restricted mean survival time (RMST) and captures the length of the delay in the nonterminal event among always-survivors. This effect may be particularly relevant if the nonterminal event represents a permanent state change, such as the onset of irreversible dementia.
In the context of preventing hospital readmission, the captures how much expected hospitalization-free time the treatment causes one to accumulate by time , defined among the always-survivors . Within the cohort describes the timing of benefit accrual. If, within , the effect of treatment on the nonterminal events arises solely by delaying early events, the benefit accrues quickly and eventually levels off as increases. If different survivorship cohorts have very different curves, then the effect on the nonterminal event is heterogeneous with respect to the underlying risk of death.
Just as with , the function conveys the sensitivity of the snapshot version to the choice of . If increases steadily, the choice of matters greatly, and reporting for only a single time point understates the total impact of the treatment on delaying the nonterminal event. On the other hand, if levels off at some , then any benefits attributable to treatment can be fully captured by an estimate of .
2.3. Using principal stratum estimands for decision making
Nonterminal and terminal events meaningfully characterize the patient experience, but decision makers rarely weigh them equally. Typically both event types are undesirable, but improvement in the nonterminal event rate with an increase in terminal events is unacceptable. When estimates of the principal stratum causal effects in Sections 2.2.1 and 2.2.2 agree qualitatively with estimates of the causal effect on terminal event times (e.g., the treatment is estimated to prolong survival and prolong readmission for all or most survival cohorts), decision making is straightforward. When these disagree, the relevance of the principal stratum effects depends on the probability of belonging to (i.e., ) for a population, or for subpopulations defined by ). We discuss considerations for three types of decisions: (1) policymakers evaluating whether an intervention should be made available to future patients and, if so, which ones (“access decisions”), (2) decision makers choosing whether to pursue an intervention for a specific individual (“personal decisions”), and (3) regulators selecting metrics to monitor the quality of an intervention (“monitoring decisions”).
Access decisions prescribe an intervention to populations where it is judged favorably and prohibit it where the risk outweighs the demonstrated benefit. For example, a beneficial principal stratum effect involving hospital readmission (e.g., ) may justify access to an intervention despite a null effect or modestly negative effect on mortality. More likely, the intervention should be targeted to subgroups defined by observable baseline where is large. Accordingly, researchers that wish to inform access decisions should report or across the wide range of to facilitate synthesis across studies.
Personal decisions mimic access decisions focused on a narrow subpopulation matching the patient of interest. If is highly dependent on , conditional effects like should be explored and reported in order to support these decisions. Additionally, individuals are sometimes expected to know more about their underlying latent health than is adequately captured by . If evidence was initially generated in a randomized setting, exceptionally healthy patients may infer that they are more likely to belong in , increasing the relevance of and .
Lastly, quality monitoring assesses whether an intervention continues to offer new patients a sufficiently favorable risk-benefit profile. Here, a regulator may decide to add SACE metrics to complement existing terminal event metrics such as 30-day mortality, or may use estimated functions of to decide if there is risk that standard nonterminal event quality of care metrics create perverse incentives (e.g., if variation in readmission tends to come at the cost of mortality). If SACE metrics are to be implemented, regulators must also choose a such that or reliably summarizes the intervention’s impact on the nonterminal event. Metrics will be most relevant for where is high, since this is the subpopulation for which the intervention has demonstrated value for the nonterminal event. For candidate , and capture the benefit most completely when the corresponding and are fairly stable as .
2.4. Structural assumptions
We now review a set of assumptions essential to our estimation strategy for and . For clarity, our exposition focuses on non-recurrent nonterminal events, where any individual who experiences the nonterminal event is no longer at risk for that event. For nonterminal events which are nonpermanent and that in principal could recur – like a second hospital readmission – the proposed framework may still be relevant with careful definition of the nonterminal event (e.g., time to first readmission).
Assumption 1
Consistency of potential outcomes.
Consistency is a standard assumption throughout the causal inference literature which connects observables and to their corresponding potential outcomes. Briefly, the treatment is well-defined such that there are no hidden variations within treatment level (Rubin, 1990).
Assumption 2
Conditional exchangeability (no unmeasured confounding).
The observed treatment assignment does not depend on the potential outcomes after accounting for the set of measured covariates .
In a randomized trial, this assumption holds by design since treatment assignment is independent of all measured and unmeasured variables. For observational settings, interpreting effect estimates as causal effects requires a sufficiently comprehensive .
Assumption 3
Shared, non-informative censoring of event times.
The potential censoring times are shared (i.e., ). Furthermore, the vector of potential censoring times is conditionally independent of all potential event times.
Non-informative censoring is required for the consistent estimation of cumulative distribution functions. With administrative censoring, this assumption is satisfied by design.
2.5. Connection to traditional semicompeting risks models
We state a key simplifying assumption that builds a bridge to the semicompeting risks literature. With closely related nonterminal and terminal event processes, it is unrealistic to assume that any measured baseline set will contain all sources of dependence between potential event times in . However, if the cause of the dependence is baseline heterogeneity in the patient population, it may be reasonable to assume that baseline factors can be summarized by a one-dimensional subject-specific latent trait . As with any random effect, cannot adjust for unmeasured confounding. However, can be used to model sources of dependence in event times across treatment arms which are independent of the treatment assignment mechanism (i.e., unmeasured predictors).
Assumption 4
Independence of potential outcomes conditional on covariates and latent frailty.
Potential nonterminal and terminal event times under each treatment are conditionally independent conditional on and an individual-level latent trait .
Assumption 4 suggests a factorization of the joint density of the four potential outcomes , , , and that is unusual within the principal stratification literature. Traditional model-based principal stratification approaches build a model for stratum membership given covariates (the “S-model”), and a model for the joint distribution of the potential outcomes conditional on the principal strata and covariates (the “Y-model”) (Schwartz et al., 2011). Instead, we choose an alternative factorization, shown in Equation 3, which further simplifies to Equation 4 under Assumption 4.
(3) |
(4) |
This arrangement makes it easy to enforce that must exceed whenever the nonterminal event occurs (i.e., ). We can also leverage existing illness-death transition models from the semi-competing risks literature to obtain a general form of the likelihood.
2.6. Likelihood
Within a single treatment condition, the semicompeting risks structure of the potential outcomes and can be seen as an illness-death transition model characterizing transitions among the event-free (“healthy”), nonterminal only (“ill”), and post-terminal (“dead”) states. Hazards can be defined for the three types of event transitions: (1) healthy-ill, (2), healthy-dead, and (3) ill-dead.
The treatment arm-specific hazards conditional on covariates are denoted , , and , where is a vector of unknown parameters. With cumulative hazard , the observed data likelihood conditional on is given by
(5) |
If the frailties are included as unknown parameters in an expanded parameter set for , the dimension of the parameter space is large and grows linearly with , rendering estimation impracticable for large data sets. For computational efficiency and scalability, we use the marginalized likelihood rather than the conditional likelihood in our estimation algorithm. For selected choices of , the form of can be obtained analytically, but numerical integration within the Markov chain Monte Carlo (MCMC) computational algorithm can be used to accommodate arbitrary . Computationally feasible estimation strategies are the focus of the next section.
3. Bayesian model-based estimation of causal effects
3.1. Identifiability in the Bayesian framework
We propose a Bayesian approach anchored to illness-death models for state transitions. Note that the likelihood in Section 2.6 does not support point identifiability of the principal stratum causal effects, a problem which also arises with the more traditional (i.e., snapshot) SACE (Long and Hudgens, 2013). This motivates our use of a Bayesian estimation procedure. In addition to the ability to handle large amounts of missing data (including unobserved potential outcomes) in much the same way as unknown parameters, the Bayesian procedure with proper prior distributions will yield proper posterior inference, even in the face of flat portions of the likelihood. In these instances, some of the unknown parameters in are only “partially identified”: even with infinite amounts of data, the posterior distribution converges to a non-degenerate distribution over a range of possible values that is smaller than that specified in the prior, but not equal to a single point (Gustafson, 2010).
3.2. Implementation with parametric illness-death models
In this paper we focus on hazards parameterized using Weibull regression models for each of the six possible transitions. Although alternative specifications are possible, we elect to use a semi-Markov model for the terminal event after the occurrence of the nonterminal event (i.e., for , the terminal event hazard at depends on only through (Lee et al., 2015), and we require a non-negative correlation among transition hazards. For and , the Weibull shape for transition under is denoted , and the baseline hazard rate is , giving hazard equations:
The complete parameter vector for the above model specification is for , and . For computational convenience we suppose that the independent subject-specific frailties arise from a gamma distribution constrained to have a mean of 1 with unknown variance . This parametric assumption allows the marginal likelihood to be computed analytically, regardless of the specific models used for the baseline hazards. Equation 6 gives the likelihood marginalizing over independent gamma-distributed frailties
(6) |
where is a reference level transition hazard for . Details of this marginalization can be found in Appendix B.
As with any Bayesian procedure, prior distributions must be placed on all unknown parameters. For the frailty variance , we suggest eliciting weakly informative priors from subject matter experts since we encountered convergence problems with dispersed starting values and vague priors (e.g., the prior on the precision suggested by Lee et al. (2015)). Priors for other components of are intended to be weakly informative; details can be found in the Appendix A. Analyses can be performed with different prior distributions to gauge sensitivity of substantive conclusions to the choice of prior.
3.3. Estimation algorithm
The estimation procedure for the causal quantities can be summarized in four steps: (1) estimating regression model coefficients using MCMC, (2) sampling latent frailties conditional on the posterior of , (3) imputing missing factual and counterfactual outcomes conditional on the posterior of , and (4) using imputed potential outcomes to calculate causal estimands of interest.
We obtain posterior samples of with a modified Hamiltonian Monte Carlo No-U-Turn Sampler (NUTS) using the marginalized form of the likelihood in Stan, a programming language for Bayesian modeling, inference, and posterior analysis. Suppose there are post-warmup MCMC parameter samples . The closed form of is a gamma distribution (see Appendix D), which facilitates sampling from the posterior for . Using these posterior samples of , we can draw from the posterior predictive distribution of the full set of potential outcomes . Full details of the imputation procedure can be found in Appendix D.1.
The final step of finite sample causal inference is straightforward once all potential outcomes have been either directly observed or imputed. For a sequence of time points with dictated by the scientific question, the principal state vector is a deterministic function of and . For MCMC iteration , denote the principal state for person at time point by . Given and , let be the number in the always-alive state at (i.e., ) For any , a posterior draw of the sample time-varying survivor average causal effect is given by
Similarly, the bth posterior draw of the sample restricted mean survivor average effect is
As with any posterior sample, the draws of or can be summarized using the means, medians, or quantile-based credible intervals for each () pair.
4. Simulation study
Our simulation study investigates operating characteristics of the estimation strategy and evaluates performance of the method under two possible misspecifications of the underlying frailty. To align with time frames for existing quality metrics, we simulate data with administrative censoring at 90 days and consider the and estimators at 2 time points, Day and Day , in a data sets of size similar to the pancreatic cancer and hospital readmission data application. The data generation processes (DGPs) are as follows: correct model specification (DGP1), parametric misspecification of the frailty distribution, where estimation assumes a gamma-distributed frailty while the true data generation process is log-normal (DGP2), and coefficient misspecification of the frailty distribution, where estimation incorrectly assumes that the frailty enters every hazard model in the same way (DGP3). For , all three scenarios can be expressed as special cases of data generation process with hazards
DGP1 and DGP2 differ only in the frailty distribution ; for both DGP1 and DGP2, for all and . With DGP3, the inclusion of allows for the importance of the frailty to vary by treatment and transition type; values of mean the the frailty is more important for transition in the group , while mean the individual frailty is less important. Values for ranged from 0.81 to 1.23.
Parameters used for data generation were selected based on simplistic analysis of the readmission data in Section 5 to approximate a setting of end-of-life cancer care. Four independent binary baseline covariates were simulated, plus one continuous covariate. For , coefficients were chosen to specify moderate hazard ratios ranging from 0.72 to 1.85. All frailty distribution were specified to have a variance of 1.3 to approximately match the readmission data. Appendix E provides more detail on these data generation processes, the process used to construct weakly data driven priors for and (Appendix A), thresholds for adequate MCMC convergence, and sampler tuning parameters. Each replicate data set was analyzed with model fits using the final 2000 iterations of a 5000 iteration chain.
4.1. Simulation results
Table 1 shows bias, mean squared error (MSE), and 95% credible interval coverage estimates for the true population quantities. In the correctly specified Scenario 1, there is small bias and low MSE, particularly for the fraction always-alive and the . However, note the slight under-coverage for the fraction always alive at times and the corresponding under-coverage of the true population estimates of TV-SACE and RM-SACE.
Table 1.
Bias, Mean Squared Error (MSE), and 95% interval coverage for estimators of population proportion always-alive, time-varying survivor average causal effect and restricted mean survivor average causal effect from 200 replicates for correct specification (Scenario 1), plus two incorrectly specified frailty scenarios (Scenarios 2 and 3).
Scenario | Quantity | Bias | MSE | Coverage | |
---|---|---|---|---|---|
| |||||
1 | Fraction always-alive at | 30 | 0.002 | 3.917e-05 | 0.845 |
60 | 0.001 | 3.956e-05 | 0.900 | ||
90 | 0.001 | 5.163e-05 | 0.855 | ||
30 | 0.001 | 9.313e-05 | 0.935 | ||
60 | 0.002 | 1.707e-04 | 0.915 | ||
90 | 0.000 | 2.142e-04 | 0.915 | ||
30 | −0.016 | 3.310e-02 | 0.930 | ||
60 | −0.079 | 2.327e-01 | 0.945 | ||
90 | −0.060 | 6.976e-01 | 0.935 | ||
2 | Fraction always-alive at | 30 | −0.006 | 8.058e-05 | 0.680 |
60 | −0.005 | 7.189e-05 | 0.745 | ||
90 | −0.002 | 6.038e-05 | 0.845 | ||
30 | 0.000 | 1.145e-04 | 0.910 | ||
60 | 0.003 | 1.917e-04 | 0.895 | ||
90 | 0.004 | 2.327e-04 | 0.940 | ||
30 | 0.094 | 5.732e-02 | 0.830 | ||
60 | −0.024 | 2.857e-01 | 0.920 | ||
90 | −0.246 | 8.786e-01 | 0.930 | ||
3 | Fraction always-alive at | 30 | 0.003 | 4.811e-05 | 0.805 |
60 | 0.003 | 4.940e-05 | 0.810 | ||
90 | 0.004 | 6.904e-05 | 0.755 | ||
30 | 0.002 | 9.410e-05 | 0.925 | ||
60 | 0.008 | 2.089e-04 | 0.885 | ||
90 | 0.013 | 3.744e-04 | 0.810 | ||
30 | −0.044 | 3.768e-02 | 0.895 | ||
60 | −0.412 | 3.829e-01 | 0.845 | ||
90 | −0.956 | 1.548e+00 | 0.775 |
Figure 1 provides 95% credible intervals for quantities from Scenario 1 defined at , depicted against the superpopulation truth (vertical gray lines) and finite sample (i.e., individual replicate data set) truths. We find roughly nominal coverage for the finite sample fraction always-alive intervals (subplot A); thus, we recommend interpreting posterior uncertainty for the fraction always-alive primarily as an in-sample quantity. Credible intervals for the survivor average causal effects show similar coverage levels for the superpopulation and finite-sample causal TV-SACE and RV-SACE estimands (subplots B and C).
Figure 1.
Simulation Scenario 1: 95% credible intervals across 200 replicates for the fraction always-alive at (A), (B), and (C) for , compared to the replicates’ finite sample truth (points) and the population average (vertical gray lines).
Table 1 shows the expected deterioration in performance for the misspecified Scenarios 2 and 3, with both misspecified scenarios result in bias and lower coverage probabilities. The seems particularly sensitive to misspecification, but the has reasonably low MSE. In general, these misspecifications resulted in slight overestimates of the , moderate to substantial underestimates of the , and unpredictable directions of the bias for the fraction always-alive.
5. Evaluation of supportive home care effects on mortality and hospital readmission among pancreatic cancer patients
5.1. Medicare Part A pancreatic cancer readmission data
We demonstrate our method in an analysis of hospital readmission using a data set of 12,091 newly diagnosed pancreatic cancer patients in the United States, adopting many of the principles involved in the emulation of target trials with observational data (Hernán and Robins, 2016) to support clear interpretation of the proposed causal estimands. The initial sample consisted of 17,685 Medicare Part A enrollees in California from 2000 to 2012 who were hospitalized and later discharged with a diagnosis of pancreatic cancer. We limited our analysis to the 12,091 patients who were healthy enough to be discharged to home (i.e., not hospice or a skilled nursing facility). The baseline was set to the date of discharge from the index hospitalization during which the cancer was diagnosed. Hospital readmission as a proxy for quality of care usually focuses on a short window after the index hospitalization. To focus on these short-term effects, administrative censoring was applied at 90 days. More information can be found in Lee et al. (2015).
The scientific question of interest is whether in-home supportive care leads to lower rates of hospital readmission than discharging to home without additional support. In the terminology of Section 2.3, a related “access decision” determines whether additional support should be available to patients whose doctors request it, a “personal decision” answers whether additional support safely reduces readmission for specific covariate profiles, and a “monitoring decision” assesses whether readmission and mortality metrics adequately summarize benefit and are therefore suitable for ongoing evaluation of home support efficacy. Of the 12,091 patients discharged to home, 3,140 (26%) were sent home with supportive care. A major concern was that patients discharged without care would be systematically healthier than those discharged with support, presenting a strong threat of confounding. To reduce the dependence of model-based confounding adjustment, a logistic regression propensity score model for the receipt of home care was constructed using all available baseline covariates: non-White race, age, dichotomized Charleston-Deyo comorbidity score, admission route, and length of stay during index hospitalization. Estimated propensity scores used to match (without replacement) 3,140 of the 8,951 patients discharged without care for comparison with those receiving supportive care (Ho et al., 2007), resulting in average treatment effects on the treated (i.e., supported). Appendix F depicts covariate balance checks after matching.
Hazard regression models included the same covariate set, with all covariates mean centered. Age and length of stay were scaled to have a standard deviation of 1 to facilitate specification of priors for the coefficients. As proposed in Section 3.3, we adopted Weibull transition hazards with a semi-Markov specification for the post-readmission hazard of death. The baseline hazard and Weibull shape parameters were allowed to freely vary across treatment arm and transition type, but adjustment covariates were assumed to be the same across treatment (i.e., for ). Prior distributions were specified as in Section 3.2. In particular, the prior for is specified as Inverse − Gamma(21, 7.1), which is an informative prior chosen to ensure stable MCMC inference for the remaining model parameters.
Posterior draws of were obtained from 4 chains of 4,000 MCMC iterations each, with the first 3,000 iterations removed as warmup. Gelman-Rubin potential scale reduction factors and effective sample sizes were calculated for each parameter (Gelman et al., 2013; Carpenter et al., 2017). Using the procedure outlined in Section 3.3, posterior draws of the frailties and missing potential outcomes were obtained using the 4,000 post-warmup samples of .
5.2. Readmission and mortality results
All Gelman-Rubin values were below 1.01, indicating good mixing of the chains, and the minimum effective sample size across all parameters was 2,770.
Part A of Figure 2 shows the posterior mean survival curves for each treatment group and their implications for posterior mean size of the always-survivor subpopulation. The fastest-declining survival curve, shown in green, is the in-sample average of time to first potential death (i.e., ); the “survival” equals . The other two curves in Part A show the mean for the counterfactual survival probabilities if everyone in the (matched) sample had been treated with extra care (, dashed orange) or discharged home without extra care (, in dotted navy). Based on these covariate-adjusted survival curves, the treatment of receiving additional support at home leads to reduced lifespan across the 90 days, i.e., for . For all curves, uncertainty increases with time because of the decreasing number of subjects used to estimate survival. Relative to the survival curves, there is additional uncertainty in the estimates due to uncertainty in and .
Figure 2.
Posterior mean survival curves among newly diagnosed pancreatic cancer patients discharged home, with supportive care and without , with the corresponding implications for always-alive principal stratum size (A) and posterior mean population composition of always-alive (), treatment-killed (), control-killed (), and doubly dead () principal states (B)
Part B of Figure 2 shows the posterior mean proportion of the population in each principal state over time. For small , nearly the entire population is in the state because few deaths are observed or imputed under . With time, more deaths accumulate among patients discharged home with support, leading to a greater proportion of the population in the state than the state. The population fractions in and stay relatively constant after approximately day 45, suggesting that most patients who would die only if discharged to one of the conditions will do so relatively early in the 90-day time frame. The overall effect is that depletion of the always-alive principal stratum occurs more during the early part of the 90-day window.
5.2.1. Population-level causal effects
Part A of Figure 3 shows posterior means for for 5 always-survivor cohorts . In all cohorts, support leads to greater incidence of hospital readmission. In the first days after discharge from the index hospitalization, the healthier, longer-surviving cohorts like have treatment effects on readmission rates which are slightly larger than cohorts with less stringent survivorship requirements (e.g., ). However, effects among the longer-surviving cohorts grow more similar over time. This may point to a heterogeneity in reasons requiring a readmission; that is, readmissions occurring in the first week or so after diagnosis may be caused by a different mixture of proximate causes than the admissions during the rest of the 90 days.
Figure 3.
Estimated time-varying () and restricted mean () survivor average causal effects of home care (vs. no additional care at home) on the cumulative incidence of hospital readmission among 6,280 newly diagnosed late-stage pancreatic cancer patients
Plot B of Figure 3 shows estimated curves of across , with each of the 1,000 lines derived from a representative posterior draw of . The color of the lines at each gives the proportion of the study population in the always-alive state at according to that set of posterior predictive potential outcome samples. The shape suggests that, for the cumulative incidence scale, there is no natural time point for evaluating the causal effect of discharge support on hospital readmission because never completely levels off. However, the direction of the effect (i.e., higher cumulative incidence in the group discharged with care) is largely consistent over time.
Like the time-varying survivor average causal effect, the restricted mean effects also suggest that being discharged home with support increases readmissions. Part C of Figure 3 shows the within-cohort accumulation of readmission-free days attributable to being discharged with support. Because the accumulation is negative, this finding is consistent with faster and ultimately greater cumulative incidence of readmission among the treated (i.e., supported) group. In part due to the natural ceiling of in the definition of the restricted mean, the estimated snapshot function in Part D of Figure 3 steadily grows in magnitude over the course of the 90 days.
5.2.2. Implications for individual-level decisionmaking
The posterior distribution for frailty variance shows substantial remaining variability in prognosis that is not explained by the covariates included in the models, with a mean of 1.31 (95% CI: 1.12, 1.51). To put this estimate into perspective, corresponds to patients in the 90th percentile of the latent frailty experiencing event hazards that are 41.9 times the hazards for comparable patients in the 10th percentile. Relative to the variation in prognoses explained by predictive covariates, large values for pose additional difficulties for tailored decisionmaking. Nevertheless, covariate-specific posterior predictions may be used to differentiate treatment recommendations.
Table 2 gives examples of tailored prognoses for two individuals with selected covariate patterns. As expected, the posterior predictive state probabilities show that – for comparable levels of underlying frailness – a younger White woman is much more likely to be in the always-alive state at 90 days than an older non-White man with the same comorbidity score and duration of index hospitalization. However, the magnitude of this survival advantage varies greatly. For an individuals in the 10th or 90th percentile of latent health (i.e., the 90th or 10th percentile for ), the difference in the probability of being always-alive at is approximately 0.08; for individuals of average frailty, the difference is more pronounced at 0.312 (0.420 vs. 0.108). We can also conclude that frail patients of either covariate pattern are unlikely to be in the always-alive state at 90 days. Together, these findings highlight the degree to which an individual can tailor decision making based on additional information.
Table 2.
Posterior predictive means for principal state probabilities and principal stratum causal effects for new patients of two covariate patterns
Principal State Probabilities at 1 | If always-alive, causal effect of being discharged to home with support (vs. without) | ||||||
---|---|---|---|---|---|---|---|
| |||||||
Patient characteristics | Latent health2 | Day | AA | CK | TK | Difference in readmission incidence by | Additional readmission-free days accumulated by |
| |||||||
Nonwhite male aged 85, average comorbidity score and hospital length of stay | Frail | 30 | 0.180 | 0.150 | 0.362 | 0.037 | −0.296 |
90 | 0.003 | 0.020 | 0.115 | 0.023 | −3.622 | ||
Average | 30 | 0.550 | 0.126 | 0.263 | 0.041 | −0.449 | |
90 | 0.108 | 0.124 | 0.353 | 0.079 | −5.439 | ||
Healthy | 30 | 0.972 | 0.009 | 0.019 | 0.006 | −0.065 | |
90 | 0.895 | 0.032 | 0.071 | 0.023 | −0.912 | ||
White female aged 65, average comorbidity score and hospital length of stay | Frail | 30 | 0.495 | 0.140 | 0.278 | 0.029 | −0.346 |
90 | 0.090 | 0.117 | 0.308 | 0.005 | −1.346 | ||
Average | 30 | 0.813 | 0.061 | 0.116 | 0.045 | −0.550 | |
90 | 0.420 | 0.147 | 0.312 | 0.043 | −3.337 | ||
Healthy | 30 | 0.994 | 0.002 | 0.004 | 0.008 | −0.092 | |
90 | 0.976 | 0.008 | 0.016 | 0.030 | −1.191 |
Always-alive (AA), dead only under control (CK), and dead only under treatment (TK)
Frail and healthy correspond to the 90th and 10th percentiles of γ, while average health corresponds to γ = 1
6. Conclusions
In this paper we have proposed a general approach to principal stratification where the strata are defined by potential times to a truncating event and there is administrative censoring. From a decision making perspective, this stratification is a natural one because it groups units according to their time horizon for comparing quality of care. To quantify those differences, we formulated two new causal estimands, the and , for contrasting nonterminal time-to-event outcomes that are truncated by death. We then described a Bayesian model-based estimation procedure that builds upon existing strategies for semicompeting risks models. Our innovative factorization scheme facilitates connections to existing illness-death models, putting a sharper causal focus on this literature and clarifying how such models can be adapted to yield causally interpretable quantities.
The methods outlined here have several limitations that warrant discussion. First, in our implementation, the latent trait is assumed to be gamma-distributed, and the effect of the frailty is constrained to be identical across all hazard types and treatment arms. Both assumptions were made largely for computational convenience because they allow parameter sampling using the marginal likelihood. The need to balance computational stability also required an informative prior for the frailty variance, with less informative prior specifications difficult to assess due to issues with MCMC convergence. Other parametric distributions could be assumed for the latent trait (e.g., log-normal), and shared frailty models have previously incorporated transition-specific coefficients for the log-frailty (Liu et al., 2004). These adaptations do not result in analytically tractable marginal likelihoods, although numerical integration can be used. In practice, we found MCMC performance using the unmarginalized likelihood to be inconsistent, slow, and prone to divergent NUTS transitions. Second, parametric Weibull hazard models may not be appropriate for all scenarios. More flexible baseline hazard specifications could be achieved with splines (Royston and Parmar, 2002) or Bayesian nonparametrics (Lee et al., 2015), although posterior prediction would become more difficult due to problems extrapolating beyond the observed time scale. The Bayesian nonparametric approach of Xu et al. (2022) sidesteps parametric assumptions but requires randomized treatment assignment. While our approach can be used in observational settings, confounding will only be controlled with correct specification of the propensity score or outcome models, which may be difficult with parametric restrictions.
Notwithstanding these limitations, this work offers a new, causally informed approach to the analysis of semicompeting risks data. Illness-death models pose two challenges for causal inference on the non-terminal outcome: (1) the use of hazard-based estimation strategies, which implicitly condition on the post-treatment outcome of survival, and (2) handling truncation when the treatments also affect survival. By formulating causal estimands using potential outcomes notation, we separate the model estimation from the choice of causal estimand. Our method also indirectly addresses decision makers’ need to balance nonmortality considerations with treatment impacts on survival; this is achieved by quantifying, for every time point, the relative size of the population for whom quality of care contrasts are relevant. While we consider the case of administrative censoring, other contexts with more complex censoring may motivate extensions of the TV-SACE and RM-SACE to additionally depend on potential censoring times. The use of posterior predictive sampling to estimate the effects allows for the innovative density factorization which connects to an existing semicompeting risks approach. Analogous factorizations may prove useful for truncated outcomes which are not time-to-event. Lastly, because we operate in the Bayesian framework, we properly account for uncertainty due to partial identifiability of the causal effects.
Future work on these and other principal stratification models for non-survival outcomes in high-mortality settings may be extended to incorporate utility functions within a formal decision-theoretic framework. These methods open up more possibilities for causally valid research on non-mortality outcomes related to quality of life or quality of care among high-mortality patient populations. In turn, this provides evidence that is more directly useful to the individuals and policymakers making decisions on the basis of multiple criteria.
Acknowledgements
Support for this work was provided by NIH grants T32CA009337 and T32ES007142 (LC), R01CA181360 (SH), and R01ES026217 and R01GM111339 (CZ), as well as EPA grant RD835872 (CZ). FM received support from Dipartimenti Eccellenti 2018–2022 ministerial funds. LC was also supported by a Rose Traveling Fellowship. We thank Alessandra Mattei for helpful discussion.
A. Additional details on the prior specification
For binary covariates and continuous variables rescaled to have unit variance, hazard ratios are unlikely to exceed 5; therefore we set to be for and . With mean-centered covariates and , the baseline hazard corresponds to the hazard experienced by those at the sample mean covariate values. Thus, reasonable priors for the log-baseline hazards are , where is the number of observed events and is the total at-risk person-time for transition , pooling across treatment arms. For exponential hazards, this prior asserts that the true hazard experienced at the sample mean value has only ≈ 5% probability of being more than two orders of magnitude away from the crude (pooled) event rate. The data-driven prior specification for the log-baseline hazards makes the model invariant to the time scale of the data (i.e., days vs. years). (An alternative approach would be to rescale all times so that the mean event times were ≈ 1.) Lastly, the Weibull shape parameters are given priors to express moderate belief that any changes in the hazards occur slowly rather than quickly decaying or exploding. Assuming variation independence of the different parameter blocks , , and , we can construct a prior as .
B. Marginalization of conditional likelihood over frailties
Let be the instantaneous hazards and be the cumulative hazards for transition . First, note for and by definition, regardless of specific distributional assumptions. We have the following equivalence:
Then the conditional likelihood contribution of individual is
(7) |
with , as in the main text. The marginal likelihood across all observations can be written as
(8) |
This was first stated in Xu et al. (2010), but we provide a proof here.
First, we define some shorthand notation to suppress indexing that is unnecessary for this proof:
and note a general property of the gamma function that for
(9) |
To see the final line, consider the 4 possible values taken on by the binary indicators :
Case 1:
Cases 2 and 3: and (1, 0)
where the rightmost equality is true by Equation 9.
Case 4:
This proves the marginal likelihood has the form stated in the main text:
C. Log-likelihood contributions by observed data pattern
The log-likelihood marginalized over the frailties is
with . This is the likelihood that gets added to the target function within Stan.
The marginal likelihood in the main text corresponds to 4 types of marginal likelihood and log-likelihood contributions: (1) neither event occurrence, (2) nonterminal occurrence only, (3) terminal occurrence only, and (4) both event occurrence.
- Observe neither event
- Observe only nonterminal
- Observe only terminal
- Observed both events
D. Frailty marginalization and posterior predictive imputation
D.1. Full conditional form of frailties
Omitting terms which do not depend on , the conditional likelihood as a function of is
This demonstrates that the posterior distribution of , conditional on , only depends on the data through . The only other place appears in the posterior is in , which has kernel . Thus, the full conditional distribution for is
which can be recognized as the kernel of a with and .
D.2. Frailty imputation
For each and , sample as
D.3. Imputation of censored outcomes
Censoring is the cause of missing outcome data in the factual treatment arm. In the presence of censoring for individual , there is only partial information on one or both of . Given censoring time and draw of the posterior parameter and frailty vectors , we can impute or . The hazards specified in the main text lead to a simple imputation strategy based on Weibull random deviates. The resulting draws of are compatible with , , and . If individual was censored before the nonterminal event occurred, we impute the missing event times according to the following algorithm.
Impute a candidate nonterminal event time from a Weibull distribution with shape parameter and scale parameter that is truncated to have no mass below .
Impute a candidate death time from a Weibull distribution with shape parameter and scale parameter that is truncated to have no mass below . If , set and . This gives us a complete and the imputation process concludes. Otherwise, set and continue to Step 3.
Impute a sojourn time from a Weibull distribution with shape parameter and scale parameter .
Set the imputed death time to .
For individuals censored after the nonterminal event, the procedure starts at Step 3 with the modification that the distribution of the sojourn time must be truncated to have no mass below . After imputation, each individual has a complete set of four potential outcomes for all MCMC iterations.
D.4. Imputation of counterfactual potential outcomes
Missingness in the outcome pair is due to the so-called fundamental problem of causal inference. From assuming independent gamma-distributed frailties as in the main text, we have that posterior draws of depend only on , , and .
Imputation of outcomes in the treatment arm counter to fact is actually simpler because there is no need to truncate so that the imputed values agree with the observed . Replace with and with 0, then follow the algorithm in Section D.3.
E. Additional details for simulation study
Sample size
4 independent Bernoulli covariates with
One continuous covariate
Mean-centering of design matrices
-
Regression coefficients, baseline hazard rates, and Weibull shape parameters match the posterior medians for the first 5 covariates in a simplified version of the data application, where the covariate effects were assumed to be shared across treatment arms:
Frailty variance equal to 1.314846, the posterior median of in the data application
Scenario 1: Correct specification
for all and
Scenario 2: Log-normal frailty
A log-normal distribution has the following properties:
Mean:
Variance:
For identifiability reasons, we constrain the mean of the distribution to be one (i.e., require ), yielding the curved exponential distribution . To have the variance of the log-normal frailties match the variance in the other scenarios, we choose for the stated above.
for all and
Scenario 3: Variable frailty exponents
The final scenario has a gamma distribution for the frailty, but the relevance of the frailty in each model is different.
-
The exponents vary:
E.1. Prior specification
The mean of the frailty distribution must be constrained to 1 for identifiability purposes, and the frailties must be assumed to arise from a gamma distribution in order to arrive at the marginalized likelihood. These two criteria are met by distributions of the form . In all simulation scenarios, data were analyzed with an inverse-Gamma prior for .
Inverse for
Numerators for the hyperparameters were chosen to approximately center the prior around the underlying leads to vaguer priors for and offers a path to performing sensitivity analysis. Simulation results presented here arise from .
E.2. MCMC specification
Each replicate data set was analyzed with model fits using the final 2000 iterations of a 5000 iteration chain.
Across 200 replicates in each of the three scenarios, none of the fits displayed signs of poor convergence, defined as having any of the following characteristics:
Energy fraction of missing information (E-FMI) < 0.2
One or more post-warmup HMC divergences in any chain
One or more occurrences reaching maximum tree depth
Effective post-warmup sample size for or below 500.
Monte Carlo standard errors were calculated following the jackknife after bootstrap formula from Koehler et al. (2009). Specifically, let designate the simulated observed data sets, and let be a Monte Carlo estimate of an operating characteristic like the 95% credible interval coverage for . The estimated Monte Carlo standard error for the
where is an estimate calculated with the rth replicate removed.
F. Hospital readmission data application
Figure 4 shows the degree to which matching on the estimated propensity score reduces imbalances in the observed covariates. All treated patients (i.e., patients discharged to home care with support) were matched to similar control patients (i.e., patients discharged to home care without additional support). While there are significant reductions in covariate imbalance in the post-matching sample, there is remaining imbalance on the length of the initial hospital stay.
Figure 4.
Reduction in covariate imbalance after propensity score matching of late-stage pancreatic cancer patients discharged to home care with support
Footnotes
Conflict of Interest
The authors have declared no conflict of interest. (or please state any conflicts of interest)
References
- Bartolucci F and Grilli L (2011). Modeling partial compliance through copulas in a principal stratification framework. JASA 106, 469–479. [Google Scholar]
- Carpenter B, Gelman A, Hoffman M, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, and Riddell A (2017). Stan: a probabilistic programming language. Journal of Statistical Software 76,. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen D, Li J, and Chong JK (2017). Hazards regression for freemium products and services: a competing risks approach. Journal of Statistical Computation and Simulation 87, 1863–1876. [Google Scholar]
- Dai J, Gilbert P, and Mâsse B (2012). Partially hidden markov model for time-varying principal stratification in HIV prevention trials. JASA 107, 52–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fine JP, Jiang H, and Chappell R (2001). On semi-competing risks data. Biometrika 88, 907–919. [Google Scholar]
- Gelman A, Carlin J, Stern H, Dunson D, Vehtari A, and Rubin D (2013). Bayesian Data Analysis. Chapman and Hall/CRC. [Google Scholar]
- Gustafson P (2010). Bayesian inference for partially identified models. The International Journal of Biostatistics 6,. [DOI] [PubMed] [Google Scholar]
- Hernán MA and Robins JM (2016). Using big data to emulate a target trial when a randomized trial is not available. American journal of epidemiology 183, 758–764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ho D, Imai K, King G, and Stuart E (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis. [Google Scholar]
- Jin H and Rubin D (2008). Principal stratification for causal inference with extended partial compliance. JASA 103, 101–111. [Google Scholar]
- Koehler E, Brown E, and Haneuse SJ-P (2009). On the assessment of monte carlo error in simulation-based statistical analyses. The American Statistician 63, 155–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee K, Haneuse S, Schrag D, and Dominici F (2015). Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis. JRSS:C 64, 253–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin J, Ten Have T, and Elliott M (2008). Longitudinal nested compliance class model in the presence of time-varying noncompliance. JASA 103, 462–473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu L, Wolfe R, and Huang X (2004). Shared frailty models for recurrent events and a terminal event. Biometrics 60, 747–756. [DOI] [PubMed] [Google Scholar]
- Long D and Hudgens M (2013). Sharpening bounds on principal effects with covariates. Biometrics 69, 812–819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lyu T, Bornkamp B, Mueller-Velten G, and Schmidli H (2023). Bayesian inference for a principal stratum estimand on recurrent events truncated by death. Biometrics n/a,. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/biom.13831. [DOI] [PubMed] [Google Scholar]
- Mattei A, Ding P, Ballerini V, and Mealli F (2024). Assessing causal effects in the presence of treatment switching through principal stratification. Bayesian Analysis page to appear. [Google Scholar]
- Nevo D and Gorfine M (2022). Causal inference for semi-competing risks data. Biostatistics 23, 1115–1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pan Q and Gastwirth J (2013). Estimating restricted mean job tenures in semi-competing risk data compensating victims of discrimination. The Annals of Applied Statistics 7, 1474–1496. [Google Scholar]
- Robins J (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling 7, 1393–1512. [Google Scholar]
- Royston P and Parmar M (2002). Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21, 2175–2197. [DOI] [PubMed] [Google Scholar]
- Rubin D (1990). Comment: Neyman (1923) and causal inference in experiments and observational studies. Statistical Science 5, 472–480. [Google Scholar]
- Rubin D (2000). Causal inference without counterfactuals: comment. JASA 95, 435–438. [Google Scholar]
- Schwartz S, Li F, and Mealli F (2011). A bayesian semiparametric approach to intermediate variables in causal inference. JASA 106, 1331–1344. [Google Scholar]
- Stensrud MJ and Dukes O (2022). Translating questions to estimands in randomized clinical trials with intercurrent events. Statistics in Medicine 41, 3211–3228. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.9398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stensrud MJ, Young JG, Didelez V, Robins JM, and Hernán MA (2020). Separable effects for causal inference in the presence of competing events. Journal of the American Statistical Association pages 1–9. [Google Scholar]
- Tchetgen Tchetgen E (2014). Identification and estimation of survivor average causal effects. Statistics in Medicine 33, 3601–3628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu J, Kalbfleisch J, and Tai B (2010). Statistical analysis of illness–death processes and semicompeting risks data. Biometrics 66, 716–725. [DOI] [PubMed] [Google Scholar]
- Xu Y, Scharfstein D, Müller P, and Daniels M (2022). A Bayesian nonparametric approach for evaluating the causal effect of treatment in randomized trials with semi-competing risks. Biostatistics 23, 34–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Young JG, Stensrud MJ, Tchetgen Tchetgen EJ, and Hernán MA (2020). A causal framework for classical statistical estimands in failure-time settings with competing events. Statistics in medicine 39, 1199–1236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang J and Rubin D (2003). Estimation of causal effects via principal stratification when some outcomes are truncated by “death”. Journal of Educational and Behavioral Statistics 28, 353–368. [Google Scholar]