Skip to main content
Biostatistics (Oxford, England) logoLink to Biostatistics (Oxford, England)
. 2021 Dec 30;23(4):1115–1132. doi: 10.1093/biostatistics/kxab049

Causal inference for semi-competing risks data

Daniel Nevo 1,, Malka Gorfine 2
PMCID: PMC9566449  PMID: 34969069

Summary

The causal effects of Apolipoprotein E Inline graphic allele (APOE) on late-onset Alzheimer’s disease (AD) and death are complicated to define because AD may occur under one intervention but not under the other, and because AD occurrence may affect age of death. In this article, this dual outcome scenario is studied using the semi-competing risks framework for time-to-event data. Two event times are of interest: a nonterminal event time (age at AD diagnosis), and a terminal event time (age at death). AD diagnosis time is observed only if it precedes death, which may occur before or after AD. We propose new estimands for capturing the causal effect of APOE on AD and death. Our proposal is based on a stratification of the population with respect to the order of the two events. We present a novel assumption utilizing the time-to-event nature of the data, which is more flexible than the often-invoked monotonicity assumption. We derive results on partial identifiability, suggest a sensitivity analysis approach, and give conditions under which full identification is possible. Finally, we present and implement nonparametric and semiparametric estimation methods under right-censored semi-competing risks data for studying the complex effect of APOE on AD and death.

Keywords: Alzheimer’s disease, Bounds, Frailty model, Illness-death model, Principal stratification

1. Introduction

Having at least one Apolipoprotein E Inline graphic allele (APOE) has been associated with a substantial increase in risk of late-onset Alzheimer’s disease (AD) (Alzheimer’s Association, 2019). With the development of clustered regularly interspaced short palindromic repeats (CRISPR) for possible gene editing, researchers have begun contemplating gene editing on APOE as a preventive intervention against AD (Safieh and others, 2019). However, the standard average causal effect on AD is ill-defined because some study participants die without an AD diagnosis, and it is unknown whether APOE has a causal effect on survival (Corder and others, 1993; Dal Forno and others, 2002). When considering the impact of intervening on APOE, a holistic approach focusing on effects on both AD and death is desirable to help public health officials to weigh potential benefits and risks and to conclude whether gene editing of APOE is a viable intervention.

The semi-competing risks (SCR) framework have been used as a basis for studying AD and death jointly (Varadhan and others, 2014; Lee and others, 2017; Nevo and others, 2021a). More generally, SCR data arise when times to two events of interest are studied, nonterminal (e.g., disease diagnosis) and terminal (e.g., death) events. The nonterminal event is observed only if it precedes the terminal event, which may occur before or after the nonterminal event.

The illness-death model is often used for studying SCR data. It focuses on transition times between three different states: An initial state (state “0,” healthy), a diseased state (“1,” AD), and an absorbing state (“2,” death). Possible transitions are 0Inline graphic1, 0Inline graphic2, and 1Inline graphic2. Various time-to-event models have been suggested to analyze SCR data in this framework. Typically, inter-personal dependence between illness and death times is modeled via a random effect, known as frailty (Xu and others, 2010; Lee and others, 2017; Gorfine and others, 2021).

In the causal inference paradigm, the related “truncation by death” problem has been studied (Zhang and Rubin, 2003; Long and Hudgens, 2013; Tchetgen Tchetgen, 2014; Ding and Lu, 2017; Zehavi and Nevo, 2021). This problem arises when researchers aspire to compare outcomes between the treated and untreated groups at a fixed time point after the treatment was assigned, but some study participants may die prior to outcome collection, hence the average causal effect is not well-defined. Using principal stratification (Frangakis and Rubin, 2002), the survivor average causal effect (SACE) is the causal effect within the stratum of people who would have survived under both treatment values. A similar problem arises in vaccination studies, as vaccine effects on post-infection outcomes are well-defined only among those who would have been infected with or without vaccinating (Shepherd and others, 2007). The SACE is nonparametrically unidentifiable, unless strong assumptions are made. Therefore, researchers have developed bounds (Zhang and Rubin, 2003; Long and Hudgens, 2013) and sensitivity analyses (Shepherd and others, 2007). However, none of these papers apply to SCR data, where the time-to-event nature of both AD diagnosis and death must be addressed.

Robins (1995) utilized additional assumptions to show the average causal effect (not SACE) is identifiable when censoring time is always observed, and without distinguishing between censoring due to death or other reasons. More recently, two papers have considered causal questions for SCR data (Comment and others, 2019; Xu and others, 2020). These papers suggested Bayesian methods to estimate the time-varying SACE (TV-SACE) and restricted mean SACE (RM-SACE). These papers provide major improvement towards causal reasoning for SCR data. One of the works rely on parametric frailty-based illness-death models (Comment and others, 2019), while the other coupled a parametric sensitivity copula-based analysis with a nonparametric Bayesian method under an additional identifying assumption (Xu and others, 2020). Questions of nonparametric partial identifiability under weaker assumptions, or sensitivity analyses without the copula model, as well as frequentist estimation were all not addressed.

In this article, we define new causal estimands for capturing the causal effect of APOE on AD and death times within the SCR framework. The general idea resembles principal stratification in that we propose to stratify the population according to their potential outcomes. In Section 2, we propose new causal estimands for SCR data, that focus on effects defined within a fixed, not time-varying, population. As with principal stratification, membership status of this subpopulation is unobserved. In Section 3, we present novel causal assumptions under the SCR data structure that are motivated by the study of APOE and AD. Our new assumption termed order preservation is more flexible than the traditionally used monotonicity assumption (Zhang and Rubin, 2003). Section 3 also provides partial identifiability results for the proposed estimands under the new assumptions and full identifiability results under a bivariate frailty semiparametric illness-death model. In Section 4, we propose kernel-based nonparametric methods and a sensitivity analysis built upon a semiparametric illness-death frailty model. A fully reproducible simulation study is described in Section 5, before turning to the study of APOE effects on AD and death, in length, in Section 6. Concluding remarks are given in Section 7. The R package Inline graphic implements all the developed methodologies and is available from https://github.com/daniel258/CausalSemiComp and the reproducibility materials are available from https://github.com/daniel258/CausalSemiCompReproduce.

2. Notation and causal estimands

Let Inline graphic and Inline graphic be the times to AD diagnosis and death, respectively, had we set APOE level to Inline graphic, with Inline graphic indicating having at least one APOE Inline graphic allele, and Inline graphic indicating no APOE Inline graphic4 alleles. Let Inline graphic be the joint density function of Inline graphic on Inline graphic for Inline graphic. Because death may occur before AD diagnosis, Inline graphic. The event Inline graphic or, stated differently, Inline graphic corresponds to not having AD under Inline graphic. We defer assumptions on the joint distribution of Inline graphic and Inline graphic for later; these two pairs could never be observed simultaneously.

Consider the average causal effect of APOE on the survival time Inline graphic. Because APOE affects AD incidence (Alzheimer’s Association, 2019), and because AD patients are likely to die earlier compared to the scenario they did not have AD (Tom and others, 2015), we would expect Inline graphic, even if APOE does not carry an effect on age at death other than the effect resulting from earlier AD onset. Additionally, when focus is on AD, Inline graphic is ill-defined, because Inline graphic for some study units for either or both Inline graphic and Inline graphic.

Therefore, causal inference for SCR data is entangled. Principal stratification effects suggested to address this challenge are defined within a strata of the population created by the potential values of a postrandomization variable, which is not the object of inquiry (Frangakis and Rubin, 2002). For SCR data, principal stratification approaches can be divided into two types: time-fixed and time-varying estimands. The time-fixed approach focuses on a single time point Inline graphic, say age 80, and defines the binary potential outcomes Inline graphic and Inline graphic as the survival status and AD status at time Inline graphic under APOE level Inline graphic. The SACE Inline graphic compares the population-level risk of AD by age Inline graphic with and without APOE only among those who would have survived until age Inline graphic under both APOE statuses. Typically, strong assumptions are needed for identifying the SACE. One commonly made assumption is the monotonicity assumption Inline graphic (Zhang and Rubin, 2003; Ding and Lu, 2017; Zehavi and Nevo, 2021). That is, those surviving beyond time Inline graphic and have APOE would also survive beyond Inline graphic had they did not have APOE. A more recent approach (Comment and others, 2019; Xu and others, 2020) focuses on the TV-SACE causal effect Inline graphic for Inline graphic. The Inline graphic-Inline graphic has two major advantages over the standard SACE: at each survival age Inline graphic it compares nonterminal event rates in multiple time points, and it respects that the population of always-survivors depends on the chosen time Inline graphic. Of note is that because the population changes with time, a change in Inline graphic-Inline graphic as a function of Inline graphic may reflect a change in the population or a change in the efficacy of the intervention.

2.1. Population stratification and alternative estimands

We propose alternative estimands, based on a single partition of the population. Unlike the partitions defining the SACE and TV-SACE, this single partition is not defined with respect to a specific (possibly varying) time point. Instead, we divide the population into four strata according to their AD and death “order.” Because our partition involves the dual outcomes, we term it population stratification. Define the following four subpopulations with respect to the order of Inline graphic for Inline graphic. The “always-diseased” (Inline graphic) will be diagnosed with AD at some point of their life, regardless of their APOE status (Inline graphic for Inline graphic); the “never-diseased” (Inline graphic) will die without having AD first, regardless of their APOE status (Inline graphic for Inline graphic); the “harmed’’ (Inline graphic) would be diagnosed with AD prior to death only had they had APOE (Inline graphic); and the “protected” (Inline graphic) would be diagnosed with the disease only without APOE (Inline graphic). Let Inline graphic, Inline graphic, Inline graphic and Inline graphic be the respective subpopulation proportions; see also Table 1.

Table 1.

Population stratification for semi-competing risks data

Name Stratum definition Stratum proportion
Always-diseased (ad) Inline graphic & Inline graphic Inline graphic
Never-diseased (nd) Inline graphic & Inline graphic Inline graphic
Disease-harmed (dh) Inline graphic & Inline graphic Inline graphic
Disease-protected (dp) Inline graphic & Inline graphic Inline graphic

Consider the stratum-specific causal estimands Inline graphic for Inline graphic and Inline graphic. Of note is that Inline graphic. Furthermore, the contrast Inline graphic is hard to interpret because under APOE, all Inline graphic would have AD before their death, but none of them would have AD before death under no APOE. A similar concern arise for the Inline graphic stratum. While Inline graphic is the additional proportion of AD patients in the population had the entire population had APOE versus had none of the population had APOE, this quantity should be interpreted with care. The Inline graphic stratum may include people for whom APOE induces early AD onset without affecting age of death and people for whom APOE prolongs survival long enough to be diagnosed with AD. Therefore, we henceforth focus on the following effects defined in the Inline graphic and Inline graphic strata

graphic file with name Equation11.gif (2.1)
graphic file with name Equation12.gif (2.2)
graphic file with name Equation13.gif (2.3)

Causal effects that better summarize the impact of Inline graphic could be based on the restricted mean survival time (RMST). For a given time point Inline graphic, let Inline graphic, and define

graphic file with name Equation14.gif (2.4)
graphic file with name Equation15.gif (2.5)
graphic file with name Equation16.gif (2.6)
graphic file with name Equation17.gif (2.7)

Each of the estimands characterizes a different causal effect of Inline graphic on the outcomes within the Inline graphic or Inline graphic stratum. Effect (2.4) is the mean (possibly negative) gain in life expectancy caused by APOE, (2.5) is the effect of APOE on AD diagnosis age, and (2.6), which is obtained by subtracting (2.5) from (2.4), is the effect on the residual lifetime with AD. If, for example, the latter equals zero, it means the same life expectancy with AD is expected for those with and without APOE within the Inline graphic stratum. Effect (2.7) is also of great interest, as it is the effect of APOE on life expectancy among the Inline graphic population, thus it captures another form of APOE effect on survival. Other summary measures (e.g., median time-to-events) can also be used to define causal effects.

2.2. Utility of proposed estimands and comparison to existing approaches

Had we ignored follow-up after AD diagnosis, we would have a classical competing risks setup with the events AD diagnosis and AD-free death. Recently, there have been major advancements towards causal inference in the presence of competing events (Young and others, 2020; Stensrud and others, 2020; Nevo and others, 2021b). The total effect (Young and others, 2020) contrasts the proportion of AD patients by age Inline graphic under APOE and no APOE for the entire population. A total effect can be also defined for the APOE effect on survival. The total effects are useful to answer a multitude of research questions, but one should be aware that interpretation can be complicated. For example, it may seems as if APOE protects against AD-free death, simply because it induces earlier onset of AD. The direct effect (Young and others, 2020) of APOE on AD contrasts the counterfactual AD risk with and without APOE had we eliminated the competing event for the entire population. To our knowledge, such intervention is not available for either AD or death. Stensrud and others (2020) proposed alternative casual estimands for competing-risks data, based on conceptual separation of the treatment into components such that each component affects a different competing event. The SCR framework has been increasingly used for aging research; see Varadhan and others (2014) for review and comparisons with competing risks. Unlike competing risks, the SCR viewpoint additionally considers follow-up after the occurrence of the nonterminal event. Therefore, the SCR framework presents an opportunity to study the dual outcomes of interest holistically, while considering simultaneously benefits and risks with respect to both events. As we demonstrate below, the proposed estimands can be used, for example, for studying the effect of APOE on AD age-of-onset among those who would have AD regardless of APOE status, or the effect of APOE on survival, separately for those would and would not have AD regardless of APOE status. The latter effect cannot be studied if follow-up after AD is ignored. In addition, SCR data provide partial information about the dependence of the two event times (conditionally on Inline graphic), which is unidentifiable from competing-risks data, a fact often leveraged in models for SCR data.

A second approach views AD diagnosis as a time-varying and possibly death-truncated mediator of the effect of APOE on survival. Interventional direct and indirect effects are more appealing for such a task, as the assumption of no common causes of AD and death affected by APOE seems too strong; see Lin and others (2017). The mediation approach is desirable when the question of interest is the extent to which AD mediates the effect of APOE on survival. In this work, we make no direct attempt to answer such questions. Our approach does not specify the role of AD as a mediator. For example, APOE may affect cognitive decline underpinning AD, and this cognitive decline may affect death not only through earlier AD onset. Therefore, it is possible that AD is a proxy of an unmeasured mediator in addition to its possible, though not necessary, own mediating role.

Nevertheless, a connection of the proposed estimands to the mediation literature can be made. Let Inline graphic. One may argue that estimands (2.4) and (2.7) resemble principal stratum direct effects (PSDEs) under Inline graphic and Inline graphic, respectively. As highlighted by VanderWeele (2011), PSDEs are generally interpretable as direct effects. In our motivating example, one may consider (2.4) as the effect of APOE on survival which is not mediated by the occurrence of AD.

From a public health policy point of view, the decision whether CRISPR should be used to eliminate APOE must consider the population effect on the dual outcomes of AD and death. Our proposed estimands offer a number of information pieces needed when considering the population health and economic value of such intervention. In Section 6, we estimate Inline graphic proportion to be 35% and the median gain in AD-free years by age 100 (analog of estimand (2.5)) to be 2.01 years (95% CI 1.92–2.09). This means that eliminating APOE would delay in at least 2 years AD diagnosis for at least half of those who would have AD under any APOE status. To complement this quantity, (2.4) is the gain in survival in the same population; the median gain is estimated to be 3.00 years (95% CI 2.93–3.07), meaning that eliminating APOE extends life for at least half of those within the Inline graphic strata by at least 3 years. The APOE intervention effect is not limited to the Inline graphic strata. In Section 6, we estimate the Inline graphic proportion to be 53%, and the median gain in survival by age 100 to be 1.99 years (95% CI 1.90–2.08). Considered collectively, the results provide initial evidence for desirable effects of eliminating APOE on both AD and survival in the population. A careful inspection will help regulators and public health officials to weigh the consequences of APOE intervention at scale. The TV-SACE and RM-SACE do not provide the above-described information, but they can complement it by providing insight into heterogeneity of the effect across time.

3. Identifiability

The focus of this section is on assumptions and identifiability. Methods for estimation and inference under right censoring are developed in Section 4. Let Inline graphic be the actual APOE and event times. Throughout this article, we assume no interference between individuals and the following two standard assumptions.

Assumption 1

Consistency. Inline graphic and Inline graphic.

Assumption 2

Weak Ignorability, Inline graphic for Inline graphic.

In many observational studies that lack randomization, a conditional version of Assumption 2, conditionally on covariates, is more plausible; see also Section 3.4.

3.1. Bounds

Because principal stratum effects are not identified from the data unless additional assumptions are leveraged, researchers often develop bounds (Zhang and Rubin, 2003; Long and Hudgens, 2013). Here, we present various bounds for the causal effects based on the identifiable distribution of Inline graphic, constructed differently depending on the assumptions a researcher is willing to make. First, we present a new assumption, tailored for SCR data.

Assumption 3

Order preservation. If Inline graphic, then Inline graphic.

Under Assumption 3, anyone who would have been diagnosed with AD without having APOE, would also have AD had they had APOE, but not necessarily at the same age. It is consistent with the overwhelming evidence that APOE significantly increases the risk of AD (Alzheimer’s Association, 2019). Assumption 3 is more flexible than the often-invoked monotonicity assumption, in that it acknowledges the realistic scenario that Inline graphic may impact survival time for either direction. Specifically, it allows that Inline graphic and Inline graphic are of different signs at the unit level. To see this, observe that Inline graphic implies Inline graphic. Figure 1 presents potential AD and death times for seven hypothetical units, differing at the timing and order of the potential event times. The figure depicts examples of units existing under both assumptions, and units that their existence violates one of the assumptions (or both). Of note is that for a unit with no AD under either APOE status, Assumption 3 holds, while monotonicity would not hold if APOE prolongs the lifetime of that unit. Finally, Assumption 3 implies that the Inline graphic stratum is empty, that is, Inline graphic.

Fig. 1.

Fig. 1.

Illustration of Assumption 3 and its distinction compared to the monotonicity assumption using seven hypothetical units. Units Inline graphic and Inline graphic are allowed under both assumptions, Assumption 3 and monotonicity. For unit Inline graphic, APOE expedites both AD and death, but in a way that AD remains before death. For unit 2, APOE impedes AD but expedites death. Nevertheless, unit 2 would live long enough under APOE for AD to occur. Similar protective effect occurs for unit Inline graphic, however, this unit violates monotonicity at time Inline graphic (as an example). Unit Inline graphic also violates monotonicity (at Inline graphic and Inline graphic) but not Assumption 3. Unit Inline graphic does not suffer from AD under any APOE value, hence it is allowed by Assumption 3 but it does violate monotonicity (at Inline graphic and Inline graphic) as AD carries a protective effect against death. Unit Inline graphic violates Assumption 3 but not monotonicity. Arguably, for this unit the APOE effect on death is so strong the unit would have died under APOE before getting the chance to have AD. Finally, unit Inline graphic violates both assumptions.

As is typically the case in truncation-by-death problems, even under Assumptions 1–3, none of the causal effects (2.1)–(2.7) are nonparametrically identifiable. These effects are however partially identifiable, in the sense that the data, coupled with the assumptions, provide bounds on these effects. For any event Inline graphic, let Inline graphic, Inline graphic and Inline graphic. Under Assumptions 1–3, Inline graphic and Inline graphic are identified, while Inline graphic and Inline graphic are only partially identified. The following proposition establishes partial identifiability under Assumptions 1–3.

Proposition 1

Under Assumptions 1–3, effects (2.1)–(2.3) are partially identified by


Proposition 1 (3.8)

Proposition 1 (3.9)

Proposition 1 (3.10)

where


Proposition 1 (3.11)

The proof is given in Section A.1 of the Supplementary material available at Biostatistics online. Under Assumptions 1–3, the strata proportions are identified by Inline graphic, Inline graphic, Inline graphic. RMST effects (2.5)–(2.7) are (partially) identified by recalling that for any nonnegative random variable Inline graphic and any fixed time Inline graphic, we have Inline graphic. Had we had unlimited uncensored data, we could have calculated the bounds using Inline graphic, Inline graphic and Inline graphic. As for Inline graphic, we could have used Inline graphic.

Generally, bounds derived under additional assumptions might be narrower (Zhang and Rubin, 2003; Long and Hudgens, 2013). Of course, these assumptions need to be plausible in the scientific problem at hand for the bounds to be valid. One particular bound that may not be informative enough is Inline graphic. As Inline graphic increases, the second term Inline graphic get closer to one while the first term is not, because Inline graphic. As a remedy for this problem, consider the following assumption and proposition providing a more informative lower bound for (2.3).

Assumption 4

For a given Inline graphic, Inline graphic.

This assumption resembles the ranked average score assumption (Zhang and Rubin, 2003), but tailored for time-to-event data. Assumption 4 reflects researcher belief that the Inline graphic are generally more frail (with respect to AD), and hence are expected to be diagnosed at earlier ages than those in the Inline graphic stratum. For example, if those in the Inline graphic stratum are more likely than those at the Inline graphic stratum to have other AD risk factors. In practice, Assumption 4 can hold for some, all, or none of the values of Inline graphic.

Proposition 2

Under Assumptions 1–3, for any Inline graphic for which Assumption 4 holds, Inline graphic is partially identified by Inline graphic, where Inline graphic and Inline graphic is given in Proposition 1.

The proof is given in Section A.2 of the Supplementary material available at Biostatistics online. In some studies, the obtained bounds, with or without additional assumptions, might be too wide to have practical utility. Therefore, following Long and Hudgens (2013), we develop narrower bounds utilizing a preintervention discrete covariate Inline graphic. In our study, Inline graphic can be, for example, gender. An adjusted version of Proposition 1 can be obtained under the following assumption.

Assumption 5

Inline graphic for Inline graphic.

For example, we show in Section A.3 of the Supplementary material available at Biostatistics online that under Assumptions 1, 3, and 5, adjusted bounds for (2.1) are given by Inline graphic, where

graphic file with name Equation22.gif (3.12)

and Inline graphic. Similarly, adjusted bounds Inline graphic and Inline graphic can be constructed. Furthermore, the following proposition establishes that the adjusted bounds are always within the unadjusted bounds.

Proposition 3

Under Assumptions 1, 3 and 5, Inline graphic and Inline graphic.

The proof is given in Section A.4 of the Supplementary material available at Biostatistics online. Proposition 3 is the analog of Proposition 1 in Long and Hudgens (2013), who developed bounds for binary outcomes. Of note is that due to finite-sample variation, Proposition 3 may not hold empirically. Therefore, as suggested by Long and Hudgens (2013), the bounds Inline graphic and Inline graphic can be used.

Figure 2 illustrates the various bounds under three different scenarios. The data-generating mechanism (DGM) and the resulting per-stratum cumulative distribution functions are described in Section D.1 of the Supplementary material available at Biostatistics online. Under Scenario (I) (top row), Assumptions 3 and 4 hold, and Inline graphic shorten the Inline graphic time-to-disease and time-to-death, and Inline graphic die faster under Inline graphic. The bounds Inline graphic, Inline graphic, and Inline graphic (in pink shade) are quite wide in this scenario. The bound Inline graphic (dashed green line) is far more informative, being only slightly lower than the true difference (black solid curve). The adjusted bounds (blue dotted lines) were narrower than the unadjusted bounds, most notably for Inline graphic values for which the unadjusted bounds are very wide. For the causal effect on time-to-disease within the Inline graphic (top left corner) the narrower bound is obtained as Inline graphic. The bounds for time-to-death among the Inline graphic are quite wide. This is likely because there is least information on this group from the observed data, as the true stratum probabilities were Inline graphic.

Fig. 2.

Fig. 2.

True causal effects (2.1)–(2.3) and different bounds under three simulated scenarios. The bounds derived in Proposition 1, Inline graphic, Inline graphic, and Inline graphic, are in pink shade. The blue dotted lines are the adjusted bounds Inline graphic, Inline graphic, and Inline graphic. The green dashed lines are the bounds Inline graphic, Inline graphic.

For further illustration of the latter point, consider the two bottom rows of Figure 2, both under null effects. In Scenario (II), the Inline graphic stratum comprised most of the population, Inline graphic, and the bounds were quite narrow for the within Inline graphic effect (2.2), and less informative for the Inline graphic. In Scenario (III), Inline graphic was substantial, Inline graphic, and the bounds were quite narrow only for causal effects within the Inline graphic. In conclusion, if one of the proportions Inline graphic or Inline graphic is large enough, the bounds can be useful, and a covariate Inline graphic may help to create narrower bounds. Under Assumption 4, the lower bound Inline graphic is quite informative.

3.2. Identification by frailty assumptions

We turn to present stronger assumptions under which effects (2.1)–(2.3) are identified from the data. The proposed overarching strategy formulates two illness-death models and tying them together via a bivariate frailty random variable Inline graphic. For Inline graphic, let

graphic file with name Equation23.gif

be the three cause-specific hazard functions under the SCR setting (Xu and others, 2010), associated with the transitions Inline graphic (0: healthy, 1: AD, 2: death); see Figure C.1 of the Supplementary material available at Biostatistics online for illustration. The strategy of how to model the distribution of Inline graphic dictates two forms of dependence, cross-world and within-world. A frailty-based approach with parametric assumptions on the hazard functions, coupled with Bayesian estimation was considered by Comment and others (2019). The authors assumed Inline graphic leading to identifiability of the causal effects. That Inline graphic implies the same dependence between AD and death times in both worlds. However, it would generally not hold unless all effect modifiers can be measured and correctly accounted for. We extend this approach by incorporating a sensitivity parameter Inline graphic that encapsulates the cross-world, and unidentifiable, dependence.

Assumption 6

There exists a bivariate random variable Inline graphic such that

  • (i)

    Inline graphic for Inline graphic.

  • (ii)
    Given Inline graphic, the joint distribution of the potential event times can be factored as follows
    graphic file with name Equation24.gif (3.13)
    where Inline graphic denotes a density function of a possibly multivariate random variable.
  • (iii)

    The frailty variable operates multiplicatively on the hazard function Inline graphic for Inline graphic and Inline graphic for Inline graphic, for some Inline graphic functions.

  • (iv)

    The probability density function of Inline graphic, Inline graphic, is known up to a finite dimensional parameter Inline graphic that is identifiable from the observed data distribution.

Part (i) of Assumption 6 adapts Assumption 2 for the frailty. Part (ii) implies a cross-world independence conditionally on the unobserved Inline graphic. That is, the inclusion of the frailty represents a model for the collection of unobserved covariates governing the cross-world dependence. This model induces marginal (with respect to Inline graphic) complex dependence and conditional independence between Inline graphic and Inline graphic. Parts (iii) and (iv) guarantee identification of the observed data distribution. As with any parametric model, one cannot expect the model to hold exactly. Nevertheless, it is well-known that usually misspecification of the frailty distribution leads to only small bias in the estimated cumulative incidence function (of the observed data distribution), integrated over the frailty distribution; see Gorfine and others (2021) and references therein.

While other specifications are possible, we assume that Inline graphic are correlated Gamma variables, with mean one, variances Inline graphic and Inline graphic, and correlation Inline graphic. The frailty variances Inline graphic and Inline graphic are identifiable from the data, while Inline graphic is unidentifiable and is supplied by the researcher. Taking Inline graphic is analogous to the approach of Comment and others (2019), and results in point identification. The following proposition provides identification as a function of Inline graphic, thus allowing a more flexible consideration of the unidentifiable cross-world dependence.

Proposition 4

Under Assumptions 1 and 6, the causal effects (2.1)–(2.3) are identified by


Proposition 4

where for any event Inline graphic, Inline graphic, Inline graphic.

The proof is given in Section A.5 of the Supplementary material available at Biostatistics online. RMST-like estimands (2.4)–(2.7) are identified by suitable integration over Inline graphic. Figure 3 illustrates the RMST effects (2.4), (2.5), and (2.7) under the DGM described in Section 4 and Section D.1 of the Supplementary material available at Biostatistics online, with equal variances Inline graphic and as a function of Inline graphic. Stratum proportions varied between 41–45% for Inline graphic stratum and 8–12% for the Inline graphic stratum. The causal RMST effects, calculated at Inline graphic, were sensitive to Inline graphic when the stratum proportion was very small and the frailty variance was large and were otherwise quite robust to Inline graphic.

Fig. 3.

Fig. 3.

Causal restricted mean and median survival time effects at Inline graphic as a function of the frailty variance Inline graphic, (corresponding to Kendall’s Inline graphic) as a function of the correlation between the frailty variables Inline graphic.

3.3. Right censoring

Time-to-event outcome data are often not fully observed for all individuals due to right censoring. Two issues arise when considering right censoring. The first is identification of the causal estimand, considered in this section, and the second is how to estimate the different quantities and provide inference, which is embedded in the nonparametric and semiparametric methods presented in Section 4. Let Inline graphic be the right-censoring time under Inline graphic. Assumption 2 is adjusted for the case of censoring by assuming Inline graphic for Inline graphic. We also assume within-world independent censoring, namely that Inline graphic for Inline graphic. In our study, this assumption asserts that, under each APOE value, the two event times and censoring time are independent. We also assume that Inline graphic for Inline graphic with Inline graphic being the end of follow-up time. Combining the weak ignorability and within-world independent-censoring assumptions together, and by Assumption 1, we get the standard independent censoring assumption for SCR data Inline graphic, with Inline graphic being the censoring time under Inline graphic.

3.4. Covariates inclusion

Often the ignorability and independent censoring assumptions are more plausible when made conditionally on covariates Inline graphic. Assumption 6 is also more plausible when made conditionally on covariates. For simplicity, we use the same vector Inline graphic for all assumptions to hold. In Section A.6 of the Supplementary material available at Biostatistics online, we adapt Propositions 1 and 4 to include covariates. An additional reason to include covariates is to improve estimators’ efficiency. In our study (Section 6), we use gender and race as covariates. Note that the role of Inline graphic is different from the role of the covariate Inline graphic we utilized in Section 3.1.

While a variety of statistical models have been used for SCR data, how to elucidate information on causal effects is typically overlooked and, in practice, focus is often on specific regression coefficients. Therefore, our work enables translation of existing model results into knowledge on causal effects. With this is mind, we focus on frequentist estimation of causal effects (2.4)–(2.7) under semiparametric frailty models that align with Assumption 6.

4. Estimation and inference

We now turn to estimation and inference from right-censored SCR data. We first propose nonparametric estimators, before considering semiparametric models. For the nonparametric estimators, the quantities of interest are the building blocks of the bounds, namely Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, and Inline graphic (for Inline graphic). For the semiparametric estimation, we focus on a frailty-based proportional-hazard model that allows estimation of causal effects utilizing Proposition 4.

For each participant Inline graphic denote by Inline graphic a covariate that is used for obtaining sharper bounds (when available), by Inline graphic a vector of baseline covariates (when available), by Inline graphic, the APOE value, and by Inline graphic the AD (Inline graphic) and death (Inline graphic) times. Let also Inline graphic be the censoring time. The data for each unit are Inline graphic where Inline graphic and Inline graphic are the observed times, and Inline graphic and Inline graphic are the indicators of AD and death, respectively. Finally, Inline graphic are the at-risk processes.

4.1. Nonparametric estimation

We suggest to estimate Inline graphic by the Kaplan–Meier (KM) estimator. Regarding Inline graphic,

graphic file with name Equation26.gif (4.14)

because for Inline graphic, Inline graphic, that is, among those who die at time Inline graphic, no new disease cases are possible later than Inline graphic. The function Inline graphic is well-defined, and thus can be estimated nonparametrically.

We propose to estimate Inline graphic by a kernel-smoothed KM estimator (Beran, 1981) within the Inline graphic group. Let Inline graphic be a kernel function (Inline graphic, Inline graphic), and let Inline graphic be a sequence of bandwidths. In practice, one might consider a separate bandwidth for each level of Inline graphic. For simplicity of presentation, we assume Inline graphic is the same for both Inline graphic. Let Inline graphic, and Inline graphic. The kernel-smoothed KM estimator for Inline graphic is then defined by

graphic file with name Equation27.gif (4.15)

We propose to estimate Inline graphic in similar fashion utilizing that

graphic file with name Equation28.gif (4.16)

In Section C.1 of the Supplementary material available at Biostatistics online, we give similar explicit formulas for Inline graphic, and Inline graphic. Covariate-adjusted bounds (3.12) are estimated at each level of Inline graphic and averaged by Inline graphic. A sketch of the proof for the asymptotic properties of the proposed estimators using existing results for the KM estimator and kernel-smoothed KM estimator is provided in Section C.2 of the Supplementary material available at Biostatistics online; see also Akritas and Van Keilegom (2003).

4.2. Semiparametric frailty models

Under Assumption 6, we propose to use the following proportional hazard specification

graphic file with name Equation29.gif (4.17)

where Inline graphic are unspecified baseline hazard functions. Define also the cumulative baseline hazard functions, Inline graphic, and let Inline graphic, and Inline graphic for Inline graphic. Finally, let Inline graphic be the collection of parameters to be estimated from the data. For prospectively collected data with all participants being AD-free at study starting time, the likelihood function is proportional to

graphic file with name Equation30.gif (4.18)

where Inline graphic, Inline graphic is the Inline graphicth derivative of Inline graphic with respect to Inline graphic, Inline graphic, and Inline graphic The likelihood derivation follows from SM3 of Gorfine and others (2021). In Section C.3 of the Supplementary material available at Biostatistics online, we present an EM algorithm for maximizing (4.18). The estimation phase ignores the unidentifiable component Inline graphic. Assumptions on the joint distribution of Inline graphic are used to map the estimator Inline graphic into estimates of causal effects. Upon estimating Inline graphic, the quantities in the identifying formulas from Proposition 4 can be calculated by numerical integration or Monte Carlo simulations, as we do in Section 5.

5. Simulations

To assess the finite-sample properties of the proposed estimators, we conducted simulations under various scenarios with 1,000 simulation iterations per scenario. Sample size was Inline graphic and different censoring rates were considered. The DGM initially followed (4.17) with Weibull baseline hazards, two covariates, and Gamma frailty as described in Section 3.2 with Inline graphic for simplicity. Treatment was randomized with Inline graphic and observed event times (pre censoring) were created by Assumption 1. Details on the different DGMs, analyses and comprehensive results are presented in Section D of the Supplementary material available at Biostatistics online and described briefly here.

For the nonparametric estimation, we considered the scenarios previously used in Figure 2. Assumption 3 was imposed by re-simulating those initially at the Inline graphic stratum. For the smoothed KM estimator, default choices of the R package Inline graphic were used (Gerds, 2019). Standard errors were estimated by bootstrap (sampling at the unit level) with 100 repetitions. For all parameters considered, bias was negligible, standard errors were well estimated and 95% Wald-type confidence intervals had satisfactory empirical coverage rate.

For the semiparametric estimation, data was generated under model (4.17) and was not restricted to follow Assumption 3. Gamma frailties were simulated with Inline graphic and equal variances Inline graphic, corresponding to Kendall’s Inline graphic between Inline graphic and Inline graphic. For each simulated dataset, we fitted two illness-death frailty models for Inline graphic using the EM algorithm (Section C.3 of the Supplementary material available at Biostatistics online). Frailty variances were estimated separately and then combined (Section D.2.2 of the Supplementary material available at Biostatistics online). RMST causal effects were estimated by a Monte Carlo integration using the estimated baseline hazards Inline graphic, coefficients Inline graphic and Inline graphic (Section D.2 of the Supplementary material available at Biostatistics online). Standard errors were estimated using bootstrap with 100 repetitions. Generally, the empirical bias of the estimated mean and median RMST effects was relatively negligible, standard errors were well estimated, and coverage provabilities were satisfactory. The regression coefficients, baseline hazard functions, and frailty variance were all also well estimated (Section D.3 of the Supplementary material available at Biostatistics online).

6. Data analysis

The Adult Changes in Thought (ACT) Study is a prospective cohort study focused on dementia in the elderly. Starting from 1994, AD-free participants of age 65 and older from the Seattle metropolitan area have been recruited (Kukull and others, 2002). Our analyses include the binary variables gender and race (white/nonwhite). Other available covariates (e.g., years of education) were not used as they are determined post-intervention. Out of 4453 participants, 1783 (40%) were censored prior to any event, 211 (5%) were diagnosed with AD and then censored, 1635 (37%) died without an AD diagnosis, and 824 (19%) were diagnosed with AD and died during follow-up. APOE was present for 1073 (24%) study participants. Because APOE is inherited and independent of other inherited traits, the ignorability assumptions seem plausible in this study.

Figure 4 presents the estimated bounds for causal effects (2.1)–(2.3) under Assumption 3 (pink shade), utilizing gender as an additional covariate (blue), and under the additional Assumption 4 (green, for (2.3) only). Assumptions 3 and 4 seem reasonably plausible in this application, because APOE is a well-known strong predictor of AD. The Inline graphic proportion Inline graphic was estimated to be 35% (95% CI 32%–37%). The bounds Inline graphic (bottom green and top blue) demonstrate that APOE induces earlier AD onset within the Inline graphic stratum. The bounds do not allow for definite conclusions regarding the effect of APOE on death, and the inclusion of the variable gender had only minor effect. Focusing on the Inline graphic stratum, its proportion was estimated to be 53% (95% CI 49%–58%), meaning the two stratum of interest together comprise the vast majority of the population. Whether APOE had a positive or negative effect on survival in the Inline graphic is unclear, although the lower bound was larger than zero for early ages. The Inline graphic stratum proportion was estimated to be 12% (95% CI 7%–17%), meaning that had the entire population had APOE, we would have expect a 12% increase of in the proportion of people with AD diagnosis sometime in their lifetime, compared to the scenario that none of the population had APOE.

Fig. 4.

Fig. 4.

Bounds for principal effects of APOE on AD and death based on the ACT data. In pink shade the simple bounds, in blue bounds adjusted for gender, and in green bounds based on Assumption 4.

In the semiparametric analysis under Assumption 6, two illness-death models (4.17) were fitted using the EM algorithm, and included gender and race as covariates Inline graphic (SM Table E.15 presents the estimated coefficients). Because participants in the ACT were recruited in varying ages, data were left truncated. We followed the approximation of Nielsen and others (1992), by replacing all risk indicators Inline graphic with Inline graphic, where Inline graphic is the age at recruitment. The estimated frailty variance was Inline graphic (SE: Inline graphic), corresponding to Kendall’s Inline graphic of 0.02. Turning to causal effects (Table 2), significant long-term negative effects of APOE on AD onset within the Inline graphic stratum were estimated; those with APOE are expected to receive an AD diagnosis approximately 2 years earlier. Additionally, death at earlier age was expected in both strata under APOE. For Inline graphic (age 80), effects were zero for the Inline graphic stratum and negative though small for the Inline graphic. For Inline graphic (ages 90,100), effects were negative in both strata, but stronger for the Inline graphic. The estimated effects did not change substantially as a function of Inline graphic.

Table 2.

Estimated RMST causal effects of APOE on AD and death times within the Inline graphic and Inline graphic strata. Effects presented are mean (ATE) and median (MTE) differences. Results presented for Inline graphic. The empirical distribution of gender and race in the data was used for calculating RMST effects

    AD Death
Stratum Inline graphic Inline graphic (95% CI) Inline graphic (95% CI) Inline graphic (95% CI) Inline graphic (95% CI)
Inline graphic Inline graphic Inline graphic 0.04 (Inline graphic0.12 to 0.04) 0.00 (Inline graphic0.07 to 0.07) Inline graphic 0.61 (Inline graphic0.7 to Inline graphic0.52) Inline graphic 1.00 (Inline graphic1.09 to Inline graphic0.91)
  Inline graphic Inline graphic 1.28 (Inline graphic1.35 to Inline graphic1.20) Inline graphic 2.00 (Inline graphic2.07 to Inline graphic1.92) Inline graphic 2.52 (Inline graphic2.61 to Inline graphic2.43) Inline graphic 3.00 (Inline graphic3.08 to Inline graphic2.92)
  Inline graphic Inline graphic 1.82 (Inline graphic1.89 to Inline graphic1.74) Inline graphic 2.01 (Inline graphic2.09 to Inline graphic1.92) Inline graphic 2.42 (Inline graphic2.5 to Inline graphic2.35) Inline graphic 3.00 (Inline graphic3.07 to Inline graphic2.93)
Inline graphic Inline graphic     Inline graphic 0.13 (Inline graphic0.22 to Inline graphic0.05) 0.00 (Inline graphic0.08 to 0.08)
  Inline graphic     Inline graphic 0.56 (Inline graphic0.65 to Inline graphic0.48) Inline graphic 1.00 (Inline graphic1.08 to Inline graphic0.92)
  Inline graphic     Inline graphic 1.75 (Inline graphic1.84 to Inline graphic1.67) Inline graphic 1.99 (Inline graphic2.08 to Inline graphic1.90)

To summarize the findings, APOE was found to expedite AD development within the Inline graphic stratum. APOE was shown to have harmful effect on long-term age at death in both strata only under Assumption 6 and the semiparametric modeling.

7. Discussion

It has been increasingly acknowledged that considering dual outcomes may provide unique information about the scientific problem in hand, especially in the presence of death as a competing event. While the SCR framework allows for consideration of dual outcomes, questions of causality under this framework were only recently began to be studied. To study causal effects of APOE on AD and death, we proposed new estimands built upon a population stratification approach.

Population stratification may suffer from similar disadvantages to principal stratification - that one cannot know the stratum a unit belongs to at baseline, and that the Inline graphic and Inline graphic strata may be small or irregular subsets of the population. When the Inline graphic and/or the Inline graphic strata comprise small part of the population, the TV-SACE and TV-RMST can be more informative. The TV-SACE and RM- SACE are also attractive when heterogeneity in treatment effect across time is of interest. When the goal is to assess the extent to which an intervention effect is mediated through a disease, a mediation approach should be preferred. Finally, when interest lies in the disease event only, causal estimands from the competing events literature might be more appealing.

We presented two overarching strategies towards studying the proposed causal estimands. The large-sample bounds we developed are built upon a set of assumptions a researcher is willing to make explicitly on the relation between potential outcomes under two different interventions. The frailty-based approach models cross-world dependence via semiparametric assumptions and carries out a sensitivity analysis as a function of the sensitivity parameter. One disadvantage of the latter is that the exact restrictions imposed on the potential outcomes beyond the conditional independence are implicit. Our sensitivity analysis strategy resembles the approach of Xu and others (2020) in that a correlation parameter for the cross-world dependence is introduced together with a distribution assumption. We used a frailty approach, while Xu and others (2020) assumed a Gaussian copula model for the joint distribution function of Inline graphic conditionally on Inline graphic. Xu and others (2020) then considered a flexible Bayesian regression model.

Supplementary Material

kxab049_Supplementary_Data

Acknowledgments

The authors thank two anonymous reviewers and the Associate Editor for helpful comments that improved the article.

Conflict of Interest: None declared.

Contributor Information

Daniel Nevo, Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, Israel.

Malka Gorfine, Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, Israel.

8. Software

Our R package Inline graphic (Version 1.0.0) implements the proposed methodology and is available from https://github.com/daniel258/CausalSemiComp.

Supplementary material

Supplementary material is available at http://biostatistics.oxfordjournals.org.

Funding

The ACT study was funded by the National Institute on Aging (U01 AG006781). The authors gratefully acknowledge the Adults Changes in Thought Study team for sharing the data for the motivating example. The authors gratefully acknowledge support from the Israel Science Foundation (DN: ISF 827/21; MG: ISF 1067/17) and the U.S.-Israel Binational Science Foundation (MG: BSF 2016126).

References

  1. Akritas, M. G. and Van Keilegom, I. (2003). Estimation of bivariate and marginal distributions with censored data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 457–471. [Google Scholar]
  2. Alzheimer’s Association. (2019). 2019 Alzheimer’s disease facts and figures. Alzheimer’s & Dementia 15, 321 – 387. [Google Scholar]
  3. Beran, R. (1981). Nonparametric regression with randomly censored survival data. Technical Report, Department of Statistics, University of California. California, Berkeley. [Google Scholar]
  4. Comment, L., Mealli, F., Haneuse, S. and Zigler, C. (2019). Survivor average causal effects for continuous time: a principal stratification approach to causal inference with semi-competing risks. arXiv preprint arXiv:1902.09304. [Google Scholar]
  5. Corder, E., Saunders, A., Strittmatter, W., Schmechel, D., Gaskell, P., Small, G, Roses, A., Haines, J. and Pericak-Vance, M. (1993). Gene dose of Apolipoprotein E type 4 allele and the risk of Alzheimer’s disease in late onset families. Science 261, 921–923. [DOI] [PubMed] [Google Scholar]
  6. Dal Forno, G., Carson, K. A., Brookmeyer, R., Troncoso, J., Kawas, C. H. and Brandt, J. (2002). APOE genotype and survival in men and women with Alzheimer’s disease. Neurology 58, 1045–1050. [DOI] [PubMed] [Google Scholar]
  7. Ding, P. and Lu, J. (2017). Principal stratification analysis using principal scores. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79, 757–777. [Google Scholar]
  8. Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Gerds, T. A. (2019). prodlim: Product-Limit Estimation for Censored Event History Analysis. R package version 2019.11.13. [Google Scholar]
  10. Gorfine, M., Keret, N., Ben Arie, A., Zucker, D. and Hsu, L. (2021). Marginalized frailty-based illness-death model: application to the UK-Biobank survival data. Journal of the American Statistical Association 116, 1155–1167. DOI 10.1080/01621459.2020.1831922. [DOI] [Google Scholar]
  11. Kukull, W. A., Higdon, R., Bowen, J. D., McCormick, W. C., Teri, L., Schellenberg, G. D., van Belle, G., Jolley, L. and Larson, E. B. (2002). Dementia and Alzheimer disease incidence: a prospective cohort study. Arch of Neurology 59, 1737–1746. [DOI] [PubMed] [Google Scholar]
  12. Lee, K-H, Rondeau, V. and Haneuse, S. (2017). Accelerated failure time models for semi-competing risks data in the presence of complex censoring. Biometrics 73, 1401–1412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Lin, S-H, Young, J. G., Logan, R. and VanderWeele, T. J. (2017). Mediation analysis for a survival outcome with time-varying exposures, mediators, and confounders. Statistics in medicine 36, 4153–4166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Long, D. M. and Hudgens, M. G. (2013). Sharpening bounds on principal effects with covariates. Biometrics 69, 812–819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Nevo, D., Blacker, D., Larson, E. B. and Haneuse, S. (2021a). Modeling semi-competing risks data as a longitudinal bivariate process. Biometrics, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Nevo, D., Ogino, S. and Wang, M. (2021b). Reflection on modern methods: causal inference considerations for heterogeneous disease etiology. International Journal of Epidemiology 50, 1030–1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Nielsen, G. G., Gill, R. D., Andersen, P. K. and Sørensen, T. I. A. (1992). A counting process approach to maximum likelihood estimation in frailty models. Scandinavian Journal of Statistics 19, 25–43. [Google Scholar]
  18. Robins, J. M. (1995). An analytic method for randomized trials with informative censoring: part 1. Lifetime Data Analysis 1, 241–254. [DOI] [PubMed] [Google Scholar]
  19. Safieh, M., Korczyn, A. D. and Michaelson, D. M. (2019). ApoE4: an emerging therapeutic target for Alzheimer’s disease. BMC Medicine 17, 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Shepherd, B.E., Gilbert, P.B. and Lumley, T. (2007). Sensitivity analyses comparing time-to-event outcomes existing only in a subset selected postrandomization. Journal of the American Statistical Association 102, 573–582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Stensrud, M. J., Young, J. G., Didelez, V., Robins, J. M. and Hernán, M. A. (2020). Separable effects for causal inference in the presence of competing events. Journal of the American Statistical Association, 1–9. [Google Scholar]
  22. Tchetgen Tchetgen, E. J. (2014). Identification and estimation of survivor average causal effects. Statistics in Medicine 33, 3601–3628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Tom, S. E., Hubbard, R. A., Crane, P. K., Haneuse, S. J., Bowen, J., McCormick, W. C., McCurry, S. and Larson, E. B. (2015). Characterization of dementia and Alzheimer’s disease in an older population: updated incidence and life expectancy with and without dementia. American Journal of Public Health 105, 408–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. VanderWeele, T. J. (2011). Principal stratification–uses and limitations. The International Journal of Biostatistics 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Varadhan, R., Xue, Q-L. and Bandeen-Roche, K. (2014). semi-competing risks in aging research: methods, issues and needs. Lifetime Data Analysis 20, 538–562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Xu, J., Kalbfleisch, J. D. and Tai, B. (2010). Statistical analysis of illness–death processes and semi-competing risks data. Biometrics 66, 716–725. [DOI] [PubMed] [Google Scholar]
  27. Xu, Y., Scharfstein, D., Müller, P. and Daniels, M. (2020). A Bayesian nonparametric approach for evaluating the causal effect of treatment in randomized trials with semi-competing risks. Biostatistics, kxaa008. 10.1093/biostatistics/kxaa008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Young, J. G., Stensrud, M. J., Tchetgen Tchetgen, E. J. and Hernán, M. A. (2020). A causal framework for classical statistical estimands in failure-time settings with competing events. Statistics in Medicine 39, 1199–1236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Zehavi, T. and Nevo, D. (2021). A matching framework for truncation by death problems. arXiv preprint arXiv:2110.10186. [Google Scholar]
  30. Zhang, J. L. and Rubin, D. B. (2003). Estimation of causal effects via principal stratification when some outcomes are truncated by “death”. Journal of Educational and Behavioral Statistics 28, 353–368. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

kxab049_Supplementary_Data

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press

RESOURCES