Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Dec 1.
Published in final edited form as: Epidemiol Methods. 2014 Mar 11;3(1):1–19. doi: 10.1515/em-2012-0001

Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data

Jessica G Young, Miguel A Herńan, James M Robins
PMCID: PMC4387917  NIHMSID: NIHMS674386  PMID: 25866704

1 Introduction

Robins et al. (2004) introduced the extended g-formula to estimate from observational data the risk of failure under hypothetical interventions wherein a subject’s treatment at time k is assigned based on the natural value of treatment at k; that is, the value of treatment that would have been observed at k were the intervention discontinued right before k. Several authors (Robins et al. 2004; Taubman et al. 2009; Lajous et al. 2013; Danaei et al. 2013; García-Aymerich et al. 2013) have parametrically applied this approach to estimate the risk of failure in observational studies under hypothetical time-varying interventions of the following form: “If a subject’s natural value of treatment at k is below a particular threshold (or above in the case of a harmful exposure) then set treatment to this threshold value. Otherwise, do not intervene on this subject at k.”

Taubman et al. (2008) referred to this special case of an intervention that depends on the natural value of treatment as a threshold intervention. For example, Taubman et al. (2009) used the parametric extended g-formula to estimate the 20-year risk of coronary heart disease (CHD) in the Nurses’ Health Study (NHS) under the following hypothetical threshold intervention on daily minutes of exercise on all days of follow-up “If a subject’s natural value of exercise by the end of day k is less than 30 minutes, set her exercise on day k to exactly 30 minutes. Otherwise, do not intervene on this subject on day k”. Threshold interventions have the property that they guarantee a continuous treatment is maintained within a pre-specified range (e.g. at least 30 minutes per day) continually throughout the follow-up while minimizing the number of subjects requiring intervention at each time.

Nonparametrically, the extended g-formula differs from the (non-extended) g-formula of Robins (1986) in that it includes (i) a specific user-supplied intervention density that depends on the natural value of treatment at each k and (ii) the density of natural treatment itself at each k conditional on past measured confounders (Robins et al. 2004). Richardson and Robins (2013) recently defined a condition such that the extended g-formula nonparametrically identifies risk under an intervention that depends on the natural value of treatment associated with the user-supplied intervention density in (i), provided this expression is well-defined. In this paper, we complement this result by showing the algebraic equivalence between the extended g-formula associated with a user-supplied intervention density (i) and the (non-extended) g-formula associated with a particular random dynamic regime that does not depend on the natural value of treatment and may, at most, depend on the measured confounders.

Provided the identifying condition of Richardson and Robins (2013) holds, this algebraic equivalence gives

  1. a sufficient positivity condition such that the extended g-formula is well-defined and thus nonparametrically identifies risk under an intervention that depends on the natural value of treatment in an observational study and

  2. semi-parametric alternatives to the parametric extended g-formula for estimation.

Given this equivalence, these results follow immediately from previous work on identification and estimation of the effects of random dynamic regimes that do not depend on the natural value of treatment and may, at most, depend on the measured confounders. For example, see Robins (1986, 1997); Pearl (2000); Murphy et al. (2001); van der Laan et al. (2005); Herńan et al. (2006); Tian (2008); Dawid and Didelez (2008); Robins and Herńan (2009); Orellana et al. (2010a,b); Cain et al. (2010); Stitelman et al. (2010); Dawid and Didelez (2010); Young et al. (2011); Picciotto et al. (2012); Díaz Muñoz and van der Laan (2012).

Finally, there has been no consideration of the limits on physical implementation of interventions that depend on the natural value of treatment. For example, once we observe that a subject has exercised 20 minutes by the end of day k we cannot subsequently intervene and make her exercise any more (or any fewer) minutes by the end of that day. Therefore, given a hypothetical intervention that depends on the natural value of treatment, we define a plausible (implementable) approximation to this intervention. We also provide an untestable assumption that, when satisfied, would give exact equivalence.

The structure of the paper is as follows. In §2, we define the observational data structure of interest and give a classification of hypothetical interventions that do not depend on the natural value of treatment and may, at most, depend on the measured confounders, including random dynamic regimes. In §3, we review a set of conditions that nonparametrically identifies risk by the end of follow-up in the observational study under any hypothetical intervention within this classification by the (non-extended) g-formula. In §4, we show the algebraic equivalence between the extended g-formula associated with an intervention that depends on the natural value of treatment and the (non-extended) g-formula associated with a particular random dynamic regime. In §5, we review the parametric extended g-formula estimator and give a semi-parametric alternative that follows immediately from the results of §4 given previous semi-parametric results in the context of random dynamic regimes. In §6, we define a plausible approximation to an intervention that depends on the natural value of treatment and an assumption for exact equivalence.

2 A classification of interventions that do not depend on the natural value of treatment

Consider an observational study in which the following random variables are measured during each follow-up time (e.g., day) k = 0, … , K + 1 for each of i = 1, … , n subjects. We assume subjects are independent and identically distributed and thus suppress the i subscript. Let Dk be an indicator of failure (e.g., CHD) by k, Lk a vector of measured confounders at the start of k (e.g., smoking, body mass index [BMI], diet), and Ak the treatment observed during k (e.g., number of minutes of actual daily exercise). During any given time k, Dk precedes (Lk, Ak). We denote the history of a random variable using overbars. For example, A¯k=(A0,,Ak) is the observed treatment history through k. For notational convenience we set L¯1 and A¯1 to be identically 0 and, by definition, D¯0=0. We use lower-case letters to denote possible realizations of a random variable, e.g., ak is a possible realization of treatment Ak. For simplicity, we assume that no subjects are lost to follow-up or die from competing risks and that all variables are perfectly measured. If a subject has failed by k, i.e. Dk = 1, then by convention, we will set Lk = Ak = 0.

Our goal is to estimate the risk of failure that would have been observed by the end of follow-up K + 1 had all subjects in this study population followed a hypothetical intervention or treatment regime. Generally define a treatment regime that does not depend on the natural value of treatment as a rule that assigns treatment at k as an independent draw from an intervention density fint(aklk,a¯k1,D¯k=0) that may, at most, depend on (a¯k1,lk), k = 0, … , K (Robins 1986).

Treatment regimes can be either deterministic or random. A regime is deterministic if fint(aklk,a¯k1,D¯k=0) may only equal zero or one for all (a¯k,lk) and k = 0, … , K. Otherwise, it is random. In particular, we denote g = (g0, …gK) to be the deterministic regime associated with the intervention density defined by fint(aklk,a¯k1g,D¯k=0)=1 if ak=akg and 0 otherwise, where asg=gs(ls,a¯s1g) is any component of akg=(a0g,,akg) and asg is recursively defined by the function gs of (ls,a¯s1g), s = 0, … , k.

Treatment regimes can further be classified as static or dynamic. A deterministic regime g is static if akg does not depend on any component of lk for all k. Otherwise g is dynamic. Analogously, a random regime may be classified as static if the intervention density fint(aklk,a¯k1,D¯k=0) does not depend on any component of lk for all k. Otherwise, this random regime can be classified as dynamic. As noted by Picciotto et al. (2012), and as made explicit in our notation, treatment assignment under any regime fint(aklk,a¯k1,D¯k=0) within the current classification depends on surviving to k (i.e. the event Dk = 0).

To fix ideas, let us consider some examples of treatment regimes in the context of interventions on daily exercise:

  1. Deterministic static regime: “Set daily exercise to 30 minutes on every day k for all subjects” or fint(aklk,a¯k1,D¯k=0) if ak = 30 and 0 otherwise for all k = 0, … , K. For this regime g, akg=30 for any k and confounder history lk.

  2. Deterministic dynamic regime: “If a subject’s BMI at the start of day k is ≥ 25, then set her exercise to exactly 30 minutes on that day. Otherwise, set her exercise to exactly 60 minutes” or, for L1,k the component of Lk corresponding to the day k BMI measurement,
    • if l1,k ≥ 25 then fint(aklk,a¯k1,D¯k=0)=1 if ak = 30 and 0 otherwise
    • if l1,k < 25 then fint(aklk,a¯k1,D¯k=0)=1 if ak = 60 and 0 otherwise
    k = 0, … , K. For this regime g, akg=30 if l1,k ≥ 25 and akg=60 otherwise for all k.
  3. Random static regime: “Randomly assign a subject’s exercise on day k such that the probability of receiving 30 minutes is 0.8 and the probability of receiving 60 minutes is 0.2” or fint(aklk,a¯k1,D¯k=0)=0.8 if ak = 30, 0.2 if ak = 60, and 0 otherwise. The intervention density fint(aklk,a¯k1,D¯k=0) may take on values between 0 and 1 but its value does not depend on lk for any k.

  4. Random dynamic regime: “If a subject’s BMI at the start of day k is ≥ 25, randomly assign her exercise on day k such that the probability of receiving 30 minutes is 0.8 and the probability of receiving 60 minutes is 0.2. Otherwise, set her exercise to 60 minutes on that day” or
    • if l1,k ≥ 25 then fint(aklk,a¯k1,D¯k=0)=0.8 if ak = 30, 0.2 if ak = 60 and 0 otherwise
    • if l1,k < 25 then fint(aklk,a¯k1,D¯k=0)=1 if ak = 60 and 0 otherwise
    k = 0, … , K. The intervention density fint(aklk,a¯k1,D¯k=0) may take on values between 0 and 1 and its value depends on lk for some k.

3 Identifying risk under interventions that do not depend on the natural value of treatment

In observational studies, treatment is not under the control of the investigator but is assigned by some unknown treatment rule that generally differs from the hypothetical regime of interest fint(aklk,a¯k1,D¯k=0). In this section, we will review a set of conditions under which data from an observational study can still be used to identify the risk had all subjects, contrary to fact, followed a treatment regime characterized by fint(aklk,a¯k1,D¯k=0).

Let D¯K+1g and A¯K+1g and L¯K+1g represent the counterfactual outcome, treatment and confounder histories, respectively, under a deterministic treatment regime g. We now define three g-specific identifying conditions for each k = 0, … , K:

  1. Consistency: If A¯k+1=A¯k+1g then D¯k+1=D¯k+1g and L¯k+1=L¯k+1g.

  2. Exchangeability:
    (Dk+1g,,DK+1g)AkL¯k=lk,A¯k1=a¯k1g,Dk=0 (1)
    Exchangeability (1) encodes the assumption that the measured history (L¯k,A¯k1) is sufficient to control confounding for the effect of treatment at k on future outcomes. It is often referred to as the assumption of no unmeasured confounding and the vector L¯k the measured confounder history at k.
  3. Positivity:
    fA¯k1,L¯k,Dk(ak1g,lk,0)0fAkL¯k,A¯k1,Dk(akglk,ak1g,0)fobs(akglk,ak1g,D¯k=0)>0w.p.1. (2)
    where fobs(aklk,a¯k1,D¯k=0) denotes the observed treatment density, i.e., the conditional density of treatment at k in the observational study evaluated at a particular (a¯k,lk).

Under the three g-specific identifying assumptions stated above for each deterministic regime gG, where G is the set of all deterministic regimes, the risk by K + 1 under an intervention characterized by any fint(aklk,a¯k1,D¯k=0) is equivalent to the g-formula (Robins 1986):

a¯KlKk=0KPr[Dk+1=1L¯k=lk,A¯k=a¯k,D¯k=0]×j=0k{Pr[Dj=0L¯j1=lj1,A¯j1=a¯j1,D¯j1=0]×f(ljlj1,a¯j1,D¯j=0)×fint(ajlj,a¯j1,D¯j=0)} (3)

where f(lklk1,a¯k1,D¯k=0) and Pr[Dk+1=1L¯k=lk,A¯k=a¯k,D¯k=0] are the observed joint density of the confounders at k and probability of the outcome by k + 1, respectively, conditional on past treatment, confounders, and survival to k, with lk the first k + 1 components of lK, k = 0, … , K. A proof of this equivalence under the current data structure and notation is provided in the appendix of Young et al. (2011) following Lemma 4.2 of Robins (1986).

One minus expression (3) is equivalent to survival by K + 1 under a treatment regime characterized by fint(aklk,a¯k1,D¯k=0), k = 0, … , K. This survival can be written as a weighted average of deterministic survival probabilities associated with the deterministic regimes gG with weights defined in terms of fint(aklk,a¯k1,D¯k=0). Appendix A reviews this equivalence and provides a simplified numerical example in a low-dimensional setting. Note for a given choice of fint(aklk,a¯k1,D¯k=0), the three identifying assumptions need only hold for the subset of deterministic regimes g that contribute a non-zero weight to the weighted average.

In settings with high-dimensional confounders and/or multiple follow-up times, it will often be quite cumbersome (if not impossible) to list every deterministic regime in the set G with non-zero weights corresponding to a particular choice of fint(aklk,a¯k1,D¯k=0). An exception is the case where fint(aklk,a¯k1,D¯k=0) is defined in terms of a single deterministic regime g. In this special case, all weight is given to this single deterministic regime and expression (3) reduces to:

lKk=0KPr[Dk+1=1L¯k=lk,A¯k=akg,D¯k=0]×j=0k{Pr[Dj=0L¯j1=lj1,A¯j1=aj1g,D¯j1=0]×f(ljlj1,aj1g,D¯j=0)} (4)

which may be more familiar to some readers.

4 Identifying risk under interventions that depend on the natural value of treatment

Given an intervention, define the natural value of treatment at k as the value of treatment that would have been observed at time k were the intervention discontinued right before k. We denote the natural value of treatment at k as Ak where, for notational simplicity, we suppress dependence on the associated intervention. Thus far we have only considered interventions that may, at most, depend on the measured confounders as classified in §2. We now extend our consideration to interventions that may also depend on the natural value of treatment at k. We shall represent such a hypothetical intervention by its intervention density, fd(akak,lk,a¯k1,D¯k=0). An example of fd(akak,lk,a¯k1,D¯k=0) is the threshold intervention of Taubman et al. (2009) on daily exercise stated in §1 such that

Ifak30thenfd(akak,lk,a¯k1,D¯k=0)=1ifak=30and0o.w.Ifak>30thenfd(akak,lk,a¯k1,D¯k=0)=1ifak=akand0o.w. (5)

Note that, in an observational study, the natural value of treatment at k Ak is equivalent to the observed treatment Ak as no intervention has been made.

Robins et al. (2004) defined the extended g-formula for risk by K + 1 associated with an intervention density fd(akak,lk,a¯k1,D¯k=0):

a¯Ka¯KlKk=0KPr[Dk+1=1L¯k=lk,A¯k=a¯k,D¯k=0]×j=0k{Pr[Dj=0L¯j1=lj1,A¯j1=a¯j1,D¯j1=0]×fd(ajaj,lj,a¯j1,D¯j=0)×f(ajlj,a¯j1,D¯j=0)×f(ljlj1,a¯j1,D¯j=0)} (6)

where we stress that f(aklk,a¯k1,D¯k=0) is the conditional density of Ak=Ak in the observational study evaluated at ak given past treatment, confounders, and survival to k, k = 0, … , K. To emphasize this fact, we sometimes write this density as fobs(aklk,a¯k1,D¯k=0).

Richardson and Robins (2013) defined a condition such that (6) identifies from observational data the risk by K + 1 under a hypothetical intervention fd(akak,lk,a¯k1,D¯k=0) provided this expression is well-defined. We can informally understand this condition as the assumption that Ak is not a confounder and has no effect on the outcome except through future treatment. We consider this condition more formally in the context of a simple example in Appendix B.

Consider one particular intervention that does not depend on Ak within the classification of §2 specifically chosen as

fint(aklk,a¯k1,D¯k=0)=akfd(akak,lk,a¯k1,D¯k=0)fobs(aklk,a¯k1,D¯k=0) (7)

for any (a¯k,lk). We will say that this choice of fint(aklk,a¯k1,D¯k=0) is an implied treatment rule because it is a marginalization of the user-supplied density fd(akak,lk,a¯k1,D¯k=0) over the observational data density of Ak=Ak. For this particular choice of fint(aklk,a¯k1,D¯k=0), the extended g-formula (6) is equivalent to the (non-extended) g-formula (3). This equivalence follows by the absence of Ak from the conditioning statement of the conditional probability of the outcome at any time k + 1, … , K + 1 in (6).

By this equivalence, it immediately follows that, with fint(aklk,a¯k1,D¯k=0) defined by (7), the positivity condition (2) of §3 guarantees that both the (non-extended) g-formula (3) and the extended g-formula (6) are well-defined. Note, again, for this fint(aklk,a¯k1,D¯k=0), this condition need only hold for the subset of deterministic regimes g that contribute a non-zero weight to the associated weighted average of deterministic regimes. Díaz Muñoz and van der Laan (2011, 2012) and Haneuse and Rotnitsky (2013) noted a similar result in the point treatment setting for random dynamic regimes that might be interpreted in terms of implied random dynamic regimes based on an explicit deterministic mechanism depending on the natural value of treatment. The regimes considered by these authors are discussed in §5.2.

The implied intervention density (7) is a function of the observed treatment density fobs(aklk,a¯k1,D¯k=0), which is generally unknown in high-dimensional observational data (although, it may be estimated). Therefore, the implied fint(aklk,a¯k1,D¯k=0) will also generally be unknown. For example, for fd(akak,lk,a¯k1,D¯k=0) as defined in (5), the marginalization (7) evaluates to

fint(aklk,a¯k1,D¯k=0)=Probs(Akaklk,a¯k1,D¯k=0)ifak=30,fint(aklk,a¯k1,D¯k=0)=fobs(aklk,a¯k1,D¯k=0)ifak>30,fint(aklk,a¯k1,D¯k=0)=0ifak<30. (8)

The implied rule (8) is a random dynamic regime by the classification given in §2 as fobs(aklk,a¯k1,D¯k=0) will generally be a nondegenerate density.

Finally, while the extended g-formula (6) and the (non-extended) g-formula (3) associated with the random dynamic regime (7) require the same positivity condition by their equivalence, the condition required for risk identification under an intervention fd(akak,lk,a¯k1,D¯k=0) and under (7) are not generally equivalent. In particular, the identifying condition defined by Richardson and Robins (2013) for an intervention fd(akak,lk,a¯k1,D¯k=0) is generally more stringent than that required for the random dynamic mechanism (7), the latter of which is equivalent to the exchangeability condition (1) of §3. An exception is under the null; here the two conditions are equivalent. For details, see section 5.6 of Richardson and Robins (2013) and Appendix B.

5 Estimating an intervention risk using observational data

In low-dimensional settings we can non-parametrically estimate expression (3) by first enumerating all possible treatment and confounder histories under a specified intervention fint(aklk,a¯k1,D¯k=0), calculating each component proportion, and then taking the overall sum. When fint(aklk,a¯k1,D¯k=0) is implied by the sum (7) then we must additionally enumerate all possible natural treatment histories and calculate this implied rule. In high-dimensional settings, such that K is large and/or there are continuously measured covariates, such an approach is not feasible. In this case, parametric or semi-parametric approaches may be used.

5.1 Parametric estimation

Robins (1986) described a parametric estimator of the (non-extended) g-formula given in (3) which involves parametrically modelling each component density and using Monte Carlo simulation to approximate the sum over all possible histories under an intervention that does not depend on the natural value of treatment as in §2. Robins et al. (2004) and Taubman et al. (2009) generalized this algorithm to allow for an intervention that depends on the natural value of treatment as in §4. Briefly, this more general approach involves the following steps:

  1. Parametrically estimate the joint density of natural treatment and confounders at each follow-up time (except baseline) given survival and past treatment and confounders.

  2. Parametrically estimate the probability of failure at each follow-up time given survival and past measured treatment and confounders.

  3. Recursively, for each k = 0, … , K
    • (a) Set baseline confounders and natural treatment to the observed sample values. For k > 0, generate time k confounders and natural treatment based on the estimated model coefficients and previously generated treatment and confounders under intervention.
    • (b) Assign time k treatment under intervention based on the rule of interest which may be an explicitly specified fint(aklk,a¯k1,D¯k=0) depending at most on the past measured confounders or an explicitly specified fd(akak,lk,a¯k1,D¯k=0) depending on the natural value of treatment at k.
    • (c) Calculate the discrete failure hazard at k + 1 given only past generated treatment and confounders under the intervention (ignoring the natural treatment value).
  4. Calculate the cumulative probability of failure by K + 1 using the k + 1 specific failure hazards for each generated treatment and confounder history under intervention.

  5. Calculate the average cumulative probability of failure by K + 1 over all generated intervention histories.

Robins et al. (2004); Taubman et al. (2009); Lajous et al. (2013); Danaei et al. (2013) and García-Aymerich et al. (2013) have applied the above approach to estimate failure risk under time-varying threshold interventions on lifestyle factors that depend on the natural value of treatment in various observational studies including the NHS, the Offspring Framingham Heart Study (FHS) and the Health Professionals Follow-up Study (HPFS). A more technical description of this algorithm is given in Appendix C and may be implemented using a SAS macro publicly available at www.hsph.harvard.edu/causal/software.

This estimation algorithm effectively ignores that, for an intervention fd(akak,lk,a¯k1,D¯k=0), the implied treatment rule depending only on the measured confounders is fint(aklk,a¯k1,D¯k=0) as defined by the marginalization (7). The natural value of treatment Ak is generated at each k regardless of whether the explicit intervention of interest depends on it or not. If the intervention does not depend on it, then Ak is generated but not used. Note that expression (3) can be rewritten as

a¯Ka¯KlKk=0KPr[Dk+1=1L¯k=lk,A¯k=a¯k,D¯k=0]×j=0k{Pr[Dj=0L¯j1=lj1,A¯j1=a¯j1,D¯j1=0]×huser(a¯j,aj,lj)×f(zjlj1,a¯j1,D¯j=0)} (9)

where f(zklk1,a¯k1,D¯k=0) is the joint density of Zk, an arbitrarily ordered vector including Ak and Lk, conditional on survival and past treatment and confounder history. Here, huser(a¯k,ak,lk) may be selected as an explicitly specified fint(aklk,a¯k1,D¯k=0) that may at most depend on the measured confounder history as in the examples given in §2 or an explicitly specified fd(akak,lk,a¯k1,D¯k=0). Under the latter choice, expression (9) is equivalent to expression (3) with fint(aklk,a¯k1,D¯k=0) defined by (7) and, thus (by the arguments of §4), also equivalent to the extended g-formula (6).

5.2 Semi-parametric estimation

The parametric g-formula may be subject to bias due to model misspecification and to the g-null paradox (Robins and Wasserman 1997). As an alternative, several authors have described semi-parametric estimators of risk under explicitly specified random dynamic regimes that may, at most, depend on the measured confounder history (Murphy et al. 2001; Cain et al. 2010; Stitelman et al. 2010; Díaz Muñoz and van der Laan 2012). These approaches do not require specification of the likelihood and may be more robust to model misspecification. Here we describe how an inverse-probability weighted (IPW) risk estimator can be extended to implied random dynamic regimes such as that defined by equation (7).

Following Cain et al. (2010), consider the following IPW estimator of risk by K + 1 under an explicitly specified fint(aklk,a¯k1,D¯k=0). Let ψ^ be the solution to the estimating equation

i=1nk=0KUi,k(ψ,α^)=0, (10)

with respect to ψ where

Ui,k(ψ,α^)=(Di,k+1λ(k,ψ))(1Di,k)Wi,k(α^),

λ(k, ψ) is a flexible function of k and the parameter vector ψ and

Wi,k(α)=j=0kfint(Ai,jL¯i,j,A¯i,j1,D¯j=0)j=0kfobs(Ai,jL¯i,j,A¯i,j1,D¯j=0;α), (11)

with α^ the MLE of α given the model fobs(ajlj,a¯j1,D¯j=0;α) for the observed treatment density as defined in (2) with α0 the true population value of α.

If this treatment model is correctly specified and there exists ψ0 such that

λ(k,ψ0)=E[Dk+1(1Dk)Wk(α0)]E[(1Dk)Wk(α0)] (12)

then we have

E[Uk(ψ0,α0)]=0 (13)

for all k and the estimator ψ^ consistent for ψ0 and asymptotically normal. Note that, under these assumptions, the g-formula (3) is equivalent to

k=0Kλ(k,ψ0)j=0k1(1λ(j,ψ0)) (14)

The IP weighted estimator of (3) is then given by the plug-in estimator where ψ0 in (14) is replaced by the IP weighted estimate ψ^. Analogous to Cain et al. (2010), we might impose a Cox marginal structural model (MSM) if few individuals are following fint(aklk,a¯k1,D¯k=0) to borrow information from individuals following other interventions (Robins 2000). Note that in the case where fint(aklk,a¯k1,D¯k=0) corresponds to a single deterministic regime g, j=0kfint(Ai,jL¯i,j,A¯i,j1,D¯j=0) in the numerator of the weight (11) becomes j=0kI(Ai,j=Ajg) which renders an estimating equation more familiar to some readers.

To extend the IP weighted estimator described above (and related semi-parametric approaches) to explicitly specified interventions of the form fd(akak,lk,a¯k1,D¯k=0), we must replace fint(aklk,a¯k1,D¯k=0) for (A¯i,k,L¯i,k)=(a¯k,lk) in the weight (11) with the marginalization (7) for every possible treatment and confounder history observed in the data. Thus, in contrast to the parametric g-formula estimator described above, semi-parametric methods cannot “ignore” the fact that the explicit treatment rule of interest fd(akak,lk,a¯k1,D¯k=0) implies the marginalization (7).

In general, the computational complexity of this marginalization will, of course, depend on the form of fd(akak,lk,a¯k1,D¯k=0) and fobs(aklk,a¯k1,D¯k=0). For example, the implied rule (8) requires knowledge of fobs(aklk,a¯k1,D¯k=0) which must be estimated to calculate the denominator of the weights. If this is based on a parametric model then one must also estimate Pr(Akaklk,a¯k1,D¯k=0) based on that model, which will be used for the numerator of the weights for any subject with Ak = 30.

Other authors have considered semi-parametric estimators of risk under random dynamic regimes that might be interpreted in terms of implied random dynamic regimes based on an explicit deterministic mechanism depending on the natural value of treatment (Díaz Muñoz and van der Laan 2011, 2012; Haneuse and Rotnitsky 2013). For example, Díaz Muñoz and van der Laan (2012) considered various semi-parametric estimators of risk under a random dynamic regime on a point treatment that somehow shifts the observed treatment density by a certain amount. They allowed this shift to, at most, depend on values of the measured confounders, considering interventions on physical activity as a particular example.

Specifically, extending to our more general time-varying setting, this shift δ(lk,a¯k1) could be achieved by the following mechanism “On each day k, if a subject with treatment and confounder history (lk,a¯k1) has exercised ak minutes under no intervention by the end of the day then have her, instead, exercise akδ(lk,a¯k1) on that day”. If we fix δ(lk,a¯k1)=30 for all (lk,a¯k1) then this intervention maintains exercise at or above 30 minutes per day for all subjects and corresponds to a particular choice of fd(akak,lk,a¯k1,D¯k=0), such that fd(akak,lk,a¯k1,D¯k=0)=1 if ak=akδ(lk,a¯k1) and 0 otherwise.

For this choice of fd(akak,lk,a¯k1,D¯k=0), the marginalization (7) is conveniently equivalent to fobs(aklk,a¯k1,D¯k=0) for all values of ak. As noted by Díaz Muñoz and van der Laan (2012), this choice of fint(aklk,a¯k1,D¯k=0) also may render practical violations of positivity less influential on the performance of the estimators. See Petersen et al. (2012) for a detailed discussion of the potential influence of practical positivity violations on various estimators.

6 A plausible approximation to interventions that depend on the natural value of treatment

In the previous sections we have considered hypothetical interventions at k that depend on the natural value of treatment also at k. Such interventions are generally not plausible in practice. For example, once an individual has exercised less than 30 minutes by the end of day k, she cannot, instead, have exercised 30 minutes by the end of that day. It follows that, even given “perfect” conditions (e.g. identifiability and no model misspecification) it is unclear how to use observational estimates associated with such interventions to inform real-world future policy or the design of future randomized experiments.

We might, however, approximate such interventions with a plausible (implementable) experiment. Let Xk be a subject’s stated intention with respect to treatment on day k measured at the start of that day (e.g., intended daily minutes of exercise at the start of day k). Given an intervention fd(akak,lk,a¯k1,D¯k=0), denote fd(akxk,lk,a¯k1,D¯k=0) as a plausible approximation that assigns treatment according to the same rule as fd(akak,lk,a¯k1,D¯k=0) at each k but replacing Ak with Xk.

For example, given the threshold intervention on exercise of Taubman et al. (2009) characterized by (5), a plausible approximation is “If a subject’s intention at the start of day k is to exercise less than 30 minutes on that day then ensure she exercises exactly 30 minutes by the end of day k. Otherwise, ensure she exercises her intended amount” or

Ifxk30thenfd(akxk,lk,a¯k1,D¯k=0)=1ifak=30and0o.w.Ifxk>30thenfd(akxk,lk,a¯k1,D¯k=0)=1ifak=xkand0o.w.. (15)

Suppose treatment is assigned according to fd(akxk,lk,a¯k1,D¯k=0) and the following assumption held:

Natural Value of Treatment Assumption

Under any intervention, for all k, every subject’s intended minutes of exercise at the start of day k is equal to what her subsequent behavior would be on that day were the intervention based on intention discontinued right before k.

Under this assumption, the plausible rule fd(akxk,lk,a¯k1,D¯k=0) is not an approximation but exactly equal to fd(akak,lk,a¯k1,D¯k=0). Further, under the reasonable assumption that intention has no direct effect on the outcome except through future treatment, the risks by K + 1 under these two rules will be equivalent. Thus all identification and estimation results of §4 and §5 apply.

In an actual experiment where treatment is assigned according to fd(akxk,lk,a¯k1,D¯k=0) it is impossible to empirically examine whether this assumption holds, even given Xk is measured. However, in an observational study, this relationship can be examined given Xk is measured. In particular, in an observational study (i.e., under no intervention), the natural value of treatment assumption implies that for each subject and all k

IfXk=xkandAk=akthenxk=ak (16)

Here, again, the natural value of treatment Ak is equivalent to the measured treatment Ak for all subjects as no intervention is made. Note that, while (16) implies that the natural value of treatment assumption holds for the observational study, (16) does not guarantee this assumption will hold under an intervention fd(akxk,lk,a¯k1,D¯k=0).

Finally, we point out that when (16) does not hold, fd(akxk,lk,a¯k1,D¯k=0) is simply an example of a deterministic dynamic regime g by the classification given in §2 with Lk in §2 replaced with (Xk, Lk). This deterministic dynamic regime g is specifically defined such that akg=30 if xk ≤ 30 and akg=xk otherwise. Further, by the arguments of §3, given the assumptions of that section for this choice of g, risk under this regime is identified by the deterministic regime g-formula (4), again replacing Lk with (Xk, Lk). Note that, in this setting, any of the conditional densities in the g-formula (4) may depend on X¯k without restriction.

By contrast, if assumption (16) holds in the observational study, positivity as defined in (2) immediately fails for this g. Specifically, akgxk for xk < 30 because we must have akg30 for all k and (x¯k,lk) under this definition of g. Therefore, given (16), fobs(akgx¯k,lk,a¯k1g,D¯k=0)=0 whenever xk < 30. As a consequence of this positivity violation, Pr[Dk+1=1X¯k=x¯k,L¯k=lk,A¯k=akg,D¯k=0] in expression (4) is undefined for all histories such that xk < 30, k = 0, … , K.

7 Conclusions

In this paper, we showed the equivalence between the extended g-formula associated with an intervention that depends on the natural value of treatment and the (non-extended) g-formula of Robins (1986) associated with a particular random dynamic regime that does not depend on this value. This equivalence immediately gives a sufficient positivity condition that guarantees the extended g-formula is well-defined. This positivity result, coupled with the results of Richardson and Robins (2013), now provides a formal causal framework for previously published applications of the parametric extended g-formula to estimate risk under threshold interventions in observational studies. It also immediately gives semi-parametric alternatives to the parametric extended g-formula. Finally, we considered limits on the practical implementation of threshold interventions along with possible real-world approximations.

The assumption of positivity is often informally described as the assumption that there are at least some subjects in the observational study who are observed to follow the hypothetical intervention of interest within every possible level of the “past”. By this understanding, it would appear that positivity must be violated for the threshold intervention on exercise considered by Taubman et al. (2009). Specifically, no subject who exercised less than 30 minutes on day k can be following the intervention at k. Our positivity result makes clear that, given appropriate identification conditions, it is not necessary to observe such patterns in the observational study. It is only necessary to observe some individuals following the implied random dynamic regime (7).

Appendix A.

A.1 Representing the g-formula characterized by a random dynamic regime as a weighted average of deterministic regimes

Given gG, let fk(akga¯k1g,lk) equal the intervention density fint(aklk,a¯k1,D¯k=0) evaluated at a¯kg. In the following we assume Lk is discrete and we choose an ordering such that {lk,1,,lk,Lk}=Lk, with Lk the support of Lk and Lk its cardinality. Let qKg(lK1)=h=1LKfK(aKga¯K1g,lK1,lK,h). For k = K − 1, … , 0, let

qgk(lk1)=h=1Lkfk(akga¯k1g,lk1,lk,h)qk+1g(lk1,lk,h)

Let wt(g)=q0g.

By Lemma 4.2 of Robins (1986), given a particular fint(aklk,a¯k1,D¯k=0) kK defining wt (g) as above for all gG, then one minus expression (3) equals

gGwt(g)Pr"[DK+1g=0]

where “Pr” [DK+1g=0] is equivalent to one-minus expression (4) and gGwt(g)=1.

A.2 Simplified numerical example

Figure 1 depicts a hypothetical sequentially randomized trial where treatment is assigned at each time based on a particular intervention density fint(aklk,a¯k1,D¯k=0) by a structural tree graph (Robins 1986). For numerical simplicity we will consider a short follow-up with K = 1 and all binary treatment and covariates. For additional simplicity we will assume that no subject fails prior to the end of follow-up (i.e. D¯1=0 for all subjects). We will also assume that all subjects have the same value of the baseline covariate L0 = l0. The intervention density is defined by the probability of receiving a given level of treatment given the past read directly off the graph. These probabilities imply that fint(aklk,a¯k1,D¯k=0) corresponds to a random dynamic regime. For example, following the top branch of the graph, the probability of receiving treatment at k = 1 given (l0, a0 = 1, l1 = 1) is 1020 or 0.5.

Figure 1.

Figure 1

A hypothetical sequentially randomized trial for K = 1 and all binary (A0, L1, A1)

The survival probability for the disease of interest in this hypothetical sequentially randomized trial is simply the overall proportion of those who did not get the disease at the end of follow-up out of the total number at risk at baseline. Specifically, 52 subjects at the end have D2 = 0 out of the 100 subjects at risk at baseline; thus survival in this hypothetical trial characterized by fint(aklk,a¯k1,D¯k=0) is 52100. We will now show 52100 is equivalent to a weighted average of the g-formula for survival over all deterministic regimes that it is possible to follow in this hypothetical sequentially randomized trial with weights defined as in the previous section.

First the set G contains the following subset of deterministic regimes (g1, g2, g3, g4, g5, g6):

  • g1: (a0g1,a1g1)=(0,1); the static regime “do not treat at time 0; treat at time 1”

  • g2: (a0g2,a1g2)=(0,1); the static regime “treat at time 0; do not treat at time 1”

  • g3: (a0g3,a1g3)=(1,1); the static regime “always treat”

  • g4: (a0g4,a1g4)=(0,1l1); the dynamic regime “do not treat at time 0; if l1 = 1 then do not treat at time 1; otherwise treat at time 1 ”

  • g5: (a0g5,a1g5)=(0,1l1); the dynamic regime “treat at time 0; if l1 = 1 then do not treat at time 1; otherwise treat at time 1”

  • g6: (a0g6,a1g6)=(1,l1); the dynamic regime “treat at time 0; if l1 = 0 then do not treat at time 1; otherwise treat at time 1”

Note that G contains additional deterministic regimes but we exclude these from the above subset as we observe no individuals following these regimes in the trial depicted in Figure 1. For example, in this trial, we observe no individuals who are untreated at both time 0 and time 1 with L1 = 0. Any deterministic static or dynamic regime that allows this treatment and covariate pattern will contribute a zero weight by the definition of the previous section. Examples of deterministic regimes that would contribute a zero weight are g7 = (0, 0) and g8 = (0, l1).

Using the definition of the previous section such that, again, fk(akga¯k1g,lk)fint(akglk,a¯k1g,D¯k=0), we define wt(g) as follows for each g in the subset above:

wt(g)=f0(a0gl0)×f1(a1gl1=0,a0g,l0)×f1(a1gl1=1,a0g,l0)

Specifically, we have by Figure 1:

  • wt(g1)=40100×1010×1530=15

  • wt(g2)=60100×3040×1020=940

  • wt(g3)=60100×1040×1020=340

  • wt(g4)=40100×1010×1530=15

  • wt(g5)=60100×1040×1020=340

  • wt(g6)=60100×3040×1020=940

We leave to the reader to confirm that the sum of these weights is one.

Each “Pr” [D2g=0] is defined by the g-formula

l1Pr[D2=0A¯1=a¯1g,L1=l1,L0=l0]×f(l1a0g)×f(l0)

where f(l0) = 1. Here we evaluate this expression for all g in the subset:

  • “Pr” [D2g1=0]=(510×1040)+(1015×3040)=58

  • “Pr” [D2g2=0]=(1530×4060)+(510×2060)=12

  • “Pr” [D2g3=0]=(4060×1010)+(210×2060)=4460

  • “Pr” [D2g4=0]=(510×1040)+(515×3040)=38

  • “Pr” [D2g5=0]=(1010×4060)+(510×2060)=56

  • “Pr” [D2g6=0]=(1530×4060)+(210×2060)=615

Finally we have that gG “Pr” [D2g=0]wt(g) is equivalent to

(58×15)+(12×940)+(4460×340)+(38×15)+(56×340)+(615×940)=52100

Appendix B.

Richardson and Robins (2013) defined a graphical condition based on a d-separation relation (i.e. checking for the absence of “backdoor paths”) that gives general identification for any intervention considered in the classification of §2 or an intervention that depends on the history of the natural value of treatment using the (non-extended) g-formula and extended g-formula, respectively. They further show that, given an appropriate consistency assumption, this graphical condition for identification implies an exchangeability condition analogous to the condition (1) given in §3. In the restricted case, where the intervention does not depend on the history of the natural value of treatment, then this condition is equivalent to the condition (1). We refer the reader to Richardson and Robins (2013) for details of this more general exchangeability condition.

The d-separation condition of Richardson and Robins (2013) is applied to a transformation of a causal DAG (Spirtes et al. 1993; Pearl 2000) representing assumptions on the underlying data generating process that produced the data in the observational study. Richardson and Robins (2013) call this transformation a Single World Intervention Graph (SWIG). We now illustrate how to evaluate identification for different interventions on a time-varying treatment under a simple set of underlying observed data generating assumptions using SWIGs. The examples given here are similar to examples depicted in Figures 19 and 21 in Richardson and Robins (2013).

Remark on notation: In describing how to construct a SWIG associated with any hypothetical intervention under an assumed observed data generating mechanism we will adopt, for this section of the appendix only, the notation of Richardson and Robins (2013). This will create two inconsistencies with notation used in the main text which we now describe, along with our motivation behind this choice. Specifically, in this appendix, we will denote any hypothetical dynamic intervention as g which may, or may not, depend on the natural value of treatment. In the main text, this notation was reserved only for deterministic regimes (dynamic or static) that do not depend on the natural value of treatment. Further, we will change the meaning of one instance of counterfactual notation used in the main text. In particular, Akg was used in the main text to denote the counterfactual value of treatment assigned under an intervention g. Here, to be consistent with Richardson and Robins (2013), Ak+g will be used to denote this counterfactual and Akg will, alternatively, be used to denote the counterfactual natural value of treatment under g.

We chose not to adopt this more complex notational convention of Richardson and Robins (2013) in the main text as the primary results regarding positivity and semi-parametric estimation of the main text do not require formalization of a counterfactual natural value of treatment. This allows simpler notation in the main text that is consistent with previous work on interventions that do not depend on the natural value of treatment. It also allows a notational bridge to the motivating work by Robins et al. (2004) and Taubman et al. (2009). While we could have used notation fully consistent with the main text in this section of the appendix, we chose to adopt that of Richardson and Robins (2013), the foundational paper on SWIGs, in order to avoid confusion within the newly emerging literature on this topic. We now proceed with our examples.

Consider the simple time-varying observational study depicted in the causal DAG of Figure 2(i) where, as in the example of §A.2, we assume a short follow-up (K = 1) and that no subject fails prior to the end of follow-up. In Figure 2(i), H1 represents an unmeasured common cause of A0=A0 and A1=A1 and H2 an unmeasured common cause of the covariate L and the outcome D.

Figure 2.

Figure 2

(i): A causal DAG representing underlying data generating assumptions for a simple time-varying observed data structure. (ii): A SWIG G(a¯) based on a transformation of the causal DAG in (i).

The d-separation condition of Richardson and Robins (2013) is evaluated for a given dynamic intervention g based on the following sets of transformations applied to a causal DAG:

  1. Split each treatment node at k into two nodes with one node containing the natural value of treatment at k and the other a constant value ak

  2. Index all random variables after time 0 as counterfactuals under a static deterministic intervention a¯, including the natural value of treatment.

  3. All arrows out of the observed Ak on the original DAG should now be out of ak and all arrows into the observed Ak on the original DAG should now be into the counterfactual natural value of treatment at Aka¯k1 (equivalent to the observed A0 at baseline as no intervention has yet been made).

Figure 2(ii) depicts a SWIG derived from the causal DAG in Figure 2(i) under this first set of transformations. A SWIG constructed from these transformations is a non-dynamic SWIG denoted G(a¯).

To assess identification for a dynamic intervention g we apply the following additional transformations:

  1. Index all counterfactuals on G(a¯) by g rather than by a¯ or a subvector thereof

  2. Replace each constant ak with the counterfactual Ak+g

  3. Add dashed arrows from any variable temporally prior to Ak+g into Ak+g if treatment at k is assigned by this variable under the intervention g

A SWIG constructed by applying this second set of transformations to G(a¯) is a dynamic SWIG denoted G(g).

Richardson and Robins (2013) prove that a dynamic intervention g is identified if, for each time k, Akg and Dg are d-separated conditional on A¯k1g, L¯kg, A¯k1+g in G(g) once we apply the additional k-specific transformation of removing all dashed arrows out of Akg. This final transformation is only required to evaluate identification when g depends on the history of the natural value of treatment. Richardson and Robins (2013) define this last k–specific transformation of the SWIG G(g) as a new SWIG associated with what they term a perturbed regime at k. Richardson and Robins (2013) note that the aforementioned d-separation holds if and only if there is no unblocked backdoor path between Akg and Dg conditional on the same set of variables.

Figure 3 depicts two dynamic SWIGs created from transformations of the non-dynamic SWIG of Figure 2(ii) which differ only by their dependence on the history of the natural value of treatment. The intervention under Figure 3(i) does not depend on any function of the history of the natural value of treatment by the absence of any dashed arrows from A0 into either A0+g or A1+g and the absence of a dashed arrow from A1g into A1+g. By contrast, the intervention under Figure 3(ii) depends on this history by the presence of dashed arrows from A0 into A0+g and A1g into A1+g.

Figure 3.

Figure 3

(i): A SWIG G(g) under which the intervention g does not depend on the history of the natural value of treatment. (ii): A SWIG G(g) under which the intervention g does depend on some function of this history.

By the d-separation condition of Richardson and Robins (2013), we can see that the intervention g in Figure 3(i), under which treatment assignment does not depend on the history of the natural value of treatment, is identified under our data generating assumptions. Specifically, there are no unblocked backdoor paths between A0 and Dg. Further, conditional on Lg, A0+g and A0 there is no unblocked backdoor path between A1g and Dg.

By contrast, we can see that the intervention g in Figure 3(ii), under which treatment assignment does depend on the history of the natural value of treatment, is not identified under our data generating assumptions. Again, applying the d-separation condition of Richardson and Robins (2013), following the transformation to the k = 0 perturbed regime (i.e. removal of the dashed arrow from A0 into A0+g) we still have the unblocked backdoor path A0H1A1gA1+gDg.

These examples illustrate that, even given we have identification for an intervention that does not depend on the history of the natural value of treatment – e.g., the random dynamic intervention (7)– it is not guaranteed that we will have identification for an intervention that does depend on some function of this history – e.g., the threshold interventions of Taubman et al. (2009) – for all underlying observed data generating mechanisms. However, under additional restrictions on the original data generating assumptions depicted in Figure 2(i), we achieve identification for both of the dynamic regimes considered in Figure 3. For example, this would be the case under either of the following restrictions applied to our initial set of data generating assumptions in Figure 2(i):

  1. The null is true (i.e. the arrows from A0 and A1 into D are removed).

  2. The common cause H1 of A0 and A1 is removed.

Appendix C.

Let Zk=(Z1,k,,Zp,k) be an arbitrary permutation of the p components in (Lk,Ak), noting that

f(zklk1,a¯k1,D¯k=0)=j=1pmj,kz

in expression (9) for any k = 0, … , K where (m1,k,,mp,k) are conditional densities based on the factorization implied by the user-selected permutation.

For user-chosen K and huser(a¯k,ak,lk), k = 0, … , K we do the following:

STEP I: Parametric modelling of conditional densities

Using the n individuals in the data set, for each k = 0, … , K:

  1. If k > 0, fit parametric models for the conditional densities mj,kz,j=1,,p.

  2. Fit a parametric model for the conditional probability of the outcome Pr[Dk+1=1L¯k=lk,A¯k=a¯k,D¯k=0]

STEP II: Monte Carlo simulation under the user-chosen huser(a¯k,ak,lk)

For k = 0, … , K and v = 1, … , n:

  1. If k = 0, set z0,v to the observed values of Z0 for subject v. Otherwise, if k > 0, recursively draw zk,v from the nested conditional densities estimated in step I.1 based on previously drawn confounders through k1lk1,v and assigned treatment a¯k1,v under the user-chosen intervention.

  2. Assign the treatment ak,v according to the user-chosen intervention. For example, for huser(a¯k,ak,lk) chosen as fd(akak,lk,a¯k1,D¯k=0) we set ak,v = 30 if ak,v30 and otherwise set ak,v=ak,v.

  3. Estimate the probability of failure by k + 1 given survival to k for the vth simulated treatment and confounder history (A¯k,v,lk,v) based on the estimated coefficients from step I.2.

STEP III: Computation of disease risk by k+1 under huser(a¯k,ak,lk)

Estimate expression (9), or equivalently expression (3), as

1nv=1nk=0KP^r[Dk+1=1L¯k=lk,v,A¯k=a¯k,v,D¯k=0]×j=0k{1P^r[Dj=1L¯j1=lj1,v,A¯j1=a¯j1,v,D¯j1=0]} (17)

where

P^r[Dk+1=1L¯k=lk,v,A¯k=a¯k,v,D¯k=0]

k = 0, … , K is obtained in step II.3.

As discussed in Young et al. (2011), both steps I and II may be modified to avoid reliance on parametric models for histories such that a priori subject matter knowledge on the observed data structure is available.

References

  1. Cain LE, Robins JM, Lanoy E, Logan R, Costagliola D, Herńan MA. When to start treatment? a systematic approach to the comparison of dynamic regimes using observational data. International Journal of Biostatistics. 2010;6 doi: 10.2202/1557-4679.1212. Article 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Danaei G, Pan A, Hu FB, Herńan MA. Hypothetical lifestyle interventions in middle-aged women and risk of type 2 diabetes: a 24-year prospective study. Epidemiology. 2013;24:122–128. doi: 10.1097/EDE.0b013e318276c98a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Dawid AP, Didelez V. Identifying optimal sequential decisions. In: McAllester D, Nicholson A, editors. Proceedings of the Twenty-Fourth Annual Conference on Uncertainty in Artificial Intelligence (UAI-08); Corvallis, Oregon: AUAI Press; 2008. pp. 113–120. [Google Scholar]
  4. Dawid AP, Didelez V. Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview. Statistics Surveys. 2010;4:184–231. [Google Scholar]
  5. Díaz Muñoz I, van der Laan MJ. Population intervention causal effects based on stochastic interventions. U.C. Berkeley Division of Biostatistics Working Paper Series; 2011; URL http://www.bepress.com/ucbbiostat/paper289. Working Paper 289. [Google Scholar]
  6. Díaz Muñoz I, van der Laan MJ. Population intervention causal effects based on stochastic interventions. Biometrics. 2012;68:541–549. doi: 10.1111/j.1541-0420.2011.01685.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. García-Aymerich J, Varraso R, Danaei G, Camargo CA, Herńan MA. Incidence of adult-onset asthma after hypothetical interventions on body mass index and physical activity: an application of the parametric g-formula. American Journal of Epidemiology. 2013 doi: 10.1093/aje/kwt229. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Haneuse S, Rotnitsky A. Estimation of the effect of interventions that modify the received treatment. 2013 doi: 10.1002/sim.5907. Submitted. [DOI] [PubMed] [Google Scholar]
  9. Herńan MA, Lanoy E, Costagliola D, Robins JM. Comparison of dynamic treatment regimes via inverse probability weighting. Basic & Clinical Pharmacology & Toxicology. 2006;98:237–242. doi: 10.1111/j.1742-7843.2006.pto_329.x. [DOI] [PubMed] [Google Scholar]
  10. Lajous M, Willett WC, Robins JM, Young JG, Rimm E, Mozaffarian D, Herńan MA. Changes in fish consumption in midlife and the risk of coronary heart disease in men and women. American Journal of Epidemiology. 2013;178(3):382–391. doi: 10.1093/aje/kws478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Murphy SA, van der Laan MJ, Robins JM. Marginal mean models for dynamic regimes. Journal of the American Statistical Association. 2001;96(456):1410–23. doi: 10.1198/016214501753382327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Orellana L, Rotnitzky A, Robins JM. Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, Part I: Main Content. International Journal of Biostatistics. 2010a;6 Article 7. [PubMed] [Google Scholar]
  13. Orellana L, Rotnitzky A, Robins JM. Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, Part II: Proofs and Additional Results. International Journal of Biostatistics. 2010b;6 doi: 10.2202/1557-4679.1242. Article 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Pearl J. Causality. Cambridge University Press; Cambridge, UK: 2000. [Google Scholar]
  15. Petersen ML, Porter KE, Gruber S, Wang Y, van der Laan MJ. Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research. 2012;21(1):31–54. doi: 10.1177/0962280210386207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Picciotto S, Herńan MA, Page JH, Young JG, Robins JM. Structural nested cumulative failure time models to estimate the effects of interventions. Journal of the American Statistical Association. 2012;107(499):886–900. doi: 10.1080/01621459.2012.682532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Richardson TS, Robins JM. Single World Intervention Graphs (SWIGs): a unification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, University of Washington Series; 2013. URL http://www.csss.washington.edu/Papers/. Working Paper Number 128. [Google Scholar]
  18. Robins JM. A new approach to causal inference in mortality studies with a sustained exposure period: application to the healthy worker survivor effect. Mathematical Modelling. 1986;7:1393–1512. [Google Scholar]; Computers and Mathematics with Applications. 1987;14:917–921. Errata. [Google Scholar]; Computers and Mathematics with Applications. 1987;14:923–945. Addendum. [Google Scholar]; Computers and Mathematics with Applications. 1987;18:477. Errata. [Google Scholar]
  19. Robins JM. Causal inference from complex longitudinal data. In: Berkane M, editor. Latent Variable Modeling and Applications to Causality. Lecture notes in statistics 120. Springer-Verlag; 1997. pp. 69–117. [Google Scholar]
  20. Robins JM. Statistical Models in Epidemiology. Springer; New York: 2000. Marginal structural models versus structural nested models as tools for causal inference; pp. 95–133. [Google Scholar]
  21. Robins JM, Herńan MA. Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, editors. Advances in Longitudinal Data Analysis. Chapman and Hall/CRC Press; Boca Raton, FL: 2009. pp. 553–599. [Google Scholar]
  22. Robins JM, Wasserman L. Estimation of effects of sequential treatments by reparameterizing directed acyclic graphs. In: Geiger D, Shenoy P, editors. Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence; San Francisco: Morgan Kaufmann; 1997. pp. 409–420. [Google Scholar]
  23. Robins JM, Herńan MA, Siebert U. Effects of multiple interventions. In: Ezzati M, Lopez AD, Rodgers A, Murray CJL, editors. Comparative Quantification of Health Risks: Global and Regional Burden of Disease Attributable to Selected Major Risk Factors. World Health Organization; Geneva: 2004. [Google Scholar]
  24. Spirtes P, Glymour C, Scheines R. Causation, Prediction and Search. Springer-Verlag; New York: 1993. [Google Scholar]
  25. Stitelman OM, Hubbard AE, Jewell NP. The impact of coarsening the explanatory variable of interest in making causal inferences: Implicit assumptions behind dichotomizing variables. U.C. Berkeley Division of Biostatistics Working Paper Series. 2010 URL http://www.bepress.com/ucbbiostat/paper264. Working Paper 264.
  26. Taubman SL, Mittleman MA, Robins JM, Herńan MA. Alternative approaches to estimating the effects of hypothetical interventions. JSM Proceedings, Health Policy Statistics Section; Alexandria, VA. American Statistical Association; 2008. [Google Scholar]
  27. Taubman SL, Robins JM, Mittleman MA, Herńan MA. Intervening on risk factors for coronary heart disease: an application of the parametric g-formula. International Journal of Epidemiology. 2009;38(6):1599–611. doi: 10.1093/ije/dyp192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Tian J. Identifying dynamic sequential plans. Twenty-fourth Conference on Uncertainty in Artificial Intelligence; AUAI Press; 2008. [Google Scholar]
  29. van der Laan MJ, Petersen ML, Joffe MM. History-adjusted marginal structural models and statically-optimal dynamic treatment regimens. International Journal of Biostatistics. 2005;1(1) Article 4. [Google Scholar]
  30. Young JG, Cain LE, Robins JM, O’Reilly EJ, Herńan MA. Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula. Statistics in Biosciences. 2011 doi: 10.1007/s12561-011-9040-7. DOI: 10.1007/s12561-011-9040-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES