Robust and Flexible Estimation of Stochastic Mediation Effects: A Proposed Method and Example in a Randomized Trial Setting

Kara E Rudolph; Oleg Sofrygin; Wenjing Zheng; Mark J van der Laan

doi:10.1515/em-2017-0007

. Author manuscript; available in PMC: 2021 May 20.

Published in final edited form as: Epidemiol Methods. 2017 Dec 13;7(1):20170007. doi: 10.1515/em-2017-0007

Robust and Flexible Estimation of Stochastic Mediation Effects: A Proposed Method and Example in a Randomized Trial Setting

Kara E Rudolph ¹, Oleg Sofrygin ², Wenjing Zheng ³, Mark J van der Laan ²

PMCID: PMC8136358 NIHMSID: NIHMS1554809 PMID: 34026421

Abstract

Background:

Causal mediation analysis can improve understanding of the mechanism s underlying epidemiologic associations. However, the utility of natural direct and indirect effect estimation has been limited by the assumption of no confounder of the mediator-outcome relationship that is affected by prior exposure (which we call an intermediate confounder)–-an assumption frequently violated in practice.

Methods:

We build on recent work that identified alternative estimands that do not require this assumption and propose a flexible and double robust targeted minimum loss-based estimator for stochastic direct and indirect effects. The proposed method intervenes stochastically on the mediator using a distribution which conditions on baseline covariates and marginalizes over the intermediate confounder.

Results:

We demonstrate the estimator’s finite sample and robustness properties in a simple simulation study. We apply the method to an example from the Moving to Opportunity experiment. In this application, randomization to receive a housing voucher is the treatment/instrument that influenced moving with the voucher out of public housing, which is the intermediate confounder. We estimate the stochastic direct effect of randomization to the voucher group on adolescent marijuana use not mediated by change in school district and the stochastic indirect effect mediated by change in school district. We find no evidence of mediation.

Conclusions:

Our estimator is easy to implement in standard statistical software, and we provide annotated R code to further lower implementation barriers.

Keywords: mediation, direct effect, indirect effect, double robust, targeted minimum loss-based estimation, targeted maximum likelihood estimation, data-dependent

1. Introduction

Mediation allows for an examination of the mechanisms driving a relationship. Much of epidemiology entails reporting exposure-outcome associations w here the exposure may be multiple steps removed from the outcome. For example, risk-factor epidemiology demonstrated that obesity increases the risk of type 2 diabetes, but biochemical mediators linking the two have advanced our understanding of the causal relationship (Kahn, Hull, and Utzschneider 2006). Mediators have been similarly important in understanding how social exposures act to affect health outcomes. In the illustrative example we consider in this paper, the Moving to Opportunity (MTO) experiment randomized families living in public housing to receive a voucher that they could use to rent housing on the private market, which reduced their exposure to neighborhood poverty (Kling, Liebman, and Katz 2007). Ultimately, being randomized to receive a voucher affected subsequent adolescent drug use (Orr et al. 2003; Rudolph et al. 2017). In the illustrative example, we test the extent to which the effect operates through a change in the adolescent’s school environment.

Causal mediation analysis (Imai, Keele, and Tingley 2010; Valeri and VanderWeele 2013; Zheng and van der Laan 2012b) (also called mediation analysis using the counterfactual framework (Valeri and VanderWeele 2013)) shares similar goals with the standard mediation approaches, e.g., structural equation modeling and the widely used Baron and Kenny “product method” approach (Baron and Kenny 1986; Valeri and VanderWeele 2013). They all aim to test mechanisms and estimate direct and indirect effects. Advantages of causal mediation analysis include that estimates have a causal interpretation (under specified identifying assumptions) and some approaches make fewer restrictive parametric modeling assumptions. For example, in contrast to traditional approaches, approaches within the causal mediation framework (1) allow for interaction between the treatment and mediator (VanderWeele 2009), (2) allow for modeling nonlinear relationships between mediators and outcomes (VanderWeele 2009), and (3) allow for incorporation of data-adaptive machine learning methods and double robust estimation (Zheng and van der Laan 2012b; Tchetgen and Shpitser 2014).

However, despite these advantages, the assumptions required to estimate certain causal mediation effects may sometimes be untenable; for example, the assumption that there is no confounder of the mediator-outcome relationship that is affected by treatment (in the literature, such a confounder is referred to as confounding by a causal intermediate (Petersen, Sinisi, and van der Laan 2006), a time-varying confounder affected by prior exposure (VanderWeele and Tchetgen Tchetgen 2017), or time-dependent confounding by an intermediate covariate (van der Laan and Gruber 2012)). For brevity, we will refer to such a variable as an intermediate confounder.

There have been recently proposed causal mediation estimands, called randomized (i.e., stochastic) interventional direct effects and interventional indirect effects that do not require this assumption (Didelez, Dawid, and Geneletti 2006; van der Laan and Petersen 2008; Zheng and van der Laan 2012a; VanderWeele, Vansteelandt, and Robins 2014; Vansteelandt and Daniel 2017; VanderWeele and Tchetgen Tchetgen 2017; Zheng and van der Laan 2017). We build on this work, proposing a robust and flexible estimator for these effects, which we call stochastic direct and indirect effects (SDE and SIE)

This paper is organized as follows. In the following section, we review and compare common causal mediation estimands, providing the assumptions necessary for their identification. Then, we describe our proposed estimator, its motivation, and its implementation in detail. Code to implement this method is provided in the Appendix. We then provide results from a limited simulation study demonstrating the estimator’s finite sample performance and robustness properties. Lastly, we apply the method in a longitudinal, randomized trial setting.

2. Notation and causal mediation estimands

Let observed data: O = (W, A, Z, M, Y) with n i.i.d. copies O₁, …, O_n ~ P₀, where W is a vector of pre-treatment covariates, A is the treatment, Z is the intermediate confounder affected by A, M is the mediator, and Y is the outcome. For simplicity, we assume that A, Z, M_, and Y are binary. In our illustrative example, A is an instrument, so it is reasonable to assume that M and Y are not affected by A except through its effect on Z. Mirroring the structural causal model (SCM) of our illustrative example, we assume that M is affected by {Z, W} but not A, and that Y is affected by {M, Z, W} but not A. We assume exogenous random errors: (U_W, U_A, U_Z, U_M, U_Y). This SCM is represented in Figure 1 and the following causal models: W = f (U_W), A = f (U_A), Z = f (A, W, U_Z), M = f (Z, W, U_M), and Y = f (M, Z, W, U_Y). Note that this SCM (including that U_Y and U_M are not affected by A) puts the following assumptions on the probability distribution: P(Y|M, Z, A, W) = P(Y|M, Z, W) and P(M|Z, A, W) = P(M|Z, W). However, our approach generalizes to scenarios where A also affects M and Y as well as to scenarios where A is not random. We provide details and discuss these generalizations in the Appendix. We can factorize the likelihood for the SCM reflecting our illustrative example as follows: P(O) = P(Y|M, Z, W)P(M|Z, W)P(Z|A, W)P(A)P(W).

Figure 1: — Structural causal model reflecting the illustrative example.

Causal mediation analysis typically involves estimating one of two types of estimands: controlled direct effects (CDE) or natural direct and indirect effects (NDE, NIE). Controlled direct effects involve comparing expected outcomes under different values of the treatment and setting the value of the mediator for everyone in the sample. For example the CDE can be defined: E(Y_a,M − Y_a*,m), where Y_a,m, Y_a*,m, is the counterfactual outcome setting treatment A equal to a or a*, respectively (the two treatment values being compared), and setting mediator M equal to m. In contrast, the NDE can be defined: $E (Y_{a, M_{a *}} - Y_{a *, M_{a *}})$ , where $Y_{a, M_{a *}}$ , $Y_{a *, M_{a *}}$ is the counterfactual outcome setting A equal to a or a* but this time setting M_a* to be the counterfactual value of the mediator had A been set to a* (possibly contrary to fact). Similarly, the NIE can be defined: $E (Y_{a, M_{a}} - Y_{a, M_{a *}})$ . Natural direct and indirect effects are frequently used in epidemiology and have the appealing property of adding to the total effect (Pearl 2001).

Although the NDE and NIE are popular estimands, their identification assumptions may sometimes be untenable. Broadly, identification of their causal effects relies on the sequential randomization assumption on intervention nodes A and M and positivity. Two specific ignorability assumptions are required to identify CDEs and NDE/NIEs: (1) A ⊥ Y_a,m|W and (2) M ⊥ Y_a,m|W, A (Pearl 2001). The positivity assumptions are: P(M = m|A = a, W) > 0 a.e. and P(A = a|W) > 0 a.e. Two additional ignorability assumptions are required to identify NDE/NIEs: (3) A ⊥ M_a|W and (4) M_a* ⊥ Y_a,m|W (Pearl 2001). This last assumption states that, conditional on W, knowledge of M in the absence of treatment A provides no information of the effect of A on Y (Petersen, Sinisi, and van der Laan 2006). This assumption is violated when there is a confounder of the M − Y relationship that is affected by A (i.e., an intermediate confounder) (Avin, Shpitser, and Pearl 2005; Petersen, Sinisi, and van der Laan 2006; VanderWeele and Tchetgen Tchetgen 2017). This assumption is also problematic because it involves independence of counterfactuals under separate worlds (a and a*) which can never simultaneously exist.

This last assumption that there is no confounder of the mediator-outcome relationship affected by prior treatment is especially concerning for epidemiology studies where longitudinal cohort data may reflect a data structure in which a treatment affects an individual characteristic measured at follow-up that in turn affects both a mediating variable and the outcome variable (see (Bild et al. 2012; Eaton and Kessler 2012; Phair et al. 1992) for some examples). It is also problematic for mediation analyses involving instrumental variables such as randomized encouragement-design interventions where an instrument, A, encourages treatment uptake, Z, which then may influence Y potentially through M. Such a design is present in the MTO experiment that we will use as an illustrative example. Randomization to receive a housing voucher (A) was the instrument that “encouraged” the treatment uptake, moving with the voucher out of public housing (Z, which we will call the intermediate confounder). In turn, Z may influence subsequent drug use among adolescent participants at follow-up (Y), possibly through a change the children’s school environment (M). In the illustrative example, our goal was to examine mediation of the effect of receiving a housing voucher (A) on subsequent drug use (Y) by changing school districts (M) in the MTO data.

There has been recent work to relax the assumption of no intermediate confounder, M_a* ⊥ Y_a,m|W, by using a stochastic intervention on M (Didelez, Dawid, and Geneletti 2006; van der Laan and Petersen 2008; VanderWeele and Tchetgen Tchetgen 2017; VanderWeele, Vansteelandt, and Robins 2014; Vansteelandt and Daniel 2017; Zheng and van der Laan 2017; 2012a). In this paper, we build on the approach described by VanderWeele and Tchetgen Tchetgen (2017) in which they defined the stochastic distribution on M as: g_M|a,W or g_M|a*,W, where

g_{M | A, W} (m, a *, W) \equiv g_{M | a *, W} (W) = \sum_{z = 0}^{1} P (M = 1 | Z = z, W) P (Z = z | A = a *, W) .

(1)

(eq. 1 is an example of VanderWeele and Tchetgen Tchetgen (2017)’s work.) In other words, instead of formulating the individual counterfactual values of M_a or M_a*, values are stochastically drawn from the distribution of M, conditional on covariates W but marginal over intermediate confounder Z, setting A = a or A = a*, respectively. The corresponding estimands of interest are the $SDE = E (Y_{a, g_{M | a *, W}}) - E (Y_{a *, g_{M | a *, W}})$ , and $SIE = E (Y_{a, g_{M | a, W}}) - E (Y_{a, g_{M | a *, W}})$ .

Others have taken a similar approach. For example, Zheng and van der Laan (2017) formulate a stochastic intervention on M that is fully conditional on the past:

g_{M | Z, A, W} (m, Z, a *, W) \equiv g_{M | Z, a *, W} (Z, W) = P (M = 1 | Z, A = a *, W)

(2)

(note that per our SCM, P(M = 1|Z, W) = P(M = 1|Z, A, W), so in our case, g_M|Z,a*,W(Z, W) = P(M = 1_|Z, W).) The corresponding estimands are the stochastic direct and indirect effects fully conditional on the past: $CSDE = E (Y_{a, g_{M | Z, a *, W}}) - E (Y_{a *, g_{M | Z, a *, W}})$ , and $CSIE = E (Y_{a, g_{M | Z, a, W}}) - E (Y_{a, g_{M | Z, a *, W}})$ . However, Zheng and van der Laan (2017)’s formulation shown in eq. 2 is not useful for understanding mediation under the instrumental variable SCM we consider here, as there is no direct pathway from A to M. Because of the restriction on our statistical model that P(M|Z, A, W) = P(M|Z, W), g_M|Z,a*,W(Z, W) = g_M|Z,a,W(Z, W), so CSIE’s under this model would equal 0. Thus, in this scenario, the NDE and CSDE are very different parameters. We note that it is also because of these restrictions on our statistical model stemming from the instrumental variable SCM that the sequential mediation analysis approach proposed by VanderWeele and Vansteelandt (2014) would also result in indirect effects equal to 0.

Because the CSIE and CSDE do not aid in understanding the role of M as a potential mediator in this scenario, we focus instead on VanderWeele and Tchetgen Tchetgen (2017)’s SDE and SIE that condition on W but marginalize over Z, thus completely blocking arrows into M (similar to an NDE and NIE). The SDE and SIE coincide with the NDE and NIE in the absence of intermediate confounders (VanderWeele and Tchetgen Tchetgen 2017).

2.1. SDE and SIE estimands and identification

Our proposed estimator can be used to estimate two versions of the SDE and SIE: (1) fixed parameters that assume an unknown, true g_M|a*,W; and (2) data-dependent parameters that assume known g_M|a*,W, estimated from the observed data, which we call ĝ_M|a*,W. Researchers may have defensible reasons for choosing one version over the other, which we explain further below. The fixed SDE and SIE can be identified from the observed data distribution using the g-computation formula as discussed by VanderWeele and Tchetgen Tchetgen (2017), assuming the sequential randomization assumption on intervention nodes A and M: (1) A ⊥ Y_a,m|W, (2) M ⊥ Y_a,m|W, A = a, Z, and (3) A ⊥ M_a|W, for a particular a and g_M|a*,W. The data-dependent SDE And SIE can be identified similarly but need only the first two assumptions, (1) A ⊥ Y_a,m|W and (2) M ⊥ Y_a,m|W, A = a, Z, because ĝ_M|a*,W is assumed known. If any of the above identification assumptions are violated, then the statistical estimands will not converge to their true causal quantities.

The estimand $E (Y_{a, {\hat{g}}_{M | a * W}})$ can be identified via sequential regression, which provides the framework for our proposed estimator that follows. For intervention (A = a, M = ĝ_M|a*,W), we have ${\bar{Q}}_{M}^{\hat{g}} (Z, W) \equiv \int_{m \in M} \int_{y \in Y} p (y | m, Z, W) d μ_{Y} (y) {\hat{g}}_{m | a *, W} d μ_{M} (m)$ , where we integrate out M under our stochastic intervention ĝ_M|a*,W, and where M has support $M$ and Y has support $Y$ and where dμ_Y(y) and dμ_M(m) are some dominating measures. This is accomplished by evaluating E(Y|M = m, Z = z, W) at each m and multiplying it by the probability that M = m under ĝ_M|a*,W, summing over all m. We then integrate out Z and set A = a: ${\bar{Q}}_{Z}^{a} (W) \equiv \int_{z \in Z} {\bar{Q}}_{M}^{\hat{g}} (z, W | A = a, W) p (z | A = a, W) d μ_{Z} (z)$ , where $Z$ denotes the support of random variable Z and dμ_Z(_z) is some dominating measure. Marginalizing over the distribution of W gives the statistical parameter: $Ψ (P) (a, {\hat{g}}_{M | a *, W}) = \int_{w \in W} {\bar{Q}}_{Z}^{a} (w) p (w) d μ_{W} (w)$ , where $W$ denotes the support of random variable W and dμ_W(w) is some dominating measure.

In the next section, we propose a novel, robust substitution estimator that can be used to estimate both the fixed and parametric versions of the SDE and SIE. Inference for the fixed SDE and SIE can be obtained by using the bootstrapped variance, which requires parametric models for the nuisance parameters P(A) and P(M|Z, A, W). This is the same inference strategy as proposed by VanderWeele and Tchetgen Tchetgen (2017). However, researchers may encounter scenarios for which fitting parametric models is unappealing and using machine learning approaches is preferred. The data-dependent SDE and SIE with inference based on the efficient influence curve (EIC) may be preferable in such scenarios. In contrast to the EIC for an assumed known g_M|a*,W, the EIC an unknown g_M|a*,W is complicated due to the dependence of the unknown marginal stochastic intervention on the data distribution. Such an EIC would include an M component, the form of which would be more complex due to the distribution of M being marginalized over Z. No statistical tools for solving an EIC of that form currently exist. Solving the EIC for the the parameter Ψ(P)(a, g_M|a*,W) for an unknown g_M|a*,W is ongoing work.

3. Targeted minimum loss-based estimator

Our proposed estimator uses targeted minimum loss-based estimation (TMLE) (van der Laan and Rubin 2006), targeting the stochastic, counterfactual outcomes that comprise the SDE and SIE. To our knowledge, it is the first such estimator appropriate for instrumental variable scenarios. TMLE is a substitution estimation method that solves the EIC estimating equation. Its robustness properties differ for the fixed and data-dependent parameters. For the data-dependent SDE and SIE, if either the Y model is correct or the A and M models given the past are correct, then one obtains a consistent estimator of the parameter. Robustness to misspecification of the treatment model is relevant under an SCM with nonrandom treatment; we discuss the generalization of our proposed estimator to such an SCM in the Appendix. Note that ĝ_M|a*,W for the stochastic intervention is not the same as the conditional distribution of M given the past, so the first could be inconsistent while the latter is consistent. For the fixed SDE and SIE, we also need to assume consistent estimation of g_M|a*,W, since it does not target g_M|a*,W (and is therefore not a full TMLE for the fixed parameters).

The estimator integrates two previously developed TMLEs: one for stochastic interventions (Muñoz and van der Laan 2012) and one for multiple time-point interventions (van der Laan and Gruber 2012), which is built on the iterative/recursive g-computation approach (Bang and Robins 2005). This TMLE is not efficient under the SCM considered, because of the restriction on our statistical model that P(Y|M, Z, A, W) = P(Y|M, Z, W). However, it is still a consistent estimator if that restriction on our model does not hold (i.e., P(Y|M, Z, A, W) ≠ P(Y|M, Z, W)), because the targeting step adds dependence on A.

The TMLE is constructed using the sequential regressions described in the above section with an additional targeting step after each regression. The TMLE solves the EIC for the target parameter that treats g_M|a*,W as given. A similar EIC has been described previously (Bang and Robins 2005; van der Laan and Gruber 2012).

The EIC for the parameter Ψ(P)(a, g_M|a*,W) for a given g_M|a*,W is given by:

D * (a, g_{M | a *, W}) = \sum_{k = 0}^{2} D_{k}^{*} (a, g_{M | a *, W}), where D_{0}^{*} (a, g_{M | a *, W}) = {\bar{Q}}_{Z}^{a} (W) - Ψ (P) (a, g_{M | a *, W}) D_{1}^{*} (a, g_{M | a^{*}, W}) = \frac{I (A = a)}{P (A = a | W)} ({\bar{Q}}_{M}^{g} (Z, W) - {\bar{Q}}_{Z}^{a} (W)) D_{2}^{*} (a, g_{M | a *, W}) = \frac{I (A = a) {I (M = 1) g_{M | a *, W} + I (M = 0) (1 - g_{M | a *, W})}}{P (A = a | W) {I (M = 1) g_{M | z, W} + I (M = 0) (1 - g_{M | z, W})}} \times (Y - {\bar{Q}}_{Y} (M, Z, W)) .

(3)

We now describe how to compute the TMLE. In doing so, we use parametric model/regression language for simplicity but data-adaptive estimation approaches that incorporate machine learning (e.g.,van der Laan, Polley, and Hubbard 2007) may be substituted and may be preferable (we use such a data-adaptive approach in the illustrative example analysis). We note that survey or censoring weights could be incorporated into this estimator as described previously (Rudolph et al. 2014). We use notation reflecting estimation of the data-dependent parameters, but note that estimation of the fixed parameters would be identical–-in the fixed parameter case, the notation would refer to g_M|a*,W instead of ĝ_M|a*,W.

First, one estimates ĝ_M|a*,W(W), which is the estimate of g_M|a*,W(W), defined in eq. 1, using observed data. Consider a binary Z. We estimate g_Z|a*,W(W) = P(Z = 1|A = a*, W). We then estimate g_M|z,W(W) = P(M = 1|Z = z, W) for z ∈ {0, 1}. We use these quantities to calculate ĝ_M|a*,W = ĝ_M|z=1,Wĝ_Z|a*,W + ĝ_M|z=0,W(1 − ĝ_Z|a*,W). We can obtain ĝ_Z|a*,W(W) from a logistic regression of Z on A, W setting A = a*, and ĝ_M|z,W(W) from a logistic regression of M on Z, W, setting Z = {0, 1}. We will then use this stochastic intervention in the TMLE, whose implementation is described as follows.

Let ${\bar{Q}}_{Y, n} (M, Z, W)$ be an estimate of ${\bar{Q}}_{Y} (M, Z, W) \equiv E (Y | M, Z, W)$ . To obtain ${\bar{Q}}_{Y, n} (M, Z, W)$ , predict values of Y from a regression of Y on M, Z, W.
Estimate the weights to be used for the initial targeting step: $h_{1} (a) = \frac{I (A = a) {I (M = 1) {\hat{g}}_{M | a *, W} + I (M = 0) (1 - {\hat{g}}_{M | a *, W})}}{P (A = a) {I (M = 1) g_{M | Z, W} + I (M = 0) (1 - g_{M | Z, W})}}$ , where estimates of g_M|Z,W are predicted probabilities from a logistic regression of M = m on Z and W. Let ĥ_1,n(a) denote the estimate of h₁(a).
Target the estimate of ${\bar{Q}}_{Y, n} (M, Z, W)$ by considering a univariate parametric submodel { ${\bar{Q}}_{Y, n} (M, Z, W) (ϵ) : ϵ$ } defined as: $logit ({\bar{Q}}_{Y, n} (ϵ) (M, Z, W)) = logit ({\bar{Q}}_{Y, n} (M, Z, W)) + ϵ$ . Let ϵ_n be the MLE fit of ϵ. We obtain ϵ_n by setting ϵ as the intercept of a weighted logistic regression model of Y with $logit ({\bar{Q}}_{Y, n} (M, Z, W))$ as an offset and weights ${\hat{h}}_{1, n} (a)$ . (Note that this is just one possible TMLE.)The update is given by ${\bar{Q}}_{Y, n}^{*} (M, Z, W) = {\bar{Q}}_{Y, n} (ϵ_{n}) (M, Z, W)$ can be bounded to the [0,1] scale as previously recommended (Gruber and van der Laan 2010).
Let ${\bar{Q}}_{M, n}^{g} (Z, W)$ be an estimate of ${\bar{Q}}_{M}^{g} (Z, W)$ . To obtain ${\bar{Q}}_{M, n}^{g} (Z, W)$ , we integrate out M to from ${\bar{Q}}_{Y, n}^{*} (M, Z, W)$ . First, we estimate ${\bar{Q}}_{Y, n}^{*} (M, Z, W)$ setting m = 1 and m = 0, giving ${\bar{Q}}_{Y}^{*} (m = 1, z, w)$ and ${\bar{Q}}_{γ}^{*} (m = 0, z, w)$ . Then, multiply these predicted values by their probabilities under ĝ_M|a*,W(W) (for a ∈ {a, a*}), and add them together (i.e., ${\bar{Q}}_{M, n}^{\tilde{g}} (Z, W) = {\hat{Q}}_{Y}^{*} (m = 1, z, w) {\hat{g}}_{M | a *, W} + {\hat{Q}}_{Y}^{*} (m = 0, z, w) (1 - {\hat{g}}_{M | a *, W})$ ).
We now fit a regression of ${\bar{Q}}_{M, n}^{\hat{g}, *} (Z, W)$ on W among those with A = a. We call the predicted values from this regression ${\bar{Q}}_{Z, n}^{a} (W)$ . The empirical mean of these predicted values is the TMLE estimate of $Ψ (P) (a, {\hat{g}}_{M | a *, W})$ .
Repeat the above steps for each of the interventions. For example, for binary A, we would execute these steps a total of three times to estimate: 1) Ψ(P)(1, ĝ_M|1,W), 2) Ψ(P)(1, ĝ_M|0W), and 3) Ψ(P)(0, ĝ_M|0,W).
The SDE can then be obtained by substituting estimates of parameters Ψ(P)(a, ĝ_M|a*,W) − Ψ(P)(a*, ĝ_M|a*,W) and the SIE can be obtained by substituting estimates of parameters Ψ(P)(a, ĝ_M|a,W) − Ψ(P)(a, ĝ_M|a*,W).
For the fixed parameters, the variance can be estimated using the bootstrap. For the data-dependent parameters, the variance of each estimate from Step 6 can be estimated as the sample variance of the EIC (defined above, substituting in the targeted fits ${\bar{Q}}_{Y, n}^{*} (M, Z, W)$ and ${\bar{Q}}_{z, n}^{a, *} (W)$ ) divided by n. First, we estimate the EIC for each component of the data-dependent SDE/SIE, which we call $E I C_{Ψ (P) (a, {\hat{g}}_{M | a *, W})}$ . Then we estimate the EIC for the estimand of interest by subtracting the EICs corresponding to the components of the estimand. For example $E I C_{SDE} = E I C_{Ψ (P) (a, {\hat{g}}_{M | a *, W})} - E I C_{Ψ (P) (a *, {\hat{g}}_{M | a *, W})}$ . The sample variance of this EIC divided by n is the influence curve (IC)-based variance of the data-dependent estimator.

4. Simulation

4.1. Data generating mechanism

We conduct a simulation study to examine finite sample performance of the TMLE estimators for the fixed SDE and SIE and data-dependent SDE and SIE from the data-generating mechanism (DGM) shown in Table 1. Under this DGM, the data-dependent SDE is: $SDE = E (Y_{1, {\hat{g}}_{M | 0, W}}) - E (Y_{0, {\hat{g}}_{M | 0, W}})$ and the SIE is: $SIE = E (Y_{1, {\hat{g}}_{M | 1, W}}) - E (Y_{1, {\hat{g}}_{M | 0, W}})$ . The fixed versions are defined with respect to the unknown, true g_M|1,W and g_M|0,W. Table 1 uses the same notation and SCM as in Section 2, with the addition of Δ, an indicator of selection into the sample (which corresponds to the MTO data used in the empirical illustration where one child from each family is selected to participate).

Table 1:

Simulation data-generating mechanism.

W₁ ~ Ber(0.5)	P(W₁ = 1) = 0.50
W₂ ~ Ber(0.4 + 0.2W₁)	P(W₂ = 1) = 0.50
Δ ~ Ber(–1 + log(4)W₁ + log(4)W₂	P(Δ = 1) = 0.58
A = ΔA, where A ~ Ber (0.5)	P(A = 1) = 0.50
Z = ΔZ, where Z ~ Ber(log(4)A – log(2)W₂)	P(Z = 1) = 0.58
M = ΔM, where M ~ Ber(−log(3) + log(10)Z – log(1.4)W₂)	P(M = 1) = 0.52
Y = ΔY, where Y ~ Ber(log(1.2) + log(3)Z + log(3)M − log(1.2)W₂ + log(1.2)ZW₂)	P(Y = 1) = 0.76

Open in a new tab

We compare performance of the TMLE estimator to an inverse-probability weighted estimator (IPTW) and estimator that solves the EIC estimating equation (EE) but differs from TMLE in that it lacks the targeting steps and is not a plug-in estimator, so its estimates are not guaranteed to lie within the parameter space (which may be particularly relevant for small sample sizes). Variance for the fixed SDE and SIE parameters is calculated using 500 bootstrapped samples for each simulation iteration. Variance for the data-dependent SDE and SIE is calculated using the EIC. We show estimator performance in terms of absolute bias, percent bias, closeness to the efficiency bound (mean estimator standard error (SE) × the square root of the number of observations), 95% confidence interval (CI) coverage, and mean squared error (MSE) across 1,000 simulations for sample sizes of N=5,000, N=500, and N=100. In addition, we consider (1) correct specification of all models, and (2) misspecification of the Y model that included a term for Z only.

4.2. Performance

Table 2 gives simulation results under correct model specification for fixed SDE and SIE using bootstrap-based variance. Table 3 gives simulation results under correct model specification for data-dependent SDE and SIE using IC-based variance. Both Tables 2 and 3 show that the TMLE, IPTW, and EE estimators are consistent when all models are correctly specified, showing biases of around 1% or less under large sample size (N=5,000) and slightly larger biases with smaller sample sizes. The 95% CIs for the TMLE and EE estimators result in similar coverage that is close to 95%, except for estimation of the SIE with a sample size of N=100, which has coverage closer to 90%. Confidence intervals for the IPTW estimator for the fixed parameter are close to 95% but are conservative and close to 100% for the data-dependent parameter. As expected, IPTW is less efficient than TMLE or EE; the TMLE and EE estimators perform similarly and close to the efficiency bound for all sample sizes.

Table 2:

Simulation results for fixed SDE and SIE using bootstrapped-based variance (500 boostrapped samples) under correct specification of all parametric models for various sample sizes. 1,000 simulations. Estimation methods compared include targeted minimum loss-based estimation (TMLE), inverse probability weighting estimation (IPTW), and solving the estimating equation (EE). Bias and MSE values are averages across the simulations. The estimator standard error × $\sqrt{n}$ should be compared to the efficiency bound, which is 1.07 for the SDE and 0.24 for the SIE.

Estimand	Bias	%Bias	SE× $\sqrt{n}$	95%CI Cov	mse
		N=5,000
TMLE
SDE	−3.95e-04	−0.58	1.11	94.1	2.50e-04
SIE	5.51e-05	0.17	0.25	94.9	1.29e-05
IPTW
SDE	5.29e-04	0.78	1.69	95.2	3.39e-04
SIE	8.96e-06	0.03	0.42	94.5	5.42e-04
EE
SDE	7.90e-04	1.17	1.11	93.7	2.76e-04
SIE	2.53e-04	0.96	0.25	94.8	1.22e-05
		N=500
TMLE
SDE	1.87e-04	0.27	1.31	94.4	2.46e-03
SIE	−2.91e-04	−1.10	0.41	94.7	7.89e-04
IPTW
SDE	−2.79e-03	−4.12	1.68	94.7	5.60e-03
SIE	1.44e-03	5.45	0.44	95.0	3.64e-04
EE
SDE	−1.69e-03	−2.49	1.10	94.0	2.49e-03
SIE	3.22e-04	1.22	0.26	94.0	1.27e-04
		N=100
TMLE
SDE	6.78e-03	10.02	1.09	98.4	1.36e-02
SIE	−1.58e-03	−5.98	0.25	87.9	1.26e-04
IPTW
SDE	−3.20e-03	−4.73	1.68	94.1	2.96e-02
SIE	−5.77e-04	−2.18	0.53	95.2	2.02e-03
EE
SDE	2.83e-03	4.18	1.09	94.9	1.19e-02
SIE	−5.25e-04	−1.98	0.31	93.0	7.49e-04

Open in a new tab

Table 3:

Simulation results for data-dependent SDE and SIE using influence curve-based variance under correct specification of all parametric models for various sample sizes. 1,000 simulations. Estimation methods compared include targeted minimum loss-based estimation (TMLE), inverse probability weighting estimation (IPTW), and solving the estimating equation (EE). Bias and MSE values are averages across the simulations. The estimator standard error × $\sqrt{n}$ should be compared to the efficiency bound, which is 1.07 for the SDE and 0.24 for the SIE.

Estimand	Bias	%Bias	SE× $\sqrt{n}$	95%CI Cov	mse
TMLE		N=5,000
SDE	1.08e-03	1.61	1.11	93.09	2.77e-04
SIE	8.21e-06	0.03	0.24	94.89	1.10e-05
IPTW
SDE	7.87e-04	1.12	2.28	99.40	6.16e-04
SIE	6.51e-06	0.05	1.18	100.00	3.74e-05
EE
SDE	1.20e-03	1.79	1.12	93.71	2.76e-04
SIE	1.85e-05	0.06	0.24	95.21	1.09e-05
		N=500
TMLE
SDE	7.55e-04	1.14	1.10	95.50	2.29e-03
SIE	−4.33e-04	−1.36	0.23	94.59	1.20e-04
IPTW
SDE	6.28e-03	9.00	2.29	98.80	5.75e-03
SIE	−1.90e-03	−6.44	1.19	100.00	3.76e-04
EE
SDE	8.27e-04	1.24	1.11	95.51	2.32e-03
SIE	−3.35e-04	−0.92	0.24	94.31	1.24e-04
TMLE		N=100

SDE	6.34e-03	8.81	1.07	95.50	1.30e-02
SIE	−1.90e-03	−7.85	0.21	87.99	7.45e-04
IPTW
SDE	−1.07e-02	−16.87	2.31	99.40	2.84e-02
SIE	−1.35e-03	−5.91	1.18	100.00	2.29e-03
EE
SDE	1.29e-03	1.27	1.10	97.01	1.21e-02
SIE	2.44e-04	0.15	0.23	90.12	7.94e-04

Open in a new tab

Table 4 gives simulation results under misspecification of the outcome model that only includes a term for Z for fixed SDE and SIE using bootstrap-based variance. Thus, comparing results in Table 4 to Table 2 demonstrates robustness to misspecification of the outcome model. As all three of the estimators evaluated are theoretically robust to misspecification of this model, we would expect similar results between the two Tables, and we see that is indeed the case.

Table 4:

Simulation results for fixed SDE and SIE using bootstrapped-based variance (500 bootstrapped samples) under misspecification of the outcome model for various sample sizes. 1,000 simulations. Estimation methods compared include targeted minimum loss-based estimation (TMLE), inverse probability weighting estimation (IPTW), and solving the estimating equation (EE). Bias and MSE values are averages across the simulations. The estimator standard error × $\sqrt{n}$ should be compared to the efficiency bound, which is 1.07 for the SDE and 0.24 for the SIE.

Estimand	Bias	%Bias	SE× $\sqrt{n}$	95%CI Cov	mse
		N=5,000
TMLE
SDE	−2.21e-03	0.16	1.13	95.3	2.38e-04
SIE	1.79e-04	0.678	0.25	95.5	1.23e-05
IPTW
SDE	−1.25e-03	−1.85	1.68	93.8	5.76e-04
SIE	4.56e-04	1.72	0.42	93.7	3.59e-05
EE
SDE	−1.81e-04	−0.27	1.13	94.8	2.36e-04
SIE	3.07e-04	1.16	0.26	94.3	1.34e-05
		N=500
TMLE
SDE	1.58e-03	2.33	1.12	94.6	2.49e-03
SIE	7.00e-05	0.26	0.26	95.1	1.30e-04
IPTW
SDE	−3.08e-03	−4.55	1.70	94.8	5.74e-03
SIE	2.25e-04	0.85	0.44	94.5	3.86e-04
EE
SDE	1.86e-03	2.75	1.12	93.1	2.63e-03
SIE	−4.24e-04	−1.60	0.26	94.1	1.28e-04
		N=100
TMLE
SDE	6.93e-04	1.02	1.11	93.4	1.33e-02
SIE	−7.91e-04	−2.99	0.28	90.1	6.99e-04
IPTW
SDE	6.76e-03	9.99	1.77	94.4	3.05e-02
SIE	−2.21e-03	−8.37	0.55	94.7	2.09e-03
EE
SDE	4.81e-03	7.10	1.16	93.6	1.28e-02
SIE	1.09e-03	4.12	0.34	93.1	8.85e-04

Open in a new tab

5. Empirical illustration

5.1. Overview and set-up

We now apply our proposed estimator to MTO: a longitudinal, randomized trial that is described above. Because we wish to use machine learning for this empirical illustration, we will estimate the data-dependent SDE of being randomized to receive a housing voucher (A) on marijuana use (Y) not mediated by change in school district (M) and the data-dependent SIE mediated by M among adolescent boys in the Boston site in the presence of an intermediate confounder (Z), moving with the voucher out of public housing.

We restrict to adolescents less than 18 years old who were present at interim follow-up, as those participants had school data and were eligible to be asked about marijuana use. We restrict to boys in the Boston site as previous work has shown important quantitative and qualitative differences in MTO’s effects by sex (Sanbonmatsu et al. 2011; Kling, Ludwig, and Katz 2005; Orr et al. 2003; Leventhal and Dupéré 2011; Osypuk et al. 2012b; 2012a) and by city (Rudolph et al. 2017). We choose to present results from a restricted analysis instead of a stratified analysis, as our goal is to illustrate the proposed method. A more thorough mediation analysis considering all sexes and sites is the subject of a future paper.

Marijuana use was self-reported by adolescents at the interim follow-up, which occurred 4–7 years after baseline, and is defined as ever versus never use. Change in school district is defined as the school at follow-up and school at randomization being in the same district. Numerous baseline characteristics included individual and family sociodemographics, motivation for participating in the study, neighborhood perceptions, school-related characteristics of the adolescent, and predictive interactions.

We use machine learning to flexibly and data-adaptively model the following relationships: instrument to intermediate confounder, intermediate confounder to mediator, and mediator to outcome. Specifically, we use least absolute shrinkage and selection operator (lasso) (Tibshirani 1996) and choose the model that improves 10-fold cross-validation prediction error, while always including age and race/ethnicity and relevant A, Z, and M variables.

5.2. Results

Figure 2 shows the data-dependent SDE and SIE estimates by type of estimator (TMLE, IPTW, and EE) for boys in the Boston MTO site (N=228). SDE and SIE estimates are similar across estimators. We find no evidence that change in school district mediated the effect of being randomized to the voucher group on marijuana use, with null SIE estimates (TMLE risk difference: − 0.003, 95% CI: − 0.032, 0.026). The direct effect of randomization to the housing voucher group on marijuana use suggests that boys who were randomized to this group were 9% more likely to use marijuana than boys in the control group, though this difference is not statistically significant (TMLE risk difference: 0.090, 95% CI: −0.065–0.245).

6. Discussion

We proposed robust targeted minimum loss-based estimators to estimate fixed and data-dependent stochastic direct and indirect effects that are the first to naturally accommodate instrumental variable scenarios. These estimators build on previous work identifying and estimating the SDE and SIE (VanderWeele and Tchetgen Tchetgen 2017). The SDE and SIE have the appealing properties of (1) relaxing the assumption of no intermediate confounder affected by prior exposure, and (2) utility in studying mediation in the context of instrumental variables that adhere to the exclusion restriction assumption (a common assumption of instrumental variables which states that there is no direct effect between A and Y or between A and M (Angrist, Imbens, and Rubin 1996)) due to completely blocking arrows into the mediator by marginalizing over the intermediate confounder, Z. Given the restrictions that this assumption places on the statistical model, several alternative estimands are not appropriate for understanding mediation in this context as the indirect effect would always equal zero (e.g., Tchetgen Tchetgen 2013; VanderWeele and Vansteelandt 2014; Zheng and van der Laan 2017).

Inference for the fixed SDE and SIE can be obtained from bootstrapping, using parametric models for nuisance parameters. Inference for the data-dependent SDE and SIE can be obtained from the data-dependent EIC that assumes known ${\hat{g}}_{M | a *, W}$ estimated from the data, and is appropriate for integrating machine learning in modeling nuisance parameters. The ability to incorporate machine learning is a significant strength in this case; if using the parametric alternative, multiple models would need to be correctly specified (VanderWeele and Tchetgen Tchetgen 2017). IC-based variance is possible in estimating the data-dependent SDE and SIE, because the data-dependent EIC has a form that is solvable using existing statistical tools; in contrast, the EIC for the fixed parameters is more complex and is not solvable with current statistical tools.

Our proposed estimator for the fixed and data-dependent parameters is simple to implement in standard statistical software, and we provide R code to lower implementation barriers. Another advantage of our TMLE estimator, which is shared with other estimating equation approaches, is that it is robust to some model misspecification. In estimating the data-dependent SDE and SIE, one could obtain a consistent estimate as long as either the Y model or the A and M models given the past were correctly specified. Obtaining a consistent estimate of the fixed SDE and SIE would also require consistent estimation of g_M|a*,W. In addition, our proposed estimation strategy is less sensitive to positivity violations than weighting-based approaches. First, TMLE is usually less sensitive to these violations than weighting estimators, due in part to it being a substitution estimator, which means that its estimates lie within the global constraints of the statistical model. This is in contrast to alternative estimating equation approaches, which may result in estimates that lie outside the parameter space. Second, we formulate our TMLE such that the targeting is done as a weighted regression, which may smooth highly variable weights (Stitelman, De Gruttola, and van der Laan 2012). In addition, moving the targeting into the weights improves computation time (Stitelman, De Gruttola, and van der Laan 2012).

However, there are also limitations to the proposed approach. We have currently only implemented it for a binary A and M, though extensions to multinomial or continuous versions of those variables are possible (Rosenblum and van der Laan 2010; Diaz and Rosenblum 2015). Extending the estimator to allow for a high-dimensional M is less straightforward, though it is of interest and an area for future work as allowing for high-dimensional M is a strength of other mediation approaches (Tchetgen Tchetgen 2013; Zheng and van der Laan 2017). We also plan to focus future work on developing a full TMLE for the fixed SDE and SIE parameters.

Supplementary Material

Supplement

NIHMS1554809-supplement-Supplement.pdf^{(285.9KB, pdf)}

Footnotes

Supplemental Material: The online version of this article offers supplementary material (https://doi.org/10.1515/em-2017-0007).

References

Angrist JD, Imbens GW, and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American statistical Association, 91: 444–455. [Google Scholar]
Avin C, Shpitser I, and Pearl J (2005). Identifiability of Path-specific Effects, 357–363 San Francisco, CA: Morgan Kaufmann Publishers Inc. [Google Scholar]
Bang H and Robins JM (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61:962–973. [DOI] [PubMed] [Google Scholar]
Baron RM, and Kenny DA (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51: 1173. [DOI] [PubMed] [Google Scholar]
Bild DE, Bluemke DA, Burke GL, Detrano R, Roux AVD, Folsom AR, Greenland P, Jacobs DR Jr., Kronmal R, Liu K, et al. (2002). Multi-ethnic study of atherosclerosis: objectives and design. American Journal of Epidemiology, 156: 871–881. [DOI] [PubMed] [Google Scholar]
Diaz I, and Rosenblum M. (2015). Targeted maximum likelihood estimation using exponential families. The International Journal of Biostatistics, 11: 233–251. [DOI] [PubMed] [Google Scholar]
Didelez V, Dawid AP, and Geneletti S (2006). Direct and indirect effects of sequential treatments. In: Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, 138–146. AUAI Press. [Google Scholar]
Eaton WW, and Kessler LG. (2012). Epidemiologic field methods in psychiatry: the NIMH Epidemiologic Catchment Area Program. Orlando, FL: Academic Press. [Google Scholar]
Gruber S, and van der Laan MJ (2010).A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome. The International Journal of Biostatistics, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Imai K, Keele L, and Tingley D (2010).A general approach to causal mediation analysis. Psychological Methods, 15:309. [DOI] [PubMed] [Google Scholar]
Kahn SE, Hull RL, and Utzschneider KM (2006). Mechanisms linking obesity to insulin resistance and type 2 diabetes. Nature, 444:840–846. [DOI] [PubMed] [Google Scholar]
Kling JR, Liebman JB, and Katz LF (2007). Experimental analysis of neighborhood effects. Econometrica, 75:83–119. [Google Scholar]
Kling JR, Ludwig J, and Katz LF. Neighborhood effects on crime for female and male youth: evidence from a randomized housing voucher experiment. The Quarterly Journal of Economics 2005, 120(1):87–130. [Google Scholar]
Leventhal T, and Dupéré V (2011).Moving to opportunity: Does long-term exposure to low-povertyneighborhoods make a difference for adolescents? Social Science & Medicine, 73:737–743. [DOI] [PubMed] [Google Scholar]
Muñoz ID, and van der Laan M (2012). Population intervention causal effects based on stochastic interventions. Biometrics, 68:541–549. [DOI] [PMC free article] [PubMed] [Google Scholar]
Orr L, Feins J, Jacob R, Beecroft E, Sanbonmatsu L, Katz LF, Liebman JB, and Kling JR. (2003). Moving to opportunity: interim impacts evaluation. Washington, D.C.: U.S. Department of Housing and Urban Development. [Google Scholar]
Osypuk TL, Schmidt NM, Bates LM, Tchetgen-Tchetgen EJ, Earls FJ, and Glymour MM (2012a). Gender and crime victimization modify neighborhood effects on adolescent mental health. Pediatrics, 130:472–481. [DOI] [PMC free article] [PubMed] [Google Scholar]
Osypuk TL, Tchetgen EJT, Acevedo-Garcia D, Earls FJ, Lincoln A, Schmidt NM, and Glymour MM (2012b). Differential mental health effects of neighborhood relocation among youth in vulnerable families: results from a randomized trial. Archives of General Psychiatry, 69:1284–1294. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pearl J (2001). Direct and Indirect Effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, 411–420. San Francisco, CA: Morgan Kaufmann Publishers Inc. [Google Scholar]
Petersen ML, Sinisi SE, and van der Laan MJ (2006). Estimation of direct causal effects. Epidemiology, 17:276–284. [DOI] [PubMed] [Google Scholar]
Phair J, Jacobson L, Detels R, Rinaldo C, Saah A, Schrager L, and Muñoz A (1992). Acquired immune deficiency syndrome occurring within 5 years of infection with human immunodeficiency virus type-1: the multicenter aids cohort study. JAIDS Journal of Acquired Immune Deficiency Syndromes, 5:490–496. [PubMed] [Google Scholar]
Rosenblum M, and van der Laan MJ. 2010. Targeted maximum likelihood estimation of the parameter of a marginal structural model. The International Journal of Biostatistics 6(2): Article 19. DOI: 10.2202/1557-4679.1238. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rudolph KE, Díaz I, Rosenblum M, and Stuart EA. 2014. Estimating population treatment effects from a survey subsample. American Journal of Epidemiology, 180(7):737–748. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rudolph KE, Schmidt NM, Glymour MM, Crowder RE, Galin J, Ahern J, and Osypuk TL (2017). Composition or context: using transportability to understand drivers of site differences in a large-scale housing experiment. Epidemiology. DOI: 10.1097/EDE.0000000000000774. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sanbonmatsu L, Ludwig J, Katz LF, Gennetian LA, Duncan GJ, Kessler RC, Adam E, McDade TW, and Lindau ST. (2011). Moving to opportunity for fair housing demonstration program–final impacts evaluation. Washington, DC: U.S. Department of Housing and Urban Development, Office of Policy Development and Research. [Google Scholar]
Stitelman OM, De Gruttola V, and van der Laan MJ. 2012. A general implementation of tmle for longitudinal data applied to causal inference in survival analysis. The International Journal of Biostatistics 8(1): Article 26.DOI: 10.1515/1557-4679.1334. [DOI] [PubMed] [Google Scholar]
Tchetgen ET, and Shpitser I (2014).Estimation of a semiparametric natural direct effect model incorporating baseline covariates. Biometrika, 101:849–864. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tchetgen Tchetgen EJ (2013). Inverse odds ratio-weighted estimation for causal mediation analysis. Statistics in Medicine, 32:4567–4580. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tibshirani R 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1): 267–288. [Google Scholar]
Valeri L, and VanderWeele TJ (2013). Mediation analysis allowing for exposure–mediator interactions and causal interpretation: Theoretical assumptions and implementation with sas and spss macros. Psychological Methods, 18:137. [DOI] [PMC free article] [PubMed] [Google Scholar]
van der Laan MJ, and Gruber S. 2012. “Targeted Minimum Loss Based Estimation of Causal Effects of Multiple Time Point Interventions.” The International Journal of Biostatistics, 9(1): Article 9. DOI: 10.1515/1557-4679.1370. [DOI] [PubMed] [Google Scholar]
van der Laan MJ, and Petersen ML. (2008). Direct effect models. The International Journal of Biostatistics 4(1):Article 23. DOI: 10.2202/1557-4679.1064. [DOI] [PubMed] [Google Scholar]
van der Laan MJ, Polley EC, and Hubbard AE. (2007). Super learner. Statistical Applications in Genetics and Molecular Biology, 6(1): Article 25. DOI: 10.2202/1544-6115.1309. [DOI] [PubMed] [Google Scholar]
van der Laan MJ, and Rubin D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics, 2(1): Article 11. DOI: 10.2202/1557-4679.1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
VanderWeele T, and Vansteelandt S (2014). Mediation analysis with multiple mediators. Epidemiologic Methods, 2:95–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
VanderWeele TJ (2009). Marginal structural models for the estimation of direct and indirect effects. Epidemiology, 20:18–26. [DOI] [PubMed] [Google Scholar]
VanderWeele TJ, and Tchetgen Tchetgen EJ. (2017). Mediation analysis with time varying exposures and mediators. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 79(3): 917–938. [DOI] [PMC free article] [PubMed] [Google Scholar]
VanderWeele TJ, Vansteelandt S, and Robins JM. (2014). Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology, 25(2): 300–306. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vansteelandt S, and Daniel RM. (2017). Interventional effects for mediation analysis with multiple mediators. Epidemiology, 28(2): 258–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zheng W, and van der Laan M. 2017. Longitudinal mediation analysis with time-varying mediators and exposures, with application to survival outcomes. Journal of Causal Inference, 5(2). DOI: 10.1515/jci-2016-0006. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zheng W, and van der Laan MJ. (2012a) Causal mediation in a survival setting with time-dependent mediators. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 295. [Google Scholar]
Zheng W, and van der Laan MJ (2012b). Targeted maximum likelihood estimation of natural direct effects. The International Journal of Biostatistics, 8:1–40. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement

NIHMS1554809-supplement-Supplement.pdf^{(285.9KB, pdf)}

[R1] Angrist JD, Imbens GW, and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American statistical Association, 91: 444–455. [Google Scholar]

[R2] Avin C, Shpitser I, and Pearl J (2005). Identifiability of Path-specific Effects, 357–363 San Francisco, CA: Morgan Kaufmann Publishers Inc. [Google Scholar]

[R3] Bang H and Robins JM (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61:962–973. [DOI] [PubMed] [Google Scholar]

[R4] Baron RM, and Kenny DA (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51: 1173. [DOI] [PubMed] [Google Scholar]

[R5] Bild DE, Bluemke DA, Burke GL, Detrano R, Roux AVD, Folsom AR, Greenland P, Jacobs DR Jr., Kronmal R, Liu K, et al. (2002). Multi-ethnic study of atherosclerosis: objectives and design. American Journal of Epidemiology, 156: 871–881. [DOI] [PubMed] [Google Scholar]

[R6] Diaz I, and Rosenblum M. (2015). Targeted maximum likelihood estimation using exponential families. The International Journal of Biostatistics, 11: 233–251. [DOI] [PubMed] [Google Scholar]

[R7] Didelez V, Dawid AP, and Geneletti S (2006). Direct and indirect effects of sequential treatments. In: Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, 138–146. AUAI Press. [Google Scholar]

[R8] Eaton WW, and Kessler LG. (2012). Epidemiologic field methods in psychiatry: the NIMH Epidemiologic Catchment Area Program. Orlando, FL: Academic Press. [Google Scholar]

[R9] Gruber S, and van der Laan MJ (2010).A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome. The International Journal of Biostatistics, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Imai K, Keele L, and Tingley D (2010).A general approach to causal mediation analysis. Psychological Methods, 15:309. [DOI] [PubMed] [Google Scholar]

[R11] Kahn SE, Hull RL, and Utzschneider KM (2006). Mechanisms linking obesity to insulin resistance and type 2 diabetes. Nature, 444:840–846. [DOI] [PubMed] [Google Scholar]

[R12] Kling JR, Liebman JB, and Katz LF (2007). Experimental analysis of neighborhood effects. Econometrica, 75:83–119. [Google Scholar]

[R13] Kling JR, Ludwig J, and Katz LF. Neighborhood effects on crime for female and male youth: evidence from a randomized housing voucher experiment. The Quarterly Journal of Economics 2005, 120(1):87–130. [Google Scholar]

[R14] Leventhal T, and Dupéré V (2011).Moving to opportunity: Does long-term exposure to low-povertyneighborhoods make a difference for adolescents? Social Science & Medicine, 73:737–743. [DOI] [PubMed] [Google Scholar]

[R15] Muñoz ID, and van der Laan M (2012). Population intervention causal effects based on stochastic interventions. Biometrics, 68:541–549. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Orr L, Feins J, Jacob R, Beecroft E, Sanbonmatsu L, Katz LF, Liebman JB, and Kling JR. (2003). Moving to opportunity: interim impacts evaluation. Washington, D.C.: U.S. Department of Housing and Urban Development. [Google Scholar]

[R17] Osypuk TL, Schmidt NM, Bates LM, Tchetgen-Tchetgen EJ, Earls FJ, and Glymour MM (2012a). Gender and crime victimization modify neighborhood effects on adolescent mental health. Pediatrics, 130:472–481. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Osypuk TL, Tchetgen EJT, Acevedo-Garcia D, Earls FJ, Lincoln A, Schmidt NM, and Glymour MM (2012b). Differential mental health effects of neighborhood relocation among youth in vulnerable families: results from a randomized trial. Archives of General Psychiatry, 69:1284–1294. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Pearl J (2001). Direct and Indirect Effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, 411–420. San Francisco, CA: Morgan Kaufmann Publishers Inc. [Google Scholar]

[R20] Petersen ML, Sinisi SE, and van der Laan MJ (2006). Estimation of direct causal effects. Epidemiology, 17:276–284. [DOI] [PubMed] [Google Scholar]

[R21] Phair J, Jacobson L, Detels R, Rinaldo C, Saah A, Schrager L, and Muñoz A (1992). Acquired immune deficiency syndrome occurring within 5 years of infection with human immunodeficiency virus type-1: the multicenter aids cohort study. JAIDS Journal of Acquired Immune Deficiency Syndromes, 5:490–496. [PubMed] [Google Scholar]

[R22] Rosenblum M, and van der Laan MJ. 2010. Targeted maximum likelihood estimation of the parameter of a marginal structural model. The International Journal of Biostatistics 6(2): Article 19. DOI: 10.2202/1557-4679.1238. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Rudolph KE, Díaz I, Rosenblum M, and Stuart EA. 2014. Estimating population treatment effects from a survey subsample. American Journal of Epidemiology, 180(7):737–748. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Rudolph KE, Schmidt NM, Glymour MM, Crowder RE, Galin J, Ahern J, and Osypuk TL (2017). Composition or context: using transportability to understand drivers of site differences in a large-scale housing experiment. Epidemiology. DOI: 10.1097/EDE.0000000000000774. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] Sanbonmatsu L, Ludwig J, Katz LF, Gennetian LA, Duncan GJ, Kessler RC, Adam E, McDade TW, and Lindau ST. (2011). Moving to opportunity for fair housing demonstration program–final impacts evaluation. Washington, DC: U.S. Department of Housing and Urban Development, Office of Policy Development and Research. [Google Scholar]

[R26] Stitelman OM, De Gruttola V, and van der Laan MJ. 2012. A general implementation of tmle for longitudinal data applied to causal inference in survival analysis. The International Journal of Biostatistics 8(1): Article 26.DOI: 10.1515/1557-4679.1334. [DOI] [PubMed] [Google Scholar]

[R27] Tchetgen ET, and Shpitser I (2014).Estimation of a semiparametric natural direct effect model incorporating baseline covariates. Biometrika, 101:849–864. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Tchetgen Tchetgen EJ (2013). Inverse odds ratio-weighted estimation for causal mediation analysis. Statistics in Medicine, 32:4567–4580. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Tibshirani R 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1): 267–288. [Google Scholar]

[R30] Valeri L, and VanderWeele TJ (2013). Mediation analysis allowing for exposure–mediator interactions and causal interpretation: Theoretical assumptions and implementation with sas and spss macros. Psychological Methods, 18:137. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] van der Laan MJ, and Gruber S. 2012. “Targeted Minimum Loss Based Estimation of Causal Effects of Multiple Time Point Interventions.” The International Journal of Biostatistics, 9(1): Article 9. DOI: 10.1515/1557-4679.1370. [DOI] [PubMed] [Google Scholar]

[R32] van der Laan MJ, and Petersen ML. (2008). Direct effect models. The International Journal of Biostatistics 4(1):Article 23. DOI: 10.2202/1557-4679.1064. [DOI] [PubMed] [Google Scholar]

[R33] van der Laan MJ, Polley EC, and Hubbard AE. (2007). Super learner. Statistical Applications in Genetics and Molecular Biology, 6(1): Article 25. DOI: 10.2202/1544-6115.1309. [DOI] [PubMed] [Google Scholar]

[R34] van der Laan MJ, and Rubin D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics, 2(1): Article 11. DOI: 10.2202/1557-4679.1043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] VanderWeele T, and Vansteelandt S (2014). Mediation analysis with multiple mediators. Epidemiologic Methods, 2:95–115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] VanderWeele TJ (2009). Marginal structural models for the estimation of direct and indirect effects. Epidemiology, 20:18–26. [DOI] [PubMed] [Google Scholar]

[R37] VanderWeele TJ, and Tchetgen Tchetgen EJ. (2017). Mediation analysis with time varying exposures and mediators. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 79(3): 917–938. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] VanderWeele TJ, Vansteelandt S, and Robins JM. (2014). Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology, 25(2): 300–306. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Vansteelandt S, and Daniel RM. (2017). Interventional effects for mediation analysis with multiple mediators. Epidemiology, 28(2): 258–265. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] Zheng W, and van der Laan M. 2017. Longitudinal mediation analysis with time-varying mediators and exposures, with application to survival outcomes. Journal of Causal Inference, 5(2). DOI: 10.1515/jci-2016-0006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] Zheng W, and van der Laan MJ. (2012a) Causal mediation in a survival setting with time-dependent mediators. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 295. [Google Scholar]

[R42] Zheng W, and van der Laan MJ (2012b). Targeted maximum likelihood estimation of natural direct effects. The International Journal of Biostatistics, 8:1–40. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Robust and Flexible Estimation of Stochastic Mediation Effects: A Proposed Method and Example in a Randomized Trial Setting

Kara E Rudolph

Oleg Sofrygin

Wenjing Zheng

Mark J van der Laan

Abstract

Background:

Methods:

Results:

Conclusions:

1. Introduction

2. Notation and causal mediation estimands

Figure 1:

2.1. SDE and SIE estimands and identification

3. Targeted minimum loss-based estimator

4. Simulation

4.1. Data generating mechanism

Table 1:

4.2. Performance

Table 2:

Table 3:

Table 4:

5. Empirical illustration

5.1. Overview and set-up

5.2. Results

Figure 2:

6. Discussion

Supplementary Material

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Robust and Flexible Estimation of Stochastic Mediation Effects: A Proposed Method and Example in a Randomized Trial Setting

Kara E Rudolph

Oleg Sofrygin

Wenjing Zheng

Mark J van der Laan

Abstract

Background:

Methods:

Results:

Conclusions:

1. Introduction

2. Notation and causal mediation estimands

Figure 1:

2.1. SDE and SIE estimands and identification

3. Targeted minimum loss-based estimator

4. Simulation

4.1. Data generating mechanism

Table 1:

4.2. Performance

Table 2:

Table 3:

Table 4:

5. Empirical illustration

5.1. Overview and set-up

5.2. Results

Figure 2:

6. Discussion

Supplementary Material

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases