A Note on formulae for causal mediation analysis in an odds ratiocontext

Eric J Tchetgen Tchetgen

doi:10.1515/em-2012-0005

. Author manuscript; available in PMC: 2014 Oct 10.

Published in final edited form as: Epidemiol Methods. 2014 Jan 3;2(1):21–31. doi: 10.1515/em-2012-0005

A Note on formulae for causal mediation analysis in an odds ratiocontext

Eric J Tchetgen Tchetgen ^a,^b

PMCID: PMC4193811 NIHMSID: NIHMS575549 PMID: 25309848

Abstract

In a recent manuscript, VanderWeele and Vansteelandt (American Journal of Epidemiology, 2010,172:1339–1348) (hereafter VWV) build on results due to Judea Pearl on causal mediation analysis and derive simple closed-form expressions for so-called natural direct and indirect effects in an odds ratio context for a binary outcome and a continuous mediator. The expressions obtained by VWV make two key simplifying assumptions:

The mediator is normally distributed with constant variance,
The binary outcome is rare.

Assumption A may not be appropriate in settings where, as can happen in routine epidemiologic applications, the distribution of the mediator variable is highly skew. However, in this note, the author establishes that under a key assumption of “no mediator-exposure interaction” in the logistic regression model for the outcome, the simple formulae of VWV continue to hold even when the normality assumption of the mediator is dropped. The author further shows that when the “no interaction” assumption is relaxed, the formula of VWV for the natural indirect effect in this setting continues to apply when assumption A is also dropped. However, an alternative formula to that of VWV for the natural direct effect is required in this context and is provided in an appendix. When the disease is not rare, the author replaces assumptions A and B with an assumption C that the mediator follows a so-called Bridge distribution in which case simple closed-form formulae are again obtained for the natural direct and indirect effects.

Recent advances in causal inference have provided a mathematical formalization of mediation analysis.^1–3 Specifically, the counterfactual language of causal inference has allowed for new definitions of causal effects in the mediation context, accompanied by formal identification conditions, and corresponding nonparametric formulae for computing these new types of causal effects.^1–9 In a recent manuscript, VanderWeele and Vansteelandt⁶ (VWV) build on results due to Judea Pearl^2,3 on causal mediation analysis and derive simple closed-form expressions for so-called natural direct and indirect effects in an odds ratio context for a binary outcome and a continuous mediator. General definitions and identifying assumptions of natural direct and indirect effects in an odds ratio context are described in great detail in VWV and are not reproduced here. However, to obtain closed-form expressions for natural direct and indirect effects, VWV require two key simplifying assumptions which are reproduced here:

The mediator is normally distributed with constant variance
The binary outcome is rare.

Assumption A may not be appropriate in settings where, as can happen in routine epidemiologic applications, the distribution of the mediator variable is highly skew. However, in this note, the author establishes that under a key assumption of “no mediator-exposure interaction” in the logistic regression model for the outcome, the simple formulae of VWV continue to hold even when the normality assumption of the mediator is dropped. The author further shows that when the “no interaction” assumption is relaxed, the formula of VWV for the natural indirect effect in this setting continues to apply when assumption A is also dropped. However, an alternative formula to that of VWV for the natural direct effect is derived in this context. When the disease is not rare, the author replaces assumptions A and B with an assumption C that the mediator follows a so-called Bridge distribution in which case simple closed-form formulae are again obtained for the natural direct and indirect effects.¹⁰

Relaxing the normality assumption

To proceed consider the statistical model studied by VWV. In their basic set up, they assume independent and identically distributed data (C, A, M, Y) are observed on n individuals, where Y is the binary outcome of interest, A is the exposure, M is a continuous mediator variable measured prior to Y and subsequently to A, and C are pre-exposure confounders of the effects of (A, M) on Y. VWV assume the following regression models:

logit \Pr (Y = 1 ∣ A = a, M = m, C = c) = θ_{0} + θ_{1} a + θ_{2} m + θ_{4}^{'} c

(1)

and

E [M ∣ A = a, C = c] = β_{0} + β_{1} a + β_{2}^{'} c

(2)

where, under (2) the error term Δ = (M − E[M|A, C]) for the linear regression of [M|A, C] is normally distributed with constant variance. VWV show that, under the nonparametric identifying assumptions 1–4 of their paper, assumptions A and B given above, and the parametric modeling assumptions (1) and (2), odds ratio natural direct and indirect effects are given by the simple formulae

O R_{a, a^{*} ∣ c}^{N D E} (a^{*}) = \exp (θ_{1} (a - a^{*}))

(3)

O R_{a, a^{*} ∣ c}^{N I E} (a^{*}) = \exp (θ_{2} β_{1} (a - a^{*}))

(4)

so that given a fixed value a*, the total causal effect of A on Y within levels of C, comparing the odds of Y when A = a versus when A = a*

O R_{a, a^{*}}^{T E} = \frac{\Pr (Y = 1 ∣ A = a, C = c) \Pr (Y = 0 ∣ A = a^{*}, C = c)}{\Pr (Y = 0 ∣ A = a, C = c) \Pr (Y = 1 ∣ A = a^{*}, C = c)}

can be decomposed on the odds ratio scale into natural direct and indirect causal effects according to⁶:

\begin{matrix} O R_{a, a^{*}}^{T E} & = O R_{a, a^{*} ∣ c}^{N D E} (a^{*}) \times O R_{a, a^{*} ∣ c}^{N I E} (a^{*}) \\ = \exp ((θ_{1} + θ_{2} β_{1}) (a - a^{*})) \end{matrix}

(5)

In the appendix, it is established that the formulae (3),(4) and therefore formula (5) continue to hold even when the normality assumption is replaced by the weaker assumption:

A'.
The error term Δ for the linear regression (2) for M is independent of (A, C).

Thus, by eliminating the requirement that the mediator is normally distributed, the result considerably broadens the scope of settings in which the methodology of VWV remains appropriate. In fact, the result states that their formulae (3) and (4) continue to hold even when as can occur in epidemiologic applications, the mediator M is not normally distributed, provided that the regression model (2) completely characterizes the relation between the mediator, and exposure and confounding variables, i.e. the residual Δ does not further depend on (A, C).

The above result depends on the crucial “no exposure-mediator interaction” assumption imposed in the logistic regression model (1). VWV also considered mediation analyses under an alternative more general model for the risk of the outcome:

logit \Pr (Y = 1 ∣ A = a, M = m, C = c) = θ_{0} + θ_{1} a + θ_{2} m + θ_{3} m a + θ_{4}^{'} c

(6)

where θ₃ now encodes the interaction (on the odds ratio scale) between the exposure and the mediator variables, and the special case θ₃ = 0 recovers model (1). Under the nonparametric identifying assumptions 1–4 of their paper, assumptions A' and B given above, and the parametric modeling assumptions (2) and (6), VWV establish that

O R_{a, a^{*} ∣ c}^{N I E} (a^{*}) = \exp {(θ_{2} + θ_{3} a) β_{1} (a - a^{*})}

In the appendix, it is shown that the formula in the above display continues to hold when assumption A is replaced by the weaker assumption A'. However, the formula for $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ given in VWV under model (6) no longer applies under assumption A' if assumption A does not also hold. An alternative expression for $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ in this latter setting is given in an online appendix. For inference, standard errors of estimators of $O R_{a, a^{*} ∣ c}^{NIE} (a^{*})$ and $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ under the various modeling assumptions considered above can be obtained as in VWV by straightforward application of the delta method, details are relegated to the online appendix.

Relaxing the rare disease assumption

In this section, simple closed-form formulae are derived for the natural direct and indirect odds ratios $O R_{a, a^{*} ∣ c}^{NIE} (a^{*})$ and $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ , in a setting where the outcome of interest is not rare. The formulae are obtained upon replacing both assumptions A (or equivalently assumptions A') and B with the following alternative distributional assumption for the mediator density:

C.
The conditional density of [Δ|A, C] follows a so-called “Bridge distribution” (more specically, a Bridge distribution for the logit link):¹⁰
$f_{Δ} [d ∣ A = a, C = c] = \frac{\sin (π ϕ)}{\cos (π ϕ) + \cosh (ϕ d)}; - \infty < d < \infty, 0 < ϕ < 1$
where $\cosh (x) = \frac{1}{\exp (x) + \exp (- x)}$

The bridge density given above is denoted B_l(0, ϕ), where the first argument indicates that it has mean zero, ϕ is a rescaling parameter and the subscript l stands for logistic. The variance of B_l(0, ϕ) is given by the simple formula:

\frac{π^{2}}{3} (ϕ^{- 2} - 1)

so that the variance of _{B_l}(0, ϕ) approaches zero as ϕ approaches one. B_l(0, ϕ) is symmetric and has a different shape from that of the Gaussian distribution.¹⁰ When standardized to have unit variance, the bridge density can be shown to have slightly heavier tails than the standard normal and lighter tails than the standard logistic. Wang and Louis¹⁰ provide a detailed study of B_l(0, ϕ). For our purposes, the bridge distribution B_l(0, ϕ) is of interest in the present setting because it is the unique covariate distribution under which marginalization of a standard multiple logistic regression model with respect to a single covariate (with a bridge distribution) produces a marginal regression that is again a standard logistic regression with regression parameters rescaled by an amount determined by ϕ. More specifically, consider the standard logistic regression model (1) for the conditional density of [Y|A, M, C], then under model (2) paired with assumption C, we have that the marginal (with respect to M) regression model of [Y|A, C] is again a standard logistic regression:

logit \Pr (Y = 1 ∣ A = a, C = c) = γ_{0} + γ_{1} a + γ_{4}^{'} c

where

γ_{1} = k (θ_{1} + θ_{2} β_{1})

and

k = {θ_{2}^{2} (ϕ^{- 2} - 1) + 1}^{- 1 ∕ 2}

Similar expressions relating γ₀ and γ₄ to θ₁, θ₂,θ₄ and ϕ are provided in the online appendix: The main point is that standard multiple logistic regression is closed under marginalization of a continuous covariate with a bridge distribution. A more general formulation of the above result is used in the online appendix to establish that under the nonparametric identifying assumptions 1–4 of VWV, the parametric modeling assumptions (1) and (2), and assumption C:

O R_{a, a^{*} ∣ c}^{NDE} (a^{*}) = \exp (k θ_{1} (a - a^{*}))

(7)

O R_{a, a^{*} ∣ c}^{NIE} (a^{*}) = \exp (k θ_{2} β_{1} (a - a^{*}))

(8)

Note the similarity between formulae (3) and (4), and formulae (7) and (8) where the factor k in the latter two expressions accounts for a non-rare outcome under assumption C that the mediator follows a bridge distribution. Analogous formulae are provided in the online appendix that incorporate an interaction between the mediator and exposure variables under model (6).

Concluding remarks

In this note, the author has extended the results of VWV in a number of interesting directions, by providing weaker conditions under which their simple estimators of natural direct and indirect effects remain valid, and by providing alternative distributional assumptions under which the assumption of a rare outcome can be dropped and yet simple formulae are still available for routine use in epidemiologic practice. However, it is important to note that as in VWV, the methods described herein rely on fairly strong modeling assumptions and can deliver severely biased inferences under modeling error of regression models such as models (2) and (6). As a possible remedy, alternative so-called multiply robust estimators have recently been proposed, that deliver valid inferences about natural direct and indirect effects even when, as can happen in practice, a statistical model for the likelihood of [Y, M, A|C] is partially mis-specified.^7–9

APPENDIX

Closed form expressions for $O R_{a, a^{} ∣ c}^{NDE} (a^{})$ and $O R_{a, a^{} ∣ c}^{NIE} (a^{})$

Under the nonparametric identifying assumptions 1–4 of their paper, assumptions A' and B given in the paper, and the parametric modeling assumptions (2) and (7), we have that

\begin{matrix} g (a, a^{*}, c) & = \int \Pr {Y = 1 ∣ A = a, C = c, M = m} f (m ∣ A = a^{*}, C) d m \\ \approx \int \exp (θ_{0} + θ_{1} a + θ_{2} m + θ_{3} m a + θ_{4}^{'} c) f (m ∣ A = a^{*}, C) d m \\ = \exp (θ_{0} + θ_{1} a + θ_{4}^{'} c) \int \exp (θ_{2} m + θ_{3} m a) f (m ∣ A = a^{*}, C) d m \\ = \exp (θ_{0} + θ_{1} a + θ_{4}^{'} c) \int \exp ((θ_{2} + θ_{3} a) m) f (m ∣ A = a^{*}, C) d m \\ = \exp (θ_{0} + θ_{1} a + θ_{4}^{'} c) M_{M ∣ A = a^{*}, C = c} (θ_{2} + θ_{3} a) \end{matrix}

(9)

where $M_{M ∣ A = a^{*}, C = c} (\cdot)$ is the moment generating function of [M|A = a*, C = c] evaluated at (·).

Note that under our assumptions,

M_{M ∣ A = a^{*}, C = c} (θ_{2} + θ_{3} a) = \exp {(θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c)} M_{Δ} (θ_{2} + θ_{3} a)

(10)

where $M_{Δ} (\cdot)$ is the moment generating function of [Δ|A = a*, C = c] evaluated at (·). We conclude that by a result due to Pearl^2;3 (also see VWV⁶)

\begin{matrix} O R_{a, a^{*} ∣ c}^{NDE} (a^{*}) & \approx \frac{g (a, a^{*}, c)}{g (a^{*}, a^{*}, c)} \\ = \frac{\exp (θ_{0} + θ_{1} a + θ_{4}^{'} c) \exp {(θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c)} M_{Δ} (θ_{2} + θ_{3} a)}{\exp (θ_{0} + θ_{1} a^{*} + θ_{4}^{'} c) \exp {(θ_{2} + θ_{3} a^{*}) (β_{0} + β_{1} a^{*} + β_{2}^{'} c)} M_{Δ} (θ_{2} + θ_{3} a^{*})} \\ = \exp [{θ_{1} + (θ_{3} (β_{0} + β_{1} a^{*} + β_{2}^{'} c))} (a - a^{*})] M_{Δ} (θ_{3} (a - a^{*})) \end{matrix}

and

\begin{matrix} O R_{a, a^{*} ∣ c}^{NIE} (a^{*}) & = \frac{g (a, a, c)}{g (a, a, c)} \\ = \frac{\exp (θ_{0} + θ_{1} a + θ_{4}^{'} c) \exp {(θ_{2} + θ_{3} a) (β_{0} + β_{1} a + β_{2}^{'} c)} M_{Δ} (θ_{2} + θ_{3} a)}{\exp (θ_{0} + θ_{1} a + θ_{4}^{'} c) \exp {(θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c)} M_{Δ} (θ_{2} + θ_{3} a)} \\ = \exp {β_{1} (θ_{2} + θ_{3} a) (a - a^{*})} \end{matrix}

which reduces to the formulae provided in the text for the special case where θ₃ = 0. For inference when θ₃ ≠ = 0, estimation of $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ requires an estimator of $M_{Δ} (θ_{3} (a - a^{*}))$ . To motivate a simple estimator of the latter quantity, note that under model (2) and assumption C:

M_{Δ} (θ_{3} a) = \frac{E [\exp {θ_{3} a M}]}{E [\exp {θ_{3} a (β_{0} + β_{1} A + β_{2}^{'} C)}]}

since the numerator is equal to

E [\exp {θ_{2} a M}] = E [\exp {θ_{3} a (β_{0} + β_{1} A + β_{2}^{'} C}] M_{Δ} (θ_{3} a)

and thus, similarly we have that

M_{Δ} (θ_{3} (a - a^{*})) = \frac{E [\exp {θ_{3} (a - a^{*}) M}]}{E [\exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}]}

which gives

O R_{a, a^{*} ∣ c}^{NDE} (a^{*}) \approx \exp [{θ_{1} + (θ_{3} (β_{0} + β_{1} a^{*} + β_{2}^{'} c))} (a - a^{*})] \frac{E [\exp {θ_{3} (a - a^{*}) M}]}{E [\exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}]}

We conclude that $M_{Δ} (θ_{3} (a - a^{*}))$ and therefore $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ is consistently estimated upon substituting empirical averages for unknown marginal expectations and consistent estimates for unknown parameters in the equation in the above display. Note that consistent estimation of $θ = {(θ_{0}, θ_{1}, θ_{2}, θ_{3}, θ_{4}^{'})}^{'}$ and $β = {(β_{0}, β_{1}, β_{2}^{'})}^{'}$ are readily obtained under standard logistic regression $\hat{θ}$ and ordinary least-squares $\hat{β}$ respectively.

The variance-covariance matrix of the resulting estimator $({\hat{O R}}_{a, a^{*} ∣ c}^{NIE} (a^{*}))$ of $\log (O R_{a, a^{*} ∣ c}^{NIE} (a^{*}))$ is obtained using a straightforward application of the delta method and details can be found in VWV. The variance-covariance matrix of $\log ({\hat{O R}}_{a, a^{*} ∣ c}^{NDE} (a^{*}))$ is similarly obtained under the “no interaction” assumption. However, more generally when θ₃ ≠ 0, requires derivations not included in VWV. To proceed, let IF_θ,β denote the influence function of $(\hat{θ}, \hat{β})$ . Let

Φ_{1} (β, θ) = {θ_{1} + (θ_{3} (β_{0} + β_{1} a^{*} + β_{2}^{'} c))} (a - a^{*}),

Φ_{2} (β, θ) = \log E [\exp {θ_{3} (a - a^{*}) M}]

Φ_{3} (β, θ) = \log E [\exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}]

. Then one can show that the influence function of [Φ₁ (β, θ), Φ₂ (β, θ), Φ₃ (β, θ)]′ is given by

I F_{Φ} = {[I F_{Φ 1}, I F_{Φ 2}, I F_{Φ 3}]}^{'}

′, where:

I F_{Φ 1} = G_{1} I F_{θ, β}

with

\begin{matrix} G_{1} & = [0, (a - a^{*}), 0, (β_{0} + β_{1} a^{*} + β_{2}^{'} c) (a - a^{*}), 0^{'} \\ θ_{3} (a - a^{*}), θ_{3} a^{*} (a - a^{*}), θ_{3} c^{'} (a - a^{*})], \end{matrix}

I F_{Φ 2} = E [\exp {θ_{3} (a - a^{*}) M}]^{- 1} U_{Φ 2}

with

U_{Φ 2} = \exp {θ_{3} (a - a^{*}) M} + E (a - a^{*}) M \exp {θ_{3} (a - a^{*}) M}] [0, 0, 0, 1, 0, 0, 0, 0] I F_{θ, β}

and

I F_{Φ 3} = {E [\exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}]}^{- 1} U_{Φ 3}

with

\begin{matrix} U_{Φ 3} & = \exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)} - E [\exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}] \\ + [0, 0, 0, E [(a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C) \exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}], 0, \\ E [θ_{3} (a - a^{*}) \exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}], \\ E [θ_{3} A (a - a^{*}) \exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}], \\ E [θ_{3} (a - a^{*}) C^{'} \exp {θ_{3} (a - a^{*}) (β_{0} + β_{1} A + β_{2}^{'} C)}]] \times I F_{θ, β} \end{matrix}

Thus, the large sample variance of $\log ({\hat{O R}}_{a, a^{*} ∣ c}^{NDE} (a^{*}))$ is approximately given by

n^{- 1} [1, 1, - 1] E (I F_{Φ} I F_{Φ}^{T}) {[1, 1, - 1]}^{'}

A consistent estimator of the above quantity is obtained by substituting empirical expectations for all unknown expectations, and consistent estimators of unknown parameters. The above construction requires the influence function IF_θ,β for standard logistic regression and ordinary least squares estimation, which is of the form:

(\begin{matrix} E {(X_{1} X_{1}^{T})}^{- 1} X_{1} ε \\ E {(X_{2} X_{2}^{T})}^{- 1} X_{2} Δ \end{matrix})

with X₁ = [1, A, M, AM, C′], X₂ = [1, A, C′] and ε = Y – Pr(Y = 1|A, M, C).

Closed form expressions for $O R_{a, a^{} ∣ c}^{NDE} (a^{})$ and $O R_{a, a^{} ∣ c}^{NIE} (a^{})$ under a Bridge distribution

Consider the logistic regression model

logit \Pr (Y = 1 ∣ A = a, M = m, C = c) = θ_{0} + θ_{1} a + θ_{2} m + θ_{3} m a + θ_{4}^{'} c

where

M = β_{0} + β_{1} A + β_{2}^{'} C + Δ

and

[Δ ∣ A, C] ~ B_{l} (0, ϕ)

Note that

\begin{matrix} g (a, a^{*}, c) & = \int \Pr (Y = 1 ∣ A = a, M = m, C = c) f (m ∣ a^{*}, c) \\ = \int expit {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + (θ_{2} + θ_{3} a) Δ + θ_{4}^{'} c} f (Δ) d Δ \\ = \int expit {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + \tilde{Δ} + θ_{4}^{'} c} f (\tilde{Δ}) d \tilde{Δ} \end{matrix}

where $expit (logit (x)) = 1, f (\overset{‒}{Δ})$ is a bridge density with rescaling parameter

\tilde{ϕ} (a) = \tilde{ϕ} (a; θ_{2}, θ_{3}, ϕ) = {{(θ_{2} + θ_{3} a)}^{2} (ϕ^{- 2} - 1) + 1}^{- 1 ∕ 2} .

Then, a result due to Louis and Wang¹⁰ implies that

g (a, a^{*}, c) = expit (\tilde{ϕ} (a) {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + θ_{4}^{'} c})

and therefore

O R_{a, a^{*} ∣ c}^{NDE} (a^{*}) = \frac{\exp (\tilde{ϕ} (a) {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + θ_{4}^{'} c})}{\exp (\tilde{ϕ} (a^{*}) {θ_{0} + θ_{1} a^{*} + (θ_{2} + θ_{3} a^{*}) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + θ_{4}^{'} c})}

\begin{matrix} {OR}_{a, a^{*} ∣ c}^{NIE} (a^{*}) & = \frac{\exp (\tilde{ϕ} (a) {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a + β_{2}^{'} c) + θ_{4}^{'} c})}{\exp (\tilde{ϕ} (a) {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + θ_{4}^{'} c})} \\ = \exp (β_{1} (θ_{2} + θ_{3} a) \tilde{ϕ} (a) (a - a^{*})) \end{matrix}

under the “no interaction” assumption θ₃ = 0, we obtain

\begin{matrix} {OR}_{a, a^{*} ∣ c}^{NDE} (a^{*}) & = \exp ({θ_{2}^{2} (ϕ^{- 2} - 1) + 1} θ_{1} (a - a^{*})) \\ = \exp (k θ_{1} (a - a^{*})) \end{matrix}

{OR}_{a, a^{*} ∣ c}^{NIE} (a^{*}) = \exp (k β_{1} θ_{2} (a - a^{*}))

A consistent estimator $\hat{ϕ}$ of ϕ is obtained by the method of moment upon noting that ϕ = ϕ (α) =expit(α) solves the population equation :

E {U_{\propto} (α; β)} = 0

where $U_{α} (α; β) = Δ {(β)}^{2} - \frac{π^{2}}{3} ({[expit (α)]}^{- 2} - 1)$ . It can then be shown that the influence function of $(\hat{θ}, \hat{β}, \hat{α})$ is given by IF_θ,β,α

I F_{θ, β, α} = (\begin{matrix} E {(X_{1} X_{1}^{T})}^{- 1} X_{1} ε \\ E {(X_{2} X_{2}^{T})}^{- 1} X_{2} Δ \\ E {(\frac{\partial U_{ϕ} (ϕ; β)}{\partial ϕ})}^{- 1} U_{\propto} (α; β) \end{matrix})

Let ${\hat{O R}}_{a, a^{*} ∣ c}^{NDE} (a^{*})$ and ${\hat{O R}}_{a, a^{*} ∣ c}^{NIE} (a^{*})$ the estimators of $O R_{a, a^{*} ∣ c}^{NDE} (a^{*})$ and $O R_{a, a^{*} ∣ c}^{NIE} (a^{*})$ respectively obtained upon substituting $(\hat{θ}, \hat{β}, \hat{ϕ})$ for (θ, β, ϕ). The large sample variances of ${\hat{O R}}_{a, a^{*} ∣ c}^{NDE} (a^{*})$ and ${\hat{O R}}_{a, a^{*} ∣ c}^{NIE} (a^{*})$ are then obtained by a straightforward application of the delta method, mainly:

var ({\hat{OR}}_{a, a^{*} ∣ c}^{NDE} (a^{*})) \approx n^{- 1} H_{1}^{'} E ({IF}_{θ, β, α} {IF}_{θ, β, α}^{'}) H_{1}

where

H_{1} = \frac{\partial}{\partial {(θ^{'}, β^{'}, α)}^{'}} {(\tilde{ϕ} (a; θ_{2}, θ_{3}, ϕ (α)) {θ_{0} + θ_{1} a + (θ_{2} + θ_{3} a) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + θ_{4}^{'} c}) - (\tilde{ϕ} (a^{*}; θ_{2}, θ_{3}, ϕ (α)) {θ_{0} + θ_{1} a^{*} + (θ_{2} + θ_{3} a^{*}) (β_{0} + β_{1} a^{*} + β_{2}^{'} c) + θ_{4}^{'} c})}

and

var ({\hat{OR}}_{a, a^{*} ∣ c}^{NIE} (a^{*})) \approx n^{- 1} H_{2}^{'} E ({IF}_{θ, β, α} {IF}_{θ, β, α}^{'}) H_{2}

where

H_{2} = \frac{\partial}{\partial {(θ^{'}, β^{'}, α)}^{'}} {β_{1} (θ_{2} + θ_{3} a) \tilde{ϕ} (a; θ_{2}, θ_{3}, ϕ (α)) (a - a^{*})}

References

[1].Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [DOI] [PubMed] [Google Scholar]
[2].Pearl J. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence; San Francisco, CA: Morgan Kaufmann; 2001. pp. 411–420. [Google Scholar]
[3].Pearl J. The Mediation Formula: A guide to the assessment of causal pathways in nonlinear models. In: Berzuini C, Dawid P, Bernardinelli L, editors. Causality:Statistical Perspectives and Applications. 2011. To appear. Forthcoming, 2011. [Google Scholar]
[4].van der Laan M, Petersen M. Direct Effect Models. (U.C. Berkeley Division of Biostatistics Working Paper Series).Working Paper 187. 2005 http://www.bepress.com/ucbbiostat/paper187.
[5].Imai K, Keele L, Yamamoto T. Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science. 2010a;25:51–71. [Google Scholar]
[6].VanderWeele TJ, Vansteelandt S. Odds ratios for mediation analysis with a dichotomous outcome. American Journal of Epidemiology. 2010;172:1339–1348. doi: 10.1093/aje/kwq332. [DOI] [PMC free article] [PubMed] [Google Scholar]
[7].Tchetgen Tchetgen EJ, Shpitser I. Semiparametric Theory for Causal Mediation Analysis: efficiency bounds, multiple robustness and sensitivity analysis. 2011 Jun 3rd; doi: 10.1214/12-AOS990. 2011. http://www.bepress.com/harvardbiostat/paper130/ [DOI] [PMC free article] [PubMed]
[8].Tchetgen Tchetgen EJ, Shpitser I. Semiparametric Estimation of Models for Natural Direct and Indirect Effects. 2011a Jun 3rd; 2011. http://www.bepress.com/harvardbiostat/paper129.
[9].Tchetgen Tchetgen Eric J. On Causal Mediation Analysis with a Survival Outcome. The International Journal of Biostatistics. 2011;7(1) doi: 10.2202/1557-4679.1351. Article 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
[10].Wang Z, Louis T. Matching conditional and marginal shapes in binary random intercept models using a bridge distribution function. Biometrika. 2003;90(4):765–775. [Google Scholar]

[R1] [1].Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [DOI] [PubMed] [Google Scholar]

[R2] [2].Pearl J. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence; San Francisco, CA: Morgan Kaufmann; 2001. pp. 411–420. [Google Scholar]

[R3] [3].Pearl J. The Mediation Formula: A guide to the assessment of causal pathways in nonlinear models. In: Berzuini C, Dawid P, Bernardinelli L, editors. Causality:Statistical Perspectives and Applications. 2011. To appear. Forthcoming, 2011. [Google Scholar]

[R4] [4].van der Laan M, Petersen M. Direct Effect Models. (U.C. Berkeley Division of Biostatistics Working Paper Series).Working Paper 187. 2005 http://www.bepress.com/ucbbiostat/paper187.

[R5] [5].Imai K, Keele L, Yamamoto T. Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science. 2010a;25:51–71. [Google Scholar]

[R6] [6].VanderWeele TJ, Vansteelandt S. Odds ratios for mediation analysis with a dichotomous outcome. American Journal of Epidemiology. 2010;172:1339–1348. doi: 10.1093/aje/kwq332. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Tchetgen Tchetgen EJ, Shpitser I. Semiparametric Theory for Causal Mediation Analysis: efficiency bounds, multiple robustness and sensitivity analysis. 2011 Jun 3rd; doi: 10.1214/12-AOS990. 2011. http://www.bepress.com/harvardbiostat/paper130/ [DOI] [PMC free article] [PubMed]

[R8] [8].Tchetgen Tchetgen EJ, Shpitser I. Semiparametric Estimation of Models for Natural Direct and Indirect Effects. 2011a Jun 3rd; 2011. http://www.bepress.com/harvardbiostat/paper129.

[R9] [9].Tchetgen Tchetgen Eric J. On Causal Mediation Analysis with a Survival Outcome. The International Journal of Biostatistics. 2011;7(1) doi: 10.2202/1557-4679.1351. Article 33. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Wang Z, Louis T. Matching conditional and marginal shapes in binary random intercept models using a bridge distribution function. Biometrika. 2003;90(4):765–775. [Google Scholar]

PERMALINK

A Note on formulae for causal mediation analysis in an odds ratiocontext

Eric J Tchetgen Tchetgen

Abstract

Relaxing the normality assumption

Relaxing the rare disease assumption

Concluding remarks

APPENDIX

Closed form expressions for $O R_{a, a^{} ∣ c}^{NDE} (a^{})$ and $O R_{a, a^{} ∣ c}^{NIE} (a^{})$

Closed form expressions for $O R_{a, a^{} ∣ c}^{NDE} (a^{})$ and $O R_{a, a^{} ∣ c}^{NIE} (a^{})$ under a Bridge distribution

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A Note on formulae for causal mediation analysis in an odds ratiocontext

Eric J Tchetgen Tchetgen

Abstract

Relaxing the normality assumption

Relaxing the rare disease assumption

Concluding remarks

APPENDIX

Closed form expressions for ORa,a∗∣cNDE(a∗) and ORa,a∗∣cNIE(a∗)

Closed form expressions for ORa,a∗∣cNDE(a∗) and ORa,a∗∣cNIE(a∗) under a Bridge distribution

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Closed form expressions for $O R_{a, a^{} ∣ c}^{NDE} (a^{})$ and $O R_{a, a^{} ∣ c}^{NIE} (a^{})$

Closed form expressions for $O R_{a, a^{} ∣ c}^{NDE} (a^{})$ and $O R_{a, a^{} ∣ c}^{NIE} (a^{})$ under a Bridge distribution