Summary
A mediation effect explains the relationship of a risk factor and an outcome through a mediator variable which is a step in their pathway. Under the assumption of no cycling in the causal relationship, we consider various situations in which a fourth variable may interfere the estimation of a mediation effect as a confounding factor. Our asymptotic results, which are supported by a Monte Carlo study, show that adjusting for confounding factors under certain conditions might lead to biased estimates. A general guideline is provided for when it is appropriate to adjust for confounding factors in estimating a mediation effect. We apply the guideline to the estimation of the mediation effect of Alzheimer’s disease pathology in the relationship between the Apolipoprotein E ε4 allele and cognitive function among 125 deceased participants from the Religious Orders Study, a longitudinal, clinical-pathologic study of aging and Alzheimer’s disease.
Keywords: Alzheimer’s disease, Confounding factor, Direct Effect, Indirect Effect, Mediation
1 Introduction
A mediation model describes how a third variable (M) intervenes in the causal relationship between an independent variable (X) and a dependent variable (Y). More specifically, the mediation model assumes a pathway, in which an independent variable (X) affects a mediator (M), which then affects a dependent variable (Y). We represent this pathway schematically as X → M → Y. Our interest lies in the mediation effect: the effect of X on Y through the mediator M.
A general approach to evaluating the mediation effect is based on the product of coefficients associated with each path in a path model (Alwin and Hauser, 1975; Baron and Kenny, 1986; Bollen, 1987; Fox, 1980; Sobel, 1982). Consider as an example the following model
| (1) |
| (2) |
where εM is a mean zero random variable that is independent of X and εY, εY is a mean zero random variable that is independent of X and M, and cM and cY represent constant intercepts. Here α0 is the coefficient associated with the pathway X → M while β0 is the coefficient associated with the pathway M → Y after controlling for X. The mediation effect (also called “indirect effect”) of X on Y through the mediator M is defined to be α0β0, under the approach of the product of coefficients. The remaining association between X and Y, denoted by τ, is called the “direct effect,” which may include unidentified indirect effects through some unknown pathways as well as a direct effect of X on Y if it exists. The summation of the indirect effect and the direct effect, viz α0β0 + τ, is referred to as the “total effect” of X on Y.
To estimate the mediation effect, one typically estimates α0 and β0 by ordinary least squares (OLS) regression based on equations (1) and (2). When these two equations characterize the true causal relationships, the OLS estimator (α̂, β̂) of (α0, β0) is consistent. As a result, the product of α̂ and β̂ provides a consistent estimate of the mediation effect α0β0.
The above observation depends crucially on the assumption that no other variables interfere as a confounding factor in the pathway of the independent variable, the mediator and the dependent variable. If there is a variable Z that interferes some or all of the relationships among the three variables (X, M, Y), then the simple estimator α̂β̂ described above is no longer consistent. For example, suppose Z → M and Z → X, then one component of εM is Z. As a result, X is correlated with εM and the OLS estimator of α0 is biased even in large samples. In a recent paper, Herting (2002) demonstrates that, without incorporating a confounding factor, it is quite simple to reject mediation effect when a true form of mediation effect exists.
In this paper, we consider all possible ways that a fourth variable Z can interfere in the pathway X → M → Y as a confounding factor. We investigate the properties of various estimators of the mediation effect under all the scenarios we consider. Asymptotic biases of different estimators are provided. Some simulation experiments are conducted to evaluate the accuracy of the asymptotic results in finite samples. Based on the asymptotic results and numerical evidence, we give some guidelines on how to choose an estimator in empirical applications.
Our approach is applied to the estimation of a mediation effect in a study of risk factors for clinically diagnosed Alzheimer’s disease (AD), where age is a possible confounding factor in the causal pathways. AD is a progressive brain disorder that gradually destroys a person’s memory and ability to learn new information, reason, make judgments, communicate and carry out daily activities. Increasing age is associated with increased risk of AD. Nearly 5 million people in the US alone have AD (Hebert et al., 2003), and this number is expected to grow substantially worldwide in the coming decades as the population ages (Ferri et al., 2005). Recent evidence suggests that the clinical manifestations of AD are a complex function of multiple genetic and environmental factors interacting with pathological and biochemical changes in the brain. For example, while the pathologic hallmarks of AD are neuritic plaques and neurofibrillary tangles, these lesions may add other brain pathologies such as cerebral infarctions to cause cognitive impairment (Petrovitch et al., 2005). By contrast, environmental risk factors may modify the relation of AD pathology to cognition (Mortimer et al., 2005). The presence of an apolipoprotein E ε4 allele (Apoe ε4, a common polymorphism of the gene coding for apolipoprotein E) is a major genetic risk factor for the disease (Tang et al., 1998). The neurobiologic mechanism through which the ε4 allele is associated with an elevated risk of clinically diagnosed AD is not well understood. Previous histopathologic studies (e.g., Bennett et al., 2003) suggest that the effect of the ε4 allele on cognitive impairment may be mediated by an increase in the rate at which AD pathology accumulates. Since AD pathology may add to or interact with other factors to cause cognitive impairment, a variety of alternate mechanisms could also account for the association. Because cognition, AD pathology, and many risk factors for AD are related to age, it is important to be able to adjust for the potential confounding effects of age in mediation analysis of common chronic conditions of older persons. We apply different strategies to evaluate the effect of the confounding factor, age at death, in the estimation of the mediating effect of AD pathology in the relationship between the presence of an Apoe ε4 allele and level of cognitive function before death among 125 subjects in the Religious Orders Study, a longitudinal, clinical-pathologic study of aging and AD.
The rest of the paper is organized as follows. Section 2 presents all the possible ways that Z may interfere in the pathway X → M → Y as a confounding factor. In our application, Z, X, M and Y correspond to age at death, Apoe ε4, AD pathology and cognitive function, respectively. Section 3 examines the asymptotic properties of estimators of the mediation effect under different estimation strategies. Simulation results are reported in Section 4 and an application is included in Section 5. Section 6 concludes and gives some advice on the choice of estimators.
2 Pathway Patterns
We begin by assuming a pathway pattern X → M → Y. We further assume X, M, and Y are interrelated in a linear fashion, as illustrated, for example, in equations (1) and (2). A complete mediation occurs when τ equals zero, where the relationship between X and Y is fully explained by the mediator M such that X has no direct effect on Y. In reality, a complete mediation is unlikely and a direct effect term τ is usually kept in the mediation model even when it is statistically non-significant.
Assuming no cycling in the pathway, where a cycle means that a variable could affect itself through other variables in the pathway, Table 1 provides all possible combinations of pathway patterns among a fourth confounding factor Z and X, M, and Y. The pathway patterns can be classified into four different categories. The first category, presented as the reference case, consists of Case 0.0, wherein Z has no association with X, M, and Y. The second category consists of seven patterns (Case 1.1 – Case 1.7) where the fourth variable Z is at the beginning of the pathway to X, M, and/or Y. The third category consists of seven patterns (Case 2.1 – Case 2.7) where X, M, and/or Y is at the beginning of the pathway to Z. The last category consists of the remaining five patterns (Case 3.1 – Case 3.5) that involve more complicated path relations among Z and X, M, and Y.
Table 1.
Relationship Between Z and X → M → Y Causal Pathway
| Case | X | M | Y |
|---|---|---|---|
| 0.0 | |||
| 1.1 | Z → X | ||
| 1.2 | Z → M | ||
| 1.3 | Z → Y | ||
| 1.4 | Z → M | Z → Y | |
| 1.5 | Z → X | Z → Y | |
| 1.6 | Z → X | Z → M | |
| 1.7 | Z → X | Z → M | Z → Y |
| 2.1 | X → Z | ||
| 2.2 | M → Z | ||
| 2.3 | Y → Z | ||
| 2.4 | M → Z | Y → Z | |
| 2.5 | X → Z | Y → Z | |
| 2.6 | X → Z | M → Z | |
| 2.7 | X → Z | M → Z | Y → Z |
| 3.1 | M → Z | Z → Y | |
| 3.2 | X → Z | M → Z | Z → Y |
| 3.3 | X → Z | Z → M | |
| 3.4 | X → Z | Z → M | Z → Y |
| 3.5 | X → Z | Z → Y |
Each pathway can be represented by a path diagram. For example, Case 0.0, the reference, can be simply represented as X → M → Y and the corresponding model is given in equations (1) and (2). As a second example, Figure 1 provides the path diagram for Case 3.4, where the independent variable X has an indirect effect on Y through the mediator M, an indirect effect on Y through the fourth variable Z, and a direct effect on Y. In addition, the fourth variable Z has a direct effect on the mediator M. The true model for Case 3.4 can be written as
Figure 1.

Path Diagram for the Causal Pattern Case 3.4.
| (3) |
| (4) |
| (5) |
| (6) |
Where εX, εZ, εM, εY are mean zero random variables with respective variance , and . In the above model, we use a subscript ‘ZM’ on γ to signify the effect of Z on M. The other γ’s are similarly defined. In the application, X is the presence of ApoE ε4, Y is the level of cognitive function before death, M is the level of AD pathology, and Z is age at death.
In equations (3) – (6), each of the ε’s is independent of the right hand side variables in the corresponding equation. This independence is a direct implication of unidirectional causality. Without this assumption, the ε’s in general depends on the right hand side variables in the corresponding equation. The assumption of unidirectional causality is the cornerstone of our mediation framework and applies to all mediation models. We maintain these assumptions throughout the paper.
For most of the pathways, the mediation effect through M is α0β0. There are few exceptions (Cases 3.1 – 3.4) in which there exist two pathways from X to Y that go through M. For example, in Case 3.4, the two pathways are X → M → Y and X → Z → M → Y. In the first pathway, X has a direct effect on M of α0. In the second pathway, Z acts as a mediator between X and M with an indirect effect γXZγZM. The total effect of X on M is the sum of the indirect effect γXZγZM and the direct effect α0. The total effect of X on M multiplied by β0, the direct effect of M on Y, provides the mediation effect of X on Y through M, denoted by δ0:
The mediation effects for the remaining cases are reported in the second column of Table 2.
Table 2.
Indirect Effect through Mediator M
| True Indirect Effect | Probability Limits of Different Estimators | ||||
|---|---|---|---|---|---|
| A | B | C | D | ||
| 0.0 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
| 1.1 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
| 1.2 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
| 1.3 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
| 1.4 | α0β0 | α0β* | α0β0 | α0β* | α0β0 |
| 1.5 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
| 1.6 | α0β0 | α*β0 | α*β0 | α0β0 | α0β0 |
| 1.7 | α0β0 | α*β* | α*β0 | α0β* | α0β0 |
| 2.1 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
| 2.2 | α0β0 | α0β0 | α0β0 | α*β0 | α*β0 |
| 2.3 | α0β0 | α0β0 | α0β* | α*β0 | α*β* |
| 2.4 | α0β0 | α0β0 | α0β* | α*β0 | α*β* |
| 2.5 | α0β0 | α0β0 | α0β* | α*β0 | α*β* |
| 2.6 | α0β0 | α0β0 | α0β0 | α*β0 | α*β0 |
| 2.7 | α0β0 | α0β0 | α0β* | α*β0 | α*β* |
| 3.1 | α0(β0 + γMZγZY) | α0(β0+ γMZγZY) | α0β0 | α*(β0 + γMZγZY) | α*β0 |
| 3.2 | α0(β0 + γMZγZY) | α0(β0+ γMZγZY) | α0β0 | α*(β0 + γMZγZY) | α*β0 |
| 3.3 | (α0 + γXZγZM)β0 | (α0+γXZγZM)β0 | (α0+ γXZγZM)β0 | α0β0 | α0β0 |
| 3.4 | (α0 + γXZγZM)β0 | (α0+γXZγZM)β* | (α0+ γXZγZM)β0 | α0β* | α0β0 |
| 3.5 | α0β0 | α0β0 | α0β0 | α0β0 | α0β0 |
Note: α* and β* are biased estimates of α0 and β0, respectively. For example, Case 1.4: , Case 1.6: , Case 3.1: , Case 3.2: .
3 Estimation of the Mediation Effect
Section 2 enumerates 20 causal patterns where a fourth confounding variable could intervene in the causal pathway X → M → Y. In accounting for the confounding factor, we consider four different strategies to estimate the mediation effect.
3.1 Four Different Estimation Strategies
The first estimation strategy, called strategy A, is to ignore the confounding factor and to fit regression equations without the variable Z. The regression equations are given by
| (7) |
| (8) |
where the parameter with a hat denotes the OLS estimator and M̂ and Ŷ are the predictive values of M and Y, respectively, from the OLS regression.
A few words on notation are in order. The subscript ‘XM’ in α̂XM signifies that α̂XM is the coefficient for ‘X’ in the regression of ‘M’ on ‘X’; and the subscript ‘MY • X’ in β̂MY•X signifies that β̂MY•X is the coefficient for ‘M’ in the regression of ‘Y’ on ‘M’, adjusting for ‘X’. We use the same convention in the rest of the paper. We refer to regression (7) as the first stage regression and regression (8) as the second stage regression. With the OLS estimators α̂XM and β̂MY•X, the estimated mediation effect is
The second estimation strategy, called strategy B, is to ignore the confounding variable in the X → M causal path. The regression equations for strategy B are given by:
The estimated mediation effect is
The third estimation strategy, called strategy C, given by:
ignores the confounding variable in the M → Y causal path. The resulting estimate of the mediation effect is
Finally, the fourth estimation strategy, called strategy D, includes the confounding variable Z in both regression equations, leading to
The estimated mediation effect is
In application, usually one of the four estimation strategies is applied without knowledge of the relationship between the confounding factor Z and the X → M → Y causal pathway. In the next subsection, we present the probability limits of δ̂A, δ̂B, δ̂C, and δ̂D for all of the possible causal patterns given in Table 1.
3.2 Asymptotic Biases
For each causal pattern, we derive the probability limit of each estimator. The difference between this limit and the true mediation effect is defined to be the asymptotic bias. According to this definition, when the asymptotic bias is zero, the estimator is consistent for the true mediation effect, given in the second column of Table 2.
We first use Case 3.4 to demonstrate the derivation of the asymptotic bias. Using equations (3) – (4), we can deduce that
where X is independent of the composite error term γZMεZ + εM. Because the OLS estimator is consistent for the underlying model parameter, the probability limit of the OLS estimator obtained by regressing M on a constant and X is
| (9) |
The above limit applies to strategies A and B as both ignore Z in their first stage regression. When Z is included in the first stage regression as in strategies C and D, using equation (5), we can deduce that
| (10) |
We now turn to the second stage regression. Strategies B and D incorporate covariate Z into the regression. In this case, the probability limit of the OLS estimator is
| (11) |
For strategies A and C, covariate Z is omitted from the regression. The probability limit of the OLS estimator is
| (12) |
It follows from equations (3) – (6) that
| (13) |
Plugging the above expressions into (12) yields
| (14) |
Combining (9), (10), (11) with (14), we obtain the probability limit of each estimator:
Let δ0 = (α0 + γXZγZM) β0 be the true mediation effect, then the asymptotic bias of each estimator is given by
Therefore, δ̂B is asymptotically unbiased while δ̂A, δ̂C, and δ̂D are asymptotically biased with the bias depending on the underlying model parameters.
To understand the bias properties of different estimators, note that in the construction of δ̂B, Z is correctly included in the second stage regression. Had Z been omitted, the effect of M on Y would be inconsistently estimated by the second stage OLS regression. This is the case for estimators δ̂A and δ̂C, which explains their inconsistency. On the other hand, Z is not included in the first stage regression of strategy B. Given that Z causes M and X causes Z, the first stage OLS estimator α̂XM that ignores the effect of Z seems to suffer from the omitted variable bias. However, our objective is to estimate the total effect of X on M. When Z is omitted, the first stage OLS estimator α̂XM captures not only the direct effect of X on M but also the indirect effect of X on M through the intermediate Z. Hence α̂XM delivers exactly what we want. In contrast, by including Z in the first stage regression, the first stage OLS estimator α̂XM•Z captures only the direct effect of X on M. As a result, the estimator δ̂D, which is based on α̂XM•Z, is inconsistent for the true mediation effect.
Next, we consider the general cases. The probability limits of estimators under the four estimation strategies for the different causal patterns are summarized in the last four columns of Table 2. As most of the probability limits have complicated forms, only a few examples are given in Table 2. These probability limits equal the limit of the first stage OLS estimator multiplied by that of the second stage OLS estimator. If either of the two estimators is inconsistent, the resulting estimator for the mediation effect is inconsistent. The configurations that lead to the inconsistency of the two estimators can be described as follows.
First, the first stage estimator α̂XM is inconsistent only when the causal diagram contains
in which case the omitted covariate Z affects M and is correlated with the included covariate X in the first stage regression. Omitting Z leads to the well-known omitted-variable bias.
Second, the first stage estimator α̂XM•Z is inconsistent only when the causal diagram contains one of the following
or
The first case is easy to understand. α̂XM•Z is inconsistent for α0 + γXZγZM, the total effect of X on M, because α̂XM•Z converges to α0, the direct effect of X on M. See also the discussion for Case 3.4 above. For the last two cases, Z is at the end of the causal chain X → M. Including Z in the first stage regression confounds the causal relationship between X and M. Because M causes Z, M and Z are statistically correlated. Regressing M on X controlling for the effect of Z gives us the statistical association between X and M but not the causal relationship that X causes M. Therefore, including Z as a regressor invalidates the causal interpretation of the regression coefficients in the first stage regression. As a result, α̂XM•Z does not provide an asymptotically unbiased estimator of the causal relationship from X to M.
Third, the second stage estimator β̂MY•X is inconsistent only when the causal diagram contains
in which case the omitted variable bias is present.
Finally, the second stage estimator β̂MY•XZ is inconsistent only when the causal diagram contains one of the following
In the first case, the OLS estimator β̂MY•XZ only accounts for the direct effect of M on Y and ignores the indirect effect through Z. In the second case, Z is at the end of the causal chain M → Y → Z. The same reason for the asymptotic bias of α̂XM•Z applies to β̂MY•XZ.
On the basis of the asymptotic bias, the 20 causal patterns can be grouped into seven categories, listed in the second column of Table 3. More details are given in Section 4.2.
Table 3.
Empirical Relative Bias (n = 1000) α0 = 0.14, β0 = 0.39, γ = 0.59, τ = 0.2, X ~ N (0, 1)
| Relative Bias
|
|||||
|---|---|---|---|---|---|
| CASE | GROUP | A | B | C | D |
| Case 2.3 | A | 0.00 | −0.26 | −0.17 | −0.39 |
| Case 2.4 | A | 0.00 | −0.92 | −0.68 | −0.97 |
| Case 2.5 | A | 0.00 | −0.26 | −0.88 | −0.91 |
| Case 2.7 | A | 0.00 | −0.92 | −2.38 | −1.11 |
| Case 3.1 | A | −0.01 | −0.48 | −0.27 | −0.62 |
| Case 3.2 | A | 0.00 | −0.47 | −2.10 | −1.58 |
|
| |||||
| Case 2.2 | AB | 0.00 | 0.00 | −0.26 | −0.26 |
| Case 2.6 | AB | 0.01 | 0.01 | −2.09 | −2.09 |
| Case 3.3 | AB | 0.00 | 0.00 | −0.72 | −0.72 |
|
| |||||
| Case 0.0 | ABCD | 0.01 | 0.01 | 0.01 | 0.01 |
| Case 1.1 | ABCD | 0.00 | 0.00 | −0.01 | −0.01 |
| Case 1.2 | ABCD | 0.00 | 0.00 | 0.00 | 0.00 |
| Case 1.3 | ABCD | 0.00 | 0.00 | 0.00 | 0.00 |
| Case 1.5 | ABCD | 0.00 | 0.01 | 0.00 | 0.01 |
| Case 2.1 | ABCD | −0.02 | −0.02 | −0.02 | −0.02 |
| Case 3.5 | ABCD | −0.01 | −0.01 | −0.01 | −0.01 |
|
| |||||
| Case 3.4 | B | 0.66 | 0.00 | −0.53 | −0.72 |
|
| |||||
| Case 1.4 | BD | 0.64 | −0.02 | 0.64 | −0.02 |
|
| |||||
| Case 1.6 | CD | 1.83 | 1.82 | −0.01 | −0.02 |
|
| |||||
| Case 1.7 | D | 3.36 | 1.85 | 0.53 | 0.01 |
4 Simulation
We use SAS® (Version 9.1) for all statistical simulation and analysis. Variables are generated from the normal distribution using the SAS RANNOR function with seed=1,000,000. We consider sample sizes of 100, 200, 500 and 1000. For simplicity, we assume that all the path coefficients between Z and X, M, and Y are the same and equal γ. In reality, this assumption certainly does not hold. Adopting the procedures by MacKinnon et al. (2002), parameter values α0, β0, and γ are chosen to correspond to effect sizes of small (2% of partial variance in the dependent variable), medium (13% of partial variance in the dependent variable), and large (26% of the partial variance in the dependent variable), as described in Cohen (1988, pp. 412–414). These parameters are 0.14, 0.39, and 0.59, corresponding to partial correlations of 0.14, 0.36, and 0.51, respectively. The direct effect τ is chosen to be 0 (complete mediation) and 0.2 for a partial mediation. Variables M, Y, and Z are simulated as continuous variables following a normal distribution. The independent variable X is assumed to follow either a normal distribution or a Bernoulli distribution with success probability 0.3. In the application, the probability of having at least one Apoe ε4 allele is 0.29. Because the intercept does not affect the estimation of the mediation effect, without loss of generality, we set all the intercepts to be zero in the simulation of data generation, but include them in the model fitting. All the random noise terms are assumed to be independent, identically and normally distributed with mean zero and variance one.
In summary, the simulation uses a 3 × 3 × 3 × 2 × 2 × 4 × 20 factorial design. We vary the factors of effect size of path α0 (0.14 for small, 0.39 for medium, and 0.59 for large), effect size of path β0 (0.14 for small, 0.39 for medium, and 0.59 for large), effect size of path γ (0.14 for small, 0.39 for medium, and 0.59 for large), direct effect τ (0 and 0.2), distribution of X (standard normal and Bernoulli with probability 0.3), sample size (100, 200, 500, and 1000), and the 20 causal patterns in Table 1, for a total of 8640 different data generating processes (DGP). For each DGP, 500 replications are conducted. To compare bias across different levels of the mediation effects, we calculate the empirical relative bias, defined by
where δ0 is the true indirect effect, as defined in Section 3. The relative bias are taken across 500 replications to evaluate the empirical performance of the four estimators under each causal pattern, summarized in Section 4.2.
4.1 Example
We use Case 3.4, shown in Figure 1, to illustrate how the data are generated. When α0 = 0.14, β0 = 0.39, γ = 0.59, τ = 0.2, and X follows a standard normal distribution, the sample is generated by
| (15) |
where εX, εZ, εM, εY ~ i.i.d.N (0, 1). To generate a binary variable X, the distribution of X, in equation (15), is replaced by X ~ Bernoulli(0.3).
4.2 Results
The simulation results for different effect sizes of α0, β0, and γ are similar across each of these 20 causal patterns. Regardless of the magnitude of the effect sizes, or the distribution of the independent variable, or the magnitude of the direct effect (0 or 0.2), the relative bias demonstrates similar patterns for each casual pattern and the results are quite stable even at sample size 100. In Figure 2, we plot the asymptotic relative bias and the empirical relative bias (500 replications) of various sample sizes for Case 3.4 when α0 = 0.14, β0 = 0.39, γ = 0.59, τ = 0.2, and X follows a standard normal distribution. Only strategy B is asymptotic unbiased. For all estimation strategies, the empirical relative bias is very close to the asymptotic relative bias. Strategy A overestimates the causal relationship of M → Y and thus overestimates the mediation effect; strategy C underestimates the causal relationship of X → M, overestimates the causal relationship of M → Y, and in combination underestimates the mediation effect; strategy D underestimates the causal relationship of X → M and thus underestimates the mediation effect. Table 3 presents the relative bias when α0 = 0.14, β0 = 0.39, γ = 0.59, τ = 0.2, n = 1000, and X follows a standard normal distribution, for all 20 causal patterns.
Figure 2.

Case 3.4: Asymptotic Relative Bias (horizontal line) and Empirical Relative Bias (horizontal bar ± 1 SD) for Estimators from Strategies A – D, with True Values α0 = 0.14, β0 = 0.39, γ = 0.59, τ = 0.2, X ~ N(0, 1).
Similar to the asymptotic relative bias results, we can classify our simulation results for these 20 causal patterns into seven different groups, listed in the second column of Table 3. Group “A” contains six causal patterns (Case 2.3, 2.4, 2.5, 2.7, 3.1, and 3.2) where only estimator δ̂A is consistent; group “AB” consists of three causal patterns (Case 2.2, 2.6, and 3.3), for which δ̂A and δ̂B are consistent; group “ABCD” consists of seven causal patterns (Case 0.0, 1.1, 1.2, 1.3, 1.5, 2.1, and 3.5), for which all four estimators are consistent; group “B” consists of one causal pattern (Case 3.4) where only δ̂B provides a consistent estimate of the mediation effect; group “BD” consists of one causal pattern (Case 1.4) where both δ̂B and δ̂D provide consistent estimates; group “CD” consists of causal pattern Case 1.6 where both δ̂C and δ̂D are consistent; the last group “D” describes one causal pattern, Case 1.7, where only δ̂D is consistent. In summary, not a single estimation strategy is unbiased for all causal patterns.
Notice that under Cases 3.3 and 3.4, the confounding factor contributes to the total mediation effect as part of the causal pathway. When the investigator is interested in estimating the partial mediation effect that does not go through the confounding factor, strategy D provides asymptotically unbiased estimate.
5 Application
As demonstrated in Section 4, there is no gold standard strategy currently available for the adjustment of potentially confounding factors when estimating a mediation effect. Thus, the choice of strategy depends on a variety of factors. In this section, we present an application to illustrate one potential approach for selecting an appropriate estimation strategy.
The clinical manifestations of Alzheimer’s disease (AD) are a complex function of multiple genetic and environmental factors causing or interacting with the pathology of AD, and other pathological and biochemical changes in the brain. Bennett et al. (2003) used data from 125 deceased persons participating in the Religious Orders Study, a longitudinal, clinical-pathologic study of aging and AD, to test the hypothesis that the Apoe ε4 allele, a known risk factor for clinical AD, is associated with level of cognitive function through an association with measures of AD pathology rather than other brain lesions. In their analysis, the independent variable was the presence of one or two Apoe ε4 alleles, the mediator was AD pathology defined as neuritic plaques and neurofibrillary tangles standardized and combined into a composite measure of global pathology score, and the dependent variable was level of cognitive function before death defined as 19 cognitive function tests standardized and combined into a composite global measure of cognition. In summary, the hypothesized causal path is Apoe ε4 → AD Pathology → Cognitive Function.
While younger people may get AD, the disease usually begins after age 65 and risk increases substantially with age. Fewer than 5 percent of men and women ages 65 to 74 have AD, and nearly half of those age 85 and older may have the disease (Evans et al., 1989). It is important to note, however, that AD is not a normal part of aging. In the Apoe ε4 → AD Pathology → Cognitive Function association, age is strongly related to both AD pathology and cognitive function and constitutes a major confounding factor.
In this section, we provide estimates of the mediation effect of Apoe ε4 through AD pathology by using all four estimation strategies, and we use this example to illustrate an approach to choosing an appropriate estimation strategy to adjust for the potential confounding effects of age in the estimation of the mediation effects in the study of common chronic age related conditions. The general approach is summarized into three steps. In the first step, one needs to identify all possible causal patterns between the confounding factor age and Apoe ε4, Pathology, and Cognitive Function conceptually. A person is born with or without Apoe ε4, thus the causal link age (Z) → Apoe ε4 (X) does not hold and, in reference to Table 1, Cases 1.1, 1.5, 1.6, and 1.7 can be excluded. By contrast, because there is evidence that Apoe ε4 is related to mortality (Hayden et al., 2005), we cannot entirely exclude the possibility of Apoe ε4 → age (Cases 2.1, 2.5 – 2.7, 3.2 – 3.5 in Table 1). At the same time, increasing age is associated with both the accumulation of AD pathology (M) and loss of cognitive function (Y). This identifies Case 1.4 and Case 3.4 (Table 1) as two possible casual patterns. In the second step, one needs to identify the appropriate estimation strategies for the identified causal patterns. According to Table 3, for Case 1.4, both estimator B and estimator D provide unbiased estimates. For Case 3.4, estimator B provides an unbiased result, but estimator D has an asymptotic bias of –γXZγZMβ0. This asymptotic bias is created by the mediation effect through the causal path X → Z → M → Y, which was inappropriately adjusted for in the first estimation stage. In the third step, we obtain the four estimates, and then compare them to see whether estimates B and D are close to one another, and markedly different from estimates A and C. Finding similar results from estimates B and D and different results from estimates A and C would provide strong evidence in favor of the conceptual causal pattern 1.4 identified in the first step. In our application, the estimates B (−0.387, 95% CI (−0.611, −0.182)) and D (−0.378, 95% CI (−0.606, −0.177)) are quite close, and are very different from estimates A (−0.445, 95% CI (−0.707, −0.221)) and C (−0.435, 95% CI (−0.699, −0.209)). Confidence intervals are obtained using the bootstrap method. The results appear to confirm our conceptual understanding of the relationship between age and Apoe ε4, AD Pathology, and Cognitive Function. An appropriate point estimate of the mediation effect would be −0.387 (estimate B). The difference between estimates D and B (0.009) would be an empirical estimate of –γXZγZMβ0 in Case 3.4, and random noise in Case 1.4. Because strategy D over-adjusts the effect of the confounding factor in Case 3.4, one should choose strategy B to estimate the mediation effect (−0.387).
6 Discussion
As presented in our paper, not a single strategy fits all 20 causal patterns. To ease the strategy selection and estimation of the indirect effect, we provide a general guideline under various causal patterns. When estimating the indirect effect, one needs to first consider all possible ways that a fourth variable Z might interfere in the causal path X → M → Y as a confounding factor. Depending on whether there are one or two pathways from X to M and then to Y, we offer the following guidelines:
First, consider the cases where there are two pathways. When X → Z → M but Z ↛ Y, δ̂A should be used. One may also use δ̂B in this case but it incurs the unnecessary cost of collecting Z. When X → Z → M and Z → Y, δ̂B should be used. When M → Z → Y, δ̂A should be used.
Second, consider the cases where there is only one pathway. If Z → M and Z → Y but Z ↛ X, both δ̂B and δ̂D could be used. If Z → M and Z → Y and Z → X, δ̂D should be used; If Z → X and Z → M but Z ↛ Y, both δ̂C and δ̂D could be used. For all other cases, δ̂A is recommended.
In summary, the researcher needs to know, a priori, what the model is or have a good idea how to restrict the choices before causal modeling can be reasonably applied. In many instances, such a priori knowledge is unavailable. Using data from a pilot study, for example, for each potential confounding factor, one can hypothesize a possible causal pattern, conduct the analysis using all four strategies, and then examine the 20 causal patterns listed in Table 3 to see whether the actual results are consistent with the conceptual causal path.
In application, an investigator usually adopts either strategy A, completely ignoring the possible confounding factor, or strategy D, adjusting for the possible confounding factor at every stage of the regression. The common misconception underlying the selection of strategy D is that an unbiased estimate is obtained only after adjusting for the potential confounding factor in all regressions. However, one should bear in mind that strategy A fails in Case 1.4, 1.6, 1.7, and 3.4 where the causal path Z → M exists and at least another causal path of Z → X or Z → Y exists, and strategy D fails in Cases 2.2 – 3.4 where Z is at the end of causal path from X, M, and/or Y. Therefore, adopting estimation strategy A or D without further consideration of the possible causal patterns can lead to bias.
In a mediation analysis, the investigator should try to collect all possible confounding factors that might directly affect both the mediator and the dependent variable (Case 1.4 and Case 1.7), or both the independent variable and the mediator (Case 1.6 and Case 1.7), and adjust for them in the analysis. When a confounding factor is an intermediate variable between the independent variable and the mediator and it also affects the dependent variable (Case 3.4), information for this confounding factor should be collected and adjusted for in the second stage regression. For all other scenarios, data collection on the confounding factor is unnecessary.
This paper relies on several crucial assumptions: 1) no cycling in the causal pathway; 2) univariate confounding factor, covariate and mediator; and 3) linear relationship among variables. We can relax the second assumption by allowing multiple confounding factors and covariates. To relax the third assumption, we can specify appropriate link functions in equations (1) – (2) and change the four estimation strategies accordingly. Further research is warranted to study the property of the four estimators.
We use the bias patterns in Table 2 to help deduce and confirm the underlying causal pathways. In the cases that two or more candidate pathways have the same bias pattern or the four estimators are not distinct, the investigator should resort to scientific literature to clarify such ambiguities. Notice that incorrectly accounting for the effect of Z not only biases the point estimate of the mediation effect, but also affects the standard error. In Figure 2, the standard errors are biggest for estimator A and smallest for estimator D. Further research is needed to study the sampling distributions of the mediation effects obtained from four different estimation strategies. It will be of great interest to investigate whether we can rule out certain pathways or pin down the correct pathway by comparing the four different estimates.
In practice, the terms mediation, confounder, and confounding are different on conceptual ground (Baron and Kenny, 1986; Greenland and Morgenstern, 2001; MacKinnon, Krull and Lockwood, 2000). In this article, we only consider the situation that a fourth variable acts as a confounding factor in the estimation of mediation effect. It might seem that these 20 different causal patterns are exhaustive. However, in application, the true underlying causal pattern might be much more complicated: the assumption of no cycling might not hold, the fourth variable might act as an effect modifier (moderator) or both as an effect modifier (moderator) and a confounding factor.
Acknowledgments
Support for this research was provided by U. S. National Institute on Aging grants P30 AG10161, R01 AG15819, and R01 AG17917. We thank Kristopher Preacher and Andrew Hayes for providing the SAS® program for their 2004 paper; and we thank Kristopher Preacher for helpful comments on an earlier draft. SAS is a trademark or registered trademark of SAS Institute Inc., Cary, NC, USA; ® indicates USA registration.
Footnotes
The SAS program and a worked example will be available at http://www.rush.edu/radc or https://biostat.ucsd.edu/~yli.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Alwin DF, Hauser RM. The decomposition of effects in path analysis. American Sociological Review. 1975;40:37–47. [Google Scholar]
- Baron RM, Kenny DA. The moderator-mediator distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology. 1986;51:1173–1182. doi: 10.1037//0022-3514.51.6.1173. [DOI] [PubMed] [Google Scholar]
- Bennett DA, Wilson RS, Schneider JA, Evans DA, Aggarwal NT, Arnold SE, Cochran EJ, Berry-Kravis E, Bienias JL. Apolipoprotein E ε4 allele, AD pathology, and the clinical expression of Alzheimer’s disease. Neurology. 2003;60:246–252. doi: 10.1212/01.wnl.0000042478.08543.f7. [DOI] [PubMed] [Google Scholar]
- Bollen KA. Total, direct, and indirect effects in structural equation models. Sociological Methodology. 1987;17:37–69. [Google Scholar]
- Cohen J. Statistical Power for the Behavioral Sciences. Hillsdale, NJ: Erlbaum; 1988. [Google Scholar]
- Evans DA, Funkenstein HH, Albert MS, Scherr PA, Cook NR, Chown MJ, Hebert LE, Hennekens CH, Taylor JO. Prevalence of Alzheimer’s disease in a community population of older persons: Higher than previously reported. The Journal of the American Medical Association. 1989;262(18):2552–2556. [PubMed] [Google Scholar]
- Ferri CP, Prince M, Brayne C, Brodaty H, Fratiglioni L, Ganguli M, Hall K, Hasegawa K, Hendrie H, Huang Y, Jorm A, Mathers C, Menezes PR, Rimmer E, Scazufca M Alzheimer’s Disease International. Global prevalence of dementia: a Delphi consensus study. Lancet. 2005;366:2112–2117. doi: 10.1016/S0140-6736(05)67889-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fox J. Effect analysis in structural equation models. Sociological Methods and Research. 1980;9:3–28. [Google Scholar]
- Greenland S, Morgenstern H. Confounding in health research. Annual Review of Public Health. 2001;22:189–212. doi: 10.1146/annurev.publhealth.22.1.189. [DOI] [PubMed] [Google Scholar]
- Hayden KM, Zandi PP, Lyketsos CG, Tschanz JT, Norton MC, Khachaturian AS, Pieper CF, Welsh-Bohmer KA, Breitner JC. Apolipoproteine genotype and mortality: Findings from the cache county study. Journal of the American Geriatric Society. 2005;53:935–942. doi: 10.1111/j.1532-5415.2005.53301.x. [DOI] [PubMed] [Google Scholar]
- Hebert LE, Scherr PA, Bienias JL, Bennett DA, Evans DA. Alzheimer disease in the US population: Prevalence estimates using the 2000 census. Archives of Neurology. 2003;60:1119–1122. doi: 10.1001/archneur.60.8.1119. [DOI] [PubMed] [Google Scholar]
- Herting J. Evaluating and rejecting true mediation models: A cautionary note. Prevention Science. 2002;3:285–289. doi: 10.1023/a:1020828709115. [DOI] [PubMed] [Google Scholar]
- MacKinnon DP, Krull JL, Lockwood CM. Equivalence of the mediation, confounding and suppression effect. Prevention Science. 2000;1:173–181. doi: 10.1023/a:1026595011371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V. A comparison of methods to test mediation and other intervening variable effects. Psychological Methods. 2002;7:83–104. doi: 10.1037/1082-989x.7.1.83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mortimer JA, Borenstein AR, Gosche KM, Snowdon DA. Very early detection of Alzheimer neuropathology and the role of brain reserve in modifying its clinical expression. Journal of Geriatric Psychiatry Neurology. 2005;18:218–223. doi: 10.1177/0891988705281869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petrovitch H, Ross GW, Steinhorn SC, Abbott RD, Markesbery W, Davis D, Nelson J, Hardman J, Masaki K, Vogt MR, Launer L, White LR. AD lesions and infarcts in demented and non-demented Japanese-American men. Annals of Neurology. 2005;57:98–103. doi: 10.1002/ana.20318. [DOI] [PubMed] [Google Scholar]
- Preacher KJ, Hayes AF. SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behavior Research Methods, Instruments, & Computers. 2004;36:717–731. doi: 10.3758/bf03206553. [DOI] [PubMed] [Google Scholar]
- Sobel ME. Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology. 1982;13:290–312. [Google Scholar]
- Tang MX, Stern Y, Marder K, Bell K, Gurland B, Lantigua R, Andrews H, Feng L, Tycko B, Mayeux R. The APOE-epsilon4 allele and the risk of Alzheimer disease among African Americans, whites, and Hispanics. The Journal of the American Medical Association. 1998;279:751–755. doi: 10.1001/jama.279.10.751. [DOI] [PubMed] [Google Scholar]
