Skip to main content
Educational and Psychological Measurement logoLink to Educational and Psychological Measurement
. 2015 Nov 30;76(6):889–911. doi: 10.1177/0013164415618992

A Note on Testing Mediated Effects in Structural Equation Models

Reconciling Past and Current Research on the Performance of the Test of Joint Significance

Matthew J Valente 1,, Oscar Gonzalez 1, Milica Miočević 1, David P MacKinnon 1
PMCID: PMC5098906  NIHMSID: NIHMS776730  PMID: 27833175

Abstract

Methods to assess the significance of mediated effects in education and the social sciences are well studied and fall into two categories: single sample methods and computer-intensive methods. A popular single sample method to detect the significance of the mediated effect is the test of joint significance, and a popular computer-intensive method to detect the significance of the mediated effect is the bias-corrected bootstrap method. Both these methods are used for testing the significance of mediated effects in structural equation models (SEMs). A recent study by Leth-Steensen and Gallitto 2016 provided evidence that the test of joint significance was more powerful than the bias-corrected bootstrap method for detecting mediated effects in SEMs, which is inconsistent with previous research on the topic. The goal of this article was to investigate this surprising result and describe two issues related to testing the significance of mediated effects in SEMs which explain the inconsistent results regarding the power of the test of joint significance and the bias-corrected bootstrap found by Leth-Steensen and Gallitto 2016. The first issue was that the bias-corrected bootstrap method was conducted incorrectly. The bias-corrected bootstrap was used to estimate the standard error of the mediated effect as opposed to creating confidence intervals. The second issue was that the correlation between the path coefficients of the mediated effect was ignored as an important aspect of testing the significance of the mediated effect in SEMs. The results of the replication study confirmed prior research on testing the significance of mediated effects. That is, the bias-corrected bootstrap method was more powerful than the test of joint significance, and the bias-corrected bootstrap method had elevated Type 1 error rates in some cases. Additional methods for testing the significance of mediated effects in SEMs were considered and limitations and future directions were discussed.

Keywords: mediation, structural equation model, SEM, test of joint significance, bootstrap methods

Introduction

Mediating variables are central to theories in education and the social sciences whereby a mediating variable is hypothesized to transmit the effect of an independent variable to a dependent variable. Statistical mediation analysis quantifies the influence of the independent variable on the dependent variable through the mediator by the mediated effect (MacKinnon, 2008). There are several methods used to test the statistical significance of mediated effects (Hayes & Scharkow, 2013; MacKinnon, Lockwood, Hoffmann, West, & Sheets, 2002; Shrout & Bolger, 2002), as well as several methods used to create confidence intervals for mediated effects (M. W. Cheung, 2007, 2009; G. W. Cheung & Lau, 2007; Lau & Cheung, 2012; MacKinnon, Lockwood, & Williams 2004; MacKinnon, Warsi, & Dwyer, 1995).

A recent simulation study by Leth-Steensen and Gallitto (2016) provided contradictory evidence regarding the test of joint significance and the bias-corrected (BC) bootstrap methods for testing the statistical significance of the mediated effect in simple structural equation models (SEMs). The results of this article are misleading for two reasons. First, the authors incorrectly applied the BC bootstrap method to test the significance of the mediated effect by using the BC bootstrap standard error instead of testing the significance by using BC bootstrap confidence intervals. The method the authors used is less powerful than using BC bootstrap confidence intervals which is the norm when using bootstrap procedures to test the significance of mediated effects (MacKinnon, 2008). Second, the authors were apparently unaware that the test of joint significance assumes that the a and b paths are independent when in fact the a and b paths in SEMs may be correlated, and thus, an accurate test of the significance of the mediated effect needs to take this correlation into account (MacKinnon, 2008; Morgan-Lopez, Saavedra, & Cole, 2012). This article demonstrates the correct procedure for testing the mediated effect with the BC bootstrap and details the problem with ignoring the correlation between the a and b paths in SEMs.

The single mediator model is represented by three linear regression equations (MacKinnon & Dwyer, 1993). Equation (1) represents the total effect of X on Y (c coefficient), Equation (2) represents the effect of X on M (a coefficient), and Equation (3) represents the effect of X on Y adjusted for M (c′ coefficient) and the effect of M on Y adjusted for X (b coefficient). An interaction between X and M provides a test of whether the relation between the mediator and the dependent variable differs across levels of X but is not included in the equations below. The mediated effect can be conceptualized in two different ways in the single mediator model. The product of coefficients method consists of computing the product of a and b coefficients from Equations (2) and (3), respectively. The mediated effect can also be computed as the difference between the total effect of X on Y and the direct effect of X on Y adjusted for M (c-c′) from Equations (1) and (3), respectively, which is referred to as the difference in coefficients method (MacKinnon et al., 2002):

Y=i1+cX+e1
M=i2+aX+e2
Y=i3+cX+bM+e3

These two methods for computing the mediated effect are algebraically equivalent in the linear single mediator model with continuous variables, but the product of coefficients method is more widely used (MacKinnon & Dwyer, 1993; MacKinnon et al., 1995). One of the most accurate standard error formulas for the product of a and b is the second-order Taylor series and exact variance under independence of a and b presented in Equation (4). The standard error of the product of ab is a function of the path estimates (i.e., a and b) and each of their variances (i.e., sa2 and sb2). Most methods for testing mediated effects involve testing either the significance of the product of ab jointly or the significance of a and b separately.

sab=sa2b2+sb2a2+sa2sb2

This article will provide a brief review of some of the accurate methods for testing the statistical significance of the mediated effect and accurate confidence interval estimation of the mediated effect. Methods reviewed will be those commonly used in the literature, each making a series of assumptions that yield differences in their statistical power and confidence interval coverage. The methods presented fall into two categories: single sample approaches and computer-intensive approaches. The single sample methods presented include the distribution of the product (MacKinnon et al., 2002, 2004; MacKinnon, Fritz, Williams, & Lockwood, 2007), the test of joint significance (MacKinnon et al., 2002), and the asymptotic normal theory test (MacKinnon & Dwyer, 1993). The computer-intensive methods presented include the percentile bootstrap and BC bootstrap methods (Bollen & Stine, 1990; MacKinnon et al., 2004; Preacher & Hayes 2008; Shrout & Bolger, 2002), and the Monte Carlo method to test for mediation (MacKinnon et al., 2004; Tofighi & MacKinnon, 2011). Other methods in the statistical mediation literature not presented in this study include the randomization tests of the mediated effect (Taylor & MacKinnon, 2012), and Bayesian mediation analysis (Miočević & MacKinnon, 2014; Yuan & MacKinnon, 2009).

Best Methods for Detecting Mediated Effects in the Single Mediator Model

Simulation work has provided evidence regarding the best methods for testing the significance of the mediated effect and creating accurate confidence intervals for the mediated effect. In terms of power and confidence interval coverage, the BC bootstrap method has been demonstrated to have the most statistical power of the bootstrap methods (i.e., methods that rely on sampling with replacement in order to create empirical distributions of parameter estimates when the standard error of the parameter estimate is unknown) followed by the percentile bootstrap method. The best single sample, or analytical, method (i.e., methods that use analytical formula for the standard error of a parameter estimate from a single sample of data) is the distribution of the product method which creates asymmetric confidence intervals for the mediated effect (MacKinnon et al., 2004). Other studies have provided evidence that bootstrapping methods are superior to assuming the product of a and b is normally distributed (i.e., asymptotic normal theory test; M. W. Cheung, 2007, 2009; G. W. Cheung & Lau, 2007; Fritz & MacKinnon, 2007; Hayes & Scharkow, 2013). There is evidence that the test of joint significance performs similarly to the distribution of the product and even outperforms it in some cases when the effect size of the mediated effect is small, but this advantage in power is slight and disappears as the effect size increases from a small effect size (MacKinnon et al., 2002). However, the BC bootstrap typically outperforms the test of joint significance in terms of statistical power but can result in Type 1 error rates above the nominal 0.05 level when either the a or the b path is small (e.g., 0.14), the other path is zero, and the sample size is between 500 and 1,000 or when either the a or the b path is moderate to large (e.g., 0.39 to 0.59), the other path is zero, and the sample size is less than 500 (M. W. Cheung, 2007; Fritz, Taylor, & MacKinnon, 2012; Hayes & Scharkow, 2013; MacKinnon et al., 2004). To this date only one simulation study provided evidence that the test of joint significance outperformed the BC bootstrap for testing mediated effects in a simple SEM (Leth-Steensen & Gallitto, 2016).

Distributional Assumptions of Two-Path Mediated Effects

The reason the results from Leth-Steensen and Gallitto (2016) are contradictory to other results regarding the power of the test of joint significance and the BC bootstrap is because of how the BC bootstrap test was carried out. To assess the power of the BC bootstrap method for detecting the significance of mediated effects in their proposed SEMs, it seems Leth-Steensen and Gallitto (2016) tested the significance by dividing the estimates of ab by the BC bootstrap standard error and then compared this test statistic with a standard normal distribution. Evidence of this can be found on page 6 where the authors state that

BC bootstrapping, based on 1,000 resamples, was used when fitting the models to determine the standard errors for each of the direct and indirect path parameters (from which a determination of whether they significantly differed from 0 could be made). (Leth-Steensen & Gallitto, 2016)

This practice is problematic because the sampling distribution of the mediated effect is not likely to be normally distributed except when sample size, and the size of a and b become large (Bollen & Stine, 1990; Lockwood & MacKinnon, 1998; MacKinnon, 2008; MacKinnon et al., 1995; MacKinnon et al., 2002; MacKinnon, Lockwood, & Hoffman, 1998) which was not the case in the study conducted by Leth-Steensen and Gallitto (2016) because of the limited sample sizes tested.

Kisbu-Sakarya, MacKinnon, and Miočević (2014) demonstrated that the reason that confidence intervals created for the product (ab) that rely on normal theory have poor coverage and imbalance is due to the skewness and kurtosis of the distribution of the mediated effect. As the absolute value of skewness increased, the coverage of the confidence intervals decreased and the imbalance increased. As kurtosis approached 0, the coverage approached the nominal value of 95%, but as kurtosis increased so did the imbalance of the confidence interval. There was also a significant interaction between skewness and kurtosis of the distribution of the product in the prediction of the coverage and imbalance of the confidence intervals. Tests of significance of the mediated effect (ab) that rely on the standard normal distribution are thus underpowered compared with tests that do not rely on the standard normal distribution (M. W. Cheung, 2007; G. W. Cheung & Lau, 2007; Fritz & MacKinnon, 2007; Hayes & Scharkow, 2013; MacKinnon, 2008; MacKinnon et al., 2002; MacKinnon et al., 2004).

To overcome the limitations of tests of significance of mediated effects that rely on the normal distribution, bootstrap methods can be used. The strength of bootstrap methods for testing the significance of mediated effects or other effects that are not normally distributed lies in creating confidence intervals from the sampling distribution of the mediated effects (Efron & Tibshirani, 1993). When confidence intervals are created (using the bootstrap estimates that correspond to the α/2 and 1 −α/2 percentiles or BC percentiles on the empirical distribution created by resampling) we do not have to make distributional assumptions regarding the sampling distribution of the mediated effect.

Testing the Mediated Effect in SEMs

SEM is a statistical technique used to simultaneously estimate parameters in a system of equations (Bollen, 2002). For example, in an SEM framework, the parameters in Equations (2) and (3) (e.g., a, b, and c′) would be estimated simultaneously. An additional advantage of SEMs is the ability to model relations between variables (e.g., M and Y) while adjusting for measurement error in these variables (Bollen, 2002). This adjustment for measurement error is accomplished by using multiple indicator variables to represent latent variables (i.e., unobserved variables). For example, extending a single mediator model with observed variables X, M, and Y, to an SEM, we could have three indicators of each variable to represent X, M, and Y as latent variables free of measurement error. When mediated effects are estimated in SEMs, the a path coefficient estimate may be correlated with the b path coefficient estimate (MacKinnon, 2008). If these path coefficients of the mediated effect are correlated, then the formula used to estimate the standard error presented in Equation (4) of the product of ab will be incorrect (MacKinnon & Cox, 2012) and consequently, some methods to test the mediated effect in these SEMs may be incorrect. When the a and b path coefficients are correlated the correct standard error formula of their product takes this correlation into account (Craig, 1936), as in Equation (5), and so does the point estimate, as in Equation (6).

sab=a2sb2+b2sa2+2abrsasb+sa2sb2+(rsasb)2
MediatedEffect=ab+rsasb

Testing the Mediated Effect When the a and b Paths Are Correlated

The standard error formula presented in Equation (5) has two additional terms to account for the correlation between the a and b paths compared with the standard error formula presented in Equation (4). The two additional terms are the 2abrsasb term and the (rsasb)2 term. The r in each of the new terms is the correlation between the a and b paths and the rsasb in each of the new terms is equivalent to the covariance between the a and b paths. The point estimate of the mediated effect is also affected by the correlation between the a and b paths, and is now the product of ab plus the covariance between the a and b paths (or correlation if the estimates are standardized).

Notice that depending on the sign (+ or −) of the mediated effect (ab) the correlation between the a and the b path coefficients (i.e., in the 2abrsasb term), the standard error of the product of ab can either become larger or smaller than if we were to ignore the correlation altogether. When the mediated effect (ab) is nonzero and positive and the correlation is negative, the standard error will become smaller. If the mediated effect is nonzero and positive and the correlation is positive, the standard error will become larger. When the mediated effect is nonzero and negative and the correlation is negative, the standard error will become larger. If the mediated effect is negative and the correlation is positive the standard error will become smaller. If the mediated effect is zero, the sign of the correlation will not affect the size of the standard error differentially. Methods for testing the statistical significance of the mediated effect that can take this correlation into account either directly or indirectly would be more accurate than methods that do not take this correlation into account.

Of the methods we have mentioned so far, the distribution of the product test using asymmetric confidence intervals (MacKinnon et al., 2007; Tofighi & MacKinnon, 2011) can adequately account for this correlation and provide an accurate test of significance and accurate confidence intervals of the mediated effect. In addition to the distribution of the product test, both percentile and BC bootstrapping procedures also provide accurate tests of significance and confidence intervals of the mediated effect. Bootstrapping procedures are accurate regardless of whether the a and b paths are correlated because we do not assume anything about the probability distribution that generated the parameter of interest (i.e., ab; Efron & Tibshirani, 1993). On the other hand, the test of joint significance does not take the possible correlation between the a and the b paths into account either analytically or indirectly (MacKinnon, 2008). By testing the statistical significance of the a and the b path separately, the test of joint significance assumes that the a and b paths are independent. For example, Equation (7) demonstrates the rationale of the test of joint significance. The probability of observing some value of a or greater times the probability of observing some value of b or greater is equal to the joint probability of a and b. This formula is only true when a and b are uncorrelated.

P(a)P(b)=P(ab)

If a and b are correlated, the joint probability formula becomes that in Equation (8). Currently the test of joint significance does not take this conditional relation between the a and b paths into account because P(b) ≠P(b|a). Other shortcomings of the test of joint significance are its inability to test the total mediated effect in multiple mediator models, as well as the fact that it only produces a binary decision regarding the mediated effect such as “the mediated effect is significant” or “the mediated effect is not significant” unlike confidence intervals for the mediated effect which also provide information regarding the precision of estimates (MacKinnon, 2008; MacKinnon, Cheong, & Pirlott, 2012; MacKinnon et al., 2004).

P(a)P(b|a)=P(ab)

Purpose

Because of this inconsistency regarding the power of the test of joint significance versus the power of the BC bootstrap method, a replication of the Leth-Steensen and Gallitto (2016) simulation study was conducted. The goal of this replication was threefold. First, we sought to reproduce the results from the simulation to confirm the findings of the original study. Second, we computed the BC bootstrap confidence intervals for each of the mediated effects in the original study to demonstrate the accurate way to use the BC bootstrap procedure when testing the significance of the mediated effect. Third, we sought to evaluate other methods to test the significance of the mediated effect that take the correlation between the a and b paths into account (i.e., the percentile bootstrap, the distribution of the product, asymptotic normal theory confidence limits, and the Monte Carlo methods).

Method

For the replication of the Leth-Steensen and Gallitto (2016) simulation study, data sets were generated under the two different SEMs the authors investigated (see Figure 1 for SEM with paths labeled as a, b, or c′ paths). Each of the models consisted of four latent variables, each defined by five indicators with decreasing reliabilities. The latent variables were identified by constraining the first item to have a factor loading of 1 and the rest of the factor loadings were .9, .8, .7, and .6. In Model 1 (Figure 2), all the path coefficients among latent variables were specified to have a magnitude of .25, to represent a small effect size. In Model 2 (Figure 3), three of the path coefficients were doubled to .5, and one of the path coefficients (i.e., b2 in Figure 1) that led to the outcome was constrained to be zero. In both models, the residual variances for the latent variables were constrained to the value of 1. Consequently, Model 1 was used to study power and Model 2 was used to study power and Type 1 error rates.

Figure 1.

Figure 1.

Structural equation model from Leth-Steensen and Gallitto (2016) with paths labeled as a, b, or c′ paths.

Figure 2.

Figure 2.

Model 1 population values for structural equation model from Leth-Steensen and Gallitto (2016).

Figure 3.

Figure 3.

Model 2 population values for structural equation model from Leth-Steensen and Gallitto (2016).

The data generation and analysis was carried out in the Mplus 7.1 Monte Carlo routine (Muthén & Muthén, 1998-2013) with the exact input and output files provided in the original article, and with the seed provided by the authors of the original article. A second replication of the study used the run date as the seed. A current limitation of the Mplus internal Monte Carlo routine is that the bootstrap procedure is not available for each Monte Carlo generated data set. As a result, an external Monte Carlo study was performed, where individual data sets were generated, saved, and individually analyzed, with each having its own unique Mplus input and output file. Finally, the MplusAutomation package was used in the R statistical environment to process and extract parameter information from the Mplus output files (Hallquist & Wiley, 2013). Example syntax to automate the simulation and other resources are provided in the appendix. Given the complexity of how the results from the Mplus MODEL INDIRECT command are reported in the output file, the specific mediated effects were defined as new variables with the Mplus MODEL CONSTRAINT command as the products of individual paths that form the mediated effect. The reported mediated effect estimates and standard errors from the MODEL INDIRECT and MODEL CONSTRAINT commands are mathematically equivalent. The goal of using MODEL CONSTRAINT is to overcome the limitation of the original article of using automated software and not running the model replications by hand.

For each of the models, 500 data sets were generated and analyzed with sample sizes of 150 and 400 cases (only two conditions per model). Four specific mediated effects were tested, where three of them consisted of two paths and one consisted of three paths. The main goal was to compare the proportions of times the overall specific mediated effects were statistically significant across seven different statistical methods. Both the statistical methods used in the original article (Leth-Steensen & Gallitto, 2016) are presented (i.e., the test of joint significance and the significance test using the BC bootstrap standard errors). Five additional methods that build confidence intervals around the mediated effect estimate are presented: the BC bootstrap, percentile bootstrap, distribution of the product, Monte Carlo method, and the asymptotic normal theory test.

Statistical Tests in the Original Article

For the test of joint significance, the proportion of times in which each of the components of the mediated effect (i.e., paths) were jointly significant was calculated. For the BC bootstrap, 1,000 bootstrap samples per generated data set were drawn and the sampling distribution was used to estimate a BC bootstrap standard error. The proportion of times the estimate of the mediated effect divided by the BC bootstrap standard error was statistically significant was recorded. A limitation of this method is that using the BC bootstrap standard error when creating a test statistic (i.e., Z = ab/BC bootstrap standard error of ab) assumes that the estimate of the mediated effect is normally distributed, which is rarely the case (MacKinnon, 2008; MacKinnon et al., 2007). As a result, alternative approaches to forming confidence intervals were investigated.

Proposed Statistical Tests

To overcome the limitation of the significance test using the BC bootstrap standard error, BC bootstrap confidence intervals were requested in Mplus. The proportion of times the confidence intervals for the estimate of the mediated effect did not contain zero was recorded. However, the BC bootstrap has inflated Type 1 error rates for certain parameter combinations, whereas the percentile bootstrap does not (MacKinnon et al., 2004). Consequently, percentile bootstrap confidence intervals were also requested in Mplus. The RMediation package in the the R statistical environment was used to calculate confidence intervals using the distribution of the product, Monte Carlo methods, and the asymptotic normal theory test (Tofighi & MacKinnon, 2011).

For the distribution of the product method, confidence intervals are determined based on an analytical solution for the distribution of the product of the two coefficients (Meeker & Escobar, 1994). An advantage of the distribution of the product method is that the analytical solution takes into account that the component paths of the specific mediated effect can be correlated. For the Monte Carlo method, the sample estimates for the mediated effect in each generated data set were used as population values and confidence limits were calculated in the sampling distribution of 1,000 Monte Carlo samples (MacKinnon et al., 2004). Finally, for the normal theory test, the analytical formula for the standard error of two normal random variables was used (see Equation 5) to form the confidence interval for the mediated effect estimate. The proportion of times that zero was not included in the estimated interval was recorded for all described interval estimation methods. The distribution of the product and the asymptotic normal theory test can only be computed for mediated effects consisting of two paths and Monte Carlo confidence limits can only be computed for mediated effects consisting of more than two paths in a beta version of an updated RMediation package (see Tofighi & MacKinnon, 2016). Therefore, the distribution of the product, Monte Carlo method, and the asymptotic normal theory test are not presented for the mediated effect composed of three paths.

Results

The first goal of the two replication studies of the simulations in the article by Leth-Steensen and Gallitto (2016) was to compare the power and Type 1 error rates of the test of joint significance and the BC bootstrap point estimates divided by the corresponding BC bootstrap standard errors and investigate the correlations between the a and b paths. Table 1 contains the average correlations for the a and b paths for the two path mediated effects across both replications, Model 1 and Model 2, and both sample sizes. Because the average correlations were negative and the a and b paths were both positive, the test of joint significance will have less statistical power than a test of the mediated effect that takes these correlations into account. That is, when both the a and b paths are positive and the correlation between these paths is negative, the standard error of the mediated effect that takes this correlation into account will be smaller than the standard error of the mediated effect that ignores this correlation thus resulting in more precise confidence intervals around the estimated mediated effects. Although the point estimate of the mediated effect will be smaller when taking this negative correlation into account (see Equation [6]), the standard error in Equation (5) is smaller relative to the standard error in Equation (4) than the point estimate of the mediated effect in Equation (6) relative to the point estimate of the mediated effect that does not take this correlation into account.

Table 1.

Average Correlations Between a and b Paths for Two-Path Mediated Effects.

Model Rep N a 1 b 1 a 2 b 2 a 3 b 2
1 1 150 −0.081 −0.070 −0.067
400 −0.083 −0.071 −0.068
2 150 −0.082 −0.068 −0.069
400 −0.083 −0.068 −0.070
2 1 150 −0.086 −0.039 −0.027
400 −0.086 −0.040 −0.028
2 150 −0.083 −0.037 −0.029
400 −0.085 −0.036 −0.031

Note. Average correlations between a and b paths for both replications (Rep) across both models and sample size (N).

As expected, the two replications found the same pattern of results as the original study (i.e., the test of joint significance had more power and larger Type 1 error rates than the BC bootstrap test of significance). However, the discrepancies between the test of joint significance and the BC bootstrap power and Type 1 error rates were larger in the replication studies than in the original studies. In the replications, the test of joint significance had more power than in the original study, the BC bootstrap significance test had less power than in the original study, and the Type 1 error rates for both methods were larger than in the original study.

In addition to the methods investigated in the original study, we also investigated the power and Type 1 error rates for these models using the percentile bootstrap, distribution of the product, the Monte Carlo method, and asymptotic normal theory confidence limits. The BC bootstrap was the most powerful method, followed closely by the test of joint significance. The results indicated that the method of dividing the BC bootstrap point estimate by the BC bootstrap standard error was vastly underpowered compared with both the test of joint significance and the BC bootstrap confidence intervals, as well as compared with the distribution of the product, percentile bootstrap, and the Monte Carlo method.

The power values obtained for the BC bootstrap point estimates divided by the BC bootstrap standard error were most similar to the power of asymptotic normal theory confidence limits, but the former approach had lower power even relative to asymptotic normal theory confidence limits (Tables 2-5). The BC bootstrap confidence limits had higher Type 1 error rates than all other methods as previously mentioned (MacKinnon et al., 2002). However, for the parameter combinations in this simulation study, only one of the cells had Type 1 error rates greater than 0.075, the upper limit using Bradley’s (1978) robustness criterion. There was less discrepancy between power values of the different methods at N = 400 than at N = 150. Overall, the simulation study found that the accurate procedure of using BC bootstrap confidence limits had slightly more power than the test of joint significance for parameter combinations tested in this study, which is the opposite of what Leth-Steensen and Gallitto (2016) concluded.

Table 2.

Power Values for the Mediated Effect a1b1.

TJS Distr. Prod Monte Carlo Normal theory Perc Boot BC Boot int BC Boot point
N = 150
Model 1
 Rep 1 0.318 0.288 0.286 0.110 0.232 0.356 0.068
 Rep 2 0.370 0.318 0.318 0.136 0.264 0.406 0.070
Model 2
 Rep 1 0.418 0.410 0.406 0.316 0.376 0.436 0.216
 Rep 2 0.474 0.462 0.464 0.364 0.426 0.506 0.240
N = 400
Model 1
 Rep 1 0.900 0.902 0.902 0.789 0.886 0.922 0.766
 Rep 2 0.898 0.896 0.900 0.806 0.888 0.918 0.762
Model 2
 Rep 1 0.888 0.888 0.888 0.870 0.864 0.876 0.840
 Rep 2 0.876 0.878 0.876 0.864 0.866 0.886 0.848

Note. Power values for a1b1 from two replications (Rep 1 and Rep 2) of the Leth-Steensen and Gallitto (2016) study with additional methods for computing the interval estimates of the mediated effect. The methods are TJS (test of joint significance), Distr. Prod (distribution of the product), Monte Carlo, asymptotic normal theory (confidence limits using analytical standard error), Perc Boot (percentile bootstrap), BC Boot int (interval estimates obtained using the bias-corrected bootstrap), and BC Boot point (significance of the mediated effect was obtained by dividing the point estimate from the bias-corrected bootstrap by the BC bootstrap standard error).

Table 3.

Power and Type 1 Error Rate Values for the Mediated Effect a2b2.

TJS Distr. Prod Monte Carlo Normal theory Perc Boot BC Boot int BC Boot point
N = 150
Model 1
 Rep 1 0.334 0.288 0.290 0.132 0.248 0.380 0.076
 Rep 2 0.332 0.300 0.298 0.114 0.256 0.364 0.078
Model 2
 Rep 1 0.052 0.050 0.050 0.014 0.030 0.064 0.004
 Rep 2 0.052 0.042 0.042 0.020 0.040 0.062 0.006
N = 400
Model 1
 Rep 1 0.888 0.890 0.888 0.792 0.854 0.912 0.730
 Rep 2 0.894 0.890 0.890 0.776 0.862 0.904 0.726
Model 2
 Rep 1 0.056 0.056 0.056 0.046 0.050 0.068 0.032
 Rep 2 0.052 0.052 0.052 0.036 0.050 0.062 0.028

Note. Power and Type 1 error rate (in bold italics) values for a2b2 from two replications (Rep 1 and Rep 2) of the Leth-Steensen and Gallitto (2016) study with additional methods for computing the interval estimates of the mediated effect. The methods are TJS (test of joint significance), Distr. Prod (distribution of the product), Monte Carlo, asymptotic normal theory (confidence limits using analytical standard error), Perc Boot (percentile bootstrap), BC Boot int (interval estimates obtained using the bias-corrected bootstrap), and BC Boot point (significance of the mediated effect was obtained by dividing the point estimate from the bias-corrected bootstrap by the BC bootstrap standard error).

Table 4.

Power and Type 1 Error Rate Values for the Mediated Effect a3b2.

TJS Distr. Prod Monte Carlo Normal theory Perc Boot BC Boot int BC Boot point
N = 150
Model 1
 Rep 1 0.334 0.296 0.296 0.130 0.236 0.356 0.058
 Rep 2 0.380 0.346 0.342 0.138 0.276 0.386 0.072
Model 2
 Rep 1 0.052 0.048 0.048 0.012 0.030 0.062 0.006
 Rep 2 0.052 0.050 0.050 0.022 0.050 0.068 0.010
N = 400
Model 1
 Rep 1 0.918 0.918 0.920 0.826 0.892 0.934 0.768
 Rep 2 0.912 0.914 0.912 0.848 0.888 0.920 0.792
Model 2
 Rep 1 0.056 0.056 0.056 0.046 0.050 0.064 0.036
 Rep 2 0.052 0.052 0.052 0.042 0.050 0.058 0.036

Note. Power and Type 1 error rate (in bold italics) values for a3b2 from two replications (Rep 1 and Rep 2) of the Leth-Steensen and Gallitto (2016) study with additional methods for computing the interval estimates of the mediated effect. The methods are TJS (test of joint significance), Distr. Prod (distribution of the product), Monte Carlo, asymptotic normal theory (confidence limits using analytical standard error), Perc Boot (percentile bootstrap), BC Boot int (interval estimates obtained using the bias-corrected bootstrap), and BC Boot point (significance of the mediated effect was obtained by dividing the point estimate from the bias-corrected bootstrap by the BC bootstrap standard error).

Table 5.

Power and Type 1 Error Rate Values for the Three-Path Mediated Effect a1a3b2.

TJS Perc Boot BC Boot int BC Boot point
N = 150
Model 1
 Rep 1 0.208 0.102 0.278 0.006
 Rep 2 0.250 0.118 0.304 0.004
Model 2
 Rep 1 0.052 0.030 0.068 0.002
 Rep 2 0.052 0.046 0.078 0.004
N = 400
Model 1
 Rep 1 0.884 0.796 0.902 0.432
 Rep 2 0.874 0.810 0.886 0.456
Model 2
 Rep 1 0.056 0.050 0.066 0.026
 Rep 2 0.052 0.046 0.062 0.022

Note. Power and Type 1 error rate (in bold italics) values for a1a3b2 from two replications (Rep 1 and Rep 2) of the Leth-Steensen and Gallitto (2016) study with additional methods for computing the interval estimates of the mediated effect. The methods are TJS (test of joint significance), Perc Boot (percentile bootstrap), BC Boot int (interval estimates obtained using the bias-corrected bootstrap), and BC Boot point (significance of the mediated effect was obtained by dividing the point estimate from the bias-corrected bootstrap by the BC bootstrap standard error). The interval estimators using the distribution of the product and asymptotic normal theory confidence limits are presently not available for three-path mediated effects.

Discussion

The first purpose of this article was to reproduce the results from the simulation of Leth-Steensen and Gallitto (2016) to confirm the findings of the original study. The two replication simulations that we carried out generally replicated the results from the original study and demonstrated that the correlations between the a and b paths in these simulations were negative on average thus leading to underpowered tests of the mediated effect when ignoring these correlations. When using the BC bootstrap standard error in a test of significance of the mediated effect, the power to detect this effect was much lower than the test of joint significance. This result occurred because dividing the point estimate of the mediated effect by the bootstrap standard error assumes that the sampling distribution of the mediated effect follows a normal distribution, which is not true at small to moderate sample sizes and with small to moderate effect sizes. Further evidence for the inappropriateness of the normality assumption for the distribution of the mediated effect is the similarity of power values of the asymptotic normal theory test and the BC bootstrap significance test. Even though the standard errors in the significance tests of these two methods are different, both these methods assume that the sampling distribution of the mediated effect is normal, which results in lower power to detect the mediated effect than methods that do not assume that the sampling distribution of the mediated effect is normal.

The second purpose of this article was to compare the power and Type 1 error rates of the BC bootstrap interval estimation and BC bootstrap significance test approaches, and to demonstrate the accurate way to use the BC bootstrap procedure when testing the significance of the mediated effect. To clarify, the BC bootstrap ought to be used to construct confidence intervals for mediated effects and not be used to obtain an estimate of the standard error of mediated effects via bootstrap sampling variability to use in testing the significance of the point estimate, ab. The replication studies found that the BC bootstrap confidence intervals were the most powerful test of the mediated effect, which is consistent with prior research (M. W. Cheung, 2007; Fritz & MacKinnon, 2007; Hayes & Scharkow, 2013; MacKinnon et al., 2004). Also consistent with prior research are the slightly elevated Type 1 error rates that were observed with the BC bootstrap confidence intervals. While consistent with prior research, these findings are informative because most prior research on the power and Type 1 error rates of different methods for detecting mediated effects focused on models that were not SEMs (Hayes & Scharkow, 2013; MacKinnon et al., 2004). Therefore, these findings build on prior research by extending the existing literature on power and Type 1 error rates of the mediated effect to SEMs (see also M. W. Cheung, 2007 and G. W. Cheung & Lau, 2008).

The third purpose of this article was to evaluate methods other than the test of joint significance and BC bootstrap for testing the significance of mediated effects in SEMs. These other methods included the percentile bootstrap, the distribution of the product method, the Monte Carlo method, and the asymptotic normal theory test. These methods were investigated because they each take into account the correlation between the a and b paths in SEMs either analytically (distribution of the product and normal theory methods) or indirectly (percentile bootstrap and Monte Carlo methods). The percentile bootstrap, distribution of the product, and Monte Carlo methods performed similarly to each other and had higher power than the asymptotic normal theory test. These methods also had more accurate Type 1 error rates than the BC bootstrap confidence interval method, which is consistent with prior research (Hayes & Scharkow, 2013; MacKinnon et al., 2004).

Recommendations

Because the a and b paths computed in SEMs are potentially correlated with one another, it is recommended that researchers test for significance of mediated effects and create confidence intervals for mediated effects in SEMs with one of the methods that accurately take into account this potential correlation. Methods that accurately take the correlation into account and would be more powerful than the test of joint significance for SEMs are the distribution of the product method with asymmetric confidence intervals, percentile or BC bootstrap confidence intervals, or Monte Carlo confidence intervals. Although the BC bootstrap confidence intervals method has been demonstrated to be a powerful method for detecting mediated effects, they can result in Type 1 error rates higher than the nominal level of 0.05 (M. W. Cheung, 2007; Fritz et al., 2012; Hayes & Scharkow, 2013; MacKinnon et al., 2004). Therefore, care should be taken when using the BC bootstrap confidence intervals to assess mediated effects when either the a or b path is small, the other path is zero, and the sample size is large (e.g., N > 500) or when either the a or b path is moderate to large, the other path is zero, and the sample size is moderate to small (e.g., N < 500; Fritz et al., 2012).

Limitations

This study was limited because it did not directly manipulate the correlation of the a and b paths. When the a and b paths are correlated, the point estimate of the mediated effect and the standard error of the mediated effect are both affected by this correlation. A correlation between these paths can occur in SEMs (MacKinnon, 2008) and correct methods for testing the significance of mediated effects need to take this correlation into consideration either analytically or indirectly (e.g., bootstrap methods). Ideally, a manipulated factor in the simulation study would have been the size of the correlation between the a and b paths to help determine the effects the magnitude and sign of the correlation on power and Type 1 error rates of the mediated effect for the methods presented in this article. It is expected that the methods that do not take this correlation into account (i.e., the test of joint significance) may be generally inaccurate compared with methods that do take this correlation into consideration for the estimation of mediated effects.

Future Directions

One future direction is to derive a test of joint significance that can take the correlation between the a and b paths into account. One way to do this is to adjust the estimate of the b path based on its dependence with the a path so that the test of significance of the b path represents the test of significance of a conditional relation on the a path instead of assuming that the a and the b paths are independent. If the test of joint significance can be adapted to handle this correlation, the method should be more accurate, and thus, more powerful for testing mediated effects in simple SEMs that have two-path mediated effects and will be comparable with the methods that either analytically or indirectly take this correlation into account.

Another future direction is to investigate methods for creating confidence intervals for the test of joint significance. A method to combine the information of these two paths into a confidence interval would provide more information regarding the mediated effect than the significance, or lack thereof, of the point estimate of the mediated effect. A plausible method is joint confidence intervals (Fox, 2015). Joint confidence intervals have a similar interpretation as confidence intervals for single regression coefficients but are applied to all combinations of values of the parameters being investigated (e.g., a and b) that are accepted at the specified alpha (i.e., Type 1 error) level (Fox, 2015).

This article replicated results from Leth-Steensen and Gallitto (2016) and described why the results from the original article were misleading. This article also extended the results from the original article by demonstrating accurate and powerful ways to detect the mediated effect in SEMs and provided an important contribution to testing mediated effects in SEMs, although additional research is needed in this area.

Appendix

Automating Monte Carlo Simulations in Mplus

The Monte Carlo procedure in Mplus (Muthén & Muthén, 1998-2013) provides extensive flexibility for studying statistical models by comparing their robustness with different violations of assumptions and for comparing properties of statistical methods, such as Type 1 error rates and statistical power. As of Mplus 7.1, the Mplus Monte Carlo procedure does not have some of the information needed to perform and compare some of the most accurate procedures with tests for mediated effects. For example, for each of the Monte Carlo datasets, Mplus cannot save the correlation between parameter estimates found in TECH 3 needed for the distribution of the product method and Mplus cannot draw bootstrap samples needed to calculate BC bootstrap and percentile bootstrap confidence intervals. Due to this limitation, Leth-Steensen and Gallitto (2016) state that they “. . . necessitated analyzing each individual set of simulated data separately after generating it (and collating those results by hand)” (p. 6), which in turn limited the number of conditions the authors could study.

Given how hand computations could be prone to error and time-consuming, syntax to automate the process of the Mplus Monte Carlo procedure is presented below. A package in R called MplusAutomation (Hallquist & Wiley, 2013) was used to streamline the three main steps in the Monte Carlo simulation: creating input files for Mplus, estimating models, and extracting parameter information from Mplus, and output files into R. The three MplusAutomation commands used in this simulation were the createModels, runModels, and readModels commands. This package is featured and recommended in the Mplus website as a resource (http://www.statmodel.com/usingmplusviar.shtml) and the package author gives a series of vignettes explaining its procedure (https://cran.r-project.org/web/packages/MplusAutomation/vignettes/Vignette.pdf).

Generally, input files for Mplus are created by calling on a text template file written by the user, just as the one presented below. The text template file is divided in two sections: the “iterator” part and the Mplus syntax. The iterator section starts with “[[init]]” and ends with “[[/init]],” and consists of the “iterators,” which are variables that we wish to change in every syntax file. For this example, the iterator is “n” and it takes on values of 150 and 400, which represent the sample sizes of interest in the simulation study. More iterators can be specified. The output directory to which the file is saved, along with the Mplus input file name, is also indicated in the iterator section. The second part of the text template file is Mplus syntax. However, the only difference is that whatever conditions we want to change in the input files, such as n = sample size, should be changed from the true value to [[n]]. By having a double bracket around the variable name, R recognizes it as an iterator and will make different input files for each of the values of n specified in the iterator section. In this example, there is only one iterator and two values, so two input files will be created, one with sample size 150 and another one with sample size 400. After saving the text template file, the user accesses R, loads the MplusAutomation package, and uses the createModels command, with the text template file name as input, to create the two Mplus input files.

After the Mplus input files are created, R can be used to analyze the Mplus input files through the runModels command. Although this command can have multiple options, it goes into a specific directory, identifies the Mplus input files, and calls Mplus to run them. The Mplus output files are saved in the same directory as the input files. After running all the models of interest, the user can call the readModels command to extract information in the Mplus output files and bring it into R as a list. Once the Mplus output results are stored in the R environment, R commands can be used for data manipulation and basic analyses in Monte Carlo simulations, such as summaries, analysis of variance, or logistic regression.

The basic framework on how the replication of the Leth-Steensen and Gallitto (2016) took place was by creating four text template files, one for the Monte Carlo procedure, one for each of the analyses of the individual 500 data sets per condition with the BC bootstrap, one for each of the analyses of individual data sets with the percentile bootstrap, and finally one without resampling to extract the correlation between parameter estimates from Mplus TECH 3. First, the Monte Carlo template file was used to create the two Monte Carlo inputs with the createModels command in R, and the data were generated with the runModels command. Then, the three different input files per individual Monte Carlo data set (500 × 3 input files) were created with createModels and analyzed with runModels. Finally, the information for each of the three analyses per Monte Carlo data set was brought into R as lists for processing and summary.

Below, the MplusAutomation Monte Carlo template text file for Model 1 is presented. The template text file can be generalized for the analysis of individual data sets given the similar format.

[[init]]

iterators= n;

n = 150 400;

outputDirectory=“C:/Oscar_/TJS1/sample_size[[n]]”;

filename=“MC-sample_size [[n]].inp”;

[[/init]]

title: This is an example of a SEM with continuous factor indicators;

montecarlo:

names = y1-y20;

nobs = [[n]];

nreps = 500;

seed = 59245;

REPSAVE = ALL;

save = mc1A150rep*.dat;

model montecarlo:

neigh by y6@1 y7*.9 y8*.8 y9*.7 y10*.6;

depress by y11@1 y12*.9 y13*.8 y14*.7 y15*.6;

parent by y16@1 y17*.9 y18*.8 y19*.7 y20*.6;

ext by y1@1 y2*.9 y3*.8 y4*.7 y5*.6;

y1-y20*1;

neigh*1;

depress-ext*1;

depress on neigh*.25;

parent on neigh*.25 depress*.25;

ext on neigh*.25 depress*.25 parent*.25;

model:

neigh by y6-y10;

depress by y11-y15;

parent by y16-y20;

ext by y1-y5;

depress on neigh;

parent on neigh depress;

ext on depress parent neigh;

MODEL INDIRECT:

ext IND depress neigh;

ext IND parent neigh;

ext IND parent depress;

ext IND parent depress neigh;

OUTPUT: ;

Footnotes

Authors’ Note: Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by the National Institute on Drug Abuse Grant No. R01 DA09757. This material is based on work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1311230.

References

  1. Bollen K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53, 605-634. doi: 10.1146/annurev.psych.53.100901.135239 [DOI] [PubMed] [Google Scholar]
  2. Bollen K. A., Stine R. (1990). Direct and indirect effects: Classical and bootstrap estimates of variability. Sociological Methodology, 20(1), 15-140. [Google Scholar]
  3. Bradley J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144-152. [Google Scholar]
  4. Cheung G. W., Lau R. S. (2008). Testing mediation and suppression effects of latent variables: Bootstrapping with structural equation models. Organizational Research Methods, 11, 296-325. doi: 10.1177/1094428107300343 [DOI] [Google Scholar]
  5. Cheung M. W. (2007). Comparison of approaches to constructing confidence intervals for mediating effects using structural equation models. Structural Equation Modeling, 14, 227-246. doi: 10.1080/10705510709336745 [DOI] [Google Scholar]
  6. Cheung M. W. (2009). Comparison of methods for constructing confidence intervals of standardized indirect effects. Behavior Research Methods, 41, 425-438. doi: 10.3758/BRM.41.2.425 [DOI] [PubMed] [Google Scholar]
  7. Craig C. C. (1936). On the frequency function of xy. Annals of Mathematical Statistics, 7, 1-15. [Google Scholar]
  8. Efron B., Tibshirani R. J. (1993). An introduction to the bootstrap. Boca Raton, FL: CRC Press. [Google Scholar]
  9. Fox J. (2015). Applied regression analysis and generalized linear models. Thousand Oaks, CA: Sage. [Google Scholar]
  10. Fritz M. S., MacKinnon D. P. (2007). Required sample size to detect the mediated effect. Psychological Science, 18, 233-239. doi: 10.1111/j.1467-9280.2007.01882.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fritz M. S., Taylor A. B., MacKinnon D. P. (2012). Explanation of two anomalous results in statistical mediation analysis. Multivariate Behavioral Research, 47, 61-87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Hallquist M., Wiley J. (2013). Mplusautomation: Automating Mplus model estimation and interpretation [Computer software]. Retrieved from https://cran.r-project.org/web/packages/MplusAutomation/index.html
  13. Hayes A. F., Scharkow M. (2013). The relative trustworthiness of inferential tests of the indirect effect in statistical mediation analysis: Does method really matter? Psychological Science, 24, 1918-1927. doi: 10.1177/0956797613480187 [DOI] [PubMed] [Google Scholar]
  14. Kisbu-Sakarya Y., MacKinnon D. P., Miočević M. (2014). The distribution of the product explains normal theory mediation confidence interval estimation. Multivariate Behavioral Research, 49, 261-268. doi: 10.1080/00273171.2014.903162 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Lau R. S., Cheung G. W. (2012). Estimating and comparing specific mediation effects in complex latent variable models. Organizational Research Methods, 15, 3-16. doi: 10.1177/1094428110391673 [DOI] [Google Scholar]
  16. Leth-Steensen C., Gallitto E. (2016). Testing mediation in structural equation modeling: The effectiveness of the Test of Joint Significance. Educational and Psychological Measurement, 76, 339-351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Lockwood C. M., MacKinnon D. P. (1998, March). Bootstrapping the standard error of the mediated effect. In Goostre S. (Conference Chair), Proceedings of the 23rd annual meeting of SAS Users Group International (pp. 997-1002). Cary, NC: SAS Institute. [Google Scholar]
  18. MacKinnon D. P. (2008). Introduction to statistical mediation analysis. New York, NY: Taylor & Francis. [Google Scholar]
  19. MacKinnon D. P., Cheong J., Pirlott A. G. (2012). Statistical mediation analysis. In Cooper H., Camic P. M., Long D. L., Panter A. T., Rindskopf D., Sher K. J. (Eds.), APA handbook of research methods in psychology, vol. 2: Research designs: Quantitative, qualitative, neuropsychological, and biological (pp. 313-331). Washington, DC: American Psychological Association. [Google Scholar]
  20. MacKinnon D. P., Cox M. G. (2012). Commentary on “Mediation analysis and categorical variables: The final frontier” by Dawn Iacobucci. Journal of Consumer Psychology, 22, 600-602. doi: 10.1016/j.jcps.2012.03.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. MacKinnon D. P., Dwyer J. H. (1993). Estimating mediated effects in prevention studies. Evaluation Review, 17, 144-158. [Google Scholar]
  22. MacKinnon D. P., Fritz M. S., Williams J., Lockwood C. M. (2007). Distribution of the product confidence limits for the indirect effect: Program PRODLIN. Behavior Research Methods, 39, 384-389. doi: 10.3758/BF03193007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. MacKinnon D. P., Lockwood C. M., Hoffman J. (1998, June). A new method to test for mediation. Paper presented at the annual meeting of the Society for Prevention Research, Park City, UT. [Google Scholar]
  24. MacKinnon D. P., Lockwood C. M., Hoffman J. M., West S. G., Sheets V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104. doi: 10.1037/1082-989X.7.1.83 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. MacKinnon D. P., Lockwood C. M., Williams J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39, 99-128. doi: 10.1207/s15327906mbr3901_4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. MacKinnon D. P., Warsi G., Dwyer J. H. (1995). A simulation study of mediated effect measures. Multivariate Behavioral Research, 30, 41-62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Meeker W. Q., Escobar L. A. (1994). An algorithm to compute the CDF of the product of two normal random variables. Communications in Statistics-Simulation and Computation, 23, 271-280. [Google Scholar]
  28. Miočević M., MacKinnon D. P. (2014, March23-26). SAS® for Bayesian mediation analysis. In Proceedings of the SAS Global Forum 2014 Conference Cary, NC: SAS Institute. [Google Scholar]
  29. Morgan-Lopez A. A., Saavedra L. M., Cole V. T. (2012, May). A cautionary note on non-zero covariance between a and b in continuous latent variable mediation. Paper presented at the 20th Annual Meeting of the Society for Prevention Research, Washington, DC. [Google Scholar]
  30. Muthén L. K., Muthén B. O. (1998-2013). Mplus user’s guide (6th ed.). Los Angeles, CA: Muthén & Muthén. [Google Scholar]
  31. Preacher K. J., Hayes A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40, 879-891. doi: 10.3758/BRM.40.3.879 [DOI] [PubMed] [Google Scholar]
  32. Shrout P. E., Bolger N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7, 422-445. doi: 10.1037/1082-989X.7.4.422 [DOI] [PubMed] [Google Scholar]
  33. Taylor A. B., MacKinnon D. P. (2012). Four applications of permutation methods to testing a single-mediator model. Behavior Research Methods, 44, 806-844. doi: 10.3758/s13428-011-0181-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Tofighi D., MacKinnon D. P. (2011). RMediation: An R package for mediation analysis confidence intervals. Behavior Research Methods, 43, 692-700. doi: 10.3758/s13428-011-0076-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Tofighi D., MacKinnon D. P. (2016). Monte Carlo confidence intervals for complex functions of indirect effects. Structural Equation Modeling, 23, 194-205. [Google Scholar]
  36. Yuan Y., MacKinnon D. P. (2009). Bayesian mediation analysis. Psychological Methods, 14, 301-322. doi: 10.1037/a0016972 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Educational and Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES