Abstract
R2 effect-size measures are presented to assess variance accounted for in mediation models. The measures offer a means to evaluate both component paths and the overall mediated effect in mediation models. Statistical simulation results indicate acceptable bias across varying parameter and sample-size combinations. The measures are applied to a real-world example using data from a team-based health promotion program to improve the nutrition and exercise habits of firefighters. SAS and SPSS computer code are also provided for researchers to compute the measures in their own data.
The purpose of the present study is to present and evaluate R2 effect-size measures for mediation analysis. Although both effect size and mediation have gained attention in recent years, little research has focused on effect-size measures for mediation models. Such measures may be especially useful for mediation analysis in program evaluation research, where identifying the practical utility of hypothesized mediators can guide more efficient, cost-effective program implementations. Before mediation effect-size measures can be routinely implemented in research, however, further study is needed to assess statistical performance of the measures.
Effect-Size Measures
Effect-size measures have become increasingly important, as researchers shift focus from the examination of the statistical significance of research findings to the consideration of the practical significance of their work. Unlike statistical significance testing, which considers whether an effect is either absent or present, effect-size measures consider the magnitude of an effect providing information on the extent to which a null hypothesis is false (Cohen, 1988). Much of the framework for effect size is based on Cohen’s research in regression analysis that evaluates the ratio of variance explained to the total variance in a model, independent of sample size. This ratio may be defined in terms of mean differences, simple bivariate correlations, and both multiple and partial correlations. Cohen defined small effects as those that are not readily perceptible to the naked eye, and large effects as those that—although visible to an attentive observer—are not so blatantly obvious that the phenomenon is not worth studying. All else being equal, smaller effects require larger sample sizes in order to be detected by a significance test than do larger effects.
Mediation Analysis
Cohen’s (1988) measures of effect size for regression and ANOVA provide information on the practical significance of an effect. It may also be of interest for one to examine the process by which an effect occurs. Mediation is a third variable effect that informs the relation between two variables by explaining how or why the two variables are related. Mediation analysis implies a causal process that connects the variables by modeling how an intervening, or mediator, variable, M, transmits the influence of an independent variable, X, onto an outcome, Y (see Figure 1).
The utility of mediation analysis is highlighted in prevention and treatment research, where modeling and testing hypothesized mediators can explain how a prevention or treatment program achieved its effects. Identifying mediator variables that can be predicted by the program, and which can predict program outcomes themselves, reveals the mechanisms that underlie program effects. Understanding how programs achieve their effects elucidates the process of behavior change, and this information can be used to refine later implementations of a program curriculum (Judd & Kenny, 1981; MacKinnon & Dwyer, 1993;
West, Aiken, & Todd, 1993). Estimates of the mediated effect are obtained from coefficients in the following two regression equations (Baron & Kenny, 1986; MacKinnon & Dwyer, 1993).
(1) |
(2) |
where Y is the dependent or outcome variable, X is the independent variable, M is the mediating variable, τ′ is the direct effect of the independent variable on the outcome controlling for the effect of the mediating variable, β is the path relating the mediator to the outcome controlling for the effect of the independent variable, α is the path relating the independent variable to the mediating variable, i1 and i2 are the intercepts in each equation, and e1 and e2 are the corresponding residuals in each equation. Regression equations for the single mediator model can be estimated in any conventional statistical software package, such as SAS or SPSS. The total effect of the independent variable on the dependent variable, τ, is the sum of the direct and indirect effects in the model and can be easily computed using parameter estimates from the output: τ = τ′ + αβ. The mediation regression equations can also be simultaneously estimated as a path model in structural equation modeling software packages, such as Mplus, in which estimates of overall fit for the model are available (L. K. Muthén & B. O. Muthén, 2007).
The assumptions underlying the statistical mediation model include those associated with OLS regression analysis (see Cohen, Cohen, West, & Aiken, 2003), temporal precedence of the treatment variable relative to the mediator and outcome variables, and no moderation of the mediation relation (MacKinnon, Lockwood, & Williams, 2004).
Although there are alternative ways to assess mediation (see, e.g., Collins, Graham, & Flaherty, 1998; James, Mulaik, & Brett, 2006; Kraemer, Kiernan, Essex, & Kupfer, 2004), the focus of the present article is the mediation relation investigated by computing the product of coefficients α and β (MacKinnon & Dwyer, 1993). The product of coefficients test of mediation was chosen for focus in the present article, since it has been shown to perform well in a variety of circumstances and is generalizable to more complicated mediation models (MacKinnon, Lockwood, Hoffman, West, & Sheets, 2002). Readers are directed to MacKinnon et al. (2002) for a detailed account of other statistical tests for mediation in which the performance of each was compared in a simulation study. Those interested in further information on experimental approaches to testing mediation are directed to Spencer, Zanna, and Fong (2005) and MacKinnon (2008), who described alternative approaches to testing mediation hypotheses that focus on research design.
Effect Size in Mediation Analysis
Several different effect-size measures for mediation may be calculated from the two regression equations presented in Equations 1 and 2, and some of these effect-size measures have been applied in the mediation literature. The focus of effect-size measures in mediation analysis concentrates on comparing the magnitudes of different effects in the model—the indirect effect, the direct effect, and the total effect—in order to assess the relative contribution of each (Sobel, 1982). One frequently used effect-size measure for mediation is the proportion mediated. This measure indicates what proportion of the total effect is mediated by the intervening variable, and it has been cited in substantive research (e.g., Chassin, Pitts, De-Lucia, & Todd, 1999; Ouimette, Finney, & Moos, 1999; Wolchik et al., 1993). The proportion mediated also provides a means to assess the relative contribution of single mediators in multiple mediator models by indicating what proportion of the total effect is attributable to individual mediational pathways. The measure is unstable in several parameter combinations, however, and has excess bias in small sample sizes (MacKinnon, Fairchild, Yoon, & Ryu, 2007; Taborga, 2000). Specifically, the proportion-mediated measure only performs well with samples of greater than 500. This large sample-size requirement may limit the utility of the measure, given the prevalence of research with smaller sample size.
Other effect-size measures for mediation, such as the partial r2 and standardized regression coefficients, have been applied from multiple regression analysis and cited in substantive research (Taborga, 2000). These measures are qualitatively different from other mediation effect-size measures, such as the proportion mediated, in that they focus on the relation between two variables in the mediation model. The partial r2 provides information on the amount of variance in a criterion variable that can be uniquely explained by an independent variable once other variables in the model have been accounted for. In the mediation model, there are two possible partial r2 measures corresponding to variance explained in the β and τ′ paths of the model, respectively: (1) , or the variance in Y that is explained by M but not X; and (2) , or the variance in Y that is explained by X but not M. The squared correlation between X and M, , is the variance in M that is explained by X. This measure is not a partial correlation, since M is predicted by a single independent variable.
There are three possible standardized regression coefficients in the mediation model, each corresponding to one of the three unstandardized paths in Figure 1. Specifically, βstandardized represents the change in Y for every 1 standard deviation change in M, αstandardized represents the change in M for every 1 standard deviation change in X, and represents the change in Y for every 1 standard deviation change in X. The standardized regression coefficients provide information on the strength of individual paths in the mediation model in a standardized metric. When testing mediation in a structural equation modeling or path-analysis framework, the standardized regression coefficients are referred to as standardized structure coefficients, but the interpretation of the weights is the same. In both frameworks, the magnitude of the weights is relative to the variables involved in their computation. Both the standardized regression (or structure) coefficients and r2 measures provide information only on component parts of the mediation model. Neither of these measures is able to provide information on the mediated effect as a whole.
Given the weaknesses of the proportion-mediated effect-size measure and the limited application of those component effect-size measures from regression analysis, alternative measures of effect size in mediation analysis are still needed. Overall R2 measures that quantify variance explained in an outcome provide a useful tool for this goal. R2 measures have been proposed in commonality analysis, also known as elements analysis (or components analysis). Originally presented by Newton and Spurrell (1967) and later refined by others (e.g., Mood, 1969, 1971; Seibold & McPhee, 1979), commonality analysis partitions variance explained in a criterion variable into unique and nonunique parts using multiple squared partial correlation estimates from the model. In addition to assessing the unique contribution of each predictor with squared partial correlations and the total variance explained in a model with the overall R2 from regression analysis, commonality analysis also provides estimates of common effects (Seibold & McPhee, 1979). These quantities provide a promising R2 measure of effect size for mediation analysis, since they quantify the proportion of variance in an outcome variable that is common to a set of predictors but not to a predictor alone. Mood (1969) provided a general equation for determining the nth-order common effect for a set of n predictors, where the −1 that results from expanding any given product is dropped from computation:
(3) |
All resulting terms following expansion of the product in Equation 3 become R2 quantities to represent the contribution of the variable(s) to variance explained in the outcome. We can apply this general framework to the mediation model to find the variance in Y that is common to both X and M but that can be attributed to neither alone. Applying Mood’s (1969) formula to the first mediation regression equation (Equation 1), where Y is predicted from X and M, we find that the variance common to both X and M in predicting Y is
(4) |
Expanding Equation 4, rearranging terms, and introducing R2 values yields
(5) |
where is the portion of the variance in Y explained by M (i.e., the squared raw correlation between the dependent variable and the mediator), is the portion of the variance in Y explained by X (i.e., the squared raw correlation between the dependent variable and the independent variable), and is the overall model R2 from Equation 1. Equation 5 shows that the second-order common effect for the mediation model partitions the observed variance in Y into several parts to isolate that part of the system that is uniquely attributable to the mediated effect. Specifically, the measure subtracts those portions of observed variance in Y that are explained uniquely by M or uniquely by X from the overall observed variance in Y, leaving the portion of variance that is explained by the predictors together. By illustrating the extent to which the combination of X and M together explain variance in Y, Equation 5 is able to combine component parts of the mediational chain into a single effect-size measure. The meaning of this overlap or redundancy in prediction in the mediation model is a bit different from what it would mean in a single multiple regression equation, since the mediation model is defined by a second equation in which M is predicted by X (Equation 2). Because M takes on the unique quality of being both a predictor and an outcome in the mediation model, the second-order common effect is able to provide information about the magnitude of the mediated effect, or the extent to which X predicts variance in M, which subsequently predicts variance in Y. Thus, the second-order common effect for the mediation model—or what we will call the —informs the researcher about the practical significance of the overall mediation relation.
Decomposition of the component parts in the measure illustrates how the effect-size measure estimates the portion of variance uniquely associated with the mediated effect in a single mediator model, represented by in Figure 2. Recasting the squared raw correlations and the overall R2 from Equation 5 in terms of pieces from the Venn diagram presented in Figure 2 helps illustrate this point. Specifically,
(6) |
(7) |
(8) |
where all terms from Equation 5 retain their original meaning, is the portion of variance in Y explained by the mediated effect, is the squared partial correlation of Y and M partialed for X, and is the squared partial correlation of Y and X partialed for the influence of the mediator. Substituting terms from Equations 6–8 into Equation 5 further demonstrates how the measure isolates the portion of variance in Y explained by the mediated effect:
(9) |
By accounting for variance in the dependent variable explained by the independent variable and the mediator variable together, the R2 effect-size measure for mediation estimates the portion of Figure 2 that is labeled . Note that because the three variables in the diagram are standardized, they all have a variance of 1. Note also that because the measure does not square the resulting quantity from the difference computed in the equation, it is possible to have negative values of the estimate. This quality of second-order common effects is what sets them apart from primary effects in multiple regression models; they are not sums of squares and therefore may take on negative values under some circumstances (Newton & Spurrell, 1967). Seibold and McPhee (1979) noted that negative values of the estimates indicate that suppression effects may be present; consideration of a particular pair or group of variables in predicting variance in an outcome may lead to the reduction in prediction from either variable alone.
Pilot research on the bias of the measure has been conducted on a small set of parameter values and sample sizes (Taborga, 2000). Simulation results suggested that the measure was unbiased at varying sample size for the medium and large effect sizes studied in the simulation. The purpose of the present study was to further investigate R2 effect-size measures for mediation analysis. To that end, we evaluated the overall measure for mediation in a comprehensive set of parameter combinations to expand empirical evidence for its use and to examine component r2 measures for the mediation model. Investigating the measure in a wider set of parameters will demonstrate its performance in a variety of contexts. Simulation results for the accompanying component r2 measures in the mediation model (i.e., , and ) will complete the presentation of R2 measures for mediation. Although previous simulation work has been published on r2 measures (Algina & Olejnik, 2003; Wang & Thompson, 2007), these studies do not preclude further examination of the measures in the present article. Not only did the sample sizes, effect sizes, and simulation outcome measures in the present study differ from those in previous research, but no study has examined the special case of r2 for mediation models either. Following the presentation of the simulation work, the R2 effect-size measures for the mediation model are applied to a real mediation example using data from a team-based program to improve the nutrition and exercise habits of firefighters.
METHOD
Simulation overview
Data from a single-mediator model were simulated using the SAS programming language, Version 9.1. Three population variables were specified on the basis of the population parameters α, β, and τ′ in Equations 1 and 2. Values of the X variable in the mediation model were generated using the SAS RANNOR function, which gives normally distributed random numbers with a mean of 0 and standard deviation of 1. Values of the M and Y variables were generated with Equations 1 and 2, using different values of α, β, and τ′, as will be described below. The residuals for these variables were also generated using SAS RANNOR. True variances, covariances, and correlations for the model parameters were derived with covariance algebra. Model conditions were generated so that all effect-size combinations of the α, β, and τ′ parameters (i.e., .00 = null effect, .14 = small effect, .39 = medium effect, .59 = large effect; Cohen, 1988) were examined in six sample sizes: N = 50, N = 100, N = 200, N = 300, N = 500, and N = 1,000. Effect sizes examined in the present study corresponded to those effect sizes that were defined by Cohen as small, medium, and large. Sample sizes examined in the present study corresponded to a range of possible sample sizes typically observed for studies in psychology. All possible combinations of the simulation conditions yielded a 4 × 4 × 4 × 6 factorial design with 384 conditions, each of which was replicated 1,000 times, yielding 384,000 total data sets. Overall sample means and the average bias were computed for the R2 measures from these data. Figure 3 provides a flowchart of the simulation work conducted in the present study.
Population mediation models in the simulation were: (1) no mediation (i.e., either α or β = 0, or α and β = 0), (2) partial mediation (i.e., α ne; 0, β ne; 0, and τ′ ne; 0), or (3) complete mediation (i.e., α ne; 0, β ne; 0, and τ′ = 0). The outcome variable for the study was the bias of the effect-size point estimates for the R2 measures, where bias was defined as the difference between the sample estimate and the population parameter value. Although relative bias, or bias divided by the population parameter value, is sometimes reported to provide a standardized metric of bias in simulation work, the measures considered in the present study already had a standardized metric for bias. Specifically, R2 values have an intrinsic metric that is naturally standardized, ranging from 0 to 1, and they also have a uniform interpretation: the percentage of variance explained in a criterion variable. Moreover, relative bias is undefined for simulation conditions in which the true value of the parameter is 0 (thus yielding division by 0), and it may exaggerate small differences when the true value of the parameter is small. R2 estimates were considered unbiased if the estimate of bias was .01 or less, indicating that the difference between the population value and the sample estimate of the R2 measure was less than 1% of the total variance accounted for in the mediation model.
RESULTS
Mean estimates for the component r2 measures for mediation ranged from .001 to .266 for , from .001 to .266 for , and from .001 to .246 for . Mean estimates of the overall measure ranged from .001 to .280. Although mean estimates of the measure increased as intended with the magnitude of the mediated effect, αβ, the estimates were also dependent on the magnitude of the direct effect, τ′. For example, the mean estimate of for the N = 50, α = .59, β = .59, τ′ = 0 condition was .082. However, the estimate for the same condition increased to .277 when τ′ = .59. This dependency is an artifact of the Venn diagram system used to formulate the measure. Specifically, the relation of all variables in the single mediator model (i.e., X, M, and Y) must be modeled to recover information on the α and β paths. Modeling the three variables necessitates inclusion of the relation between X and Y, however (see Figure 2).
Estimates for the component r2 measures, , and , were only dependent on sample size and the parameter values involved in their computation. For example, the sample estimates and estimates of bias for were unaffected by the magnitude of the α and τ′ parameters because the correlation between M and Y was partialed for X. Likewise, was unaffected by either β or τ′, and was unaffected by α and β. Because the measures reflect unique sources of variance, simulation results for the component r2 measures could be collapsed across several conditions for more effective presentation.
All three r2 measures for component paths in the mediation model had acceptable bias for N ≥ 100 across effect-size conditions (see Table 1). Specifically, no estimates of bias for , or exceeded .01 in any condition in which N ≥ 100, indicating that sample estimates of these effect-size measures were an adequate gauge of the magnitude of individual paths within 1% of the total variance observed in either M or Y. When N ≥ 50, the three component r2 measures had acceptable bias if the effect size of the parameter was large. For example, if the parameter value of α was .59, the sample estimate of had bias less than .01. If the parameter value of β was .59, the sample estimate of had acceptable bias, and so forth. In contrast, the overall measure for mediation had acceptable bias across all sample-size and effect-size combinations (see Table 2). Its bias never exceeded .006 in any parameter combination, indicating that the measure only made errors of just over one half of a percent of variance accounted for in the mediation model. SAS and SPSS code to compute all of the R2 measures for mediation analysis described in the present article is presented in the Appendix.
Table 1.
Parameter | True Value | Sample Size |
||||||
---|---|---|---|---|---|---|---|---|
50 | 100 | 200 | 300 | 500 | 1,000 | |||
|
α = 0 | .020 | .010 | .005 | .003 | .002 | .001 | |
α = .14 | .019 | .009 | .005 | .003 | .002 | .001 | ||
α = .39 | .013 | .007 | .003 | .002 | .001 | .000 | ||
α = .59 | .008 | .004 | .001 | .002 | .001 | .001 | ||
|
β = 0 | .021 | .010 | .005 | .003 | .002 | .001 | |
β = .14 | .020 | .001 | .005 | .003 | .002 | .001 | ||
β = .39 | .015 | .001 | .003 | .002 | .002 | .001 | ||
β = .59 | .008 | .004 | .002 | .001 | .000 | .000 | ||
|
τ′ = 0 | .020 | .010 | .005 | .003 | .002 | .001 | |
τ′ = .14 | .020 | .010 | .005 | .003 | .002 | .001 | ||
τ′ = .39 | .014 | .007 | .004 | .002 | .001 | .001 | ||
τ′ = .59 | .009 | .005 | .002 | .001 | .001 | .000 |
Note—Estimates of bias for each component r2 measure for mediation are collapsed across simulation parameters by which bias was not affected.
Table 2.
Input Parameters | Sample Size |
|||||
---|---|---|---|---|---|---|
50 | 100 | 200 | 300 | 500 | 1,000 | |
Null Direct Effect (τ′ = 0) | ||||||
α = 0, β = 0 | .000 | −.000 | −.000 | −.000 | −.000 | −.000 |
α = 0, β = .14 | .001 | .000 | .000 | .000 | .000 | .000 |
α = 0, β = .39 | .002 | .001 | .001 | .000 | .000 | .000 |
α = 0, β = .59 | .004 | .003 | .001 | .001 | .001 | .000 |
α = .14, β = 0 | −.000 | .000 | −.000 | −.000 | .000 | .000 |
α = .14, β = .14 | .001 | .000 | .000 | .000 | −.000 | .000 |
α = .14, β = .39 | .003 | .002 | .000 | .001 | .000 | .000 |
α = .14, β = .59 | .005 | .003 | .002 | .001 | .001 | .000 |
α = .39, β = 0 | .001 | .000 | −.000 | −.000 | −.000 | .000 |
α = .39, β = .14 | −.001 | .000 | −.000 | −.000 | −.000 | .000 |
α = .39, β = .39 | .003 | .000 | −.001 | .001 | .000 | −.000 |
α = .39, β = .59 | .005 | .003 | .002 | −.000 | .000 | .000 |
α = .59, β = 0 | .000 | .000 | −.000 | .000 | .000 | −.000 |
α = .59, β = .14 | −.000 | −.000 | .000 | −.000 | .000 | −.000 |
α = .59, β = .39 | .003 | .002 | .000 | .001 | −.001 | .001 |
α = .59, β = .59 | −.000 | .001 | .001 | .001 | .001 | .000 |
Small Direct Effect (τ′ = .14) | ||||||
α = 0, β = 0 | .000 | .000 | .000 | .000 | .000 | .000 |
α = 0, β = .14 | .001 | .001 | .000 | .000 | .000 | −.000 |
α = 0, β = .39 | .002 | .002 | .001 | .000 | .001 | .000 |
α = 0, β = .59 | .005 | .002 | .001 | .001 | .001 | .001 |
α = .14, β = 0 | −.000 | .000 | .000 | .000 | .000 | .000 |
α = .14, β = .14 | .000 | .000 | .000 | .000 | .000 | −.000 |
α = .14, β = .39 | .002 | .001 | .001 | .000 | .000 | −.000 |
α = .14, β = .59 | .003 | .002 | .002 | .000 | .001 | .000 |
α = .39, β = 0 | −.000 | −.000 | .000 | .000 | .000 | .000 |
α = .39, β = .14 | −.001 | −.000 | −.000 | −.000 | −.000 | .000 |
α = .39, β = .39 | .000 | −.001 | .000 | .001 | −.000 | .000 |
α = .39, β = .59 | .001 | .001 | .000 | .001 | .002 | −.000 |
α = .59, β = 0 | .001 | .000 | .000 | .000 | .000 | −.000 |
α = .59, β = .14 | .000 | −.001 | .000 | −.000 | −.000 | −.000 |
α = .59, β = .39 | −.003 | −.001 | −.000 | .000 | .001 | −.001 |
α = .59, β = .59 | −.001 | .001 | −.000 | .001 | −.001 | .001 |
Medium Direct Effect (τ′ = .39) | ||||||
α = 0, β = 0 | .003 | .001 | .001 | .001 | .000 | .000 |
α = 0, β = .14 | .003 | .002 | .001 | .000 | .000 | .000 |
α = 0, β = .39 | .005 | .001 | .001 | −.000 | −.000 | −.000 |
α = 0, β = .59 | .006 | .003 | .000 | .000 | .000 | .001 |
α = .14, β = 0 | .002 | .001 | .001 | .001 | .000 | .000 |
α = .14, β = .14 | .002 | −.000 | .001 | .000 | .000 | .000 |
α = .14, β = .39 | .004 | .001 | .000 | .001 | .000 | .000 |
α = .14, β = .59 | .001 | .002 | .001 | −.000 | .001 | −.000 |
α = .39, β = 0 | .002 | .001 | −.000 | −.000 | −.000 | −.000 |
α = .39, β = .14 | .000 | −.001 | −.000 | .000 | −.000 | −.000 |
α = .39, β = .39 | −.004 | .004 | .000 | .000 | −.000 | .000 |
α = .39, β = .59 | .002 | .001 | −.001 | .000 | −.001 | −.001 |
α = .59, β = 0 | −.000 | .000 | .001 | .001 | .001 | −.000 |
α = .59, β = .14 | .002 | −.001 | −.001 | −.001 | .001 | −.000 |
α = .59, β = .39 | −.002 | −.002 | .001 | .000 | −.001 | −.000 |
α = .59, β = .59 | −.003 | −.002 | −.001 | .001 | −.002 | −.000 |
Large Direct Effect (τ′ = .59) | ||||||
α = 0, β = 0 | .004 | .003 | .001 | .001 | .000 | .000 |
α = 0, β = .14 | .005 | .003 | .001 | .001 | .001 | .000 |
α = 0, β = .39 | .003 | .001 | .002 | −.001 | .000 | .000 |
α = 0, β = .59 | .006 | .001 | .002 | .002 | −.000 | .000 |
α = .14, β = 0 | .004 | .002 | .002 | .000 | .000 | .000 |
α = .14, β = .14 | .003 | .001 | .001 | .001 | .000 | −.000 |
α = .14, β = .39 | −.001 | .003 | .001 | .001 | .000 | −.000 |
α = .14, β = .59 | −.001 | −.003 | .001 | −.001 | −.001 | −.000 |
α = .39, β = 0 | .002 | .002 | .001 | −.000 | .000 | .000 |
α = .39, β = .14 | −.003 | .001 | −.000 | .001 | −.001 | −.000 |
α = .39, β = .39 | .001 | .001 | −.000 | −.000 | −.000 | −.000 |
α = .39, β = .59 | −.001 | −.000 | −.001 | −.001 | −.001 | −.000 |
α = .59, β = 0 | −.000 | .002 | .001 | .002 | −.002 | −.000 |
α = .59, β = .14 | −.002 | −.001 | −.002 | −.000 | .001 | .001 |
α = .59, β = .39 | −.004 | −.004 | −.001 | .001 | −.001 | −.000 |
α = .59, β = .59 | −.003 | −.004 | −.002 | .001 | −.002 | −.000 |
R2 Effect-Size Measures in a Real data Example
The PHLAME (“Promoting healthy lifestyles: Alternative models’ effects”; Moe et al., 2002) study prospectively compared two methods of behavior change and a control condition’s ability to promote healthy eating and exercise habits in Pacific Northwest firefighters. Although firefighters have risk factors for heart disease that are similar to those of other U.S. adults (Moe et al., 2002), extreme episodic physical activity in the workplace puts them at a higher risk for heart attacks. Accordingly, the PHLAME intervention sought to decrease these risk factors by improving overall nutrition and exercise habits of the cohort. Although several outcome measures were outlined in the PHLAME program, the example used in the present study considered the effect of PHLAME’s team intervention on the fruit consumption of study participants.
The PHLAME program recruited 657 total participants from 35 stations in five Oregon and Washington fire department districts. Participants ranged in age from 20 to 69, with a mean age of approximately 40 years. The sampled firemen were predominantly men (96.2%), and a majority of the participants were also married (77.8%) and Caucasian (90.1%). Multiple minority groups represented the remaining 9.9% of the sample. Most participants worked over 50 hours a week (82.9%).
Firefighters were randomly assigned by station to a motivational interviewing (MI) curriculum, a team-based (TEAM) curriculum, or a control group. All participants were measured at three annual time points: (1) baseline, or the preintervention phase; (2) 1 year after the implementation of the intervention; and (3) 2 years after program implementation. One of the mechanisms through which the TEAM intervention was hypothesized to achieve its effects was through increasing the intentions of participants to improve their nutrition and exercise behavior. By implementing the TEAM curriculum in fire stations, PHLAME was able to create a culture of healthy eating and activity levels, thus promoting individual gains and intentions to improve diet and exercise habits. The mediation model in the present example considers the impact of PHLAME’s TEAM intervention (X) on participants’ intentions to exercise (M), and participants’ subsequent physical activity (Y). Participants’ intentions to exercise were measured on a 5-point Likert scale that ranged from “I do not regularly exercise and do not intend to start within the next 6 months” to “I regularly exercise and have been regularly exercising for more than 6 months.” Physical activity of the participants was measured on a 7-point Likert scale that quantified the number of days per week the subject engaged in exercise that worked up a sweat.
The PHLAME trial was a cluster-randomized design in which the intervention was randomly assigned to fire stations within a district. The average cluster size in the study was about 10 firefighters per station. This clustering of the sample means that the data are multilevel, so that the observations in a single cluster may be more highly related to one another than are observations across clusters (potentially creating dependence in the observations). However, intraclass correlation coefficients computed for the intent to exercise and level of physical activity variables were .002 and .000, yielding design effects of 1.018 and 1.000, respectively. Since the average cluster size in the data was 10, the obtained ICC values and design effects were well within the range in which dependency is small enough to be ignorable (B. [O.] Muthén, 1997; B. O. Muthén & Satorra, 1995). Because the variables used in this example were not significantly affected by the multilevel structure of the data, and also because the point of the present article was to describe and illustrate R2 effect-size measures, mediation analysis on the basis of conventional regression models was used for analysis.
Mediation analysis (MacKinnon et al., 2002) was conducted to explore the mediated relation of PHLAME’s TEAM intervention on participant physical activity through increased intentions to exercise. Regression models were run in SAS 9.1, where the Time 2 outcome measure of days per week engaged in exercise that works up a sweat was the dependent variable, and the Time 2 measure of participants’ intention to exercise was the mediating variable.
The point estimate of the mediated effect (αβ), called mediatedeffect in the program code, was .193. The PRODCLIN program (MacKinnon, Fritz, Williams, & Lockwood, 2007) was used to find an asymmetric confidence interval for the estimate ([.008, .382], p < .05). Using the overall R2 statistic (.376; called overallrsquared in the program code) and the raw correlations (rMY = .612, rXY = .086) from the mediation model defined by Equations 1–2 (called rmy and rxy in the code, respectively), the measure for the relation was .6122 − (.376 − .0862) = .006. The is referred to as rsquaredmediated in the code. The component r2 measures for the individual contribution of paths in the mediation model were , and . These measures are referred to as rxmsquared, partialrmy_xsquared, and partialrxy_msquared in the code, respectively. The overall value of .006 indicates that slightly less than 1% of the variance in participants’ physical activity is attributable to the indirect effect of the PHLAME TEAM intervention through increasing participants’ intentions to exercise. If we consider that approximately 38% of the total variance in participants’ average daily fruit consumption is explained ( ), we can say that 1.6% (.006/.376) of the explained variance in the model was due to the mediated effect. Note that although the mediated relation for these data was statistically significant, the effect size of the point estimate suggests that this effect is small. Using the component r2 measures from the data, weak areas of the mediation model can be isolated, and this information can be used to improve understanding of the program. An examination of the individual contributions of component paths indicates that the weakest link in the model is the relation between X and M ( ), and that the strongest link in the model is the relation between M and Y ( ). This finding suggests that intentions to exercise are strongly related to subsequent physical activity, but that there is room in the PHLAME program to improve the manipulation of participants’ intentions to exercise. Researchers can now use the results of this R2 effect-size analysis for mediation to inform later implementations of the PHLAME program.
DISCUSSION
The goal of the present article was to present and describe the statistical performance of R2 effect-size measures for the single mediator model. Simulation results showed that both the overall and the component r2 mediated measures are stable measures of effect size for N ≥ 100 across effect-size parameters of α, β, and τ′. The overall effect-size measure had acceptable bias in all effect-size combinations across sample size, and the component r2 mediated measures had acceptable bias in N ≥ 100 across effect-size parameters. If the effect size of the parameter for the r2 mediated measures was large (.59), the measures had acceptable bias when N ≥ 50.
Despite the favorable findings in the present article, the R2 measures for mediation have limitations. First, a correctly specified mediation model was assumed, and it is unclear how the R2 measures will perform for misspecified models. Second, the measures were not extended to more complicated models—for example, structural equation models with more than one dependent variable, multilevel models, or categorical data models. At present, these measures are appropriate for data with continuous outcomes and no hierarchical structure. The measures may also be applied to multilevel data in which the dependency is not substantial, however, if ordinary least squares estimation is used. Recall that R2 measures are based on the ordinary least squares regression model that uses a least squares minimization function as its fit criterion. Structural equation models, hierarchical linear models, and logistic regression models estimate parameters with maximum-likelihood procedures that iteratively arrive at an appropriate solution rather than minimize residual variance in a criterion variable. Thus, R2 measures as they are conventionally defined are not yet applicable to these models. However, several pseudo-R2 measures have been developed for use with these types of data, and it may be possible to extend the present methods in a similar way.
Although bias for the measure was acceptable, true values and sample estimates of the measure were small in most cases. Even when the effect sizes of α and β were large, true values and mean sample estimates of the measure never exceeded .280 in any condition. The small value of the effect-size measure reflects the size of effects of paths in the mediation model, however, and Abel-son (1985) has argued that small effects in a variance-explained framework are often meaningful.
Small values of the occur because the measure isolates the variance explained in a dependent measure by squaring various elements involved in the product of two variables. Squaring or taking the product of any estimate reduces its size if values are less than one. Hence, the magnitude of the effect size for the mediated effect is bounded by the effect size of the α and β parameters. An examination of the component r2 measures for mediation complements results from the , offering additional information on effect size in the model. The mixture of several sources of information in this framework is what makes R2 mediated-effect measures a comprehensive resource for obtaining a clear picture of effect size in mediation analysis.
In summary, the results of the present study supported the R2 effect-size measures for mediation. Estimates of bias for the measures were generally acceptable across parameter combinations. The effect-size measures aim to provide a cohesive look at effect size in the mediation model by describing the effect sizes of component paths, as well as by isolating the variance explained by a mediated effect in an outcome variable. By combining results from the measure with information drawn from the r2 measures of individual paths in the mediation model, researchers are not only able to gauge the practical significance of a mediated effect, but can also identify areas of the model that need improvement and/or require further study. The application of the measures to a real data example provides a guide for how to implement the measures in analysis.
Acknowledgments
This research was supported by the National Institute on Drug Abuse Grant DA09757. A portion of this article was presented at the 2005 Annual Meeting of the American Psychological Association in Washington, DC.
APPENDIX
The following SAS and SPSS programs compute partial r2 and R2 quantities for the mediated effect described in the present article. The programs specify three variables for analysis: X (the independent variable in the study), M (the mediating variable in the study, and Y (the outcome variable in the study). Both programs are automated macros that only require the user to assign a dataset and variable names for X, M, and Y in the program.
SPSS Macro to Compute R2 Effect-Size Measures for Mediation
To use the program, scroll down to the line that says USING THE PROGRAM. *. * Define macro *. DEFINE rsquare (dataname = !TOKENS(1) / x = !TOKENS(1) / m = !TOKENS(1) / y = ! TOKENS (1) ) * Get data and make new variables x, m, and y for easier reference *. dataset activate !dataname window=asis. compute x = !x. compute m = !m. compute y = !y. exe. * Get correlations among variables *. dataset declare rawcorrs. correlations variables = x m y /matrix out(rawcorrs). exe. dataset activate rawcorrs window=asis. dataset copy corrs1 window=hidden. dataset activate corrs1. select if ROWTYPE_ = ‘CORR’ and VARNAME_ = ‘x’. compute rxm = m. compute rxy = y. exe. dataset activate rawcorrs window=asis. dataset copy corrs2 window=hidden. dataset activate corrs2. select if ROWTYPE_ = ‘CORR’ and VARNAME_ = ‘m’. compute rmy = y. exe. * Regress m on x *. dataset declare regmonx. dataset activate !dataname. regression /dependent = m /method = enter x /outfile covb (‘regmonx’). exe. dataset activate regmonx window=asis. select if ROWTYPE_=’EST’. compute alpha = x. exe. * Regress y on x and m *. dataset declare regyonxm. dataset activate !dataname. regression /dependent = y /method = enter x m /outfile covb (‘regyonxm’). exe. dataset activate regyonxm window=asis. select if ROWTYPE_= ‘EST’. compute tauprime = x. compute beta = m. exe. * Combine results into a single dataset *. match files file = corrs1 /file = corrs2 /file = regmonx /file = regyonxm /keep = rxm rxy rmy alpha beta tauprime. exe. * Compute r-squared quantities *. dataset name combined. * Find the mediated effect *. compute mediatedeffect = alpha*beta. * Find rxm squared *. compute rxmsquared = rxm**2. * Find the squared partial correlation for tau prime *. compute partialrxy_msquared = ( (rxy−rmy*rxm) / (sqrt( (1−rmy**2) * (1− rxmsquared) )) )**2. * Find the squared partial correlation for beta *. compute partialrmy_xsquared = ( (rmy−rxy*rxm) / (sqrt( (1−rxy**2) * (1− rxmsquared) )) )**2. * Find the squared multiple correlation of Y, M, and X *. compute overallrsquared = ( ( (rxy**2) + (rmy**2) )−(2*rxy*rmy*rxm) ) / ( 1− (rxmsquared) ). * Find the amount of variance in Y explained by the mediated effect *. compute rsquaredmediated = (rmy**2) − (overallrsquared − (rxy**2) ). formats mediatedeffect rxm rxmsquared rxy rmy partialrxy_msquared partialrmy_xsquared overallrsquared rsquaredmediated (F10.5). exe. dataset close rawcorrs. dataset close corrs1. dataset close corrs2. dataset close regmonx. dataset close regyonxm. print records = 12 / ‘alpha = ’ alpha / ‘beta = ’ beta / ‘tauprime = ’ tauprime / ‘mediatedeffect = ’ mediatedeffect / ‘rxm = ’ rxm / ‘rxmsquared = ’ rxmsquared / ‘rxy = ’ rxy / ‘rmy = ’ rmy / ‘partialrxy_msquared = ’ partialrxy_msquared / ‘partialrmy_xsquared = ’ partialrmy_xsquared / ‘overallrsquared = ’ overallrsquared / ‘rsquaredmediated = ’ rsquaredmediated. exe. !ENDDEFINE. * USING THE PROGRAM *. * To use the program, first select the section above (from DEFINE to ! ENDDEFINE) and run it. *. * This defines the macro for SPSS so that it will be recognized once it is invoked with the macro call. *. * Then to run the example, simply highlight the code from *AN EXAMPLE* to the end of the program *. * Alternatively, to run the macro for your own data, edit the line below that says rsquare “dataname = nameofdataset x = xvar m = mvar y = yvar” *. * to direct the program to your dataset and variables. Replace the word ‘nameofdataset’ with the name of the SPSS dataset *. * that has the data you want to analyze. If you only have one dataset open, you can just put an asterisk (*) here. *. * Replace ‘xvar’ with the name of the predictor variable. Replace ‘mvar’ with the name of the mediating variable. *. * Replace ‘yvar’ with the name of the outcome variable. Delete the asterisk from the beginning of the line. *. * Then select that line of the program and run it. *. * rsquare dataname = nameofdataset x = xvar m = mvar y = yvar. * AN EXAMPLE *. data list free /predictor mediator outcome. begin data 1 2 3 2 1 2 3 3 1 4 5 4 5 4 8 6 9 9 7 8 7 8 7 6 9 6 5 end data. dataset name example. rsquare dataname = example x = predictor m = mediator y = outcome.
SAS Macro to Compute R2 Effect-Size Measures for Mediation
* This SAS program computes partial r-squared and R-squared quantities; * for the mediated effect described in this paper; * To use the program, scroll down to the line that says USING THE PROGRAM; * Define macro; %macro rsquare(dataset,x,m,y); %* Get data and make new variables x, m, and y for easier reference; data newdata; set &dataset; x = &x; m = &m; y = &y; run; %* Get correlations among the variables and save them as a new dataset; proc corr data=newdata outp=rawcorrs noprint; var x m y; run; data corrs1; set rawcorrs; if _NAME_ = ‘x’; rxm = m; rxy = y; run; data corrs2; set rawcorrs; if _NAME_ = ‘m’; rmy = y; run; data corrs; merge corrs1 corrs2; keep rxm rxy rmy; run; %* Regress m on x; proc reg data=newdata outest=out1; model m = x; run; data regmonx; set out1; if _TYPE_=’PARMS’; alpha = x; keep alpha; run; %* Regress y on x and m; proc reg data=newdata outest=out2; model y = x m; run; data regyonxm; set out2; if _TYPE_=’PARMS’; tauprime = x; beta = m; keep tauprime beta; run; %* Combine results into a single dataset and compute r-squared quantities; data combined; merge corrs regmonx regyonxm; %* Compute the mediated effect; mediatedeffect = alpha*beta; %* Compute rxm squared. ; rxmsquared = rxm**2; %* Find the squared partial corr. for tau prime; partialrxy_msquared = ( (rxy−rmy*rxm) / (sqrt( (1−rmy**2) * (1− rxmsquared) )) )**2; %* Find the squared partial correlation for beta; partialrmy_xsquared = ( (rmy−rxy*rxm) / (sqrt( (1−rxy**2) * (1− rxmsquared) )) )**2; %* Find the multiple correlation of Y, M, and X (the model R-squared); overallrsquared = ( ( (rxy**2) + (rmy**2) )−(2*rxy*rmy*rxm) ) / ( 1− (rxmsquared) ); %* Find the amount of variance in Y explained by the mediated effect; rsquaredmediated = (rmy**2) − (overallrsquared − (rxy**2) ); run; proc print data=combined; run; %mend rsquare; * USING THE PROGRAM; * To use the program, first select the section above (from %macro to %mend) and run it; * This defines the macro for SAS so that it will be recognized once it is invoked with the macro call; * Then to run the example, simply highlight the code from *AN EXAMPLE: to the end of the program; * Alternatively, to run the macro for your own data, edit the line below that says “%rsquare(dataset,x,m,y)” to direct the program to your dataset and variables; * Replace the word ‘dataset’ with the name of the SAS dataset that has the data you want to analyze; * Replace ‘x’ with the name of the predictor variable; * Replace ‘m’ with the name of the mediating variable; * Replace ‘y’ with the name of the outcome variable; * Delete the asterisk from the beginning of the line, select that line of the program and run it; *%rsquare(dataset,x,m,y); * AN EXAMPLE: ; data example; input predictor mediator outcome; cards; 1 2 3 2 1 2 3 3 1 4 5 4 5 4 8 6 9 9 7 8 7 8 7 6 9 6 5 ; %rsquare(example,predictor,mediator,outcome);
Contributor Information
Amanda J. Fairchild, Email: afairchi@mailbox.sc.edu, University of South Carolina, Columbia, South Carolina.
David P. MacKinnon, Arizona State University, Tempe, Arizona
Marcia P. Taborga, Children’s Hospital, Los Angeles, California
Aaron B. Taylor, Texas A&M University, College Station, Texas
References
- Abelson RP. A variance explanation paradox: When a little is a lot. Psychological Bulletin. 1985;97:129–133. [Google Scholar]
- Algina J, Olejnik S. Sample size tables for correlation analysis with applications in partial correlation and multiple regression analysis. Multivariate Behavioral Research. 2003;38:309–324. doi: 10.1207/S15327906MBR3803_02. [DOI] [PubMed] [Google Scholar]
- Baron RM, Kenny DA. The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality & Social Psychology. 1986;51:1173–1182. doi: 10.1037//0022-3514.51.6.1173. [DOI] [PubMed] [Google Scholar]
- Chassin L, Pitts SC, DeLucia C, Todd M. A longitudinal study of children of alcoholics: Predicting young adult substance use disorders, anxiety and depression. Journal of Abnormal Psychology. 1999;108:106–119. doi: 10.1037//0021-843x.108.1.106. [DOI] [PubMed] [Google Scholar]
- Cohen J. Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum; 1988. [Google Scholar]
- Cohen J, Cohen P, West S, Aiken L. Applied multiple regression/correlation analysis for the behavioral sciences. Mahwah, NJ: Erlbaum; 2003. [Google Scholar]
- Collins LM, Graham JW, Flaherty BP. An alternative framework for defining mediation. Multivariate Behavioral Research. 1998;33:295–312. doi: 10.1207/s15327906mbr3302_5. [DOI] [PubMed] [Google Scholar]
- James LR, Mulaik SA, Brett JM. A tale of two methods. Organizational Research Methods. 2006;9:233–244. [Google Scholar]
- Judd CM, Kenny DA. Process analysis: Estimating mediation in treatment evaluations. Evaluation Review. 1981;5:602–619. [Google Scholar]
- Kraemer HC, Kiernan M, Essex M, Kupfer D. Moderators and mediators in biomedical research: The MacArthur and the Baron and Kenny approaches. 2004 doi: 10.1037/0278-6133.27.2(Suppl.).S101. Unpublished manuscript. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKinnon DP. Introduction to statistical mediation analysis. Mahwah, NJ: Erlbaum; 2008. [Google Scholar]
- MacKinnon DP, Dwyer JH. Estimating mediated effects in prevention studies. Evaluation Review. 1993;17:144–158. [Google Scholar]
- MacKinnon DP, Fairchild AJ, Yoon M, Ryu E. Evaluation of the proportion mediated effect size measure of mediation. 2007 Unpublished manuscript. [Google Scholar]
- MacKinnon DP, Fritz MS, Williams J, Lockwood CM. Distribution of the product confidence limits for the indirect effect: Program PRODLIN. Behavior Research Methods. 2007;39:384–389. doi: 10.3758/bf03193007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V. A comparison of methods to test mediation and other intervening variable effects. Psychological Methods. 2002;7:83–104. doi: 10.1037/1082-989x.7.1.83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKinnon DP, Lockwood CM, Williams J. Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research. 2004;39:99–128. doi: 10.1207/s15327906mbr3901_4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moe EL, Elliot DL, Goldberg L, Kuehl KS, Stevens VJ, Breger RKR, et al. Promoting healthy lifestyles: Alternative models’ effects (PHLAME) Health Education Research. 2002;17:586–596. doi: 10.1093/her/17.5.586. [DOI] [PubMed] [Google Scholar]
- Mood AM. Macro-analysis of the American educational system. Operations Research. 1969;17:770–784. [Google Scholar]
- Mood AM. Partitioning variance in multiple regression as a tool for developing learning models. American Educational Research Journal. 1971;8:191–202. [Google Scholar]
- Muthén B[O] Latent variable modeling with longitudinal and multilevel data. In: Raftery A, editor. Sociological methodology. Vol. 27. Boston: Blackwell; 1997. pp. 453–480. [Google Scholar]
- Muthén BO, Satorra A. Complex sample data in structural equation modeling. In: Marsden PV, editor. Sociological methodology. Vol. 25. Washington, DC: American Sociological Association; 1995. pp. 267–316. [Google Scholar]
- Muthén LK, Muthén BO. Mplus user’s guide. 5. Los Angeles: Muthén & Muthén; 2007. [Google Scholar]
- Newton RG, Spurrell DJ. A development of multiple regression for the analysis of routine data. Applied Statistics. 1967;16:51–64. [Google Scholar]
- Ouimette PC, Finney JW, Moos RH. Two-year post-treatment functioning and coping of substance abuse patients with posttraumatic stress disorder. Psychology of Addictive Behaviors. 1999;13:105–114. [Google Scholar]
- Seibold DR, Mcphee RD. Commonality analysis: A method for decomposing explained variance in multiple regression analyses. Human Communication Research. 1979;5:355–365. [Google Scholar]
- Sobel ME. Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology. 1982;13:290–312. [Google Scholar]
- Spencer SJ, Zanna MP, Fong GT. Establishing a causal chain: Why experiments are often more effective than mediational analyses in examining psychological processes. Journal of Personality & Social Psychology. 2005;89:845–851. doi: 10.1037/0022-3514.89.6.845. [DOI] [PubMed] [Google Scholar]
- Taborga MP. Effect size in mediation models. Arizona State University; Tempe: 2000. Unpublished master’s thesis. [Google Scholar]
- Wang Z, Thompson B. Is the Pearson r2 biased, and if so, what is the best correction formula? Journal of Experimental Education. 2007;75:109–125. [Google Scholar]
- West SG, Aiken LS, Todd M. Probing the effects of individual components in multiple component prevention programs. American Journal of Community Psychology. 1993;21:571–605. doi: 10.1007/BF00942173. [DOI] [PubMed] [Google Scholar]
- Wolchik SA, West SG, Westover S, Sandler IN, Martin A, Lustig J, et al. The children of divorce parenting intervention: Outcome evaluation of an empirically based program. American Journal of Community Psychology. 1993;21:293–331. doi: 10.1007/BF00941505. [DOI] [PubMed] [Google Scholar]