Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

Oscar L Olvera Astivia; Edward Kroc

doi:10.1177/0013164418817801

. 2018 Dec 13;79(5):813–826. doi: 10.1177/0013164418817801

Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

Oscar L Olvera Astivia ^1,^✉, Edward Kroc ¹

PMCID: PMC6713984 PMID: 31488914

Abstract

Within the context of moderated multiple regression, mean centering is recommended both to simplify the interpretation of the coefficients and to reduce the problem of multicollinearity. For almost 30 years, theoreticians and applied researchers have advocated for centering as an effective way to reduce the correlation between variables and thus produce more stable estimates of regression coefficients. By reviewing the theory on which this recommendation is based, this article presents three new findings. First, that the original assumption of expectation-independence among predictors on which this recommendation is based can be expanded to encompass many other joint distributions. Second, that for many jointly distributed random variables, even some that enjoy considerable symmetry, the correlation between the centered main effects and their respective interaction can increase when compared with the correlation of the uncentered effects. Third, that the higher order moments of the joint distribution play as much of a role as lower order moments such that the symmetry of lower dimensional marginals is a necessary but not sufficient condition for a decrease in correlation between centered main effects and their interaction. Theoretical and simulation results are presented to help conceptualize the issues.

Keywords: linear model, moderated regression, interaction, multicollinearity

Introduction

Within the social, behavioral, and health sciences, the product-interaction term in a multiple linear regression context is perhaps the most prevalent and popular approach researchers are familiar with when attempting to account for nonlinear influences in their statistical models (Baron & Kenny, 1986; Dawson, 2014; Wu & Zumbo, 2008). Although its origins can be traced to the early 20th century in Court (1930), the use of product-interactions was popularized by Cohen (1968) when advocating for the use of multiple regression as a “general” data-analytic strategy and took its final form in the seminal work of Aiken and West (1991) on simple slope analysis. The term moderation became synonymous with multiplicative interaction and, today, it is an essential tool in the methodologist’s repertoire and a staple of any introductory course to linear regression and the general linear model (Chaplin, 1991; Irwin & McClelland, 2001).

Using Aiken and West’s (1991) notation, the classical product-interaction approach in linear regression follows a model of the form:

Y = β_{0} + β_{1} X + β_{2} Z + β_{3} X Z + ϵ

(1)

for continuous, nondegenerate random variables $Y$ , $X$ , and $Z$ and where all the assumptions of ordinary least square linear regression are satisfied, as described in Cohen, Cohen, West, and Aiken (2002). A recommended preliminary step before running model (1) is to subtract the mean from $X$ and $Z$ so that the predictors in the equation are

Y = β_{0} + β_{1} (X - E (X)) + β_{2} (Z - E (Z)) + β_{3} (X - E (X)) (Z - E (Z)) + ϵ

This process is known as “centering” and is usually advocated both on conceptual and statistical grounds. The conceptual reasoning is mostly due to the interpretation of the coefficients in the presence of the multiplicative interaction term, and a thorough discussion of this issue can be found in Bedeian and Mossholder (1994), Iacobucci, Schneider, Popovich, and Bakamitsos (2016), and Kraemer and Blasey (2004). The statistical reasoning, however, is the main focus of this article, which we expect to clarify by revisiting some old assumptions made by Aiken and West (1991).

The main statistical concern behind centering in regression models is the issue of multicollinearity. Product-interaction terms are generally highly correlated with their main effects and subtracting the mean of the predictors has been recommended to alleviate the issue (Cohen et al., 2002). Indeed, Cohen et al. (2002, p. 264) advise,

If $X$ and $Z$ are each completely symmetrical as, in our numerical example, then the covariance between $X$ and $XZ$ is

$Cov (X, XZ) = Var (X) E (Z) + Cov (X, Z) E (X)$
If $X$ and $Z$ are centered then $E (X)$ and $E (Z)$ are both zero and the covariance between $X$ and $XZ$ is zero as well. The same holds for the covariance between $Z$ and $XZ$ . The amount of correlation that is produced between $X$ and $XZ$ by the nonzero means of $X$ and $Z$ is referred to as nonessential multicollinearity. This nonessential multicollinearity is due purely to scaling when variables are centered, it disappears. The amount of correlation between $X$ and $XZ$ that is due to skew in $X$ cannot be removed by centering. This source of correlation between $X$ and $XZ$ is termed essential multicollinearity.

We will see, however, that there are three mistakes with this typical advice. First, we provide an explicit example when $X$ and $Z$ are both completely symmetrical, yet the covariance identity given by Cohen et al. (2002) does not hold (see Figure 3 and accompanying text). Second, what Cohen et al. (2002) refer to as “nonessential multicollinearity” does not automatically disappear when variables are centered. And third, the implication that centering always reduces multicollinearity (by reducing or removing “nonessential multicollinearity”) is incorrect; in fact, in many cases, centering will greatly increase the multicollinearity problem.

Figure 3. — Centered and uncentered X and Z from a Joe copula distribution.

The covariance between the product term and any of its constituent terms was derived by Bohrnstedt and Goldberger (1969) and reproduced in Aiken and West (1991) as

Cov (XZ, X) = E (X_{c}^{2} Z_{c}) + Var (X) E (Z) + Cov (X, Z) E (X)

where $X_{c} = X - E (X)$ and $Z_{c} = Z - E (Z)$ . If the predictors in multiple regression are centered, then $E (X_{c}) = E (Z_{c}) = 0$ and the covariance between the main effects and their interaction is exclusively a function of $E (X_{c}^{2} Z_{c})$ , a central third-order moment of the joint distribution. Aiken and West (1991), Osborne and Waters (2002), Tabachnick and Fidell (2007), and others further assume that the variables in the model are jointly normally distributed so that $E (X^{2} Z)$ is in fact equal to 0. Note that this fact takes advantage of special properties of a bivariate normal distribution; it has nothing to do with the idea that odd moments of even functions are always zero. Indeed, such a statement only makes sense in a purely univariate setting as the concept of distributional symmetry is not uniquely defined in higher dimensions. We will explore this subtlety in greater detail in the section “Symmetry and Expectation-Independence.”

Within the social sciences, most of what is known with regard to the properties of the covariance of the interaction term and its main effects comes indirectly from reinterpretations of the work of Bohrnstedt and Goldberger (1969). As such, understanding the assumptions made in the original source and how they were translated into data analytic practice are crucialto better characterize the instances where they hold, where they are suspect, and how the overall process of analyzing multiplicative interactions would change if different assumptions were set in place. The main purpose of this article is, therefore, to highlight the original framework of Bohrnstedt and Goldberger (1969), with particular emphasis on the assumption they label “expectation-independence.” By adjusting this assumption, we present a class of bivariate random variables that exhibit a rotational form of expectation-independence to expand the class of described bivariate distributions where centering decreases the covariance between the product terms and their interaction. Moreover, we identify some prototypical cases where the opposite holds; that is, where centering increases (rather than decreases) the covariance between the product terms and their interaction. We derive a set of general conditions for this increase in covariance to take place. We will define a type of symmetry relevant to bivariate distributions and connect it to the concept of expectation-independence to show that this symmetry implies expectation-independence in the sense of Bohrnstedt and Goldberger (1969), so that the vanishing correlation between main effects and their multiplicative interaction is a consequence of it. We also present a graphical representation of these conditions and discuss their impact on the analysis of three-way interaction terms.

Results

Centered Versus Uncentered Covariance Products

Consider the jointly distributed, nondegenerate, continuous random variables $(X, Z)$ with $Z$ expectation-independent of $X$ ; that is, $E (Z | X) = E (Z)$ . This condition states that the average value of $Z$ does not depend on the value of $X$ . Geometrically then, the joint distribution of $X$ and $Z$ should enjoy reflectional symmetry about the $X$ axis. Bohrnstedt and Goldberger (1969) show that $Z$ expectation-independent of $X$ yields $E [(X - E (X))^{2} (Z - E (Z))] = 0$ , which implies that

E (X^{2} Z) = 2 E (XZ) - E (X)^{2} E (Z) + E (X^{2}) E (Z)

(2)

Using the definition of the covariance between random variables and substituting the expression derived in (1), the Aiken and West (1991) identity A.14 can be obtained as

\begin{matrix} Cov (X, XZ) = E (X^{2} Z) - E (XZ) E (X) \\ = E (XZ) E (X) - 2 E {(X)}^{2} E (Z) + E (X^{2}) E (Z) \\ = E (X) Cov (X, Z) + E (Z) Var (X) \end{matrix}

As before, set $X_{c} = X - E (X)$ and $Z_{c} = Z - E (Z)$ . It then follows from the previous definitions that $Cov (X_{c}, X_{c} Z_{c}) = E (X_{c}^{2} Z_{c})$ and, by expectation independence, if $E (Z | X) = E (Z)$ then $E (Z_{c} | X) = E (Z_{c}) = 0$ , in which case $E (X_{c}^{2} Z_{c}) = Cov (X_{c}, X_{c} Z_{c}) = 0$ and

| Cov (X_{c}, X_{c} Z_{c}) | \leq | Cov (X, XZ) |

(3)

which is the standard result presented in introductory textbooks in regression analysis. Note that we could derive the analogous result

| Cov (Z_{c}, Z_{c} X_{c}) | \leq | Cov (Z, ZX) |

by simply assuming that $X$ is expectation-independent of $Z$ .

If no kind of expectation-independence is assumed, however, the following expression can be derived for $Cov (X_{c}, X_{c} Z_{c})$ . The proof of this identity can be found in the appendix:

Cov (X_{c}, X_{c} Z_{c}) = Cov (X, XZ) - E (Z) Var (X) - E (X) Cov (X, Z)

(4)

From this identity, one can obtain four different conditions under which the covariance of the centered terms is larger than the covariance of the uncentered terms. $| Cov (X, XZ) | \leq | Cov (X_{c}, X_{c} Z_{c}) |$ if:

\begin{matrix} 0 \leq Cov (X, XZ) \leq Cov (X_{c}, X_{c} Z_{c}) \Leftrightarrow \\ E (Z) Var (X) \leq - E (X) Cov (X, Z) \end{matrix}

(5)

\begin{matrix} 0 \leq Cov (X, XZ) \leq - Cov (X_{c}, X_{c} Z_{c}) \Leftrightarrow \\ - E (Z) Var (X) \leq - 2 Cov (X, XZ) + E (X) Cov (X, Z) \end{matrix}

(6)

\begin{matrix} 0 \leq - Cov (X, XZ) \leq - Cov (X_{c}, X_{c} Z_{c}) \Leftrightarrow \\ - E (Z) Var (X) \leq E (X) Cov (X, Z) \end{matrix}

(7)

\begin{matrix} 0 \leq - Cov (X, XZ) \leq Cov (X_{c}, X_{c} Z_{c}) \Leftrightarrow \\ E (Z) Var (X) \leq 2 Cov (X, XZ) - E (X) Cov (X, Z) \end{matrix}

(8)

Note that there are really only two distinct conditions here, as replacing $X$ by $- X$ and $Z$ by $- Z$ will transform Conditions (5) and (6) into Conditions (7) and (8), respectively.

To apply these conditions in practice, one must first estimate $Cov (X, XZ)$ . If we have $Cov (X, XZ) > 0$ , and if exactly one of Conditions (5) or (6) hold, then centering will not reduce the correlation between $X$ and $XZ$ . On the other hand, if both or none of Conditions (5) and (6) hold, while $Cov (X, XZ) > 0$ , then centering must reduce the correlation between $X$ and $XZ$ . If $Cov (X, XZ) < 0$ , then simply replacing $X$ by $- X$ and $Z$ by $- Z$ , one can apply the same diagnostics to determine if centering will reduce correlation.

Symmetry and Expectation-Independence

We have already remarked that the concept of a symmetric distribution is not uniquely defined in a multivariate setting. In a single variable, we often use a density function’s evenness to facilitate analytical arguments; that is, the property $f_{X} (x) = f_{X} (- x)$ enjoyed by the normal distribution, for example. In a bivariate setting, one natural generalization of this property is to suppose that the joint density function of $X$ and $Z$ satisfies the following two conditions:

f_{X, Z} (x, z) = f_{X, Z} (- x, z) = f_{X, Z} (x, - z) .

(9)

Notice that these two conditions imply the final component-wise symmetry condition $f_{X, Z} (x, z) = f_{X, Z} (- x, - z)$ and that these conditions taken together force the density function to be symmetric in each quadrant of the plane, with respect to the origin.

From this definition of an “even” bivariate density function, we have the following result:

Proposition 1: Suppose $X$ and $Z$ are jointly distributed such that their joint density function is symmetric in the sense of condition (9). Then $X$ is expectation-independent of $Z$ and $Z$ is expectation-independent of $X$ .
Proof. We first show that $E (X | Z) = E (Z | X) = 0$ , proceeding as follows:

\begin{matrix} E (X | Z) = \int_{R} x f_{X | Z} (x | z) dx \\ = \int_{R} x \frac{f_{X, Z} (x, z)}{f_{Z} (z)} dx \\ = \frac{1}{f_{Z} (z)} (\int_{- \infty}^{0} x f_{X, Z} (x, z) dx + \int_{0}^{\infty} x f_{X, Z} (x, z) dx) \\ = \frac{1}{f_{Z} (z)} (\int_{\infty}^{0} x f_{X, Z} (- x, z) dx + \int_{0}^{\infty} x f_{X, Z} (x, z) dx) \\ = \frac{1}{f_{Z} (z)} (- \int_{0}^{\infty} x f_{X, Z} (x, z) dx + \int_{0}^{\infty} x f_{X, Z} (x, z) dx) = 0, \end{matrix}

where the final step follows from the definition of “even” presented before. The same argument shows that $E (Z | X) = 0$ . Now applying double expectaction (Casella & Berger, 2002), we find $E (X) = E (Z) = 0$ . □

Because expectation-independence guarantees that centering will reduce the covariance between the interaction and first-order terms in regression model (1), Proposition 1 shows that symmetric joint distributions in the sense of (9) also satisfy the covariance condition in (3). It turns out that we can characterize another class of bivariate distributions where (3) holds that are neither symmetric in the sense of (9) nor expectation-independent in terms of the original variables $X$ and $Z$ . For this class of distributions, we can find a new system of coordinates $(\tilde{X}, \tilde{Z})$ where the kind of symmetry given by (9), and thus expectation-independence, holds with respect to these transformed coordinates. As a special case, this class includes all bivariate normal random variables.

Proposition 2: Suppose that there exists an invertible linear transformation of the plane $A$ such that for $(\tilde{X}, \tilde{Z})^{T} : = A \cdot (X_{c}, Z_{c})^{T}$ the following conditions hold: (i) $\tilde{X}$ is expectation-independent of $\tilde{Z}$ , (ii) $\tilde{Z}$ is expectation-independent of $\tilde{X}$ , (iii) $E ({\tilde{X}}^{3}) = E ({\tilde{Z}}^{3}) = 0$ . Then, $E (X_{c}^{2} Z_{c}) = E (Z_{c}^{2} X_{c}) = 0$ and thus the covariance condition (3) must hold for $X$ and $XZ$ and for $Z$ and $XZ$ .

To prove Proposition 2, we will need the following technical lemma.

Lemma 1: If $E (X_{1}) = E (X_{2}) = 0$ , and if $({\tilde{X}}_{1}, {\tilde{X}}_{2})^{T} : = A \cdot (X_{1}, X_{2})^{T}$ for some rotation of the plane $A$ , then $E ({\tilde{X}}_{1}) = E ({\tilde{X}}_{2}) = 0$ .
Proof. Let $π_{j}$ denote the $j$ th coordinate projection operator; that is, $π_{j} : R^{2} \to R$ via $π_{j} (x) = x_{j}$ . Then,

E ({\tilde{X}}_{1}) = E (π_{1} (A \cdot (X_{1}, X_{2})^{T})) .

Now, $A$ is a $2 \times 2$ matrix of real numbers, so $π_{1} (A \cdot (X_{1}, X_{2})^{T})$ is a linear combination of $X_{1}$ and $X_{2}$ . Consequently, we may write,

E ({\tilde{X}}_{1}) = E (c_{1} X_{1} + c_{2} X_{2}),

for some real numbers $c_{1}$ and $c_{2}$ . But since $E (X_{1}) = E (X_{2}) = 0$ , we must have that $E ({\tilde{X}}_{1}) = 0$ . A similar argument shows that $E ({\tilde{X}}_{2}) = 0$ as well. □

Proof of Proposition 2: We imitate the method in the proof of Lemma 1 and write,

E (X^{2} Z) = E [{(π_{1} (A^{- 1} \cdot {(\tilde{X}, \tilde{Z})}^{T}))}^{2} (π_{2} (A^{- 1} \cdot {(\tilde{X}, \tilde{Z})}^{T}))] .

Recall that $A^{- 1}$ exists as a $2 \times 2$ matrix of real numbers since $A$ is assumed to be invertible. Now, $π_{2} (A^{- 1} \cdot (\tilde{X}, \tilde{Z})^{T})$ is a linear combination of $X$ and $Z$ ; thus, we may write,

E (X^{2} Z) = E [{(c_{1} \tilde{X} + c_{2} \tilde{Z})}^{2} (c_{3} \tilde{X} + c_{4} \tilde{Z})],

for some real numbers $c_{j}$ , $1 \leq j \leq 4$ . Expanding the product and distributing the expectation yields,

E (X^{2} Z) = E (d_{1} {\tilde{X}}^{3}) + E (d_{2} {\tilde{X}}^{2} \tilde{Z}) + E (d_{3} \tilde{X} {\tilde{Z}}^{2}) + E (d_{4} {\tilde{Z}}^{3}),

for some other real numbers $d_{j}$ , $1 \leq j \leq 4$ . By conditions (i) and (ii) of our Proposition, the two cross terms must equal zero, while condition (iii) ensures that the third moments are zero. Thus, $E (X^{2} Z) = 0$ . A similar argument will show that $E (Z^{2} X) = 0$ . □

Recall that all invertible linear transformations of the plane are at most combinations of reflections, rotations, expansions or compressions, and shears. Geometrically then, what Proposition 2 tells us is that centering will reduce the correlation between $X$ and $XZ$ whenever there exists a pair of perpendicular lines through the joint distribution of $X$ and $Z$ , each marginally unskewed, such that the joint distribution has the same shape in each of the quadrants defined by those lines. In other words, whenever we can define a new set of coordinate axes so that the joint distribution is symmetric in the sense of (9) and unskewed with respect to the transformed variables $\tilde{X}$ and $\tilde{Z}$ defined by the new coordinate axes. Figure 1 in the following section illustrates a typical case.

Figure 1. — Centered and uncentered X and Z from a bivariate normal distribution.

Empirical Simulation

To further develop the theory elucidated in the previous section, a series of simulated examples are presented here to further explore the cases where the correlation between centered terms and their interaction is larger than between the latter and its uncentered main effects. Notice the differences in the scaling of the axes in the pairs of plots throughout this section.

Figure 1 presents the “benchmark” case alluded to in Aiken and West (1991) and Cohen et al. (2002) where $(X, Z)$ are jointly normally distributed. For this example, $μ_{x} = 10; μ_{z} = 5; σ_{x}^{2} = σ_{z}^{2} = 10; σ_{xz} = 5$ so that $ρ = 0.5$ and $N = 1, 000$ .

Notice that neither is $X$ expectation-independent of $Z$ nor is $Z$ expectation-independent of $X$ in this example. However, their joint distribution does satisfy the criterion of Proposition 2; that is, there exists a line through the centered joint distribution of $(X_{c}, Z_{c})$ such that if we rotate this line atop the $Z$ -axis, say, then the transformed joint distribution is expectation-independent in the transformed variables $(\tilde{X}, \tilde{Z})$ . Moreover, the corresponding marginal distributions will not be skewed.

The left panel presents the scatter plot of the marginal distributions and the right panel shows the relationship between one of the marginal distributions, $X$ , and the product term $XZ$ . As it is commonly known, the leftmost plot of one-dimensional marginals shows the classical elliptical shape that characterizes the multivariate normal distribution (among others). The rightmost one depicts the quadratic effect of the product-interaction term so that a saddle-like point can be observed along the origin, which further expands as one move along the horizontal axis. For this particular example, $Cor (X, XZ) = 0.918$ and $Cor (X_{c}, X_{c} Z_{c}) = 0.033$ , showing the expected reduction in correlation when one centers the main effects.

Now consider the following bivariate normal mixture:

(0.9) \times N_{1} (\begin{matrix} [\begin{matrix} 4 \\ 4 \end{matrix}], & [\begin{matrix} 1 & 0.6 \\ 0.6 & 1 \end{matrix}] \end{matrix}) + (0.1) \times N_{2} (\begin{matrix} [\begin{matrix} 7 \\ - 1 \end{matrix}], & [\begin{matrix} 1 & 0.1 \\ 0.1 & 1 \end{matrix}] \end{matrix})

Figure 2 shows the scatter plot of the main effects and interaction of 1,000 units sampled from this distribution. Notice that, whether one looks at the centered or uncentered clouds of points, two distinct clusters can be observed as a function of the mixture proportions. Once $X$ and $Z$ are centered and the multiplicative interaction term is formed, the smaller cluster for the centered case rotates, creating a downward-sloping line that connects the origin to the mean of this smaller group of points, forcing a correlation between the main effect $X$ and its interaction. Although the cloud of points for the uncentered terms still remains and is also present, the points are too far apart to induce a slope connecting the new center of the interaction term with the smaller cloud of points. For this particular example $Cor (X, XZ) = - 0.033$ and $Cor (X_{c}, X_{c} Z_{c}) = - 0.585$ .

Figure 3 presents a third example, a bivariate distribution generated through the copula-based method outlined in Mair, Satorra, and Bentler (2012). For this particular case, normal distributions are chosen for the marginals with parameters $μ_{X} = 15, μ_{Z} = 5; σ_{X} = σ_{Z} = 10$ and $σ_{X, Z} = 5$ to obtain a correlation coefficient of $ρ = 0.5$ . A Joe copula function with parameter $θ = 19$ was used (see Joe, 2014, p. 170) and 1,000 datapoints were sampled. The scenario presented here highlights the fact that even though lower order (i.e. marginal) symmetry can be present (such as with the normal distribution), lack of symmetry in higher dimensions can still make centered interaction terms exhibit higher correlation than their uncentered counterparts. The left panel of Figure 3 suggests a jointly skewed distribution with its longer tail in the positive quadrant and the rest of the mass moving toward the negative quadrant, becoming more dispersed as it does. When the multiplicative interaction is formed, the points of the shorter tail among the centered distributions are mostly reflected upward with a smaller group of them creating a subset close to the origin so that an upward-sloping line can be traced from the origin to these points. For the uncentered case, it is possible to again see the horizontal reflection that induces a $U$ -shape, but the spread of points below their mean centroid is not as strongly defined as in the centered case to induce an upward-sloping line. For this particular case, $Cor (X, XZ) = 0.190$ and $Cor (X_{c}, X_{c} Z_{c}) = 0.273$ .

Table 1 offers a summary of the increase or decrease in correlation between $X$ or $Z$ and the $XZ$ term as well as which conditions from section “Centered Versus Uncentered Covariance Products” hold for each of the three cases just described. The “symmetric” case (which encompasses the bivariate normal distribution) shows the classic reduction in correlation for the centered main effects. As expected, neither Condition (5) nor Condition (6) hold true. The mixture of normal distributions shows the greatest increase (in absolute value) of the correlation between the centered terms and their interaction. Condition (5) holds in this case, after replacing $X$ by $- X$ and $Z$ by $- Z$ . Finally, the Joe copula with standard normal marginals does not show an increase in the absolute value of the correlation as dramatic as the normal mixture case, but it still demonstrates the fact that under nonsymmetric conditions, the correlation between uncentered terms can be lower than that for centered terms. Condition (5) is also true in this scenario.

Table 1.

Correlation for Centered and Uncentered Main Effects With Interactions.

Distribution	Uncentered correlation	Centered correlation	Condition (5)	Condition (6)
Symmetric	0.918	0.033	False	False
Mixture	−0.033	−0.585	True^*	False
Copula	0.190	0.273	True	False

Open in a new tab

Refers to the need to switch signs for X and Z before testing the conditions.

Conclusion

As the theory and examples we have explored show, centering will not always reduce the multicollinearity problem of multiple regression, even when a pair of predictors $X$ and $Z$ have marginally symmetric distribution functions. This fact runs counter to some common wisdom on the centering issue, presented, for example, in Cohen et al. (2002). Making use of careful mathematical definitions like expectation-independence and bivariate symmetry, as in (9), can greatly help correct and clarify our understanding of these issues.

In practice, centering can still be considered a viable tool for the applied researcher when multicollinearity is a concern, but only in certain cases. This article describes what to watch for, but we can summarize our general advice as follows. If centering is being considered:

Plot the $(X, Z)$ and $(X, XZ)$ data.
Visually assess if there could exist a pair of perpendicular lines one could place on the $(X, Z)$ plane so that the distribution of points in the scatterplot are symmetric with respect to the transformed origin created by the intersection of these perpendicular lines. If yes, then centering will reduce the problem of multicollinearity.
On the other hand, if the $(X, Z)$ data appear in clusters, centering is unlikely to reduce the problem of multicollinearity.

We have included the R code used to generate our example figures in the Supplementary Material (available in the online version of the article) as a teaching resource for introductory regression courses.

Finally, we remark that our investigations in this article apply only to the bivariate setting of Model (1). In higher dimensional settings, the issue of multicollinearity becomes considerably more complicated. Indeed, centering will actually change the least squares estimators and standard errors of the pairwise interaction term coefficients when third-order (or higher) interactions are present in a linear regression model. Thus, a researcher’s ability to detect pairwise interaction effects can be greatly altered by choosing to center variables when three or more predictors (and their interactions) are under consideration.

Supplemental Material

EPM_817801_Appendix – Supplemental material for Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

Click here for additional data file.^{(76.1KB, pdf)}

Supplemental material, EPM_817801_Appendix for Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering by Oscar L. Olvera Astivia and Edward Kroc in Educational and Psychological Measurement

Appendix

To show that $Cov (X_{c}, X_{c} Z_{c}) = Cov (X, XZ) - E (Z) Var (X) - E (X) Cov (X, Z)$ one merely needs to rely on the definition of covariance $Cov (X, Z) = E [(X - E (X)) (Z - E (Z))]$ and expand the terms as necessary. Please notice that, by definition, $E (X_{c}) = E (Z_{c}) = 0$ .

\begin{matrix} Cov (X_{c}, X_{c} Z_{c}) = E (X_{c}^{2} Z_{c}) = E [(X - E (X))^{2} (Z - E (Z))] \\ = E (X^{2} Z) - 2 E (X) E (XZ) + E (X)^{2} E (Z) - E (X^{2}) E (Z) + 2 E (X)^{2} E (Z) - E (X)^{2} E (Z) \\ = E (X^{2} Z) - 2 E (X) E (XZ) - E (Z) [(X^{2}) - 2 E (X)^{2}] \\ = E (X^{2} Z) - 2 E (X) E (XZ) - E (Z) Var (X) + E (Z) E (X)^{2} \\ = E (X^{2} Z) - E (Z) Var (X) - E (X) [E (XZ) + E (XZ) - E (X) E (Z)] \\ = E (X^{2} Z) - E (Z) Var (X) - E (X) E (XZ) - E (X) Cov (X, Z) \\ = Cov (X, XZ) + E (XZ) E (X) - E (Z) Var (X) - E (X) E (XZ) - E (X) Cov (X, Z) \\ = Cov (X, XZ) - E (Z) Var (X) - E (X) Cov (X, Z) . \end{matrix}

Footnotes

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

Supplemental Material: Supplemental material for this article is available online.

References

Aiken L. S., West S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage. [Google Scholar]
Baron R. M., Kenny D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182. doi: 10.1016/j.jom.2018.04.003 [DOI] [PubMed] [Google Scholar]
Bedeian A. G., Mossholder K. W. (1994). Simple question, not so simple answer: Interpreting interaction terms in moderated multiple regression. Journal of Management, 20, 159-165. doi: 10.1177/014920639402000108 [DOI] [Google Scholar]
Bohrnstedt G. W., Goldberger A. S. (1969). On the exact covariance of products of random variables. Journal of the American Statistical Association, 64, 1439-1442. doi: 10.1080/01621459.1969.10501069 [DOI] [Google Scholar]
Casella G., Berger R. L. (2002). Statistical inference. Pacific Grove, CA: Duxbury. [Google Scholar]
Chaplin W. F. (1991). The next generation of moderator research in personality psychology. Journal of Personality, 59, 143-178. doi: 10.1111/j.1467-6494.1991.tb00772.x [DOI] [PubMed] [Google Scholar]
Cohen J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70, 426-443. doi: 10.1037/h0026714 [DOI] [Google Scholar]
Cohen J., Cohen P., West S. G., Aiken L. S. (2002). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Hillsdale, NJ: Lawrence Erlbaum. [Google Scholar]
Court A. T. (1930). Measuring joint causation. Journal of the American Statistical Association, 25, 245-254. doi: 10.1080/01621459.1930.10503127 [DOI] [Google Scholar]
Dawson J. F. (2014). Moderation in management research: What, why, when, and how. Journal of Business and Psychology, 29, 1-19. doi: 10.1007/s10869-013-9308-7 [DOI] [Google Scholar]
Iacobucci D., Schneider M. J., Popovich D. L., Bakamitsos G. A. (2016). Mean centering helps alleviate “micro” but not “macro” multicollinearity. Behavior Research Methods, 48, 1308-1317. doi: 10.3758/s13428-015-0624-x [DOI] [PubMed] [Google Scholar]
Irwin J. R., McClelland G. H. (2001). Misleading heuristics and moderated multiple regression models. Journal of Marketing Research, 38, 100-109. doi: 10.1509/jmkr.38.1.100.18835 [DOI] [Google Scholar]
Joe H. (2014). Dependence modeling with copulas. Boca Raton, FL: CRC Press. [Google Scholar]
Kraemer H. C., Blasey C. M. (2004). Centering in regression analyses: A strategy to prevent errors in statistical inference. International Journal of Methods in Psychiatric Research, 13, 141-151. doi: 10.1002/mpr.170 [DOI] [PMC free article] [PubMed] [Google Scholar]
Mair P., Satorra A., Bentler P. M. (2012). Generating nonnormal multivariate data using copulas: Applications to SEM. Multivariate Behavioral Research, 47, 547-565. doi: 10.1080/00273171.2012.692629 [DOI] [PubMed] [Google Scholar]
Osborne J. W., Waters E. (2002). Four assumptions of multiple regression that researchers should always test. Practical Assessment, Research & Evaluation, 8 Retrieved from http://pareonline.net/getvn.asp?v=8&n=2 [Google Scholar]
Tabachnick B. G., Fidell L. S. (2007). Using multivariate statistics (5th ed.). Boston, MA: Allyn & Bacon/Pearson Education. [Google Scholar]
Wu A. D., Zumbo B. D. (2008). Understanding and using mediators and moderators. Social Indicators Research, 87, 367-392. doi: 10.1007/s11205-007-9143-1 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

EPM_817801_Appendix – Supplemental material for Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

Click here for additional data file.^{(76.1KB, pdf)}

[bibr1-0013164418817801] Aiken L. S., West S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage. [Google Scholar]

[bibr2-0013164418817801] Baron R. M., Kenny D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182. doi: 10.1016/j.jom.2018.04.003 [DOI] [PubMed] [Google Scholar]

[bibr3-0013164418817801] Bedeian A. G., Mossholder K. W. (1994). Simple question, not so simple answer: Interpreting interaction terms in moderated multiple regression. Journal of Management, 20, 159-165. doi: 10.1177/014920639402000108 [DOI] [Google Scholar]

[bibr4-0013164418817801] Bohrnstedt G. W., Goldberger A. S. (1969). On the exact covariance of products of random variables. Journal of the American Statistical Association, 64, 1439-1442. doi: 10.1080/01621459.1969.10501069 [DOI] [Google Scholar]

[bibr5-0013164418817801] Casella G., Berger R. L. (2002). Statistical inference. Pacific Grove, CA: Duxbury. [Google Scholar]

[bibr6-0013164418817801] Chaplin W. F. (1991). The next generation of moderator research in personality psychology. Journal of Personality, 59, 143-178. doi: 10.1111/j.1467-6494.1991.tb00772.x [DOI] [PubMed] [Google Scholar]

[bibr7-0013164418817801] Cohen J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70, 426-443. doi: 10.1037/h0026714 [DOI] [Google Scholar]

[bibr8-0013164418817801] Cohen J., Cohen P., West S. G., Aiken L. S. (2002). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Hillsdale, NJ: Lawrence Erlbaum. [Google Scholar]

[bibr9-0013164418817801] Court A. T. (1930). Measuring joint causation. Journal of the American Statistical Association, 25, 245-254. doi: 10.1080/01621459.1930.10503127 [DOI] [Google Scholar]

[bibr10-0013164418817801] Dawson J. F. (2014). Moderation in management research: What, why, when, and how. Journal of Business and Psychology, 29, 1-19. doi: 10.1007/s10869-013-9308-7 [DOI] [Google Scholar]

[bibr11-0013164418817801] Iacobucci D., Schneider M. J., Popovich D. L., Bakamitsos G. A. (2016). Mean centering helps alleviate “micro” but not “macro” multicollinearity. Behavior Research Methods, 48, 1308-1317. doi: 10.3758/s13428-015-0624-x [DOI] [PubMed] [Google Scholar]

[bibr12-0013164418817801] Irwin J. R., McClelland G. H. (2001). Misleading heuristics and moderated multiple regression models. Journal of Marketing Research, 38, 100-109. doi: 10.1509/jmkr.38.1.100.18835 [DOI] [Google Scholar]

[bibr13-0013164418817801] Joe H. (2014). Dependence modeling with copulas. Boca Raton, FL: CRC Press. [Google Scholar]

[bibr14-0013164418817801] Kraemer H. C., Blasey C. M. (2004). Centering in regression analyses: A strategy to prevent errors in statistical inference. International Journal of Methods in Psychiatric Research, 13, 141-151. doi: 10.1002/mpr.170 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr15-0013164418817801] Mair P., Satorra A., Bentler P. M. (2012). Generating nonnormal multivariate data using copulas: Applications to SEM. Multivariate Behavioral Research, 47, 547-565. doi: 10.1080/00273171.2012.692629 [DOI] [PubMed] [Google Scholar]

[bibr16-0013164418817801] Osborne J. W., Waters E. (2002). Four assumptions of multiple regression that researchers should always test. Practical Assessment, Research & Evaluation, 8 Retrieved from http://pareonline.net/getvn.asp?v=8&n=2 [Google Scholar]

[bibr17-0013164418817801] Tabachnick B. G., Fidell L. S. (2007). Using multivariate statistics (5th ed.). Boston, MA: Allyn & Bacon/Pearson Education. [Google Scholar]

[bibr18-0013164418817801] Wu A. D., Zumbo B. D. (2008). Understanding and using mediators and moderators. Social Indicators Research, 87, 367-392. doi: 10.1007/s11205-007-9143-1 [DOI] [Google Scholar]

PERMALINK

Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

Oscar L Olvera Astivia

Edward Kroc

Abstract

Introduction

Figure 3.

Results

Centered Versus Uncentered Covariance Products

Symmetry and Expectation-Independence

Figure 1.

Empirical Simulation

Figure 2.

Table 1.

Conclusion

Supplemental Material

Appendix

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

Oscar L Olvera Astivia

Edward Kroc

Abstract

Introduction

Figure 3.

Results

Centered Versus Uncentered Covariance Products

Symmetry and Expectation-Independence

Figure 1.

Empirical Simulation

Figure 2.

Table 1.

Conclusion

Supplemental Material

Appendix

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases