Skip to main content
Educational and Psychological Measurement logoLink to Educational and Psychological Measurement
. 2021 Feb 12;81(5):817–846. doi: 10.1177/0013164421992407

Examining the Impact of and Sensitivity of Fit Indices to Omitting Covariates Interaction Effect in Multilevel Multiple-Indicator Multiple-Cause Models

Chunhua Cao 1,, Eun Sook Kim 2, Yi-Hsin Chen 2, John Ferron 2
PMCID: PMC8377341  PMID: 34565809

Abstract

This study examined the impact of omitting covariates interaction effect on parameter estimates in multilevel multiple-indicator multiple-cause models as well as the sensitivity of fit indices to model misspecification when the between-level, within-level, or cross-level interaction effect was left out in the models. The parameter estimates produced in the correct and the misspecified models were compared under varying conditions of cluster number, cluster size, intraclass correlation, and the magnitude of the interaction effect in the population model. Results showed that the two main effects were overestimated by approximately half of the size of the interaction effect, and the between-level factor mean was underestimated. None of comparative fit index, Tucker–Lewis index, root mean square error of approximation, and standardized root mean square residual was sensitive to the omission of the interaction effect. The sensitivity of information criteria varied depending majorly on the magnitude of the omitted interaction, as well as the location of the interaction (i.e., at the between level, within level, or cross level). Implications and recommendations based on the findings were discussed.

Keywords: multilevel MIMIC, model misspecification, fit indices


It is common practice to estimate interaction effects in multiple regression and factorial analysis of variance to probe if the effect of one predictor or factor on the dependent variable varies across different levels of another predictor or factor. When the data structure is hierarchical, for example, students nested in schools, testing cross-level interaction effects in addition to Level 1 interaction effects and Level 2 interaction effects is common in applications of multilevel multiple regression by applied researchers (Chiu, 2010; Raudenbush & Bryk, 2002). However, in multilevel structural equation modeling (ML SEM), it is less common to test interaction effects partly due to the complexity of the model.

Multiple-indicator multiple-cause (MIMIC) models under the framework of SEM have been gaining popularity in educational and psychological research because of their inclusion of causal indicators that predict the latent factor as well as effect indicators (Jöreskog & Goldberger, 1975; Kaplan, 2009). The multilevel MIMIC (ML MIMIC) model is used to account for the dependence of individuals clustered in the same unit, and it possesses the flexibility of simultaneously modeling covariates at the between level and at the within level, as well as interaction effects of the covariates (Cao et al., 2019).The results of this study showed adequate performance of ML MIMIC models in detecting the within-level and cross-level covariates interaction effect in most conditions, but the power of testing between-level interaction was lower, depending primarily on the number of clusters (CN) and intraclass correlation (ICC). However, when applied researchers employed ML MIMIC to examine the effects of Level 1 covariates and Level 2 covariates on the latent factor, they tended not to test covariates interaction effects in the models (e.g., Tsai et al., 2017). The exclusion of covariates interaction effect in ML MIMIC may produce bias in the other parameters in the model because an important effect was omitted (Yuan et al., 2003). In the context of SEM, researchers mainly focus on parameter estimates and model fit indices. If a significant interaction effect is omitted in ML MIMIC, it is not yet known whether the model fit indices can detect the model misspecification. The focus of this study is on the effect of omitting covariates interaction on parameter estimates in ML MIMIC and the sensitivity of fit indices to the omission.

The impact of model misspecification in SEM on parameter estimates has been studied extensively by researchers in educational and psychological research (Farley & Reddy, 1987; Gallini, 1983; Kaplan, 1988; Sörbom, 1975), and so has been the sensitivity of model fit indices to model misspecification (e.g., Fan et al., 1999; Fan & Wang, 1998; Gerbing & Anderson, 1993; Hu & Bentler, 1998, 1999; Marsh et al., 1996). However, there has been much less related research on the effect of model misspecification in multilevel modeling, particularly ML SEM. Bias in parameter estimates was reported in a few previous studies about the impact of model misspecification on multilevel modeling. For example, in the study about assuming constant Level 1 residual variances when they are in fact varying in each subpopulation in growth mixture models (Enders & Tofighi, 2008), all parameter estimates exhibited bias to some degree. In an investigation into the impact of misspecifying the within-level error structure in a two-level growth models, results demonstrated that the misspecification produced biased estimates of variance parameters but unbiased estimates of fixed effects (Ferron et al., 2002). The studies about the impact of omitting the interaction between crossed factors in cross-classified random effects modeling indicated that coefficient estimates and their associated standard errors were not biased for fixed effects, Level l random effects were not affected, but Level 2 random effects were affected (Lee & Hong, 2019; Shi et al., 2010).

These findings discussed above provided some hints about the impact of model misspecification, but they are not sufficient to understand the impact of omitting covariates interaction effects in ML MIMIC. The extent to which such misspecifications would produce bias for parameter estimates in the ML MIMIC model has not been examined in literature. Moreover, the sensitivity of fit indices in SEM to the omission of interaction effects has not been investigated in literature.

Many fit indices are descriptive and cannot be used to endorse the correctness of the model (Hu & Bentler, 1999). Some researchers have recommended the use of equivalence testing to overcome the limitations of null hypothesis testing in SEM (Yuan et al., 2016). Equivalence testing allows the researchers to control the degree of model misspecification (Wellek, 2010). With a predetermined alpha level, equivalence testing allows us to make a conclusion about the upper bound on the size of the misspecification (Yuan et al., 2016). For instance, a t-size root mean square error of approximation (RMSEAt) of .08 obtained in equivalence testing implied that the size of misspecification is no more than .08 as measured by RMSEA. Recently Marcoulides and Yuan (2020) extended equivalence testing to ML SEM to evaluate the fit indices of the between-level and within-level models separately. However, the sensitivity of level-specific fit indices based on equivalence testing to the omission of interaction effects in ML MIMIC has not been investigated yet.

Accordingly, the purpose of this study is to investigate the impact of omitting covariates interaction effects on parameter estimates at both between and within levels in ML MIMIC models when the interaction is present in the population model. To be more specific, this study purports to examine the impact of omitting the between-level, within-level, and cross-level interactions on parameter estimates in ML MIMIC models, as well as the sensitivity of model fit indices to the omission of interaction effects under various conditions. Fit indices examined in this study include traditional SEM fit indices as well as fit indices obtained in equivalence testing in ML MIMIC.

Multilevel Multiple-Indicator Multiple-Cause Models

Unlike multiple group confirmatory factor analysis (CFA), which includes only categorical grouping variables, the MIMIC model has the advantage of including multiple covariates (either categorical or continuous) as well as their interaction simultaneously in the model without dividing the sample intro subgroups (Cheng et al., 2016; Fleishman et al., 2002). The MIMIC model, examining the linear relationship between observed variables and latent factors, consists of two parts: a measurement part and a structural part:

yi=ν+Ληi+εi, (1)
ηi=ΓXi+ζi, (2)
Λε~N(0,Θε), (3)
ζ~N(0,Ψ), (4)

where, for individual i, y is a vector of observed scores of indicators, ν is a vector of intercepts of indicators, is a matrix of factor loadings, η is a vector of common latent factors, and ε is a vector of residuals; Γ represents a matrix of pattern coefficients estimating the effect of the covariates (X) on latent factors (η) and ζ is a vector of disturbance. Residuals (ε) and disturbances (ζ) are normally distributed with a mean of zero and a variance of Θε and Ψ, respectively. Moreover, residuals (ε) and disturbances (ζ) are uncorrelated. Note that if there are multiple covariates that have regression effects on the latent factor in MIMIC models, these coefficients linking the covariates and the latent variable can be estimated simultaneously. Like multiple regression, the interaction effect of two or more covariates can be included in the model by creating a product term of the covariates.

When data structure is hierarchical or nested, MIMIC models can be extended to ML MIMIC model. In ML MIMIC, a subscript j is used to indicate the cluster, implying that the parameter of interest can vary across clusters. Equation 1 is decomposed into the between-level and within-level models and rewritten in Equation 5:

yij=νB+ΛBηBj+ΛWηWij+εBj+εWij, (5)
εWij~N(0,ΘεW), (6)
εBj~N(0,ΘεB), (7)

where yij is the observed scores of y for individual i in cluster j; νB is the between-level intercept; ΛB is the between-level factor loading; ηBj is the between-level cluster factor; ΛW is the within-level factor loading, and it is not random; ηWij is the within-level latent factor; εBj and εWij are the between-level and the within-level residuals, respectively. Note that the within-level intercept (νW) is fixed at zero because individual scores are simply the sum of the group mean and individuals’ deviation from the group mean (Heck & Thomas, 2000).

For the structural part of ML MIMIC models, three types of covariates interaction effect were described by Cao et al. (2019). Between-level covariates interaction occurs when both covariates have an effect on the between-level latent factor, and the effect of one covariate on the latent effect is moderated by the other one as shown in Equation 8:

ηBj=γ1BX1Bj+γ2BX2Bj+γ3BX1Bj*X2Bj+ζBj, (8)

where γ1B is the main effect of covariate X1Bj, γ2B is the main effect of covariate X2Bj, and γ3B is the interaction effect of the two covariates above. When the model is misspecified, the interaction effect, γ3B, is omitted in the model.

Similarly, in ML MIMIC with within-level interaction, the effect of one covariate on the within-level factor depends on the other covariate as shown in Equation 9:

ηWij=γ1WX1Wij+γ2WX2Wij+γ3WX1Wij*X2Wij+ζWij, (9)

where γ1W is the main effect of the covariate X1Wij, γ2W is the main effect of covariate X2Wj, and γ3W is the interaction effect of the two covariates above. γ3W is omitted in Equation 9 in the misspecified ML MIMIC models with within-level interaction effect.

In ML MIMIC with cross-level interaction, the effect of the within-level covariate on the within-level factor is a random effect (Equation 10) that is predicted by the between-level covariate (Equation 11), which also predicts the between-level factor as shown in Equation 12. For more details about model specification of the three types of covariates interaction in ML MIMIC models, refer to Cao et al. (2019).

ηWij=γWjXWij+ζWij, (10)
γWj=γ10+γCXBj+ζγWj, (11)
ηBj=γBXBj+ζBj, (12)

where γWj is the path coefficient of the within-level covariate XWij predicting the within-level latent variable, ηWij; ζWij refers to the residual of the latent factor, ηWij. The terms γ10 and γC represent the regression intercept and slope of the random slope, γWj, on XBj, respectively. ζγWj is the residual variance, the percentage of variance in the random slope that cannot be explained by the between-level covariate, XBj. γB indicates the between-level regression slope of the between-level latent variable ηBj on the between-level covariate, XBj; ζBj refers to the residual of ηBj that cannot be explained by XBj. When γWj in Equation 10 is replaced by Equation 11, the combined model for the within-level latent variable, ηWij, in Equation 10, is expressed in Equation 13.

ηWij=γ10XWij+γCXWij*XBj+ζγWj*XWij+ζWij, (13)

where γC is the cross-level covariates interaction effect, which is omitted in the misspecified ML MIMIC models with cross-level covariates interaction effect.

Model Misspecification in Structural Equation Modeling

The impact of misspecification in single-level SEM on parameter estimates has been studied extensively by educational and psychological researchers (e.g., Gallini, 1983; Kaplan, 1988; Yuan et al., 2003). Gallini (1983) utilized path analysis and CFA models to show that omitted variables caused severe parameter bias if the omitted variables are strongly related to the exogenous variable. Kaplan (1988) used CFA and full SEM to illustrate the impact of misspecification on parameter estimates. The author considered misspecification of the measurement model, the misspecification of the structural component (e.g., the path between exogenous and endogenous factors, and covariance among factors), as well as misspecification of both measurement part and structural components. The results of the simulation study showed that the combination of measurement and structural misspecifications produced the most biased parameter estimates (Kaplan, 1988). Bentler and Chou (1993) proposed to examine the mean absolute parameter change for all the free parameters and their findings implied that not all parameters in the model were affected when a variable was omitted in the model, and only parameters that were closely related to the misspecification would be affected. Most studies about misspecification in SEM employed CFA and path models to evaluate the impact of model misspecification on parameter bias, and no studies used MIMIC models, especially the path linking the observed covariate and the latent factor.

Compared with the research on the impact of model misspecification in traditional single-level SEM, there has been a paucity of research in model misspecification in the ML SEM context. Among the infrequent research about model misspecification in multilevel data, Lee and Hong (2019) examined the impact of omitting interaction of crossed factors on parameter estimates in cross-classified random effect modeling, and they showed that the Level 2 random effects were affected by the omission. In an investigation into the impact of misspecifying the first-level error structure in a two-level growth models, results demonstrated that the misspecification produced biased estimates of variance parameters but unbiased estimates of fixed effects (Ferron et al., 2002).

Among the studies about various types of model misspecification in SEM, most of them examined the CFA model, and the misspecification type was manifested as misspecified factor loadings, factor variance covariance, and residuals correlation (Fan & Sivo, 2007; Hu & Bentler, 1999). None of them examined misspecification of covariates interaction. However, in empirical research using ML SEM, researchers tend to model only the main effects of the covariates, without testing the covariates interaction effect (e.g., Tsai et al., 2017). If a significant interaction effect is left out, the interpretation of the main effects of the covariates may be misleading. Moreover, the estimation of the interaction effect between variables has important theoretical, substantive, and empirical implications in psychology and other social sciences (Marsh et al., 2004). Little is known about the impact of omitting an existing covariates interaction effect in SEM models, especially in ML SEM models. In B. O. Muthén and Asparouhov’s (2003) summary of all types of possible interaction in SEM models, they called for a Monte Carlo study to investigate the misestimation of parameters as a result of omitting interaction effect when the interaction effect was present in the population model. The impact of omitting covariates interaction at one level on the parameter estimates at another level as well as at the same level in ML MIMIC models has not been examined in literature.

Sensitivity of Model Fit Indices in Structural Equation Modeling

Closely related to the research on model misspecification in SEM is the research about the sensitivity of model fit indices to model misspecification in SEM. To evaluate how well the hypothesized models fit the data, several fit indices have been commonly used by empirical researchers in the SEM framework, including chi-square, comparative fit index (CFI), Tucker–Lewis index (TLI), RMSEA, and standardized root mean square residual (SRMR) for the between level (SRMR-B) and the within level (SRMR-W). The sensitivity of information criterion (IC), such as Akaike information criterion (AIC; Akaike, 1974), Bayesian information criterion (BIC; Schwarz, 1978), and sample size–adjusted BIC (SaBIC; Sclove, 1987), to the omission of covariates interaction in ML MIMIC models has not been examined. Readers who are interested in the formulae of these fit indices can refer to previous studies about model fit indices in SEM (e.g., Hu & Bentler, 1999; Vrieze, 2012).

In the past few decades, substantial research has been done in the sensitivity of model fit indices in detecting model misspecification in SEM (Fan & Sivo, 2007; Hsu et al., 2015; Hu & Bentler, 1998; Ryu & West, 2009) and in establishing cutoff criteria for fit indices produced in SEM models. The most widely known research of this type was conducted by Hu and Bentler (1998, 1999). The cutoff rules of model fit indices recommended by Hu and Bentler (1999; CFI ≥ .95; TLI ≥ .95; RMSEA ≤ .06; SRMR ≤ .08) were utilized frequently by applied researchers in their reporting of SEM model results. However, some researchers started to ask questions about whether the fit indices were equally sensitive to misspecification in different types of models (e.g., Fan & Sivo, 2007; Kenny & McCoach, 2003).

Most of the studies about the sensitivity of fit indices in detecting model misspecification were in the context of single-level SEM, and a few methodological researchers investigated the sensitivity of fit indices in ML SEM. Ryu and West (2009) examined the sensitivity of two fit indices (i.e., RMSEA and CFI) in detecting model misspecification in ML SEM. They showed that RMSEA and CFI failed to detect the misspecification at the between level, although the two fit indices were able to detect the misspecification at the within level. This was a great starting point for evaluating fit indices in ML SEM. However, Ryu and West (2009) used a single ICC level of .5, which is very high in educational and psychological data (Hsu et al., 2015). Also, they included only one type of model misspecification; that is, the correlation of the two factors was specified higher than the true correlation in the population. Last, they examined only two fit indices (i.e., RMSEA and CFI). Hsu et al. (2015) expanded Ryu and West’s (2009) simulation study by including more design factors, such as CN, cluster size (CS), ICC, and misspecification type. Their findings showed that CFI, TLI, and RMSEA could only detect the misspecification at the within level, but not the between level. Moreover, CFI, TLI, and RMSEA were more sensitive to misspecification of factor loadings, whereas SRMR-W was more sensitive to misspecification of factor covariance, and SRMR-B was the only fit index that was sensitive to the model misspecification at the between level.

Compared with the research on the sensitivity of CFI, RMSEA, and SRMR to model misspecification in SEM, the sensitivity of ICs to model misspecification in SEM has received little attention. In addition to examining the fit indices of CFI, TLI, RMSEA, SRMR, this study also aims to investigate the sensitivity of ICs to omitting the interaction effect in ML MIMIC. A smaller value of ICs indicates a better model. If the ICs are sensitive to the model misspecification, ICs in the correct model are expected to be smaller than the misspecified model.

Level-Specific Fit Indices in Equivalence Testing of Multilevel Structural Equation Modeling

Previous research showed that the commonly used global fit indices, for example, RMSEA and CFI, failed to detect model misspecification of the between level (Hsu et al., 2015; Ryu & West, 2009). These fit indices evaluated the entire model of ML SEM without separating the goodness-of-fit indices into within-level and between-level fit indices, and their values reflect the degree of model fit for all levels in the model (Marcoulides & Yuan, 2020). The fit indices in ML SEM will be largely dominated by the large sample size of the within level, and the model misspecification of the between-level model cannot be easily detected (Yuan & Bentler, 2007). Equivalence testing in ML SEM using a partially saturated method construction approach (Hox, 2002) has been proposed to evaluate the goodness-of-fit indices of the within and between levels separately (Marcoulides & Yuan, 2020). First, RMSEA and CFI fit indices of each level are obtained using the partially saturated model construction approach (Hox, 2002). To be more specific, to obtain the fit indices of the within-level model, the between-level model is saturated, so the obtained fit indices of the model is attributed to the within-level model. Similarly, to obtain the fit indices of the between-level model, the within-level model is saturated so that calculated fit indices represent the between-level model. Interested readers can refer to the study by Marcoulides and Yuan (2020) in terms of the step-by-step procedures of conducting equivalence testing in ML SEM. Second, equivalence testing t-size of fit indices, RMSEAt and CFIt are computed using the R program provided in the study by Yuan et al. (2016) as well as by Marcoulides and Yuan (2020). Third, cutoff values for RMSEAt and CFIt corresponding to the conventionally accepted cutoff values for RMSEA (.01, .05, .08, and .10 for excellent, close, fair, and poor fit, respectively) and CFI (.99, .95, .92, and .90 for excellent, close, fair, and poor fit, respectively) are computed to obtain the RMSEAe and CFIe. RMSEAt and CFIt are compared with different levels of RMSEAe and CFIe to assess the degree of model fit. Note that in this study only RMSEAt and CFIt from equivalence testing are examined for their sensitivity to model misspecification in ML MIMIC, in keeping with the approach presented in Yuan et al. (2016) and Marcoulides and Yuan (2020), although equivalence testing applies to all other fit indices.

Research Purpose and Significance of the Study

As discussed above, previous simulation studies about model misspecification in SEM mostly focused on the CFA model. The impact of omitting covariates interaction in ML MIMIC has not been examined in existing literature. Comparing parameter estimates of the correct model and the misspecified model (omitting covariates interaction) under varying conditions would inform researchers about the extent of bias of parameter estimates resulting from the misspecification. Also, the degree of the sensitivity of global fit indices to the omission of covariates interaction could shed light on the appropriateness of using various fit indices to detect the omission of covariates interaction in ML MIMIC. Moreover, equivalence testing is conducted to compute the t-size RMSEA and CFI of each level in ML MIMIC to investigate the sensitivity of RMSEAt and CFIt to the omission of between-level and within-level covariates interaction effect in ML MIMIC.

Method

In this study, for data generation, we adopted the ML MIMIC models that Cao et al. (2019) used to detect covariates interaction effects in correctly specified ML MIMIC models. To be more specific, the population (true) models had a within-level covariates interaction effect, a between-level interaction effect, and a cross-level interaction effect, for the within-level interaction conditions, the between-level interaction conditions, and cross-level interaction conditions, respectively. In the study of Cao et al. (2019), the focus was on the performance of ML MIMIC in detecting a significant covariates interaction effect using the true model. The current study used the true model proposed by them to generate the data and to examine the impact of excluding the interaction effect. Each generated data set was analyzed by the correctly specified model and the misspecified model. The latter was misspecified in such a way that the covariates interaction effect was omitted but the main effects of the covariates remained in the model.

Data Generation and the Population Models

PROC IML in SAS/IML 9.4 was used to generate the two-level ML MIMIC data with two dichotomous covariates in the population models. The measurement part was a single factor measured by six continuous indicators at both the between and the within levels. Factor loadings were all set at .80 at the within level, and the residual variances were .36, so that the total variance of the indicator was a unit. For model identification purpose the factor variance was set to be one at the within level. At the between level, the six indicators were treated as latent variables (random intercepts) in ML SEM. By assuming measurement invariance across levels, the six factor loadings were set to be equal at .80 at the between level, and the between-level residual variances of all six indicators were set to be equal at .02, .01, and .004 for the large, medium, and small factor ICCs, respectively. Satisfying scalar measurement invariance in two-level SEM models required between-level residual variances of zero based on the research on measurement bias in multilevel data conducted by Jak et al. (2014). In real-world empirical data, it is rare to have a scalar invariant construct when there are many groups (represented by clusters in this study; Marsh et al., 2018; Rutkowski & Svetina, 2014), so a close to zero (.02, .01, and .004), but not zero, was selected as the between-level residual variance to mimic the reality (Cao et al., 2019). To obtain different levels of ICC, the between-level factor variance was varied, which will be explained in the next section.

For the between-level covariates interaction and within-level covariates interaction conditions, the main path coefficients of the two covariates to the factor were constant at .30 and .40, respectively, with .30 representing a moderate-sized relationship (Maas & Hox, 2005). These main effects were chosen to reflect the values of some empirical research in applied psychology literature. Mathieu et al. (2012) conducted a review of studies published in the Journal of Applied Psychology involving tests of cross-level interactions, and they reported a within-level slope range between −.06 and .45 and a between-level slope range between −.23 and .35. On the other hand, for the cross-level covariates interaction conditions, the within-level main effect was set at .40, and the between-level main effect was set at .50.

Design Factors

The design factors in the simulation included the location of the interaction effect with three levels (at the between level, at the within level, and cross levels), CN, CS, ICC, and the magnitude of the covariates interaction effect.

Location of the Interaction Effect

There were three levels of the location of the interaction effect: the covariates interaction effect at the between, within, and cross levels. For cross-level interaction conditions, the random slope of the within-level covariate predicting the within-level factor was specified to be predicted by the between-level covariate (the cross-level covariates interaction effect) as mentioned above. However, the random slope was not explained 100% by the between-level covariate, and the residual variance of this random slope was specified to be at .10, which was a reasonable value for the magnitudes of the cross-level interaction effect.

Number of Clusters and Cluster Size

For between-level covariates interaction conditions, CN was set at 40, 80, and 120 to reflect applied research scenarios and previous simulation research in ML SEM (Hox & Maas, 2001; Kim et al., 2012; Maas & Hox, 2005). Note that two dichotomous covariates were at the between level, creating four groups. Four balanced groups were generated, with 10, 20, and 30 in each group for the three levels of CN, respectively. CS was set at 10 and 20, which was suggested as a typical CS for designing multilevel research by Hox (1998). Multiplying CN by CS, the minimum and maximum total sample sizes were 400 (CN = 40 and CS = 10), and 2,400 (CN = 120 and CS = 20), respectively. When the two covariates were at the within level, the CN were set at 20, 40, and 60, respectively. The CS was twice that of the between-level covariates interaction to achieve identical total sample size: 20 and 40. Therefore, the four balanced groups created by two dichotomous covariates at the within level had CS of 5 and 10, respectively. The major reason for adopting this method was because the grouping was at different levels. For cross-level interaction conditions, the CN and CS were set to be the same as the between-level conditions, that is, 40, 80, and 120 clusters with a CS of 10 and 20.

Intraclass Correlation

ICC varied at three levels in this study, as the previous simulation research (e.g., Finch & French, 2011; Kim et al., 2012) showed that ICC had an impact on the performance of ML SEM. Different levels of ICC, which is defined as the ratio of the between-level factor variance over the total factor variance (the sum of the between-level and the within-level factor variance), were simulated by varying the between factor variance because the within-factor variance was fixed at one. The between factor variance was set to be .10, .25, and .50, creating three ICCs of .09, .20, and .33, respectively, as the small, medium, and large ICC. These ICC levels are in the range of reported ICCs in empirical educational and psychological research and were employed in previous simulation studies (e.g., Hox & Maas, 2001; Kim et al., 2012; Maas & Hox, 2005). The corresponding item-level ICCs were .08, .15, and .25 for all six items, respectively.

Magnitude of the Covariates Interaction

For the between-level and within-level interaction conditions, the covariates interaction effect was set at 0, .30, and .60, which were consistent with the range (−.06 to .45) reported in the literature review of applied research by Mathieu et al. (2012). The cross-level covariates interaction effect was set at 0, .20, and .40, with .20 representing a small cross-level covariates interaction and .40 indicating a moderate cross-level interaction effect. The interaction effect of zero was used to examine the false positive rates of ICs incorrectly selecting the models with the interaction effect when there was no interaction in the population model.

In sum, there were a total of 162 conditions (3 locations × 3 CN × 2 CS × 3 ICCs × 3 effect sizes). For each condition, 1,000 replications of data were generated. Each data set was analyzed using ML MIMIC models with and without the interaction effect using Mplus 7.1 (L. K. Muthén & Muthén, 1998-2012).

Outcome Variables

To investigate the impacts of omitting the covariates interactions on parameter estimates, the average of all parameter estimates over replications from the correctly specified and misspecified ML MIMIC models were computed separately and compared. We compared the estimates of the correctly specified and misspecified models instead of calculating the bias of parameters of the misspecified model because it is possible that the parameter estimates may be biased in the correctly specified model. In addition, the design factors were examined to see if they affected the impacts of misspecification on parameter estimates.

Second, model fit indices (CFI, TLI, RMSEA, SRMR-W, and SRMR-B) were compared between the correctly specified model and the misspecified model to see if the model fit indices were sensitive to the omission of the covariates interaction in ML MIMIC models. The average SEM-based model fit indices (CFI, TLI, RMSEA, SRMR-W, and SRMR-B) across the replications and their standard deviations were calculated and reported. These fit indices are thought of as sensitive if they meet the criteria proposed by Hu and Bentler (1999) in the correct model and fail to meet the criteria in the misspecified model. For ICs (i.e., AIC, BIC, and SaBIC), if they are greater in the misspecified model than in the correctly specified model, the correct model is selected. When there was an interaction effect in the population model, the proportion of replications in which the correct model was selected in model comparisons was computed as the true positive (TP) rates. High TP rates indicate these fit indices are sensitive to the omitted interaction effect. When there was zero interaction effect in the population model, the proportion of the replications in which the model with the interaction effect was selected was computed as the false positive rates. The lower the false positive rates are, the less oversensitive the fit indices are.

In ML SEM, some researchers suggested using BIC(J), in which J represents CN, instead of the regular BIC. BIC is defined as

BIC=2logL(θ)+log(n)r,

where logL(θ) denotes the log likelihood, n represents sample size, and r is the number of parameters in the model. Lukočienė and Vermunt (2010) discussed a specific issue when using BIC in the case of multilevel analysis. It is not clear whether the sample size in the formula, n, should be replaced by the cluster number, J. Pauler (1998) suggested the use of J for decisions about between-level model features and n for within-level model features in the context of linear mixed models. Kim et al. (2016) recommended using BIC(J) over the regular BIC in detecting between-level measurement noninvariance. In Mplus output, BIC was calculated using the total sample size, n. To evaluate the sensitivity of BIC using the cluster number instead of n as sample size, BIC(J) was computed to examine its sensitivity to the model misspecification in between-level covariates conditions. Note that BIC(J) was calculated only for the between-level covariates interaction model, because in the within-level and cross-level covariates interaction models, the total sample size was used for decisions about model features.

Last, the computed RMSEAt and CFIt of the within level when the between-level model was saturated, and of the between level when the within-level model was saturated, in equivalence testing were compared with the adjusted cutoff values of RMSEAe and CFIe to determine the sensitivity of fit indices produced in equivalence testing to the omission of interaction effect in ML MIMIC. RMSEA_e05 and CFI_e95 correspond to the conventionally accepted close model fit of RMSEA of .05 and CFI of .95, respectively. Under between-level interaction omission conditions, if RMSEAt of the between level was greater than RMSEA_e05 of the between level and CFIt was less than CFI_e95, then RMSEAt and CFIt were considered sensitive to the omission of the interaction effect. Similarly, under within-level interaction omission conditions, if RMSEAt of the within level was greater than RMSEA_e05 of the within level and CFIt was less than CFI_e95, then RMSEAt and CFIt were considered sensitive to the omission of interaction effect. In this study, equivalence testing with level-specific fit indices applied to omitting the between-level and within-level interaction effect conditions, but not the cross-level interaction effect. The cross-level interaction conditions were not included for level-specific indices in equivalence testing because a saturated model at either level cannot be specified, since the interaction term involves the variables in both between and within levels.

Results

The results are presented in the sequence of omitting the between-level, within-level, and the cross-level covariates interaction.

Omitting the Between-Level Covariates Interaction

Results showed that the main effects of the two covariates from the misspecified models were consistently overestimated as presented in Table 1. In the correct model, the main effects of the two covariates were around .30 and .40, respectively, close to the population values. In the misspecified model omitting the covariates interaction effect, the two main effects were overestimated around .44, and .54, respectively, when the interaction effect was at .30. The two main effects were estimated to be around .59 and .69 when the interaction effect was at .60, under all conditions. This indicated that the omitted interaction effect was channeled through the main effects, divided almost equally between the two main effects. The between-level factor mean was underestimated in all conditions when the interaction effect was omitted. In the correct model, the between-level factor mean was around zero in all conditions. In the misspecified model, the between-level factor mean was estimated to be around −.07 and −.14 for the covariates interaction of .30 and .60, respectively. In addition to the affected three parameters above, between-level factor residual variance was slightly affected as shown in Table 1. The between-level residual variance was overestimated by approximately .01 and .02 for covariates interaction of .30 and .60, respectively. The estimates of all other parameters (between-level factor loadings, between-level intercepts, between-level item residual variance, and all the parameters at the within level) were not noticeably affected by the misspecification.

Table 1.

The Estimated Values of Main Covariates Effects, Between-Level Factor Mean, and Factor Residual Variance of Correct and Misspecified Models With Between-Level Covariates.

Covariates interaction of .30 Covariates interaction of .60
Correct model Misspecified model Correct model Misspecified model
ICC CN/CS M1 M2 fb fbv M1 M2 fb fbv M1 M2 fb fby M1 M2 fb fbv
Small 40/10 .30 .40 .00 .03 .44 .54 −.07 .04 .30 .39 .00 .03 .60 .69 −.14 .06
40/20 .29 .40 .01 .04 .44 .54 −.06 .05 .29 .39 .01 .04 .59 .68 −.14 .07
80/10 .29 .39 .01 .04 .44 .54 −.07 .04 .29 .39 .01 .04 .59 .69 −.14 .06
80/20 .29 .39 .01 .05 .44 .54 −.07 .06 .30 .40 .00 .05 .59 .69 −.14 .07
120/10 .29 .39 .01 .04 .44 .54 −.07 .05 .30 .40 .00 .04 .59 .69 −.14 .06
120/20 .30 .39 .01 .05 .44 .54 −.07 .06 .30 .40 .00 .05 .60 .69 −.14 .07
Medium 40/10 .29 .39 .01 .11 .43 .53 −.06 .12 .30 .40 .01 .12 .59 .69 −.14 .14
40/20 .30 .40 .00 .13 .45 .54 −.07 .14 .31 .41 .00 .13 .60 .70 −.14 .15
80/10 .30 .39 .01 .12 .44 .54 −.06 .13 .29 .40 .01 .13 .59 .69 −.14 .15
80/20 .30 .40 .00 .14 .45 .54 −.07 .14 .30 .40 .01 .14 .59 .69 −.14 .16
120/10 .29 .39 .01 .13 .44 .54 −.06 .14 .29 .39 .01 .13 .59 .69 −.14 .16
120/20 .30 .39 .00 .14 .45 .54 −.07 .15 .29 .39 .01 .14 .59 .69 −.14 .17
Large 40/10 .29 .38 .01 .25 .45 .53 −.06 .27 .28 .39 .01 .26 .58 .69 −.14 .29
40/20 .30 .40 .00 .27 .45 .56 −.07 .29 .29 .39 .01 .28 .59 .69 −.14 .31
80/10 .29 .40 .00 .27 .44 .54 −.07 .28 .29 .39 .01 .28 .59 .69 −.14 .31
80/20 .28 .38 .02 .29 .44 .54 −.06 .30 .30 .40 .00 .29 .60 .70 −.15 .32
120/10 .30 .39 .01 .28 .44 .54 −.07 .29 .29 .39 .01 .28 .59 .69 −.14 .31
120/20 .29 .39 .01 .29 .45 .54 −.07 .30 .30 .39 .00 .29 .60 .69 −.14 .32

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; M1 = main effect of Covariate 1 (.30); M2 = main effect of Covariate 2 (.40); fb = between-level factor mean; fbv = between-level factor residual variance.

When the between-level covariates interaction was incorrectly left out, none of the SEM-based fit indices examined in this study (CFI, TLI, RMSEA, SRMR-B, and SRMR-W) was found to be sensitive to the omission of the covariates interaction effect as presented in Table 2. The CFI values of both the correctly specified model and the misspecified model were very close, at .98 or .99, respectively, and the same pattern was observed for TLI, ranging from .96 to .99, indicating excellent model fit for both correct and misspecified models according to the criteria proposed by Hu and Bentler (1999). TLI was very similar to CFI, and it was not presented in Table 2. RMSEA for both correct and misspecified model was around .05. SRMR-B values in the correct model and the misspecified model were very similar, ranging from .01 to .05. SRMR-W was at .05 in all conditions, regardless of the model specified. CFI, TLI, RMSEA, SRMR-B, and SRMR-W were very similar for interaction effect of .30 and .60. The standard deviation of these fit indices in the correct model was close to that in the misspecified model. Given that the SEM fit indices in the correct and misspecified models were very close, these fit indices were reported only for the misspecified model of interaction effect of .30. The complete data are available on request.

Table 2.

Detection Rates of Information Criteria for Between-Level Interaction and Values of Fit Indices Under Between-Level Interaction Omission.

ICs SEM fit indices (misspecified model)
Interaction effect of zero (false positive) Interaction effect of .30 Interaction effect of .60 Interaction effect of .30
ICC CN/CS AIC BIC BIC(J) SaBIC AIC BIC BIC(J) SaBIC AIC BIC BIC(J) SaBIC CFI (SD) RMSEA (SD) SRMR-B (SD) SRMR-W (SD)
S 40/10 .21 .03 .09 .15 .45 .14 .28 .36 .87 .57 .75 .81 .98 (.04) .07 (.04) .04 (.01) .05 (.01)
40/20 .20 .02 .08 .09 .54 .15 .35 .37 .96 .68 .88 .89 .98 (.02) .06 (.01) .03 (.01) .05 (.01)
80/10 .18 .01 .04 .08 .68 .24 .41 .50 .99 .86 .95 .97 .99 (.01) .05 (.01) .03 (.01) .05 (.01)
80/20 .17 .01 .04 .05 .77 .28 .54 .56 1.00 .92 .99 .99 .99 (.00) .05 (.01) .02 (.00) .05 (.01)
120/10 .17 .01 .02 .04 .81 .36 .52 .62 1.00 .96 .99 1.00 .99 (.00) .05 (.01) .03 (.01) .05 (.01)
120/20 .19 .01 .04 .04 .90 .42 .66 .68 1.00 .98 1.00 1.00 .99 (.00) .04 (.00) .02 (.00) .05 (.01)
M 40/10 .19 .02 .07 .12 .37 .09 .20 .27 .71 .33 .52 .60 .98 (.01) .06 (.02) .04 (.01) .05 (.01)
40/20 .18 .01 .07 .07 .38 .07 .22 .24 .74 .33 .59 .61 .98 (.01) .06 (.02) .04 (.01) .05 (.01)
80/10 .18 .01 .05 .08 .48 .11 .24 .31 .92 .58 .77 .83 .99 (.01) .05 (.01) .03 (.01) .05 (.01)
80/20 .16 .01 .04 .04 .53 .12 .27 .28 .94 .60 .80 .81 .99 (.00) .04 (.01) .02 (.00) .05 (.01)
120/10 .17 .02 .04 .06 .59 .18 .32 .39 .98 .77 .89 .92 .99 (.00) .04 (.01) .02 (.00) .05 (.01)
120/20 .19 .00 .03 .04 .68 .18 .38 .40 .99 .83 .95 .95 .99 (.00) .04 (.00) .01 (.00) .05 (.00)
L 40/10 .17 .03 .08 .11 .33 .03 .16 .23 .54 .19 .36 .45 .98 (.01) .05 (.01) .03 (.01) .05 (.01)
40/20 .20 .02 .09 .10 .30 .06 .16 .18 .58 .20 .40 .41 .99 (.00) .04 (.01) .02 (.01) .05 (.01)
80/10 .16 .01 .04 .07 .38 .07 .16 .20 .79 .34 .54 .62 .99 (.00) .04 (.01) .02 (.00) .05 (.01)
80/20 .17 .01 .04 .05 .43 .08 .20 .21 .81 .32 .58 .60 .99 (.00) .04 (.00) .02 (.00) .05 (.01)
120/10 .17 .01 .03 .06 .47 .10 .20 .26 .88 .50 .69 .75 .99 (.00) .04 (.01) .02 (.00) .05 (.01)
120/20 .17 .01 .03 .03 .53 .08 .24 .25 .91 .49 .72 .74 .99 (.00) .04 (.01) .01 (.01) .05 (.01)

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; ICs = information criteria; SEM = structural equation modeling; AIC = Akaike information criterion; BIC = Bayesian information criterion; BIC(J) = BIC with between-level sample size; SaBIC = sample size–adjusted BIC; CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; SRMR-B = SRMR for the between level; SRMR-W = SRMR for the within level; S = small; M = medium; L = large.

The proportions of the replications that AIC, BIC, BIC(J), and SaBIC selected the model with the interaction effect were computed to denote their sensitivity to model misspecification. The false positive rates for AIC ranged from .17 to .21, from 0 to .03 for BIC, from .02 to .09 for BIC(J), and from .03 to .15 for SaBIC. When the omitted covariates interaction effect was .30, the TP rates ranged from .30 to .90, .03 to .42,.16 to .66, and.18 to .68, for AIC, BIC, BIC(J), and SaBIC, respectively. The results showed that their sensitivity increased as sample size increased, especially as cluster number increased. ICC had a negative impact on their sensitivity to model misspecification. When ICC increased, their sensitivity decreased, controlling for all the other design factors. Therefore, AIC, BIC, BIC(J), and SaBIC had the best performance in the condition with the largest sample size and small ICC. Although the pattern of these four fit indices was very similar to each other, AIC was more sensitive to the misspecification than SaBIC, which was more sensitive than BIC(J), and BIC was the lowest. Note that AIC showed the highest false positive rates, while BIC had the lowest false positive rates. When the omitted covariates interaction was at .60, all four fit indices were more sensitive to the model misspecification than the covariates interaction of .30. The proportions of replications that selected the correct model for AIC, BIC, BIC(J), and SaBIC ranged from .54 to 1.00,.19 to .98,.36 to 1.00, and.41 to 1.00, respectively. In some conditions with larger sample size and small ICC, AIC, BIC(J), and SaBIC selected the correct model in all replications.

The detection rates of RMSEAt and CFIt obtained in equivalence testing of the between level (the within-level model was saturated), as well as the adjusted cutoff values of RMSEA_e05 and CFI_e95, corresponding to the conventional .05 of RMSEA and .95 of CFI of the between level under between-level interaction omission are presented in Table 3. The detection rate of between-level RMSEAt with saturated-within model ranged from 0 to .17 when sample size was 800 and above. For the smallest sample size with only 40 clusters, the detection rates of RMSEAt ranged from .19 to .57, increasing when ICC became smaller. The seemingly higher detection rate of RMSEAt in the smaller sample size was partly because RMSEA is penalized for the larger number of estimated parameters due to saturation of the within-level model and smaller sample size (i.e., the number of free parameters was larger than the CN; Browne & Cudeck, 1993; Ryu & West, 2009; Steiger, 1990). The detection rate of CFIt was zero in all conditions except in the conditions with the smallest sample size and small ICC. The results implied that RMSEAt and CFIt in equivalence testing failed to detect the model misspecification of the between-level model.1

Table 3.

Detection Rates of RMSEAt and CFIt Under Between-Level Interaction Omission and Adjusted Cutoff Values of RMSEA_e05 and CFI_e95.

Interaction effect of .30 Interaction effect of .60 Adjusted cutoff values
ICC CN/CS RMSEAt CFIt RMSEAt CFIt RMSEA_e05 CFI_e95
S 40/10 .57 .04 .53 .02 .07 .91
40/20 .17 .00 .13 .00 .07 .93
80/10 .08 .00 .06 .00 .07 .93
80/20 .00 .00 .00 .00 .06 .93
120/10 .00 .00 .00 .00 .06 .93
120/20 .00 .00 .00 .00 .06 .94
M 40/10 .37 .00 .34 .00 .07 .91
40/20 .04 .00 .04 .00 .07 .93
80/10 .02 .00 .01 .00 .07 .93
80/20 .00 .00 .00 .00 .06 .93
120/10 .00 .00 .00 .00 .06 .93
120/20 .00 .00 .00 .00 .06 .94
L 40/10 .22 .00 .19 .00 .07 .91
40/20 .01 .00 .01 .00 .07 .93
80/10 .01 .00 .00 .00 .07 .93
80/20 .00 .00 .00 .00 .06 .93
120/10 .00 .00 .00 .00 .06 .93
120/20 .00 .00 .00 .00 .06 .94

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; RMSEA = = root mean square error of approximation; RMSEAt = t-size RMSEA; CFI = comparative fit index; CFIt = t-size CFI; RMSEA_e05 = the cutoff value of RMSEAt corresponding to conventional RMSEA of .05; CFI_e05 = the cutoff value of CFIt corresponding to conventional CFI of .95; S = small; M = medium; L = large.

Omitting the Within-Level Covariates Interaction

With respect to parameter estimates, biased parameter estimates included the main effects of the two within-level covariates, and the between-level factor mean as shown in Table 3. Like the scenario of omitting between-level interaction effect, the omitted covariates interaction effect was channeled through the two main effects. Note that the two main effects received exactly half of the magnitude of the interaction effect. For instance, when the interaction effect was at .30, the two main effects in the misspecified model were .15 higher than in the correct model; when the interaction effect was .60, the two main effects were .30 higher than in the correct model. The between-level factor mean was underestimated when the within-level covariates interaction effect was omitted. In the correct model, the between-level factor mean was around −.20, and it was estimated around −.27, and −.35 when the interaction effect was .30, and .60, respectively. The other parameters in the multilevel MIMIC model were not affected. Results were very similar for the three different ICCs, and only the results of medium ICC were reported in Table 3.

When the within-level covariates interaction effect was omitted in the model, CFI, TLI, RMSEA, SRMR-B, and SRMR-W produced in the correct model were very close to those in the misspecified model. Again, CFI and TLI were very similar, and TLI was not reported in Table 4. None of these fit indices detected the misspecification, since they were very similar in the correct and misspecified models. These four fit indices of the misspecified model of the interaction effect of .30 were reported in Table 4. The proportions of replications that AIC, BIC, and SaBIC selected the correct model increased as the total sample size increased. Note that cluster number did not have an effect on them. ICC was not associated with the sensitivity of the three information criteria. In other words, the sensitivity was very similar in small, medium, and large ICCs when the other design factors were constant, and only results of medium ICC conditions were reported in Table 5. When there was no interaction effect in the true model, the false positive rates fell around .16, .01, and .06 for AIC, BIC, and SaBIC, respectively. When the omitted covariates interaction effect was at .30, the TP rates of AIC, BIC, and SaBIC ranged from .66 to 1.00, 24 to.96, and .54 to .99, respectively. When the omitted covariates interaction was at .60, all three ICs selected the correct model in 100% of the replications except in the conditions with the smallest sample size.

Table 4.

The Estimated Values of Main Covariates Effects and Between-Level Factor Mean of Correct and Misspecified Models With Within-Level Covariates.

Covariates interaction of .30 Covariates interaction of .60
Correct model Misspecified model Correct model Misspecified model
ICC CN/CS M1 M2 fb M1 M2 fb M1 M2 fb M1 M2 fb
Medium 40/10 .30 .40 −.20 .45 .55 −.28 .30 .40 −.20 .60 .70 −.36
40/20 .30 .40 −.20 .45 .55 −.27 .30 .40 −.20 .60 .70 −.35
80/10 .30 .40 −.20 .45 .55 −.27 .30 .40 −.20 .60 .70 −.35
80/20 .30 .40 −.20 .45 .55 −.27 .30 .40 −.20 .60 .70 −.35
120/10 .30 .40 −.20 .45 .55 −.28 .30 .40 −.20 .60 .70 −.34
120/20 .30 .40 −.20 .45 .55 −.28 .30 .40 −.20 .60 .70 −.35

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; M1= main effect of Covariate 1 (.30); M2 = main effect of Covariate 2 (.40); fb = between-level factor mean.

Table 5.

Detection Rates of Information Criteria for Within-Level Interaction and Values of Fit Indices Under Within-Level Interaction Omission.

ICs SEM fit indices of misspecified model
Interaction effect of zero (false positive) Interaction effect of .30 Interaction effect of .60 Interaction effect of .30
ICC CN/CS AIC BIC SaBIC AIC BIC SaBIC AIC BIC SaBIC CFI (SD) RMSEA (SD) SRMR-B (SD) SRMR-W (SD)
M 20/20 .14 .01 .09 .66 .27 .56 .99 .88 .97 .99 (.02) .03 (.02) .03 (.02) .02 (.00)
20/40 .16 .01 .06 .88 .46 .75 1.00 1.00 1.00 1.00 (.00) .02 (.01) .02 (.01) .01 (.00)
40/20 .15 .01 .07 .88 .45 .75 1.00 1.00 1.00 1.00 (.00) .01 (.01) .02 (.01) .01 (.00)
40/40 .17 .01 .05 .99 .80 .92 1.00 1.00 1.00 1.00 (.00) .01 (.01) .01 (.00) .01 (.00)
60/20 .17 .01 .05 .95 .66 .87 1.00 1.00 1.00 1.00 (.00) .01 (.01) .01 (.01) .01 (.00)
60/40 .15 .01 .04 1.00 .95 .98 1.00 1.00 1.00 1.00 (.00) .01 (.01) .01 (.00) .01 (.00)

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; ICs = information criteria; SEM = structural equation modeling; AIC = Akaike information criterion; BIC = Bayesian information criterion; SaBIC = sample size–adjusted BIC; CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; SRMR-B = SRMR for the between level; SRMR-W = SRMR for the within level; M = medium.

The detection rates of RMSEAt and CFIt obtained in equivalence testing of the within level (the between-level model was saturated), as well as the adjusted cutoff values of RMSEA_e05 and CFI_e95, corresponding to the conventional .05 of RMSEA and .95 of CFI of the within level under within-level interaction omission are presented in Table 6. The detection rate of the within-level RMSEAt ranged from 0 to .19 when sample size was 800 and above. For the sample size of 400, the detection rate of RMSEAt ranged from .16 to .54, higher for the small ICC. The higher detection rates of RMSEAt might occur not because the model was misspecified but because the number of free parameters exceeded the CN with a larger penalty given the sample size. There was negligible difference in the detection rates of RMSEAt and CFIt between the omitted within-level interaction of .30 and .60. The detection rate of CFIt was around zero regardless of the conditions. In sum, the within-level RMSEAt and CFIt did not detect the model misspecification at the within-level model of ML MIMIC.2

Table 6.

Detection Rates of RMSEAt and CFIt under Within-Level Interaction Omission and Adjusted Cutoff Values of RMSEA_e05 and CFI_e95.

Interaction effect of .30 Interaction effect of .60 Adjusted cutoff values
ICC CN/CS RMSEAt CFIt RMSEAt CFIt RMSEA_e05 CFI_e95
S 20/20 .53 .05 .54 .05 .07 .91
20/40 .18 .01 .19 .02 .07 .93
40/20 .18 .01 .16 .01 .07 .93
40/40 .01 .00 .01 .00 .06 .93
60/20 .03 .00 .03 .00 .06 .93
60/40 .00 .00 .00 .00 .06 .94
M 20/20 .36 .03 .37 .02 .07 .91
20/40 .04 .00 .05 .00 .07 .93
40/20 .02 .00 .02 .00 .07 .93
40/40 .00 .00 .00 .00 .06 .93
60/20 .00 .00 .00 .00 .06 .93
60/40 .00 .00 .00 .00 .06 .94
L 20/20 .16 .01 .16 .01 .07 .91
20/40 .01 .00 .01 .00 .07 .93
40/20 .00 .00 .00 .00 .07 .93
40/40 .00 .00 .00 .00 .06 .93
60/20 .00 .00 .00 .00 .06 .93
60/40 .00 .00 .00 .00 .06 .94

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; RMSEA = root mean square error of approximation; CFI = comparative fit index; RMSEAt = t-size RMSEA; CFIt = t-size CFI; RMSEA_e05 = the cutoff value of RMSEAt corresponding to conventional RMSEA of .05; CFI_e05 = the cutoff value of CFIt corresponding to conventional CFI of .95; S = small; M = medium; L = large.

Omitting the Cross-Level Covariates Interaction

With respect to the bias in parameter estimates resulting from the omission of the cross-level covariates interaction effect as shown in Table 7, the main effects of the within-level covariate and the between-level covariate were overestimated, receiving approximately half of the size of the omitted interaction effect, respectively. Apart from the two main effects, the between-level factor mean was affected, and the between-level factor residual variance was slightly affected. The between-level factor mean was underestimated. In the true model, they were estimated to be around −.20, but were underestimated to be around −.25, and −.30 for covariates interaction effect of .20 and .40, respectively. The between-level factor residual variance was about .01 higher in the misspecified model than in the correct model. Note that the design factors in the simulation study, including cluster number, CS, and ICC, had little impact on the bias of parameter estimates in the misspecified models as shown in Table 5.

Table 7.

The Estimated Values of Main Covariates Effects, Between-Level Factor Mean, and Between-Level Factor Residual Variance of Correct and Misspecified Models With Cross-Level Covariates.

Correct model Misspecified model
Effect ICC CN/CS MB MW fb fbv MB MW fb fbv
.20 S 40/10 .48 .39 −.19 .05 .60 .50 −.25 .06
40/20 .49 .40 −.20 .05 .60 .50 −.25 .06
80/10 .49 .40 −.20 .05 .60 .50 −.25 .06
80/20 .50 .40 −.20 .06 .60 .50 −.25 .06
120/10 .49 .40 −.20 .05 .60 .50 −.25 .06
120/20 .50 .40 −.20 .06 .60 .50 −.25 .06
M 40/10 .48 .39 −.18 .13 .59 .50 −.24 .15
40/20 .49 .40 −.19 .14 .60 .50 −.25 .15
80/10 .50 .40 −.20 .14 .60 .50 −.25 .16
80/20 .50 .40 −.20 .15 .60 .50 −.25 .16
120/10 .50 .40 −.20 .15 .60 .50 −.25 .16
120/20 .50 .40 −.20 .15 .60 .50 −.25 .16
L 40/10 .48 .39 −.19 .29 .59 .49 −.24 .30
40/20 .48 .40 −.19 .29 .59 .50 −.25 .30
80/10 .49 .40 −.19 .30 .59 .50 −.25 .32
80/20 .49 .40 −.20 .30 .59 .50 −.25 .31
120/10 .50 .40 −.20 .30 .60 .50 −.25 .32
120/20 .50 .40 −.20 .31 .60 .50 −.25 .32
.40 S 40/10 .48 .40 −.19 .05 .70 .60 −.30 .06
40/20 .50 .40 −.20 .05 .70 .60 −.30 .06
80/10 .49 .40 −.20 .05 .70 .60 −.30 .06
80/20 .50 .40 −.20 .05 .70 .60 −.30 .06
120/10 .49 .40 −.20 .05 .70 .60 −.30 .06
120/20 .49 .40 −.20 .06 .70 .60 −.30 .06
M 40/10 .48 .39 −.20 .14 .70 .60 −.30 .15
40/20 .49 .40 −.20 .14 .70 .60 −.30 .15
80/10 .50 .40 −.20 .14 .70 .60 −.30 .15
80/20 .50 .40 −.20 .15 .70 .60 −.30 .16
120/10 .50 .40 −.20 .14 .70 .60 −.30 .16
120/20 .50 .40 −.20 .15 .70 .60 −.30 .16
L 40/10 .47 .39 −.19 .29 .69 .60 −.29 .30
40/20 .50 .40 −.20 .30 .70 .60 −.31 .31
80/10 .49 .40 −.19 .30 .70 .60 −.29 .31
80/20 .50 .40 −.20 .31 .70 .60 −.30 .32
120/10 .50 .40 −.20 .30 .70 .60 −.30 .31
120/20 .50 .40 −.20 .31 .70 .60 −.30 .32

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; MB = between-level covariate main effect (.50); MW = within-level covariate main effect (.40); fb = between-level factor mean; fbv = between-level factor residual variance; S = small; M = medium; L = large.

The two-level random effect modeling did not produce model fit indices that are commonly used in SEM (e.g., CFI, TLI, RMSEA, and SRMR); however, it did produce ICs (i.e., AIC, BIC, and SaBIC). Therefore, checking the sensitivity of fit indices to misspecification in cross-level covariates interaction conditions applied only to AIC, BIC, and SaBIC. As shown in Table 8 the false positive rate for the interaction effect of zero conditions of AIC, BIC, and SaBIC ranged from 0 to .11, 0 to .02, and 0 to .09, respectively. When the sample was the largest size (i.e., CN = 120 and CS = 20), the false positive rates were around zero in all conditions for all ICs. When the cross-level covariates interaction of .20 was omitted, the ranges of the TP proportions were from .08 to .31, .02 to .10, and.05 to .25 for AIC, BIC, and SaBIC, respectively, indicating that these ICs had low sensitivity to model misspecification. However, when the omitted covariates interaction effect increased to .40, the sensitivity of the three ICs improved substantially. The proportions of replications that AIC, BIC, and SaBIC selected the correct model ranged from .54 to .90,.30 to .78, and .49 to .85, respectively. Furthermore, the sensitivity of the three ICs increased as the sample size increased. As ICC increased, the sensitivity of the ICs improved slightly.

Table 8.

Detection Rates of Information Criteria for Cross-Level Interaction.

Interaction effect of zero (false positive) Interaction effect of .20 Interaction effect of .40
ICC CN/CS AIC BIC SaBIC AIC BIC SaBIC AIC BIC SaBIC
S 40/10 .11 .02 .08 .23 .07 .19 .54 .30 .50
40/20 .04 .01 .03 .16 .05 .11 .57 .37 .49
80/10 .05 .01 .04 .20 .06 .14 .68 .46 .61
80/20 .01 .00 .01 .11 .03 .06 .72 .53 .65
120/10 .03 .01 .02 .20 .07 .13 .76 .58 .69
120/20 .00 .00 .00 .08 .02 .05 .78 .62 .71
M 40/10 .16 .03 .12 .28 .09 .23 .59 .35 .54
40/20 .07 .01 .04 .23 .07 .15 .65 .40 .58
80/10 .08 .01 .05 .26 .08 .17 .74 .50 .65
80/20 .02 .00 .01 .16 .06 .11 .80 .61 .73
120/10 .05 .01 .03 .24 .08 .16 .81 .61 .74
120/20 .01 .00 .00 .12 .04 .07 .86 .75 .81
L 40/10 .14 .02 .09 .31 .10 .25 .65 .40 .60
40/20 .08 .01 .05 .24 .08 .18 .71 .48 .63
80/10 .10 .01 .06 .32 .10 .26 .79 .55 .71
80/20 .02 .00 .01 .23 .08 .16 .86 .68 .79
120/10 .07 .01 .03 .30 .12 .21 .84 .64 .78
120/20 .02 .00 .01 .18 .07 .11 .90 .78 .85

Note. ICC = intraclass correlation; CN = number of clusters per group; CS = cluster size; AIC = Akaike information criterion; BIC = Bayesian information criterion; SaBIC = sample size–adjusted BIC; S = small; M = medium; L = large.

Discussion

The purpose of this study was to examine the impact of omitting covariates interaction on parameter estimates and the sensitivity of fit indices (RMSEA, CFI, TLI, SRMR-W, SRMR-B, AIC, BIC, and SaBIC) to such misspecification in ML MIMIC models. The simulation results indicated that none of the SEM-based fit indices (RMSEA, CFI, TLI, SRMR-W, and SRMR-B) detected the misspecification of incorrectly excluding the covariates interaction effect. All these fit indices produced in the misspecified model were almost identical to those in the correct model. Based on these fit indices, the misspecified model was concluded to fit the data very well, even though the existing covariates interaction effect was omitted in the model. Thus, obtaining very good fit indices did not guarantee that the functional form was correctly specified in the model. Moreover, level-specific fit indices RMSEAt and CFIt using equivalence testing failed to detect the omission of between-level and within-level interactions. The seemingly higher detection rates of level-specific RMSEAt in small sample sizes should not be considered as sensitivity to model misspecification because similar detection rates of level-specific RMSEAt were observed when there was no model misspecification (as reported in Notes 1 and 2). RMSEA is penalized for a large number of estimated parameters and small sample size (Browne & Cudeck, 1993; Ryu & West, 2009; Steiger, 1990). The saturation of the within-level or between-level model increased the number of estimated parameters. RMSEA was inflated due to the coupling of increased number of estimated parameters and small sample sizes. According to Ryu and West (2009), CFI and RMSEA that most SEM software packages present in the output failed to detect model misspecification that occurred at the between level but succeeded in detecting model misspecification at the within level. Note that the misspecification in their study was the misspecified factor correlation. The simulation results in our study were consistent with their results with respect to the misspecification at the between level (failure to detect the between-level misspecification), but conflicted with respect to the misspecification at the within level. CFI and RMSEA in their study were sensitive to the within-level model misspecification, whereas they were not in our study. Hsu et al. (2015) conducted simulation studies to examine the sensitivity of common fit indices in detecting misspecified ML SEMs, and the results showed that CFI, RMSEA, and TLI could detect misspecification only in the within-level model but not misspecification in the between-level model; SRMR-B was the only fit index sensitive to misspecification in the between-level model. However, in our study, SRMR-B was not sensitive to the omission of covariates interaction in the between-level model.

The sensitivity of the fit indices may vary as a result of different types of the misspecification. The previous research about the sensitivity of fit indices to model misspecification focused on the misspecification of factor loadings or factor covariance (e.g., Hsu et al., 2015; Hu & Bentler, 1998; Ryu & West, 2009). Misspecification of these forms influenced the model-implied mean and covariance matrices, thus affecting the fit indices. On the other hand, omitting a covariates interaction effect in multilevel MIMIC had a less severe impact on the model-implied mean and covariance matrices, and the fit indices were not sensitive enough to capture misspecification of this form. One explanation for the insensitivity of fit indices to the omission of the covariates interaction effect in multilevel SEM models was that the misspecification in one parameter (constraining the covariates interaction effect to be zero in this study) might be manifested as biased estimates of other related parameters without substantially changing the overall model-implied mean and covariance matrices (Gerbing & Anderson, 1993; Tomarken & Waller, 2003). When model-implied mean and covariance matrices were not affected much by the model misspecification, the χ2 statistic produced by the misspecified model was not very different from the χ2 of the correct model. Therefore, fit indices, such as CFI, RMSEA, TLI, which are a function of the overall model χ2 test statistic, were not sensitive to the model misspecification. Therefore, the fit indices in SEM have their limitations in detecting model misspecification of a variety of forms. Good fit indices based on the commonly used criteria (e.g., CFI > .95) could not guarantee that the relations of the variables in the model were correctly modeled.

The measurement component in SEM had much more weight on global fit indices than the structural component (McDonald, 2010; O’Boyle & Williams, 2011), and global fit indices failed to detect model misspecification when the misspecified part was in the structural model. In this ML MIMIC model, the measurement part of both the between and within levels was correctly specified, and the omission of the interaction effect occurred in the structural part. This could be part of the reason why all SEM-based fit indices were insensitive to the omission of interaction effect.

The seemingly higher sensitivity of AIC compared with other ICs was accompanied by a higher false positive rate when there was no covariate interaction in the population model. This tendency was consistent with literature about AIC, which tended to select the more complex model (with interaction effect in our study) because of the less harsh penalty for model complexity imposed on AIC compared with other ICs (e.g., Kim et al., 2015; Vrieze, 2012). It is not wise to use AIC to detect the omission of interaction effect even with its higher sensitivity, due to its high false positive rates.

When necessary parameters are omitted from the model, not all parameters in the model are affected, but only certain parameters that are closely related to the misspecification (Yuan et al., 2003). The results of this study conformed to this statement. Only the parameters that were closely related to the omitted interaction effect were affected, including the two main effects of the two covariates and the between-level factor mean. Based on the results of the path analysis conducted by Yuan et al. (2003), it was expected that parameters mentioned above (i.e., main effects and between-level factor mean) were to be affected as they were closely related to the misspecification. However, to what degree they were to be affected was not certain before this study. In this study, it showed that the two main effects of the two covariates carried approximately half of the interaction effect in the same direction.

Practical Implications

The consequences of the insensitivity of fit indices to a missing covariates interaction effect pose serious issues for applied researchers who are trying to make inferences about the relationship between variables in the data. It is recommended that applied researchers use theoretical frameworks in substantive research areas to guide the modeling of the relationship among variables in the model to avoid omitting some important effects between some variables. If no established theory has been in place, more explorations into the relationship between the variables are suggested to have a more complete understanding of the data. It is important to run both models with and without interaction effects. For model comparisons, CFI, TLI, REMSEA, SRMR-B, and SRMR-W failed to detect the omission, producing comparable fit indices in the correct and misspecified models. When the omitted interaction effect was large, the information criteria such as AIC, BIC, and SaBIC were quite sensitive to the omission, especially for the within level and cross-level conditions. For the between-level conditions, a sample size of 800 or more was desirable to detect the omission. When the omitted interaction was moderate, ICs failed to detect the misspecification in the between level and cross-level conditions, but they could detect the omission of the within-level interaction with a sample size greater than 1,600. Thus, it is important for the applied researcher to consider the location of the potential interaction, the available sample size, and ICC when it comes to selection of models with and without covariates interaction. It is recommended to examine the combination of the three ICs (replace BIC with BIC(J) in between-level interaction) instead of using only one of them.

1.

We also examined the detection rates of the within-level RMSEAt and CFIt when the between level was saturated. Because there was no model misspecification at the within level, the detection rates could be regarded as false positive and should be close to zero. The detection rates of CFIt ranged from 0 to .07, but those of RMSEAt ranged from 0 to .49, higher for the smallest sample size combined with large ICC, showing inflated false positive rates with small samples possibly for the same reason we explained in the text.

2.

The detection rates of the between-level RMSEAt and CFIt could be regarded as false positive, as the within-level model was saturated and there was no model misspecification in the between-level model. The detection rate of the between-level RMSEAt was higher for the small ICC, around. 50 for the sample size of 400, and decreased dramatically when sample size increased, especially for large ICC. The detection rate of the between-level CFIt was about zero for all conditions.

Footnotes

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

  1. Akaike H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716-723. 10.1109/TAC.1974.1100705 [DOI] [Google Scholar]
  2. Bentler P. M., Chou C. P. (1993). Some new covariance structure model improvement statistics. In Bollen K. A., Long J. S. (Eds.), Testing structural equation models (pp. 235-255). Sage. [Google Scholar]
  3. Browne M. W., Cudeck R. (1993). Alternative ways of assessing model fit. In Bollen K. A., Long J. S. (Eds.), Testing structural equation models (Vol. 154, pp. 132-162). Sage. [Google Scholar]
  4. Cao C., Kim E. S., Chen Y.-H., Ferron J., Stark S. (2019). Exploring the test of covariate moderation effects in multilevel MIMIC models. Educational and Psychological Measurement, 79(3), 512-544. 10.1177/0013164418793490 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cheng Y., Shao C., Lathrop Q. N. (2016). The mediated MIMIC model for understanding the underlying mechanism of DIF. Educational and Psychological Measurement, 76(1), 43-63. 10.1177/0013164415576187 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chiu M. M. (2010). Effects of inequality, family, and school on mathematics achievement: Country and student difference. Social Forces, 88(4), 1645-1676. 10.1353/sof.2010.0019 [DOI] [Google Scholar]
  7. Enders C. K., Tofighi D. (2008). The impact of misspecifying class-specific residual variances in growth mixture models. Structural Equation Modeling, 15(1), 75-95. 10.1080/10705510701758281 [DOI] [Google Scholar]
  8. Fan X., Sivo S. A. (2007). Sensitivity of fit indices to model misspecification and model types. Multivariate Behavioral Research, 42, 509-529. 10.1080/00273170701382864 [DOI] [Google Scholar]
  9. Fan X., Thompson B., Wang L. (1999). The effects of sample size, estimation methods, and model specification on SEM fit indices. Structural Equation Modeling, 6(1), 56-83. 10.1080/10705519909540119 [DOI] [Google Scholar]
  10. Fan X., Wang L. (1998). Effects of potential confounding factors on fit indices and parameter estimates for true and misspecified SEM models. Educational and Psychological Measurement, 58(5), 699-733. 10.1177/0013164498058005001 [DOI] [Google Scholar]
  11. Farley J. U., Reddy S. K. (1987). A factorial evaluation of effects of model specification and error on parameter estimation in a structural equation model. Multivariate Behavioral Research, 22(1), 71-90. 10.1207/s15327906mbr2201_4 [DOI] [PubMed] [Google Scholar]
  12. Ferron J., Dailey R., Yi Q. (2002). Effects of misspecifying the first-level error structure in two-level models of change. Multivariate Behavioral Research, 37(3), 379-403. 10.1207/S15327906MBR3703_4 [DOI] [PubMed] [Google Scholar]
  13. Finch W. H., French B. F. (2011). Estimation of MIMIC model parameters with multilevel data. Structural Equation Modeling, 18(2), 229-252. 10.1080/10705511.2011.557338 [DOI] [Google Scholar]
  14. Fleishman J., Spector W., Altman B. (2002). Impact of differential item functioning on age and gender differences in functional disability. Psychological Sciences and Social Sciences, 57(5), 275-284. 10.1093/geronb/57.5.S275 [DOI] [PubMed] [Google Scholar]
  15. Gallini J. K. (1983). Misspecifications that can result in path analysis structures. Applied Psychological Measurement, 7(2), 125-137. 10.1177/014662168300700201 [DOI] [Google Scholar]
  16. Gerbing D. W., Anderson J. C. (1993). Monte Carlo evaluations of goodness-of-fit indices for structural equation models. In Bollen K. A., Long J. S., (Eds.), Testing structural equation models (pp. 40-59). Sage. [Google Scholar]
  17. Heck R., Thomas S. (2000). An introduction to multilevel modeling techniques. Lawrence Erlbaum. 10.4324/9781410604767 [DOI]
  18. Hox J. J. (1998). Multilevel modeling: When and why. In Balderjahn I., Mathar R., Schader M. (Eds.). Classification, data analysis, and data highways (pp. 147-154). Springer Verlag. 10.1007/978-3-642-72087-1_17 [DOI] [Google Scholar]
  19. Hox J. J. (2002). Multilevel analysis: Techniques and applications. Lawrence Erlbaum. 10.4324/9781410604118 [DOI]
  20. Hox J. J., Maas C. J. M. (2001). The accuracy of multilevel structural equation modeling with pseudobalanced groups and small samples. Structural Equation Modeling, 8(2), 157-174. 10.1207/S15328007SEM0802_1 [DOI] [Google Scholar]
  21. Hsu H., Kwok O., Lin J. H., Acosta S. (2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50(2), 197-215. 10.1080/00273171.2014.977429 [DOI] [PubMed] [Google Scholar]
  22. Hu L., Bentler P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3(4), 424-453. 10.1037/1082-989X.3.4.424 [DOI] [Google Scholar]
  23. Hu L., Bentler P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1-55. 10.1080/10705519909540118 [DOI] [Google Scholar]
  24. Jak S., Oort F. J., Dolan C. V. (2014). Measurement bias in multilevel data. Structural Equation Modeling, 21(1), 31-39. 10.1080/10705511.2014.856694 [DOI] [Google Scholar]
  25. Jöreskog K. G., Goldberger A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70(351a), 631-639. 10.1080/01621459.1975.10482485 [DOI]
  26. Kaplan D. (1988). The impact of specification error on the estimation, testing, and improvement of structural equation models. Multivariate Behavioral Research, 23(1), 69-86. 10.1207/s15327906mbr2301_4 [DOI] [PubMed] [Google Scholar]
  27. Kaplan D. (2009). Structural equation modeling: Foundations and extensions (2nd ed.). Sage. 10.4135/9781452226576 [DOI] [Google Scholar]
  28. Kenny D. A., McCoach D. B. (2003). Effect of the number of variables on measures of fit in structural equation modeling. Structural Equation Modeling, 10(3), 333-351. 10.1207/S15328007SEM1003_1 [DOI] [Google Scholar]
  29. Kim E. S., Joo S.-H., Lee P., Wang Y., Stark S. (2016). Measurement invariance testing across between-level latent class using multilevel factor mixture modeling. Structural Equation Modeling, 23(6), 870-887. 10.1080/10705511.2016.1196108 [DOI] [Google Scholar]
  30. Kim E. S., Kwok O., Yoon M. (2012). Testing factorial invariance in multilevel data: A Monte Carlo study. Structural Equation Modeling, 19(2), 250-267. 10.1080/10705511.2012.659623 [DOI] [Google Scholar]
  31. Kim E. S., Yoon M., Wen Y., Luo W., Kwok O. (2015). Within-level group factorial invariance with multilevel data: Multilevel factor mixture and multilevel MIMIC models. Structural Equation Modeling, 22(4), 603-616. 10.1080/10705511.2014.938217 [DOI]
  32. Lee Y. R., Hong S. (2019). The impact of omitting random interaction effects in cross-classified random effect modeling. Journal of Experimental Education, 87(4), 641-660. 10.1080/00220973.2018.1507985 [DOI] [Google Scholar]
  33. Lukočienė O, Vermunt J. K. (2010). Determining the number of components in mixture models for hierarchical data, in Berthold A. F. L., Seidel W., Ultsch A., Advances in data analysis, data handling and business intelligence (pp. 241-249). Springer. 10.1007/978-3-642-01044-6_22 [DOI] [Google Scholar]
  34. Maas C. J. M., Hox J. J. (2005). Sufficient sample size for multilevel modeling. Methodology, 1, 86-92. 10.1027/1614-2241.1.3.86 [DOI] [Google Scholar]
  35. Marcoulides K. M., Yuan K.-H. (2020). Using equivalence testing to evaluate goodness of fit in multilevel structural equation modeling. International Journal of Research & Method in Education, 43, 431-443. 10.1080/1743727X.2020.1795113 [DOI] [Google Scholar]
  36. Marsh H. W., Balla J. R., Hau K.-T. (1996). An evaluation of incremental fit indices: A clarification of mathematical and empirical properties. In Marcoulides G. A., Schumacker R. E. (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 315-353). Lawrence Erlbaum. [Google Scholar]
  37. Marsh H. W., Guo J., Parker P. D., Nagengast B., Asparouhov T., Muthén B., Dicke T. (2018). What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychological Methods, 23(3), 524-545. 10.1037/met0000113 [DOI] [PubMed] [Google Scholar]
  38. Marsh H. W., Wen Z., Hau K. T. (2004). Structural equation models of latent interactions: Evaluation of alternative estimation strategies and indicator construction. Psychological Methods, 9(3), 275-300. 10.1037/1082-989X.9.3.275 [DOI] [PubMed] [Google Scholar]
  39. Mathieu J. E., Aguinis H., Culpepper S. A., Chen G. (2012). Understanding and estimating the power to detect cross-level interaction effects in multilevel modeling. Journal of Applied Psychology, 97(5), 951-966. 10.1037/a0028380 [DOI] [PubMed] [Google Scholar]
  40. McDonald R. P. (2010). Structural models and the art of approximation. Perspectives on Psychological Science, 5(6), 675-686. 10.1177/1745691610388766 [DOI] [PubMed] [Google Scholar]
  41. Muthén B. O., Asparouhov T. (2003). Modeling interactions between latent and observed continuous variables using maximum-likelihood estimation in Mplus (Mplus Web Notes: No. 6). [Google Scholar]
  42. Muthén L. K., Muthén B. O. (1998-2012). Mplus user’s guide (7th ed.). Muthén & Muthén. [Google Scholar]
  43. O’Boyle E. H., Jr., Williams L. J. (2011). Decomposing model fit: Measurement vs. theory in organizational research using latent variables. Journal of Applied Psychology, 96(1), 1-12. 10.1037/a0020539 [DOI] [PubMed] [Google Scholar]
  44. Pauler D. K. (1998). The Schwarz criterion and related methods for normal linear models. Biometrika, 85(1), 13-27. 10.1093/biomet/85.1.13 [DOI] [Google Scholar]
  45. Raudenbush S. W., Bryk A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Sage. [Google Scholar]
  46. Rutkowski L., Svetina D. (2014). Assessing the hypothesis of measurement invariance in the context of large scale international surveys. Educational and Psychological Measurement, 74(1), 31-57. 10.1177/0013164413498257 [DOI] [Google Scholar]
  47. Ryu E., West S. G. (2009). Level-specific evaluation of model fit in multilevel structural equation modeling. Structural Equation Modeling, 16(4), 583-601. 10.1080/10705510903203466 [DOI] [Google Scholar]
  48. Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461-464. 10.1214/aos/1176344136 [DOI] [Google Scholar]
  49. Sclove S. L. (1987). Application of model-selection criterion to some problems in multivariate analysis. Psychometrika, 52, 333-343. 10.1007/BF02294360 [DOI] [Google Scholar]
  50. Shi Y., Leite W., Algina J. (2010). The impact of omitting the interaction between crossed factors in cross-classified random effects modeling. British Journal of Mathematical and Statistical Psychology, 63(1), 1-15. 10.1348/000711008X398968 [DOI] [PubMed] [Google Scholar]
  51. Sörbom D. (1975). Detection of correlated errors in longitudinal data. British Journal of Mathematical and Statistical Psychology, 28(2), 138-151. 10.1111/j.2044-8317.1975.tb00558.x [DOI] [Google Scholar]
  52. Steiger J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate Behavioral Research, 25(2), 173-180. 10.1207/s15327906mbr2502_4 [DOI] [PubMed] [Google Scholar]
  53. Tomarken A. J., Waller N. G. (2003). Potential problems with well fitting models. Journal of Abnormal Psychology, 112(4), 578-598. 10.1037/0021-843X.112.4.578 [DOI] [PubMed] [Google Scholar]
  54. Tsai S.-L., Smith M. L., Hauser R. M. (2017). Families, schools, and student achievement inequality: A multilevel MIMIC model approach. Sociology of Education, 90(1), 64-88. 10.1177/0038040716683779 [DOI] [Google Scholar]
  55. Vrieze S. I. (2012). Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychological Methods, 17(2), 228-243. 10.1037/a0027127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Wellek S. (2010). Testing statistical hypotheses of equivalence and noninferiority (2nd ed.). Chapman & Hall/CRC. 10.1201/EBK1439808184 [DOI]
  57. Yuan K.-H., Bentler. P. M. (2007). Multilevel covariance structure analysis by fitting multiple single-level models. Sociological Methodology, 37(1), 53-82. 10.1111/j.1467-9531.2007.00182.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Yuan K.-H., Chan W., Marcoulides G. A., Bentler P. M. (2016). Assessing structural equation models by equivalence testing with adjusted fit Indexes. Structural Equation Modeling, 23(3), 319-330. 10.1080/10705511.2015.1065414 [DOI] [Google Scholar]
  59. Yuan K.-H., Marshall L. L., Bentler. P. M. (2003). Assessing the effect of model misspecifications on parameter estimates in structural equation models. Sociological Methodology, 33(1), 241-265. 10.1111/j.0081-1750.2003.00132.x [DOI] [Google Scholar]

Articles from Educational and Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES