Skip to main content
Educational and Psychological Measurement logoLink to Educational and Psychological Measurement
. 2021 Mar 9;82(1):5–28. doi: 10.1177/0013164421997896

Robustness of Latent Profile Analysis to Measurement Noninvariance Between Profiles

Yan Wang 1,, Eunsook Kim 2, Zhiyao Yi 3
PMCID: PMC8725055  PMID: 34992305

Abstract

Latent profile analysis (LPA) identifies heterogeneous subgroups based on continuous indicators that represent different dimensions. It is a common practice to measure each dimension using items, create composite or factor scores for each dimension, and use these scores as indicators of profiles in LPA. In this case, measurement models for dimensions are not included and potential noninvariance across latent profiles is not modeled in LPA. This simulation study examined the robustness of LPA in terms of class enumeration and parameter recovery when the noninvariance was unmodeled by using composite or factor scores as profile indicators. Results showed that correct class enumeration rates of LPA were relatively high with small degree of noninvariance, large class separation, large sample size, and equal proportions. Severe bias in profile indicator mean difference was observed with intercept and loading noninvariance, respectively. Implications for applied researchers are discussed.

Keywords: latent profile analysis, composite scores, factor scores, measurement noninvariance, factor mixture modeling

Introduction

Latent profile analysis (LPA) has been widely used in social and behavioral sciences to identify substantively meaningful unobserved groups based on a set of continuous indicators (Vermunt & Magidson, 2002). The unobserved groups are often referred to as profiles or classes. Individuals with similar response patterns are classified into a profile. For example, Lazarides et al. (2020) examined student profiles of motivational beliefs in math based on three dimensions: intrinsic value, importance value, and ability self-concept. Four profiles emerged where three of them had low, medium, and high levels across all three dimensions, respectively, and another profile was low in intrinsic value but high on the other two dimensions. Other examples of LPA include the identification of immigrant families’ parental socialization profiles based on socialization and parenting dimensions (S. Y. Kim et al., 2019), subgroups of post-9/11 veterans based on their perceptions of quality of life in various domains (McCaslin et al., 2019), profiles of early adolescents according to a set of psychosocial dimensions including perceived stress and distress tolerance (Warren et al., 2020).

Among those applications, LPA is commonly estimated using composite scores or factor scores saved from measurement models. Specifically, although dimensions are usually measured by a set of items, composite scores (e.g., the sum or mean scores) are often created for each dimension and used as indicators of profiles in LPA. In addition, profile indicators can be the factor scores of dimensions that are saved from measurement model (e.g., confirmatory factor analysis or CFA; exploratory structural equation modeling or ESEM; Kam et al., 2016; Meyer & Morin, 2016; Morin & Marsh, 2015; Morin et al., 2016). These two approaches are prevalent in LPA given that it is straightforward to compute and use composite scores or factor scores which would lead to relatively simple model specifications and interpretations as compared with LPA with measurement models (Luningham et al., 2017).

However, the simplicity of using composite scores or factor scores in LPA comes at a cost. First, as measurement models are not included in LPA, measurement errors that are associated with the items are not taken into account. While measurement errors are completely unaccounted for when composite scores are used, factor scores can partially control for measurement errors as items with higher loadings (i.e., lower levels of measurement errors) are given more weight in the estimation of factor scores (Meyer & Morin, 2016; Morin et al., 2016). Second, measurement invariance (MI) across profiles is not tested and thus potential noninvariance is not modeled in LPA. The consequence of not modeling measurement noninvariance (MNI) when using composite scores has received much attention and been investigated across a variety of models, including structural equation models (Neale et al., 2005), latent growth modeling (E. S. Kim & Willson, 2014; Luningham et al., 2017; Wirth, 2008), growth mixture modeling (E. S. Kim & Wang, 2017), and longitudinal mediation models (Zhang & Yang, 2020). Overall using composite scores was not recommended in those models given biased parameter estimates and/or low coverage of true parameters.

When measurement models are included in LPA, LPA is extended to factor mixture modeling (FMM) which can take measurement errors and MNI into account. FMM allows for measurement errors because responses to each item are a function of factor scores and measurement errors. MNI is modeled by freely estimating noninvariant parameters across profiles (while invariant parameters are constrained to be equal across profiles) in FMM. On the other hand, because FMM takes into account MNI and measurement errors, the model becomes more complicated than LPA (Morin et al., 2016). For example, the size of the model grows considerably, especially when the numbers of factors and items are relatively large. With a larger model size, the number of parameters to be estimated would increase considerably. The model complexity might lead to nonconvergence issues (Luningham et al., 2017) and require a larger sample size. In addition, model interpretations might become more complicated as profiles might differ from each other in terms of both factor means and MNI that is often unintended in FMM. To summarize, the benefit of FMM in accounting for measurement errors and MNI comes with increasing model complexity, which might explain the prevalence of LPA where composite scores or factor scores are used.

However, despite the prevalence of LPA, it remains unknown how robust LPA is to the presence of MNI across latent profiles when measurement models are ignored unlike FMM. The goal of this study was to shed light on this issue by investigating the impact of ignoring MNI on class enumeration and parameter estimates of LPA when MNI was unmodeled by using composite scores or factor scores as profile indicators. A secondary aim was to examine the performance of FMM (LPA with measurement models) when MNI was modeled, which would serve as a benchmark by which the robustness of LPA to MNI can be evaluated.

Latent Profile Analysis

LPA identifies a latent categorical variable based on responses to a set of continuous indicators (Vermunt & Magidson, 2002). Figure 1a represents an example of LPA where continuous responses to Y1, Y2, Y3, and Y4 define C, the latent categorical variable that represents distinct profiles or classes. The purpose of LPA is to identify the number of latent profiles and categorize individuals into profiles. The commonly applied LPA can be written as

Figure 1.

Figure 1.

(a) Latent profile analysis with four continuous indicators, Y1, Y2, Y3, and Y4. C represents the latent categorical or class variable. (b) Factor mixture modeling with four latent factors, F1, F2, F3, and F4, each measured by five continuous items. For simplicity, error terms associated with Y1 to Y4 are not shown in (a); error terms associated with X1 to X20 and measurement models for F2 and F3 are not shown in (b).

f(yi|θ)=k=1Kπkfk(yi|θk), (1)

where yi is a vector of responses to profile indicators for an individual i, θ represents model parameters, k refers to a specific profile (k = 1, 2, 3, . . ., K), πk is the probability of belonging to a profile k, and fk is a profile-specific normal density function with profile-specific mean vector and variance–covariance matrix θk=(μk,Σk) . The distribution of yi conditional on the model parameters θ, f(yi|θ) , is a function of πk and fk . LPA generally makes two assumptions. First, Σk=Σ . That is, the variance–covariance matrices are assumed to be equal across profiles. The mean vector, μk , distinguishes profiles. Second, the variance–covariance matrices are diagonal. In other words, the covariance among indicators is fixed to be zero. Equivalently speaking, conditional on the individual membership to a profile, indicators within each profile are independent.

Given that the optimal number of profiles is often unknown, models with various numbers of profiles (e.g., one-, two-, and three-profile) are compared and the final model is selected based on statistical fit and substantive interpretability. When comparing model fit, likelihood-based tests can be used, including the Lo–Mendell–Rubin test (LMR; Lo et al., 2001), the adjusted LMR (aLMR; Lo et al., 2001), and the bootstrap likelihood ratio test (McLachlan & Peel, 2000). In these tests, the model with K profiles has a better fit than that with K − 1 profiles if p value is statistically significant (i.e., p < .05). Information criteria (ICs) are also commonly used in model comparisons, including Akaike information criterion (AIC; Akaike, 1974), consistent AIC (cAIC; Bozdogan, 1987), Bayesian information criterion (BIC; Schwarz, 1978), and sample size–adjusted BIC (saBIC; Sclove, 1987). Model with the smallest value of ICs has the best fit.

Factor Mixture Modeling

As an alternative to using composite or factor scores as profile indicators, a measurement model can be incorporated for each indicator in LPA, which yields factor mixture modeling or FMM (see Figure 1b where Y1 to Y4 are replaced by latent factors F1 to F4 and the corresponding measurement model, respectively). In this sense, FMM combines CFA and LPA.

For each factor, the CFA part of FMM can be represented as

Yik=νk+Λkηik+εik, (2)

where Yik is a P× 1 vector of item responses for an individual i in profile k with P denoting the number of items; νk and εik are P× 1 vectors of item intercepts and residuals, respectively; ηik is an R× 1 vector of factor scores with R denoting the number of factors; and Λk is a P× R matrix of factor loadings. Note that the subscript k associated with the matrix and vectors indicates that they are profile specific. Equation (2) states that an individual’s response to an item is a function of the item intercept, factor loadings, the individual’s factor scores, and the residual. Residuals are assumed to be normally distributed within profile with a mean of zero and Θk is the covariance matrix of the residuals. Note that residual covariances are often fixed to be zero, and thus, only the diagonal elements in Θk representing residual variances are estimated. The normal distribution is also assumed for factor scores with αk representing the vector of factor means and Ψk the covariance matrix of factors.

In FMM, equality constraints on the measurement parameters can be imposed across profiles, which indicates MI. There are three common levels of MI (Lubke & Muthén, 2005; Meredith, 1993). In a configural invariance model, the same factor structure applies across profiles but intercepts and factor loadings are free to vary across profiles. The metric (or weak) invariance model requires that factor loadings are constrained to be equal but intercepts are freely estimated across profiles (i.e., Λk=Λ ). An additional constraint on the equality of intercepts can be imposed across profiles in the scalar (or strong) invariance model (i.e., Λk=Λ , and νk=ν ), which is the prerequisite to valid factor mean comparisons across profiles. The factor means are often freely estimated, given that the distinction of profiles based on factor means is of focal interest in most FMM applications (e.g., Allan et al., 2014; Lazarides et al., 2020; Warren et al., 2020).

To account for potential MNI across profiles in FMM specifications, MI needs to be tested by constructing and comparing the fit of configural, metric, and scalar invariance models. Note that the number of profiles is often unknown while testing MI, and therefore the level of MI and the number of profiles can be tested simultaneously. For example, a series of models are compared based on the ICs, including one-profile, two-profile configural, two-profile metric, two-profile scalar, three-profile configural, three-profile metric, three-profile scalar, and so on. Model with the smallest value of ICs will be selected so that both the number of profiles and the level of MI can be identified. If scalar invariance can be established, factor means can be meaningfully interpreted and compared across profiles, which is often the focus of substantive research using FMM.

The Impact of Measurement Noninvariance in Mixture Models

Although FMM includes the measurement model and thus takes into account MNI and measurement error, LPA has been widely adopted in the applied literature with the composite scores or factor scores created for dimensions and used as profile indicators. In LPA, MNI across profiles is not addressed. This section reviews relevant literature on the impact of ignoring MNI in mixture models. Nylund-Gibson and Masyn (2016) have examined the effect of ignoring MNI related with observed covariates in LCA on class enumeration from the standpoint of misspecifying the covariate effect. That is, MNI in intercept was generated through the covariate effect on the latent class indicator(s) and ignored by not specifying such covariate effect. They observed the overextraction of latent classes when the covariate was included in LCA but its effect on latent class indicators was not specified (i.e., MNI was not taken into account); however, when the covariate was not included in LCA, class enumeration was accurate. E. S. Kim and Wang (2017) examined the impact of using mean composite scores in growth mixture modeling via Monte Carlo simulation. They observed that although the correct number of latent classes was still identified, the magnitude and direction of bias in intercept and slope factor means was strongly associated with the magnitude and direction of MNI. For example, when the size of MNI doubled, the size of bias also doubled. Positive bias was observed when positive MNI was ignored. In addition, MNI in intercepts was associated with bias in intercept factor mean and MNI in factor loadings was associated with bias in slope factor mean.

Previous simulation studies have examined the impact of ignoring MNI within the context of LPA (Cole, 2017; Olivera-Aguilar & Rikoon, 2018). However, the MNI was defined as different LPA model parameters across observed groups, not latent profiles. For instance, Cole (2017) simulated direct covariate effects on profile indicators so that the covariate was the source of MNI. It was found that parameter estimates and profile membership were biased when MNI was not taken into account (i.e., only covariate effect on the latent profile membership was included). Olivera-Aguilar and Rikoon (2018) examined MNI in the multiple-group LPA setting where noninvariance was simulated by generating unequal indicator means across observed groups. They observed large relative bias in indicator means with medium and large magnitudes of noninvariance and large percent of noninvariant items (4 out of 4 items). More severe bias was found when the sample sizes across observed groups were unbalanced.

Purposes of the Study

With MNI in LPA operationalized as noninvariant parameters across observed groups in previous studies, this study shifted the focus to noninvariance across latent profiles and aimed to investigate the impact of such noninvariance on class enumeration and parameter estimates in LPA when the noninvariance was unmodeled by using composite scores and factor scores as profile indicators. Note that the performance of FMM (LPA with measurement models) when noninvariance was correctly modeled would serve as a point of reference by which the robustness of LPA can be evaluated.

Method

Data Generation and Simulation Factors

Data were generated based on FMM with four factors each measured by five continuous and normally distributed items. The number of profiles was fixed at two. For Profile 1, factor loadings ranged from .5 to .8, and intercepts were all fixed to be zero. Item residual variances ranged from .36 to .75 so the total variances were all 1 (residual variance + squared factor loadings). For Profile 1, factor means were all fixed to be 0 with a variance of 1. For Profile 2, factor variances were all set to be 1, while factor means and measurement parameters varied depending on simulation factors that are discussed in detail below.

Factor Mean Difference (.50, 1.00)

The four factors of Profile 2 had a mean of .50 or 1.00, thus creating a standardized factor mean difference of .50 or 1.00 across profiles, which is consistent with simulation studies on LPA and FMM (e.g., Collier & Leite, 2017; Morgan et al., 2017; Peugh & Fan, 2013; Tein et al., 2013). In addition, the selected standardized factor mean differences were commonly observed in applied research using LPA or FMM (e.g., de Oliveira Corrêa et al., 2020; Grove et al., 2015; Kramer et al., 2016).

Mixing Proportions (50/50, 80/20)

Proportions of the two profiles were manipulated to be equal (50/50) or unequal (80/20). The rationale for including both proportions is twofold: (1) to represent applied research using LPA (e.g., Akkerman et al., 2020) and (2) to examine whether the unequal proportions were associated with worse performance of LPA when composite scores and factor scores were used under MNI.

Location of MNI (No, Loading, Intercept)

We simulated data with MI, MNI in loading, and MNI in intercept. When there was MI across profiles, factor means distinguished the profiles. Measurement parameters of Profile 2 were exactly the same as those of Profile 1. When loading MNI was generated, intercepts were all equal across profiles. When intercept MNI was generated, loadings were all set to be equal across profiles. Note that MNI in loadings and intercepts were generated separately in order to disentangle the impact of ignoring different locations of MNI on the performance of LPA.

Size of MNI (Small, Large)

For small MNI conditions, loadings of the noninvariant items decreased by .15 and intercepts increased by .25 in Profile 2. For large MNI conditions, difference in loadings and intercepts increased to .40 and .50, respectively, for noninvariant items. These sizes of MNI were consistent with previous methodological studies (e.g., E. S. Kim et al., 2017; Maij-de Meij et al., 2010; Stark et al., 2006; Wang et al., 2020).

Percent of MNI (20, 40)

For 20% MNI conditions, there were a total of four noninvariant items out of 20 items. Specifically, one item from each measurement model (the 5th, 10th, 15th, and 20th items) had MNI. When the amount of MNI was 40%, two items from each measurement model (the 4th, 5th, 9th, 10th, 14th, 15th, 19th, and 20th items) had MNI, which resulted in a total of eight noninvariant items. These two percentages, corresponding to a small and large number of noninvariant items, are aligned with both applied and methodological studies on MI testing or detection of noninvariant items (e.g., Cho & Cohen, 2010; Davidov et al., 2012; Jak et al., 2013; Maij-de Meij et al., 2010; Wang et al., 2020).

Sample Size (250, 500, or 1,000)

Levels of sample size were chosen to represent a relatively wide range of sample sizes encountered in applied research using LPA and FMM (e.g., Akkerman et al., 2020; Lazarides et al., 2020; McCaslin et al., 2019).

With those six simulation factors, there were a total of 108 conditions. For each condition, 500 replications were generated using Mplus 8.3 (Muthén & Muthén, 1998-2017). Note that in data generation, factor covariances were fixed to be zero to be consistent with the underlying assumption of LPA that indicators are independent conditional on the profile membership. Therefore, the impact of ignoring MNI in LPA can be examined free of the potential confounding effect of ignoring factor covariances when composite scores and factor scores are used.

Analytical Models

Three models were fitted using Mplus 8.3: (1) LPA that used composite scores as profile indicators (LPA-composite), (2) LPA that used factor scores as profile indicators (LPA-fscore), and (3) FMM that matched the data generation model with measurement parameters freely estimated across profiles for noninvariant items only (FMM-correct). For the LPA-composite model, composite scores for each factor were created by averaging the responses across all items associated with the factor. For the LPA-fscore model, CFA was fitted to each replication and factor scores were saved and used as profile indicators. The performance of FMM-correct would serve as a baseline by which the performance of LPA-composite and LPA-fscore can be evaluated when MNI was ignored. For each of the three analysis models, a varying number of profiles (from one to three) were specified.

Simulation Outcomes

The primary outcome of interest was correct class enumeration rates. Correct class enumeration rates were defined as the proportion of replications that selected two profiles as the best fitting model. Specifically, the fit of the one-, two-, and three-profile models were compared based on AIC, BIC, saBIC, LMR test, and the aLMR test. As discussed previously, lower values of ICs are indicative of better model fit. LMR and aLMR compare the fit of K profiles and K − 1 profiles and a p value less than .05 would indicate that the model with K profiles has a better fit. Note that entropy was not included in the study to aid model selection because its unreliable performance has been documented in the literature (e.g., E. S. Kim et al., 2016; Tein et al., 2013). Bootstrap likelihood ratio test was excluded due to its long execution time which would prevent the completion of the simulations within a reasonable timeframe (Nylund et al., 2007). In addition to correct enumeration rates, we reported underextraction or overextraction if two profiles were not recovered. Note that nonconvergence was checked prior to class enumeration and it barely occurred across conditions and models (i.e., only one or two replications out of 500 that did not converge for a few conditions). Nonconverged replications were discarded in class enumeration.

Secondary simulation outcomes were relative bias and mean square error (MSE) in the mean difference of profile indicators. Relative bias and MSE were computed for the two-profile model among replications in which the correct number of profiles was extracted. Relative bias for each indicator was calculated as (|Profile 1 indicator mean across replications − Profile 2 indicator mean across replications| − population factor mean difference)/population factor mean difference. 1 Note that indicator here referred to the composite score or the factor score when LPA-composite and LPA-fscore were fitted, respectively, and the factors when FMM-correct was specified. Also note that relative bias as well as MSE was computed for each indicator and then averaged across four indicators. To ensure that the examination of relative bias and MSE was done with a relatively large number of replications, relative bias and MSE were examined only for LPA-composite, LPA-fscore and FMM-correct under the conditions in which correct class enumeration rates were relatively high, which will be discussed in the “Results” section. In addition, in computing relative bias and MSE, inadmissible solutions (e.g., negative factor variances, zero class proportion, negative item residual variances) were checked. Relative bias and MSE were computed only for converged replications with admissible solutions. Relative bias greater than .05 was considered severe (Hoogland & Boomsma, 1998).

Results

Class Enumeration

MI Conditions

Under MI conditions, only factor means distinguished the simulated two latent classes. Table 1 presents the correct class enumeration rates by simulation condition for each analysis model based on AIC, BIC, saBIC, LMR, and aLMR. Correct class enumeration rates were similar across LPA-composite, LPA-fscore, and FMM-correct, with the rates slightly higher for FMM-correct (e.g., .94, .94, and .96 for the three models, respectively, for saBIC under sample size 500, unequal proportions, and 1.00 factor mean difference). Correct enumeration rates depended on the magnitude of factor mean difference (indicative of class separation), mixing proportions, and sample size. That is, larger factor mean difference, equal proportions, and larger sample size were associated with higher correct enumeration rates.

Table 1.

Correct Class Enumeration Rates Under Measurement Invariance Conditions.

N Prop ES LPA-composite LPA-fscore FMM-correct
AIC BIC saBIC LMR aLMR AIC BIC saBIC LMR aLMR AIC BIC saBIC LMR aLMR
250 50/50 1.00 .65 .94 .80 .82 .82 .65 .94 .79 .83 .84 .73 .94 .84 .82 .75
.50 .34 .01 .31 .16 .15 .34 .01 .31 .16 .15 .35 .01 .28 .02 .00
80/20 1.00 .61 .54 .76 .60 .59 .59 .57 .74 .64 .63 .78 .57 .86 .46 .36
.50 .30 .00 .25 .13 .11 .29 .00 .25 .10 .10 .28 .00 .21 .01 .01
500 50/50 1.00 .66 1.00 .94 .88 .89 .62 1.00 .94 .88 .89 .76 1.00 .96 .97 .98
.50 .50 .01 .32 .23 .21 .53 .01 .32 .24 .22 .57 .01 .32 .12 .07
80/20 1.00 .66 .94 .93 .87 .87 .66 .95 .95 .87 .87 .82 .95 .96 .90 .88
.50 .36 .00 .17 .16 .14 .33 .00 .15 .16 .15 .36 .00 .15 .05 .03
1,000 50/50 1.00 .62 1.00 .99 .87 .87 .64 1.00 .99 .89 .89 .77 1.00 .99 .95 .96
.50 .59 .05 .52 .39 .38 .59 .06 .54 .38 .37 .71 .06 .53 .31 .25
80/20 1.00 .64 1.00 .99 .87 .88 .64 1.00 .98 .87 .87 .77 1.00 .99 .94 .95
.50 .45 .00 .20 .23 .21 .47 .00 .19 .23 .22 .52 .00 .18 .14 .10

Note. N = sample size; Prop = mixing proportions; ES = effect size (factor mean difference); AIC = Akaike information criterion; BIC = Bayesian information criterion; saBIC = sample size–adjusted BIC; LMR = Lo–Mendell–Rubin Test; aLMR = adjusted LMR.

saBIC showed superior performance in identifying the two profiles with unequal proportions and/or small factor mean difference (i.e., .50), followed by LMR/aLMR and BIC. However, when equal proportions were paired with large factor mean difference, BIC outperformed saBIC, LMR, and aLMR. The discrepancy between BIC and saBIC became smaller as sample size increased such that the two ICs were comparable with sample size 1,000. Correct enumeration rates of LMR and aLMR were almost identical. AIC did not perform as well as other model selection criteria with one exception: when factor mean difference was small. When the two profiles were not detected, the one-profile model was selected across conditions, analysis models, and model selection criteria.

MNI Conditions

Similar to the findings under MI conditions, saBIC generally outperformed AIC, BIC, LMR, and aLMR across conditions. Therefore, class enumeration rates based on saBIC were reported for MNI conditions, whereas results for other model selection criteria are presented in Supplemental Material (see Supplementary Tables S1-S3; available online). Specifically, Figures 2 and 3 present the correct enumeration rates by condition for each analysis model when there was loading MNI and intercept MNI, respectively.

Figure 2.

Figure 2.

Correct class enumeration rates (saBIC) under factor loading noninvariance. saBIC = sample size–adjusted Bayesian information criterion.

Figure 3.

Figure 3.

Correct class enumeration rates (saBIC) under intercept noninvariance. saBIC = sample size–adjusted Bayesian information criterion.

Some common patterns were observed across loading and intercept MNI conditions. That is, LPA-composite and LPA-fscore had comparable correct enumeration rates across conditions, although LPA-fscore had slightly higher rates (i.e., .02 higher on average across all MNI conditions). Overall, both LPA models did not perform as well as FMM-correct. However, the discrepancy in correct enumeration rates between the models depended on simulation factors. That is, FMM-correct showed substantially higher correct enumeration rates than the LPA models when the size of MNI was large as opposed to small. For example, the correct enumeration rates of FMM-correct were .30 and .02 higher than those of LPA with large and small MNI, respectively, for large factor mean difference, equal proportions, and 40% loading MNI. Particularly, the discrepancy between FMM-correct and the LPA models was greater with unequal proportions and small factor mean difference.

Similar to the finding under MI conditions, larger factor mean difference, equal proportions, and larger sample size were associated with higher correct enumeration rates across analysis models. When the correct number of profiles was not identified, saBIC supported the one-profile model with small factor mean difference and the three-profile model with large factor mean difference. Such overextraction occurred more frequently for LPA models than FMM-correct.

Two major differences between loading and intercept MNI conditions were noticed. First, Loading MNI conditions were also associated with larger discrepancies in enumeration rates between LPA models and FMM as opposed to intercept MNI conditions, especially when the magnitude of MNI was large. Second, the performance of FMM-correct was consistently better than LPA models across intercept MNI conditions, whereas LPA models outperformed FMM-correct under some loading MNI conditions, that is, when a small factor mean difference was paired with a small size of MNI. Particularly, higher correct enumeration rates of LPA models than FMM-correct were notable when there was 40% small MNI and sample sizes were 500 and 1,000. Under these conditions, FMM-correct tended to overextract the number of profiles.

Relative Bias and MSE of Indicator Mean Difference

Relative bias and MSE of indicator mean difference were examined and reported for LPA-composite, LPA-fscore, and FMM-correct for the two-profile model under conditions with large factor mean difference (1.00) considering relatively high correct enumeration rates for these conditions. Note that only replications in which the correct number of profiles was identified with admissible solutions were taken into account when computing relative bias and MSE. Also note that the presence of inadmissible solutions was limited (i.e., occurred only for FMM-correct with up to 16 replications across conditions).

Table 2 presents relative bias and MSE across analysis models. Results showed that for MI conditions (see bottom panel of Table 2), both LPA-composite and LPA-fscore yielded larger relative bias than FMM-correct, although the magnitude of bias was relatively small across all three models. Specifically, both LPA models tended to underestimate the indicator mean difference by up to 7%, whereas FMM-correct overestimated factor mean difference with small sample size (250) by only 5% and as sample size increased to 500 and 1,000, factor mean difference was accurately recovered. MSE was comparable between LPA-composite and FMM-correct, both smaller than that under LPA-fscore; smaller sample size and unequal proportions were associated with slightly greater MSE although the differences in MSE across conditions were negligible.

Table 2.

Relative Bias and MSE of Indicator Mean Difference for Two-Profile Models Under Large Factor Mean Difference.

Percent of MNI Size of MNI N Prop Loading MNI Intercept MNI
Rebias MSE Rebias MSE
LPA-c LPA-f FMM LPA-c LPA-f FMM LPA-c LPA-f FMM LPA-c LPA-f FMM
20 LG 250 50/50 −.09 −.07 .03 .15 .19 .11 .07 .12 .02 .05 .10 .11
80/20 −.16 −.15 .02 .23 .27 .13 .07 .12 .02 .08 .11 .12
500 50/50 −.12 −.10 .01 .14 .18 .10 .06 .11 .01 .04 .09 .10
80/20 −.21 −.20 .00 .19 .25 .11 .06 .10 .01 .05 .10 .10
1,000 50/50 −.13 −.11 .01 .13 .18 .09 .05 .10 .00 .04 .09 .09
80/20 −.23 −.21 .00 .18 .24 .10 .05 .10 .00 .04 .09 .10
SM 250 50/50 −.06 −.06 .04 .12 .18 .12 .01 .04 .03 .08 .13 .11
80/20 −.09 −.09 .03 .17 .23 .16 .01 .04 .03 .10 .15 .13
500 50/50 −.09 −.08 .02 .11 .17 .10 .00 .03 .01 .06 .12 .10
80/20 −.13 −.13 .00 .13 .20 .12 .01 .02 .01 .08 .13 .11
1,000 50/50 −.10 −.09 .01 .10 .16 .09 .01 .02 .01 .06 .12 .09
80/20 −.13 −.13 .00 .12 .19 .10 .01 .02 .00 .06 .12 .10
40 LG 250 50/50 −.10 −.11 .01 .19 .23 .11 .19 .30 .01 .02 .06 .11
80/20 −.23 −.23 .02 .31 .37 .13 .19 .30 .01 .04 .06 .12
500 50/50 −.15 −.15 .01 .18 .22 .10 .19 .30 .01 .01 .05 .10
80/20 −.31 −.32 .01 .30 .37 .11 .18 .29 .00 .02 .05 .10
1,000 50/50 −.16 −.16 .00 .17 .21 .10 .18 .29 .00 .01 .05 .09
80/20 −.36 −.37 .00 .29 .37 .10 .18 .28 .00 .01 .04 .10
SM 250 50/50 −.08 −.08 .04 .14 .20 .12 .07 .12 .01 .05 .10 .11
80/20 −.13 −.14 .02 .21 .27 .18 .07 .12 .01 .08 .11 .13
500 50/50 −.11 −.11 .02 .12 .19 .10 .06 .11 .01 .04 .09 .10
80/20 −.18 −.19 .00 .17 .25 .13 .06 .11 .01 .05 .10 .11
1,000 50/50 −.12 −.12 .01 .12 .18 .10 .05 .11 .00 .04 .09 .10
80/20 −.19 −.20 .01 .15 .24 .11 .05 .10 .00 .04 .09 .10
MI
NO 250 50/50 .04 .04 .05 .10 .16 .11
80/20 .04 .03 .05 .14 .19 .14
500 50/50 −.06 −.06 .02 .09 .15 .10
80/20 −.07 −.06 .02 .10 .17 .11
1,000 50/50 −.07 −.07 .01 .08 .15 .09
80/20 −.07 −.07 .01 .09 .16 .10

Note. Relative bias over 5% is in boldface. MNI = measurement noninvariance; N = sample size; Prop = mixing proportions; Rebias = relative bias; MSE = mean square error; LPA-c = LPA-composite; LPA-f = LPA-fscore; FMM = FMM-correct; LG = large size of MNI (.40 for loading MNI and .50 for intercept MNI); SM = small size of MNI (.15 for loading MNI and .25 for intercept MNI); NO = no MNI.

For loading MNI conditions (see top left panel of Table 2), LPA-composite and LPA-fscore had almost identical relative bias—both severely underestimated indicator mean difference. More severe bias was observed for more MNI items, larger size of MNI, and unequal proportions. By contrast, minimal bias (less than 5%) was found for FMM-correct across loading MNI conditions. MSE was the largest under LPA-fscore, followed by LPA-composite and FMM-correct. In addition to smaller sample size and unequal proportions, more MNI items and larger size of MNI yielded larger MSE for the LPA models.

For intercept MNI conditions (see top right panel of Table 2), LPA-fscore had more severe relative bias than LPA-composite. That is, LPA-fscore and LPA-composite overestimated indicator mean difference by 14% and 8%, respectively on average. Relative bias was the most severe with eight noninvariant items and large size of MNI. Relative bias was minimal (less than 5%) when four noninvariant items were coupled with small size of MNI. No severe bias was found for FMM-correct. LPA-composite had the smallest MSE as compared with LPA-fscore and FMM-correct. For each model, less variation in MSE was found across conditions than loading MNI conditions.

Discussion

Although constructs are often measured by a set of items, it is a common practice to use composite scores or factor scores across constructs as indicators for profiles in LPA. In other words, measurement models are not included. To the best of our knowledge, this simulation study is the first to examine the impact of measurement noninvariance (MNI) across profiles on the performance of LPA when composite or factor scores are used. The primary and secondary outcomes are correct class enumeration rates, and bias and MSE of indicator mean differences. Major findings are summarized and discussed as follows.

Class enumeration of LPA-composite and LPA-fscore depended on the amount of MNI that was ignored. Specifically, the LPA models had comparable correct enumeration rates to FMM-correct with smaller magnitude of MNI (.15 and .25 for loading and intercept, respectively). By contrast, under larger MNI (.40 and .50 for loading and intercept, respectively), correct enumeration rates were lower than FMM-correct, which was aligned with our expectation due to larger degree of model misspecification when MNI was ignored in LPA models. Another explanation is that the presence of larger MNI enhanced the performance of FMM-correct by increasing class separation especially with small factor mean difference, thus, leading to larger discrepancies between FMM-correct and the LPA models. The magnitude of MNI seemed to have greater impact on the performance of LPA models than the percentage of MNI.

Larger factor mean difference, larger sample size, and equal proportions can help mitigate the negative impact of ignoring MNI when composite or factor scores were used in LPA. This can be explained by the fact that these conditions would generally lead to substantial improvement in class enumeration across models, including LPA that ignored MNI across profiles (Dias, 2004; E. S. Kim et al., 2016; Lubke & Neal, 2006, 2008; Wang et al., 2020).

saBIC was more reliable in identifying the correct number of profiles as compared with other ICs and likelihood-based tests examined in this study. Although BIC, LMR, and aLMR performed well in class enumeration, saBIC yielded higher correct enumeration rates with unequal proportions and small factor mean difference, which is consistent with the finding of previous simulation studies (e.g., Henson et al., 2007; E. S. Kim et al., 2016; Wang et al., 2020). When saBIC did not identify the correct number of profiles in LPA, overextraction of profiles occurred when factor mean difference was large and underextraction occurred when factor mean difference was small. AIC tended to overextract the number of profiles, which has been consistently documented in methodological studies (Cho & Cohen, 2010; Henson et al., 2007; Nylund et al., 2007; Wang et al., 2020).

Overall, severe bias in indicator mean difference was observed for LPA models when MNI was ignored and the direction and size of bias was related with the direction and size of MNI (E. S. Kim & Wang, 2017). That is, negative MNI was generated for loadings and positive MNI for intercepts (i.e., the profile with higher factor mean had lower loadings or higher intercepts for noninvariant items). Therefore, we observed that the indicator mean difference tended to be underestimated with loading MNI but overestimated with intercept MNI. For both locations of MNI, larger relative bias was observed with more MNI items and larger size of MNI, which is expected due to the fact that a larger amount of MNI was ignored (E. S. Kim & Wang, 2017; Olivera-Aguilar & Rikoon, 2018).

It is worthwhile noting that the bias in indicator mean difference under LPA models came from two sources: ignoring measurement errors and ignoring MNI. The bias under MI conditions can serve as a baseline (in which only measurement errors were ignored) for us to evaluate the additional impact of ignoring MNI in LPA. Slight bias was observed for LPA models under the MI conditions because LPA models did not or only partially take into account measurement errors, which attenuates indicator mean difference. On top of such small negative bias, ignoring loading MNI introduced additional negative bias, which resulted in the severe negative bias in indicator mean difference for LPA models. By contrast, ignoring intercept MNI led to positive bias, which cancelled off the small negative bias due to ignoring measurement errors. When the positive bias was also small (i.e., when four MNI items were coupled with small size of MNI), these two opposite directions of bias would be almost perfectly counterbalanced, which could explain the minimal relative bias (less than 5%) under these conditions. When there were more MNI items and/or larger size of MNI, positive bias became dominant.

In addition, the direction of bias in indicator mean difference might be able to explain the differential impact of ignoring loading and intercept MNI on class enumeration. That is, with severe negative bias in indicator mean difference under loading MNI, the separation between profiles would be attenuated and it would be more challenging to correctly identify two profiles. On the contrary, with positive direction of MNI and thus positive bias in indicator mean difference for intercepts, class enumeration would not be negatively affected as much as that under loading MNI. To confirm our hypothesis, a few more conditions were run with negative direction of intercept MNI generated in the population. As expected, we observed negative bias in indicator mean difference. In addition, correct enumeration rates were much lower than those in the original positive MNI conditions.

We noticed both consistency and inconsistency in the performance of LPA-composite and LPA-fscore. That is, LPA-fscore performed slightly better than LPA-composite in class enumeration, which is expected given that measurement errors were partially taken into account in LPA-fscore but completely ignored in LPA-composite (Meyer & Morin, 2016; Morin & Marsh, 2015; Morin et al., 2016). However, the improvement in correct enumeration rates was minimal for LPA-fscore, because the presence of measurement errors was not severe in this study, that is, loadings ranged from .5 to .8 in data generation. LPA-composite and LPA-fscore had comparable relative bias in indicator mean differences for MI and loading MNI conditions, but bias was more severe under LPA-fscore for intercept MNI conditions. LPA-fscore also had larger MSE across conditions, indicating lower precision in the estimation of indicator mean differences in LPA-fscore.

Limitations and Future Directions

First, this study assumed zero factor covariances to remove the potential cofounding effect of ignoring factor covariances when the impact of ignoring MNI on LPA was examined. However, in applied research where a decision on adopting LPA or FMM has to be made, it is possible that factor covariances are present in FMM or indicators are correlated conditional on profile membership in LPA. Thus, it would be worthwhile investigating the performance of LPA with composite or factor scores when factor covariances are ignored. Note that in this case, FMM allows for the estimation of factor covariances, which might lead to better performance than LPA.

Second, a caveat in conducting FMM or LPA with composite or factor scores is potential model misspecification. This study assumes that composite or factor scores are created for each dimension under correct model specification except that MNI across profiles was ignored. However, misspecification in creating composite or factor scores might occur in applied research. Examples include the omission of cross-loadings or correlated residuals, and the misspecification of factor structures (e.g., a two-factor model should be used instead of the hypothesized one-factor model). Future research is needed to investigate the impact of those misspecifications on the performance of LPA. Relatedly, although this study showed the benefit of using a correct specification of FMM that matched the data generation model over LPA with composite or factor scores, future research is warranted to investigate the impact of potential model misspecifications on the performance of FMM.

Practical Implications

Despite the limitations discussed above, this study provides several implications for applied researchers. First, we recommend FMM over LPA with composite or factor scores given that FMM, when correctly specified, had higher correct enumeration rates and more accurate estimates of indicator mean difference than LPA models across conditions. Thus, efforts should be directed to the FMM model specification by carefully evaluating the factor structure based on substantive knowledge and psychometric analyses and identifying noninvariant parameters across profiles or classes by fitting and comparing different specifications (Clark et al., 2013; Wang et al., 2020).

Second, LPA with composite or factor scores can be considered as alternatives to FMM if it is impossible to run FMM (e.g., model nonconvergence or very small sample size), but it should be kept in mind that LPA models only work well under certain conditions. Particularly, applied researchers can consider using LPA models when the size of MNI is expected to be small and/or class separation is expected to be large. Given the unobserved nature of profiles, researchers might not be able to hypothesize the size of MNI or class separation before conducting LPA or FMM analyses. However, efforts can be made to gain a better understanding of these factors that are important for the reliable performance of LPA models. For example, researchers can rely on substantive theories or extant literature to have a better sense of what profiles would emerge and how distinctive they would be. Of another note is that if small MNI and/or large class separation is expected, LPA should be considered as an alternative to FMM when the research focus is more on classification rather than mean comparisons because indicator mean differences could be biased.

Third, applied researchers should be cognizant of the choice between composite scores and factor scores in LPA. LPA with factor scores had slightly higher class enumeration rates than LPA with composite scores but the latter had smaller bias and MSE in indicator mean difference. The last two recommendations would be the benefit of a large sample size (at least 500) as it could help mitigate the negative impact of ignoring MNI in class enumeration when LPA is used and the preference of saBIC over AIC, BIC, LMR, and aLMR in class enumeration due to its reliable performance across simulation conditions.

Supplemental Material

sj-docx-1-epm-10.1177_0013164421997896 – Supplemental material for Robustness of Latent Profile Analysis to Measurement Noninvariance Between Profiles

Supplemental material, sj-docx-1-epm-10.1177_0013164421997896 for Robustness of Latent Profile Analysis to Measurement Noninvariance Between Profiles by Yan Wang, Eunsook Kim and Zhiyao Yi in Educational and Psychological Measurement

1.

The value of population factor mean difference was 1 across analysis models. We are aware of the option to rescale the population factor mean difference for LPA-composite to help disentangle the two sources of bias, ignoring measurement errors and ignoring MNI. However, these two sources cannot be easily disentangled in LPA-fscore, and therefore, to make sure results for LPA-composite and LPA-fscore are comparable, we kept the same value of population factor mean difference across analysis models.

Footnotes

Declaration of Conflicting Interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The authors received no financial support for the research, authorship, and/or publication of this article.

Supplemental Material: Supplemental material for this article is available online.

References

  1. Akaike H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716-723. 10.1109/TAC.1974.1100705 [DOI] [Google Scholar]
  2. Akkerman D. M., Vulperhorst J. P., Akkerman S. F. (2020). A developmental extension to the multidimensional structure of interests. Journal of Educational Psychology, 112(1), 183-203. 10.1037/edu0000361 [DOI] [Google Scholar]
  3. Allan N. P., Raines A. M., Capron D. W., Norr A. M., Zvolensky M. J., Schmidt N. B. (2014). Identification of anxiety sensitivity classes and clinical cut-scores in a sample of adult smokers: Results from a factor mixture model. Journal of Anxiety Disorders, 28(7), 696-703. 10.1016/j.janxdis.2014.07.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bozdogan H. (1987). Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52(3), 345-370. 10.1007/BF02294361 [DOI] [Google Scholar]
  5. Cho S., Cohen A. S. (2010). A multilevel mixture IRT model with an application to DIF. Journal of Educational and Behavioral Statistics, 35(3), 336-370. 10.3102/1076998609353111 [DOI] [Google Scholar]
  6. Clark S. L., Muthén B., Kaprio J., D’Onofrio B. M., Viken R., Rose R. J. (2013). Models and strategies for factor mixture analysis: An example concerning the structure underlying psychological disorders. Structural Equation Modeling, 20(4), 681-703. 10.1080/10705511.2013.824786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cole V. T. (2017). Adapting latent profile analysis to take into account measurement noninvariance. Multivariate Behavioral Research, 52(1), 115. 10.1080/00273171.2016.1264288 [DOI] [Google Scholar]
  8. Collier Z. K., Leite W. L. (2017). A comparison of three-step approaches for auxiliary variables in latent class and latent profile analysis. Structural Equation Modeling, 24(6), 819-830. 10.1080/10705511.2017.1365304 [DOI] [Google Scholar]
  9. Davidov E., Dülmer H., Schlüter E., Schmidt P., Meuleman B. (2012). Using a multilevel structural equation modeling approach to explain cross-cultural measurement noninvariance. Journal of Cross-Cultural Psychology, 43(4), 558-575. 10.1177/0022022112438397 [DOI] [Google Scholar]
  10. de Oliveira Corrêa1 A., Brown E. C., Lee T. K., Mejía-Trujillo J., Pérez-Gómez A., Eisenberg N. (2020). Assessing community readiness for preventing youth substance use in Colombia: A latent profile analysis. International Journal of Mental Health and Addiction, 18(2), 368-381. 10.1007/s11469-019-00191-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Dias J. G. (2004). Finite mixture models: Review, applications, and computer-intensive methods [Unpublished doctoral dissertation]. Research School Systems, Organization and Management (SOM), Groningen of University, Netherlands. [Google Scholar]
  12. Grove R., Baillie A., Allison C., Baron-Cohen S., Hoekstra R. A. (2015). Exploring the quantitative nature of empathy, systemising and autistic traits using factor mixture modeling. British Journal of Psychiatry, 207(5), 400-406. 10.1192/bjp.bp.114.155101 [DOI] [PubMed] [Google Scholar]
  13. Henson J. M., Reise S. P., Kim K. H. (2007). Detecting mixtures from structural model differences using latent variable mixture modeling: A comparison of relative model fit statistics. Structural Equation Modeling, 14(2), 202-226. 10.1080/10705510709336744 [DOI] [Google Scholar]
  14. Hoogland J. J., Boomsma A. (1998). Robustness studies in covariance structure modeling: An overview and a meta-analysis. Sociological Methods & Research, 26(3), 329-367. 10.1177/0049124198026003003 [DOI] [Google Scholar]
  15. Jak S., Oort F. J., Dolan C. V. (2013). A test for cluster bias: Detecting violations of measurement invariance across clusters in multilevel data. Structural Equation Modeling, 20(2), 265-282. 10.1080/10705511.2013.769392 [DOI] [Google Scholar]
  16. Kam C., Morin A. J. S., Meyer J. P., Topolnytsky L. (2016). Are commitment profiles stable and predictable? A latent transition analysis. Journal of Management, 42(6), 1462-1490. 10.1177/0149206313503010 [DOI] [Google Scholar]
  17. Kim E. S., Cao C., Wang Y., Nguyen D. T. (2017). Measurement invariance testing with many groups: A comparison of five approaches. Structural Equation Modeling, 24(4), 524-544. 10.1080/10705511.2017.1304822 [DOI] [Google Scholar]
  18. Kim E. S., Joo S.-H., Lee P., Wang Y., Stark S. (2016). Measurement invariance testing across between-level latent classes using multilevel factor mixture modeling. Structural Equation Modeling, 23(6), 870-887. 10.1080/10705511.2016.1196108 [DOI] [Google Scholar]
  19. Kim E. S., Wang Y. (2017). Class enumeration and parameter recovery of growth mixture modeling and second-order growth mixture modeling in the presence of measurement noninvariance between latent classes. Frontiers in Psychology, 8, Article 1499. 10.3389/fpsyg.2017.01499 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kim E. S., Willson V. L. (2014). Measurement invariance across groups in latent growth modeling. Structural Equation Modeling, 21(3), 408-424. 10.1080/10705511.2014.915374 [DOI] [Google Scholar]
  21. Kim S. Y., Chen S., Hou Y., Zeiders K. H., Calzada E. J. (2019). Parental socialization profiles in Mexican-origin families: Considering cultural socialization and general parenting practices. Cultural Diversity and Ethnic Minority Psychology, 25(3), 439-450. 10.1037/cdp0000234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kramer M. D., Arbisi P. A., Thuras P. D., Krueger R. F., Erbes C. R., Polusny M. A. (2016). The class-dimensional structure of PTSD before and after deployment to Iraq: Evidence from direct comparisons of dimensional, categorical, and hybrid models. Journal of Anxiety Disorders, 36, 1-9. 10.1016/j.janxdis.2016.02.004 [DOI] [PubMed] [Google Scholar]
  23. Lazarides R., Dicke A.-L., Rubach C., Eccles J. S. (2020). Profiles of motivational beliefs in math: Exploring their development, relations to student-perceived classroom characteristics, and impact on future career aspirations and choices. Journal of Educational Psychology, 112(1), 70-92. 10.1037/edu0000368 [DOI] [Google Scholar]
  24. Lo Y., Mendell N. R., Rubin D. B. (2001). Testing the number of components in a normal mixture. Biometrika, 88(3), 767-778. 10.1093/biomet/88.3.767 [DOI] [Google Scholar]
  25. Lubke G., Neale M. (2006). Distinguishing between latent classes and continuous factors: Resolution by maximum likelihood? Multivariate Behavioral Research, 41(4), 499-532. 10.1207/s15327906mbr4104_4 [DOI] [PubMed] [Google Scholar]
  26. Lubke G., Neale M. (2008). Distinguishing between latent classes and continuous factors with categorical outcomes: Class invariance of parameters of factor mixture models. Multivariate Behavioral Research, 43(4), 592-620. 10.1080/00273170802490673 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lubke G. H., Muthén B. (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods, 10(1), 21-39. 10.1037/1082-989X.10.1.21 [DOI] [PubMed] [Google Scholar]
  28. Luningham J. M., McArtor D. B., Bartels M., Boomsma D. I., Lubke G. H. (2017). Sum scores in twin growth curve models: Practicality versus bias. Behavior Genetics, 47(5), 516-536. 10.1007/s10519-017-9864-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Maij-de Meij A. M., Kelderman H., Van der Flier H. (2010). Improvement in detection of differential item functioning using a mixture item response theory model. Multivariate Behavioral Research, 45(6), 975-999. 10.1080/00273171.2010.533047 [DOI] [PubMed] [Google Scholar]
  30. McCaslin S. E., Cloitre M., Neylan T. C., Garvert D. W., Herbst E., Marmar C. (2019). Factors associated with high functioning despite distress in post-9/11 veterans. Rehabilitation Psychology, 64(3), 377-382. 10.1037/rep0000271 [DOI] [PubMed] [Google Scholar]
  31. McLachlan G., Peel D. (2000). Finite mixture models. Wiley. 10.1002/0471721182 [DOI] [Google Scholar]
  32. Meredith W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525-543. 10.1007/BF02294825 [DOI] [Google Scholar]
  33. Meyer J. P., Morin A. J. S. (2016). A person-centered approach to commitment research: Theory, research and methodology. Journal of Organizational Behavior, 37(4), 584-612. 10.1002/job.2085 [DOI] [Google Scholar]
  34. Morgan G. B., Hodge K. J., Baggett A. R. (2017). Latent profile analysis with nonnormal mixtures: A Monte Carlo examination of model selection using fit indices. Computational Statistics and Data Analysis, 93, 146-161. 10.1016/j.csda.2015.02.019 [DOI] [Google Scholar]
  35. Morin A. J., Marsh H. W. (2015). Disentangling shape from level effects in person-centered analyses: An illustration based on university teachers’ multidimensional profiles of effectiveness. Structural Equation Modeling, 22(1), 39-59. 10.1080/10705511.2014.919825 [DOI] [Google Scholar]
  36. Morin A. J., Meyer J. P., Creusier J., Biétry F. (2016). Multiple-group analysis of similarity in latent profile solutions. Organizational Research Methods, 19(2), 231-254. 10.1177/1094428115621148 [DOI] [Google Scholar]
  37. Muthén L. K., Muthén B. O. (1998-2017). Mplus user’s guide (8th ed.). Muthén & Muthén. [Google Scholar]
  38. Neale M. C., Lubke G., Aggen S. H., Dolan C. V. (2005). Problems with using sum scores for estimating variance components: Contamination and measurement noninvariance. Twin Research and Human Genetics, 8(6), 553-568. 10.1375/twin.8.6.553 [DOI] [PubMed] [Google Scholar]
  39. Nylund K. L., Asparouhov T., Muthén B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14(4), 535-569. 10.1080/10705510701575396 [DOI] [Google Scholar]
  40. Nylund-Gibson K., Masyn K. E. (2016). Covariates and mixture modeling: Results of a simulation study exploring the impact of misspecified effects on class enumeration. Structural Equation Modeling, 23(6), 782-797. 10.1080/10705511.2016.1221313 [DOI] [Google Scholar]
  41. Olivera-Aguilar M., Rikoon S. H. (2018). Assessing measurement invariance in multiple-group latent profile analysis. Structural Equation Modeling, 25(3), 439-452. 10.1080/10705511.2017.1408015 [DOI] [Google Scholar]
  42. Peugh J., Fan X. (2013). Modeling unobserved heterogeneity using latent profile analysis: A Monte Carlo simulation. Structural Equation Modeling, 20(4), 616-639. 10.1080/10705511.2013.824780 [DOI] [Google Scholar]
  43. Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461-464. 10.1214/aos/1176344136 [DOI] [Google Scholar]
  44. Sclove S. L. (1987). Application of model-selection criteria to some problems in multivariate analysis. Psychometrika, 52(3), 333-343. 10.1007/BF02294360 [DOI] [Google Scholar]
  45. Stark S., Chernyshenko O. S., Drasgow F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91(6), 1292-1306. 10.1037/0021-9010.91.6.1292 [DOI] [PubMed] [Google Scholar]
  46. Tein J.-Y., Coxe S., Cham H. (2013). Statistical power to detect the correct number of classes in latent profile analysis. Structural Equation Modeling, 20(4), 640-657. 10.1080/10705511.2013.824781 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Vermunt J. K., Magidson J. (2002). Latent class cluster analysis. In Hagennars J. A., McCutcheon A. L. (Eds.), Applied latent class analysis (pp. 89-106). Cambridge University Press. 10.1017/CBO9780511499531.004 [DOI] [Google Scholar]
  48. Wang Y., Kim E., Ferron J. M., Dedrick R. F., Tan T. X., Stark S. (2020). Testing measurement invariance across unobserved groups: The role of covariates in factor mixture modeling. Educational and Psychological Measurement. Advance online publication. 10.1177/0013164420925122 [DOI] [PMC free article] [PubMed]
  49. Warren C. M., Kechter A., Christodoulou G., Cappelli C., Pentz M. A. (2020). Psychosocial factors and multiple health risk behaviors among early adolescents: A latent profile analysis. Journal of Behavior Medicine. Advance online publication. 10.1007/s10865-020-00154-1 [DOI] [PMC free article] [PubMed]
  50. Wirth R. J. (2008). The effects of measurement non-invariance on parameter estimation in latent growth models (Publication No. AAI 3331053) [Doctoral dissertation, University of North Carolina at Chapel Hill; ]. ProQuest Dissertations and Theses. [Google Scholar]
  51. Zhang Q., Yang Y. (2020). Autoregressive mediation models using composite scores and latent variables: Comparisons and recommendations. Psychological Methods. Advance online publication. 10.1037/met0000251 [DOI] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-docx-1-epm-10.1177_0013164421997896 – Supplemental material for Robustness of Latent Profile Analysis to Measurement Noninvariance Between Profiles

Supplemental material, sj-docx-1-epm-10.1177_0013164421997896 for Robustness of Latent Profile Analysis to Measurement Noninvariance Between Profiles by Yan Wang, Eunsook Kim and Zhiyao Yi in Educational and Psychological Measurement


Articles from Educational and Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES