Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2024 Apr 3;51(15):3005–3038. doi: 10.1080/02664763.2024.2335568

Bayesian factor selection in a hybrid approach to confirmatory factor analysis

Junyu Nie 1, Jihnhee Yu 1,CONTACT
PMCID: PMC11536624  PMID: 39507209

Abstract

To investigate latent structures of measured variables, various factor structures are used for confirmatory factor analysis, including higher-order models and more flexible bifactor models. In practice, measured variables may also have relatively small or moderate non-zero loadings on multiple group factors, which form cross loadings. The selection of correct and ‘identifiable’ latent structures is important to evaluate an impact of constructs of interest in the confirmatory factor analysis model. Herein, we first discuss the identifiability condition that allows several cross loadings of the models with underlying bifactor structures. Then, we implement Bayesian variable selection allowing cross loadings on bifactor structures using the spike and slab prior. Our approaches evaluate the inclusion probability for all group factor loadings and utilize known underlying structural information, making our approaches not entirely exploratory.

Through a Monte Carlo study, we demonstrate that our methods can provide more accurately identified results than other available methods. For the application, the SF-12 version 2 scale, a self-report health-related quality of life survey is used. The model selected by our proposed methods is more parsimonious and has a better fit index compared to other models including the ridge prior selection and strict bifactor model.

KEYWORDS: Bayesian factor selection, bifactor model, cross loadings, model identifiability, spike and slab prior

1. Introduction

Confirmatory factor analysis (CFA) is a type of structural equation modeling that deals specifically with measurement models, that is, the relationships between observed measures/indicators (e.g. test items, test scores, behavioral observation ratings) and latent constructs [8]. CFA differs from exploratory factor analysis (EFA) in that it starts with a robust theoretical foundation, whereby many factor loadings are fixed to zero. This approach results in a reduction of parameters to estimate in CFA, leading to smaller standard errors and the potential for more precise estimations compared to EFA [31]. However, prior knowledge may not be available, and specifying oversimplified factor loading structures may lead to biased estimates of the parameters in factor variance [3]. For instance, forcing non-zero cross loadings on minor group factors to zero substantially inflates the estimates of the factor correlations in CFA models [3]. Thus, identification of correct latent structures is important to evaluate the impact of constructs of interest and to interpret the CFA model. To overcome excessively restrictive CFAs, hybrid strategies to combine confirmatory analysis and exploratory factor analysis may be resorted such as exploratory structural equation modeling (ESEM) and Bayesian structural equation modeling [54].

In this paper, we adopt a Bayesian variable selection called the spike and slab shrinkage method in the context of a hybrid approach to CFA and compare our approach with other existing procedures such as different prior choices and post-processing methods using factor rotations [41]. The spike and slab prior is suitable when the researcher has strong prior knowledge that some variables have exactly zero coefficients, which makes suitable for CFA modeling. Commonly, the spike and slab method is based on the calculation of a certain probability as a function of existing data. In application to factor analyses, a major difficulty of using this technique is that latent factors are not observed unlike other more traditional regressions. Our method is a kind of hybrid as some pre-identified bifactor structures are incorporated into factor loading identification. We note that Lu, Chow and Loken [31] explore the spike and slab technique on candidates of cross loadings based on a distribution for each item in the factor analysis. Their method shows an improvement in factor variable selection over other Bayesian variable selections and traditional variable selections such as forward and backward selections. In this paper, we derive a different formation of inclusion probabilities in which an inclusion of each factor loading is evaluated based on a marginal conditional distribution. Additionally, our proposed approach is vetted to handle cases that the majority of group factor loadings are not fixed rather than handling a few cross loadings in the model. Our approach may simplify the calculation of inclusion probabilities, but we also show an improvement of variable selection performance compared to existing methods.

Various factor structures are used for CFA, including independent clusters model of CFA (ICM-CFA), higher-order models, and hierarchical factor models. As a more flexible alternative to these models, a bifactor measurement model [22] can be used to explain hierarchical latent structures of multidimensional variables. The bifactor measurement model uses a hybrid model that includes a general factor on which all items (indicators) load and one or more uncorrelated group factors on which subsets of items load. In a typical higher-order model, each group level factor loads onto the general factor (second-order factor). Thus, the higher-order model restricts factor structures, where the shared variance between the group-level factors is caused by a general factor [52]. For this reason, the second-order factor model is nested in the bifactor model, and the models are equivalent when a condition called proportionality is met [4,34,38]. When a bifactor model is used for CFA, the cross loadings are typically fixed at zero values, which may prevent nonidentification (i.e. nonunique model) due to approximate linear dependencies between the general factor and the group factors [55]. A simplified model may prevent nonconvergence, but CFA models may include large cross loadings [32]. To this end, it is important to discuss the model identifiability or uniqueness of the model when we alter basic bifactor models and include multiple cross loadings.

The outline of this paper is as follows. In Section 2, we review the existing literature on variable selection methods, including Bayesian variable selection with different priors. In Section 3, we introduce the Bayesian factor model and present a set of results regarding the identifiability of models with cross loadings. These findings provide us with ample flexibility to explore various cross loadings and justify the final model in terms of its identifiability. In addition, we propose a novel approach using the spike and slab shrinkage method to assess the inclusion probability of each factor loading based on the marginal conditional distribution. In Section 4, we conduct a Monte Carlo study to evaluate the performance of the proposed methods using various scenarios, including the number of items, factors, and the number of cross loadings. We investigate the percentage of correctly identified models and compare our results with those obtained using other existing methods. In Section 5, we apply the developed methods to analyze the Short Form-12 version 2 (SF-12v2), a self-report quality of life questionnaire. Finally, Section 6 presents the concluding remarks summarizing the key findings of our research.

2. Existing methods of latent variable selection

2.1. Classical variable selection

Variable selection has been a major topic in the statistical literature on regression. Classical variable selection procedures such as best subset selection, sequential searches including forward selection, and backward elimination and stepwise methods can be used for the factor analysis [17]. Stepwise methods [13] gained popularity in epidemiology and other fields for their computational simplicity [45]. However, the stepwise algorithm is generally ineffective at excluding anomalous variables from the factor model and ‘the poor selection accuracy of the stepwise approach suggests that it should be avoided’ [21].

Model comparison criteria such as the Akaike information criterion (AIC) [1] and the Bayesian information criterion (BIC) [46] can be used for the factor analysis. The AIC only provides information about the quality of a model relative to other models and does not provide information on the absolute quality of a model. With a small sample size relative to a large number of parameters/variables, the AIC often provides models with too many variables. The BIC often chooses models that are more parsimonious than the AIC, as the BIC penalizes bigger models due to the larger penalty term inherent in its formula [40]. These methods assess models estimated by the maximum likelihood procedure and are unable to provide suitable regularization parameters to regulate overfitting [19].

Commonly, factor structures are evaluated using fit indexes. Evaluating factor structures using fit indexes is an iterative process that involves exploring different model combinations to select the final candidate models. The model chi-square index directly uses the ratio of the maximum likelihood functions, which is assumed to have a chi-square distribution. In practice, the null hypothesis is often rejected due to the chi-square’s heightened sensitivity under large samples. To resolve this problem, approximate fit indexes that are not based on accepting or rejecting the null hypothesis have been developed. Approximate fit indexes can be further classified into absolute and incremental or relative fit indexes. An example of an incremental fit indexes is the CFI (comparative fit index) [6]. An example of an absolute fit index is the root-mean-square error of approximation (RMSEA). The RMSEA [48] measures the discrepancy due to the approximation per degree of freedom. The RMSEA tends to reward complex models with high degrees of freedom and large samples. It also tends to penalize simpler structural equation models when estimated with fewer variables analyzed at smaller sample sizes [29,37,53]. In contrast, the CFI is an index of ‘good fit’ ranging from 0 to 1, which quantifies the proportional improvement in structural equation model fit over a ‘null’ model [6,7,29]. It is less influenced by sample sizes and penalizes non-parsimonious models. However, the ‘null’ comparison model for the CFI may not be supported by the data, while the model may produce an unrealistic extreme value for comparison that gives rise to excessively generous assessments of model fit [43]. Researchers using structural equation models often provide multiple fit index values, such as the chi-square test statistic, the CFI, the RMSEA, and others. However, these fit indexes may leave readers to assess the strength of their claims based on subjective preference. Papers often report fit indexes, and authors characterize their findings based on personal opinions for meeting recommended cut-point criteria [43]. Absent any additional definitive criteria, adequate cut-point criteria of fit indexes (e.g. [24] can become matters more of semantic subjectivity than absolute validity [5,43]).

2.2. Bayesian factor analysis with different priors

Regularization methods, such as Bayesian model selection, offer automated and data-driven model selection, which reduces subjectivity and is relatively free of personal preference. These methods can assist proper model selection in CFA by carefully regulating the loss functions/log-likelihood functions to achieve sparse coefficient estimates. By defining the penalty functions as prior distributions, it is feasible to combine established regularization techniques with Bayesian models. Consequently, these Bayesian regularization methods can determine the distributions of unknown parameter values and exclude irrelevant parameters from the model structure. The Bayesian statistical approach has gained widespread recognition due to its ability to quantify uncertainty transparently and its generality across different kinds of simple and complex analyses. Moreover, the availability of computationally sophisticated Markov chain Monte Carlo (MCMC) sampling algorithms in the statistics and machine-learning communities has made Bayesian analysis highly feasible.

The ridge [20] and Lasso [49] penalties are two well-known regularization techniques that employ L2-norm and L1-norm, respectively. For Bayesian model selection, the prior choice similar to the ridge regression is the Gaussian distribution, (β|σ2)exp(12σ2j=1pβj2) where β is the coefficient of a model and σ2is a variance parameter with prior distribution p(σ2).The prior choice of the Laplace distributions, p(β|σ2)exp(12σ2j=1p|βj|), regulates variable selection similar to the Lasso regression. Powerful Bayesian computational techniques can be employed for wide range of models, like parameter constraints and penalty functions, which often lead to optimization difficulties in the frequentist framework and may be handled with relative ease in the Bayesian framework. For example, sampling from the posterior distribution provides more information for estimation and inference that is difficult to carry out for the Lasso regression [9].

For the latent variable selections in the factor analysis framework, Muthén and Asparouhov [39] use Bayesian variable selection in which the ridge prior is used to shrink the cross loading elements in factor loadings toward zero. In their approach, they replace parameter specifications of exact zeros with approximate zeros based on the ridge prior. The proposed Bayesian approach is particularly beneficial in applications in which parameters are added to a conventional model leading to a non-identified model when using maximum likelihood methods. Additionally, Jacobucci, Brandmaier and Kievit [26] show that regularized structural equation modeling with Lasso prior outperforms non-Bayesian traditional structural equation modeling in situations with a large number of predictors and a small sample size.

Mixture priors such as spike and slab components have been used extensively for variable selection (e.g. [16,25,35]). The spike component, which concentrates its mass at values close to zero, allows shrinkage of small effects to zero, whereas the slab component has its mass spread over a wide range of plausible values for the regression coefficients [33]. For latent variable selections, Lu, Chow and Loken [31] explore variable selection problems in Bayesian factor analysis, where the spike and slab prior (SSP) is used to model the prior probability of nonzero factor loadings. In their approach, the ridge prior and SSP are used together, where the SSP is used to select a few cross loadings that are strongly associated with each factor. We note that Lu, Chow and Loken [31] focus on a few suspicious factor loadings to use the SSP, and have not evaluated their performance in the setting of the bifactor model. Let R=[rjk] indicate a p×q matrix ( p: the number of items, q: the number of factors), where rjk’s are latent binary variables that can take on the value of either 1 or 0 to indicate selecting or not selecting factor loadings. In Lu, Chow and Loken [31] approach, promising factor loadings to be estimated are predicted by high posterior inclusion probabilities that are obtained based on multinomial distribution of (rj1,,rjq).

A recent study by Papastamoulis and Ntzoufras [41] introduces a post-processing approach to identify correct latent factor loading structures addressing the issue of invariance in Bayesian factor analytic models caused by orthogonal transformations. The proposed method involves a two-stage process that uses a varimax rotation to achieve a simple structure of factor loadings, followed by a correction for sign and permutation invariance. While the method can be used for identifying the factor loading structures, it primarily identifies orthogonal factor structures without a general factor. Therefore, the method may not be suitable for the bifactor model identification.

Most of these existing techniques can target to identify cross loadings. In our paper, we investigate a new Bayesian approach in the SSP framework to identify general confirmatory model structures as shown in the next section. We also compare our approach to previously mentioned Bayesian approaches in terms of the correctness of the model identification as shown in Section 4.

3. Method

In this section, we begin by discussing the general setting of Bayesian factor models. Then, we discuss one crucial issue in model searching, i.e. models’ identifiability. We will provide some results related to this issue for models with cross loadings as a variation of the bifactor model. These results offer us flexibility in model searching and help justify the final model’s identifiability. Then, we explore a spike and slab approach to inclusion probability, which evaluates each factor loading’s inclusion using a marginal conditional distribution. This method simplifies calculations and improves variable selection performance as well as incorporating some items’ bifactor structures.

3.1. Bayesian factor model

For Bayesian models considering latent variables, Lee [30] provides a thorough explanation for obtaining posterior distributions of latent variable quantities in Bayesian models, and we rely on Lee’s work as our primary source for notation and methodology. Let X indicate p-variate random vector, Y=XE(X) and V(Y)=Σ. Let N indicate the sample size, p indicate the number of manifest variables (or items) in the data, and q indicate the number of latent variables or factors. A factor model is described as

Yi=Λfi+ϵi,i=1,2,,N, (1)

where Yi is a p×1 observed random vector, Λ is a p×q factor loading matrix, fi is a q×1 vector factor scores and ϵi is a p×1 random vector of error measurements which is independent of fi. The error vector ϵi is normally distributed as N(0,Ψ), where Ψ is a diagonal matrix, and fi is distributed as N(0,Φ) with some positive definite covariance matrix Φ. The quantities fi and ϵi are independent. In the frequentist approach, Φ is often considered as the identity matrix with variances of 1. We note that some constraints are necessary to identify the model in Equation (1), as discussed in Section 3.2. In standard Bayesian factor analysis, conjugate priors are used for the model parameters. Let ψj and Λj be the j-th diagonal elements of Ψ and the vector consisted of the elements of j-th row of Λ, respectively. We assume

fiN(0,Φ),ΦIWq(R0,ρ0),Λj|ψjN(Λ0j,ψjH0j),ψj1Gamma(α0j,β0j),

where Gamma(α,β) represents the gamma distribution with shape parameter α and inverse scale parameter β, and IWq denotes an q-dimensional inverse Wishart distribution. Scalars such as α0j,β0j Λ0j, ρ0 and the positive definite matrices H0j and R0 are hyperparameters whose values are frequently considered non-informative or presumed to be derived from prior knowledge obtained through previous studies. For noninformative priors, we use the following commonly used hyperparameters [30,31,44],

α0j=2,β0j=1,H0j=Iq,R01=Iq(pq1),ρ0=p,Λ0j=0.

The conditional posterior distribution of fi given (yi,Λ,Ψ,Φ) is equal to

(fi|yi,Λ,Ψ,Φ)N[(Φ1+ΛTΨ1Λ)1ΛTΨ1yi,(Φ1+ΛTΨ1Λ)1]. (2)

Let Y=(y1,,yN) be a p×N data matrix and jT be the j-th row of Y(j=1,,p),Aj=(H0j1+FFT)1, aj=Aj(H0j1Λ0j+FYj), βj=β0j+1/2(jTYjajTAj1aj+Λ0jTH0j1Λ0j), where F=(f1,f2,,fN). The conditional posterior distributions of ( Λj,ψj1) and Φ given Y and F are expressed as

(ψj1|Y,F)Gamma(N2+α0j, βj),(Λj|Y,F,ψj1)N(αj,ψjAj),(Φ|Y,F)IWq[(FFT+R01),N+ρ0]. (3)

In each iteration of the Gibbs sampler, we consecutively draw values from the conditional posterior distributions of the parameters. We repeat that process until sufficient values have been drawn so that the samples converged to the posterior distribution.

3.2. Identifiability of Bayesian confirmatory factor model

In the CFA model building, model identifiability needs to be satisfied for consistent parameter estimation and valid model description [14]. If the model is not identifiable, different sets of parameters may produce the same model-implied covariance matrix [14]. In a frequentist analysis, parameters are often put at zero (i.e. hard constraint) to produce an analysis that better reflects the researcher’s theories and prior beliefs. Freeing these parameters would result in a non-identified model in practical applications. In Bayesian approach, Muthén and Asparouhov [39] replace the hard constraint on cross loadings with small-variance informative priors (i.e. soft constraint) to represent a belief that the cross loadings may have large prior belief close to 0 [31]. To this end, it is important to know how many cross loadings can be incorporated when we search for the model so that the final model is identifiable. In this section, we discuss identifiable cross loading structures, altering basic bifactor models. Our results described below provide ample flexibility in the model identification procedure as well as final model structures can be justified by the results.

For the general factor model, a few conditions are required to ensure identification without label-switching and sign-reversal problems. Anderson and Rubin [2] suggest that model identification can be achieved by imposing at least q2 restrictions on (Λ,Φ). Howe [23] and most recently Peeters [42] provide some detailed conditions to make sure rotational invariances of Λ. Some common methods to achieve identifiability is to set some elements in Λ to fixed known values, e.g. one main loading of each latent factor to be fixed at 1 as well as q(q1) additional cross loadings to be fixed at 0 at appropriate places [2]. Alternatively, for a CFA model with at least three indicators (items), we can either set the variance of each factor to 1 (variance standardization method) or set the first loading of each factor to 1 (the marker method [47]). From a Bayesian point of view, this is equivalent to assigning the fixed values to the corresponding parameters with probability one [30].

In our CFA model, the models under consideration have bifactor structures allowing several cross loadings. Fang, Xu, Guo, Ying and Zhang [14] state that, under the standard bifactor model with the same sign factor loadings, the model parameters are identifiable if and only if one of the following conditions is met

  1. If we have three or more group factors, each group factor needs at least three indicators.

  2. When we have two group factors, each group factor should have a minimum of three indicators, and the set of group factors should be partitioned into two disjoint subsets of items, where each submatrix of factor loadings (consisted of general and group factor loadings) has full column ranks.

The same sign factor loadings are accomplished commonly in constructing a questionnaire by aligning the responses of question items in the same direction, e.g. positive responses have a larger value and indicating stronger relationship with constructs, while positive but smaller values usually indicate weaker relationship. In the spirit of the above identifiability conditions, we consider some identifiability conditions of bifactor model modified with a few cross loadings. In the following discussions, we largely adopt the notations from [14].

First, consider the standard bifactor model. Let B={1,,p} indicate the set with all item indexes. In this case, the cardinal number of B, |B|=p. Let Λ=[λ0,λ1,,λG] indicate the factor loading matrix, where λi=(λi1,,λip)T, i=0,1,,G. The set of all group factors are indicated by H1={1,,G}. The first column λ0 is the general factor loading, and the other columns indicate the group factor loadings, thus the total number of factors is q=G+1. For an item index set SB, we define Λ[S]=[λ0[S],λ1[S],,λG[S]] where λi[S] is a part of λi corresponding to items in S. We define:

  • Bg={j|λgj0}, i.e. the items in B={1,,p}corresponding to the group factor g(g=1,,G). Thus, BhBg=(hg) and g=1GBg=B.

When the strictly bifactor structure is altered so that there are cross loadings in group factors, we express factor loading as Λ=[λ0,,λG], and λi=(λi1,,λip)T without asterisks. The specific variance matrix is expressed as Φ=diag(ϕ1,ϕp). We also have the following additional definitions:

  • Bg,c: a set of item indexes in Bg that now have cross loadings, i.e. Bgc={j|λgj0}Bg, where Bg is defined above based on the standard bifactor model Λ.

  • Bg,c=BgBg,c

  • H2 is a set of group factors defined as H2={g|Bg,c has a partition S1 and S2 where rank(Λ[S1])2 and rank(Λ[S2])2, g=1,,G}.

Consider a model with two group factors ( G=2 or q=3). The elements in Σ is expressed as Σ=λ0λ0T+λ1λ1T+λ2λ2T+Φ, and the non-identifiability is defined as λ0λ0T+λ1λ1T+λ2λ2T+Φ=λ0λ0T+λ1λ1T+λ2λ2T+Φ for λiλi(i=0,1,2) and ΦΦ. The following result states the condition for identifiability with two group factor models.

Theorem 1.

Suppose =2, |H1|=2 ( H1={1,2}), and the model satisfies the following conditions.

(C1) |Bg|3,g{1,2}.

(C2) |Bg,c|1, g{1,2}.

(C3) All elements in Λ have the same sign.

(C4) |H2|1.

(C5) At least one g ( {1,2}) satisfies |Bg,c|3.

(C6) For g in (C5), rank(Λ[Bg,c])2.

With conditions (C1) – (C6), Λ is identifiable.

The proofs of Theorem 1 and all the other results are provided in the Appendix. Theorem 1 states the minimum requirements for identifying a bifactor model altered by cross loadings comprising of two group factors. In addition, if there is a fixed value in general factor, the conditions can be further relaxed as shown below.

Corollary 1.

Suppose |H1|=2 and one element in λ0[Bg,c] is a fixed value for at least one g(H1). In addition, all conditions except (C4) in Theorem 1 is satisfied. Then, Λ is identifiable.

According to Theorem 1 and Corollary 1, we have a good amount of flexibility in choosing a model that includes cross loadings. Each group factor in the model corresponds to at least three items (C1), when all factor loadings have the same sign (C3). A strictly bifactor structure (i.e. an item is related to only one group factor besides the general factor) is only necessary for 3 items for one group factor (C5) and 1 item for the other factor (C2).

In the following, we state the identifiability conditions of the model with one general factor and more than two group factors, i.e. G3.

Theorem 2.

Suppose | H1|3, and the model satisfies the following conditions.

(C1) |Bg|3,gH1.

(C2) | Bg,c|2 ( gH1) for at least G1 elements in H1.

(C3) All elements in Λ have the same sign.

(C4) | H2|1.

(C5) At least one g ( H1) satisfies | Bg,c|3.

(C6) For g in (C5), rank(Λ[Bg,c])2.

With conditions (C1) – (C6), Λ is identifiable.

In addition, if there is a fixed value in the general factor, the conditions shown in Theorem 2 can be relaxed further as follows.

Corollary 2.

Suppose |H1|3 and one element in λ0[Bg,c] is a fixed value for at least one g(H1). In addition, all conditions except (C4) in Theorem 2 is satisfied. Then Λ is identifiable.

Theorem 2 and Corollary 2 again give much flexibility to include cross loadings when there are more than two group factors. The stated model’s factors correspond to at least three items each (C1), similar to 2-factor models, when all factor loadings have the same sign (C3). Then, according to Corollary 2, only 3 items in one group factor need to have a strict bifactor structure (C5), items in another factor do not have to have bifactor structure, and the remaining factors can only have 2 items with a strict bifactor structure (C2).

3.3. Implementation of the spike and slab prior

A priori, there are a few predetermined factor loading structures of the model. For example, researchers may consider that some question items have clear bifactor structures, i.e. the items correspond to only one group factor besides the general factor. This prior belief or information is needed to be incorporated into the posterior distributions. Applying the spike and slab prior, each factor loading has a discrete mass at point 0 or a continuous distribution based on the posterior distribution. We put the following prior on the cross loadings of λjk(j=1,,p,k=2,..,q),

λjk{(1rjk)δ0+rjkN(λ0jk,ψjσλ0jk2)}I[0,)(λjk),rjkBernoulli(πjk),

where rjk{0,1} is a mixture weight, ψj is the j-th diagonal elements of error variance, δ0 is the point mass function at 0 (the spike), σλ0jk2 is the variance of the slab, πjk is a hyperparameter between 0 and 1 reflecting a prior probability regarding the selection of the factor loading. The function Is(λjk) is the indicator function where the value is 1 if λjks or 0 otherwise. We use rjk is a binary latent variable commonly used in the formulation of mixture models, which indicates whether λjk should be set to zero or freed to deviate from zero. Without strong prior knowledge of the model, values such as λ0jk=0, σλ0jk2=1, and πjk=0.5 will be used. The hyperparameter σλ0jk2=1 signifies that the variance is solely controlled by the error variance of the data Yi, without incorporating any prior knowledge. The value πjk=0.5 aligns with the common practice of not having a precise understanding of the factor structure concerning cross loadings during model building.

We define the matrix R=[rjk]=[R1,,Rq]=[r1T,,rpT]T, where Rk is k-th column ( k=1,,q) of R and rj is the j-th row ( j=1,,p) of R. We also define the matrix Π=[πjk]=[Π1,,Πq], where Πk is k-th column of Π. We note that the values composed of R1 and Π1 are always 1, indicating the general factor is always estimated.

In CFA, we handle known factor loading structures as follows. If a factor loading is determined to be always nonzero a priori, the factor loading is generated based on the prior distribution

λjk{(1rjk)δ0+rjkN(λ0jk,ψjσλ0jk2)}I(0,)(λjk).

When a factor loading is determined to be always zero a priori, the factor loading is generated based on the prior distribution

λjk{(1rjk)δ0+rjkN(λ0jk,ψjσλ0jk2)}I[0](λjk).

In the following notations, p(.) indicates the density or probability mass function of a random variable. We obtain samples from p(F,Λ,R,Φ,Ψ|Y) using the Gibbs sampler [15]. Starting with initial values of {F(0),Λ(0),R(0),Φ(0),Ψ(0)}, the sample at t-th interaction {F(t),Λ(t),R,Φ(t),Ψ(t)} is obtained in the following manner using the full conditional distributions:

Step 1: Generate F(t) from p(F|Y,Λ(t1),Φ(t1),Ψ(t1)).

Step 2: Generate Λ(t) and Ψ(t) from p( Λ,Ψ|Y,F(t),R(t1)).

Step 3: Generate R(t) from p(R|F(t),Λ(t),Ψ(t),Φ(t1)).

Step 4: Generate Φ(t) from p(Φ|F(t)).

The conditional distributions used in Steps 1, 2, and 4 are given in Equations (2) and (3). For step 3, we consider the joint conditional distribution of parameters p(Λ,R,Ψ|Y,F),

p(Λ,R,Ψ|Y,F)p(Y|Λ,Ψ,F)p(Λ|R,Ψ)p(R|Π)p(Ψ)|2πΨ|N2exp[12i=1N(YiΛfi)TΨ1(YiΛfi)]×j=1pψj(α1j+1)exp(α2jψj)j=1pk=1q{πjkrjk(1πjk)1rjk}×j=1pk=1q(ψjσλ0jk2)12exp{(λjkλ0jk)22ψjσλ0jk2}Is(Λ), (4)

where Is(Λ) indicates predetermined constraints on Λ. Let Yi=Yi(ΛMIk)fi, MIk is a matrix that has 0 for the predetermined structure on fi and the element (j,k), and 1 for the rest of elements, and indicates the Hadamard product. If rjk=0, based on Equation (4), we have a conditional distribution:

p(Λk,R,Ψ|Y,F,rjk=0)exp{12i=1N(YiΛkcfki)TΨ1(YiΛkcfki)}×j=1pψj(α1j+1)exp(α2jψj)j=1pk=1q{πjkrjk(1πjk)1rjk}×j=1pk=1q(ψjσλ0jk2)12exp{(λjkλ0jk)22ψjσλ0jk2}, (5)

where Λkc is the factor loading corresponding to fi complementing to k-th column of MIk. Also, if rjk=1, we have

p(Λk,R,Ψ|Y,F,rjk=1)exp{12i=1N(YiΛkfki)TΨ1(YiΛkfki)}×j=1pψj(α1j+1)exp(α2jψj)j=1pk=1q{πjkrjk(1πjk)1rjk}×j=1pk=1q(ψjσλ0jk2)12exp{(λjkλ0jk)22ψjσλ0jk2}, (6)

where Λkis same as Λkc except that λjk is the value estimated in the previous iteration. After integrating out Λ in Equations (5) and (6), we have the posterior distribution of rjk as the Bernoulli random variable with the conditional probability

p(Λk,R,Ψ|Y,F,rjk=1)dΛkp(Λk,R,Ψ|Y,F,rjk=1)dΛk+p(Λk,R,Ψ|Y,F,rjk=0)dΛk=Kj122πexp[Lj2Kj212(λ0j2ψjσλ0jk2+i=1Nyi,j2ψj)]πjkKj122πexp[Lj2Kj212(λ0j2ψjσλ0jk2+i=1Nyi,j2ψj)]πjk+exp[12(i=1Nyij2ψj)](1πjk), (7)

where yi,j is the j-th element of Yi, Kj=i=1Nfki2ψj+1ψjσλ,0jk2, and Lj=λ0jkψjσλ0jk2+i=1Nyijfkiψj.

Since the derivation of the probability (7) is based on the whole group factor, we call the factor loading selection based on (7) as the spike and slab method with factor membership (BSS-FM). Alternatively, we can also obtain the probability for an individual factor loading as follows. We consider the joint conditional distribution of parameters for each item [30]:

p(Λj,rj,ψj|Yj,F)(ψj)N2exp{12ψji=1N(yijΛjfi)2}ψj(α1j+1)exp(α2jψj)×k=1q{πjkrjk(1πjk)1rjk}k=1q(ψjσλ0jk2)12exp{(λjkλ0jk)22ψjσλ0jk2}, (8)

where Λj is the j-th row of Λ. Based on Equation (8), if rjk=0, we have

p(Λjk,rj,ψj|Yj,F,rjk=0)(ψj)N2exp{12ψji=1N(zij)2}ψj(α1j+1)exp(α2jψj)×k=1q{πjkrjk(1πjk)1rjk}k=1q(ψjσλ0jk2)12×exp{(λjkλ0jk)22ψjσλ0jk2},

where zij=yijΛj,kfi and Λj,k is Λj with 0 for the k-th element. Also, if rjk=1,

p(Λjk,rj,ψj|Yj,F,rjk=1)(ψj)N2exp{12ψji=1N(zijλjkfik)2}(ψj)α1j+12exp(α2jψj)×k=1q{πjkrjk(1πjk)1rjk}k=1q(ψjσλ0jk2)12exp{(λjkλ0jk)22ψjσλ0jk2},

where fik is the k-th element of fi. Then, the posterior distribution of rjk is the Bernoulli random variable with the conditional probability

p(Λjk,rj,ψj|Yj,F,rjk=1)dΛjkp(Λjk,rj,ψj|Yj,F,rjk=1)dΛjk+p(Λjk,rj,ψj|Yj,F,rjk=0)dΛjk=exp[12(L2Kλ0jk2ψjσλ0jk2)]K122ππjkexp[12(L2Kλ0jk2ψjσλ0jk2)]K122ππjk+exp{λ0jk22ψjσλ0jk2}(1πjk), (9)

where K=1ψji=1Nfik2+1ψjσλ0jk2,L=1ψji=1Nfikzij+λ0jkψjσλ0jk2. As opposed to the probability derivation in (7), we call the factor loading selection based on (9) as the spike and slab method in an individual loading (BSS-IL).

We remark that the derivations of the slab probability for BSS-FM and BSS-IL are based on the marginal conditional distributions of whole group factors and individual factor loadings, respectively. This is different from the approach by Lu et al. [31], which considers each variate’s distribution separately. Also, in an application of the spike and slab prior, we use distinct priors for unknown loadings, known nonzero loadings and known zero loadings. This allows us to incorporate a priori knowledge about model structures in the model selection process.

4. Simulation study

We conduct a Monte Carlo study using various scenarios in terms of the number of items, factors, and cross loadings. We primarily assess the proposed methods based on their rate of correctly identifying the underlying factor structure. We consider four different factor models: (1) an 8-item model ( p=8) with one general factor and two group factors (each group factor corresponds to four non-overlapping items); (2) a 12-item model ( p=12) with one general factor and three group factors (each group factor corresponds to four non-overlapping items); (3) an 8-item model ( p=8) with one general factor and two group factors (each group factor has two cross loadings); and (4) a 12-item model ( p=12) with one general factor and three group factors (each group factor has two cross loadings). Models (1) and (2) have strict bifactor model structures and models (3) and (4) allow some cross loadings. We compare the performance of our method with the method by Lu et al. (BSEM-SSP) [31], ridge prior [39] and a post-processing approach [41]. We include the post-processing approach to show the performance of factor rotations with MCMC samples. We do not include ESEM approach since the method is not designed for factor loading selection. It is worth mentioning that the varimax method, often used for factor rotation, has limited success with bifactor models [27]. In this comparison, all non-zero factor loadings of the factor models are 1, and Ψ and Φ are the identity matrix from the factor model described in Section 2. In the simulation, we initially assume that the factor number is known, but we also consider the cases with unknown factor numbers as indicated by ‘with extra group factors’ as shown in the results of Tables 1–4. For each scenario, we generate 200 data sets with sample sizes N= 200 and 500 observations. For each data set, MCMC samples are obtained using 1000 iterations and a burn-in of 300 draws to assess the performance of the variable selections. We employ a semblance of hypothesis testing with factor loading 0 as the null hypothesis. We utilize symmetric confidence intervals with a 95% confidence level based on the estimation of factor loadings, where a non-zero factor loading is indicated by the confidence interval not including 0. We do not consider the highest density interval [50] since its lower boundary does not include 0 when we have the positive value constraint. Alternatively, similar to other authors’ approaches (Lu, Chow and Loken [31]), we also use the 50% threshold criterion, by which a factor loading is considered 0 if 50% of MCMC samples assign the factor loading as ‘spike’.

Table 1.

Percentage of correctly identified factor loadings in group factors when p=8 and N=500. In each cell, 1 indicates that the factor loading is correctly identified as zero or non-zero 100% of the times.

    p = 8
    1st group factor 2nd group factor Extra factor
Bi-factor model BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.97  
    1.00 0.98 0.97 0.99 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.95 0.97 0.98  
    1.00 0.98 0.99 0.99 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.93 0.76 0.78 0.77 1.00 0.93 0.89 0.93  
    1.00 0.84 0.85 0.88 0.97 0.98 0.96 0.97  
  BSEM-SSP (95% CI) 1.00 0.98 0.97 0.97 1.00 0.91 0.92 0.95  
    0.95 0.92 0.94 0.95 0.65 0.42 0.47 0.46  
  BSEM-SSP (50% rule) 1.00 0.67 0.66 0.67 1.00 0.74 0.78 0.75  
    1.00 0.76 0.77 0.81 1.00 0.44 0.44 0.43  
  Post processing (95% CI) 0.73 0.96 0.95 0.96 0.00 0.00 0.00 0.00  
    0.22 0.12 0.12 0.12 1.00 1.00 1.00 1.00  
Bifactor model with extra group factor BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.99 1.00 0.90 0.91 0.91
    1.00 0.98 0.97 0.98 1.00 1.00 1.00 1.00 1.00 0.76 0.78 0.79
  BSS-FM (95% CI) 0.99 1.00 0.99 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 0.99 0.99 0.99 1.00 1.00 0.99 0.99 1.00
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.97 0.99 0.99 1.00 0.89 0.86 0.90
    1.00 0.99 0.98 0.97 1.00 1.00 1.00 1.00 1.00 0.79 0.75 0.74
  Ridge prior (95% CI) 0.92 0.61 0.62 0.64 1.00 0.93 0.96 0.94 1.00 0.82 0.85 0.84
    1.00 0.92 0.89 0.85 0.99 0.97 0.97 0.96 1.00 0.89 0.89 0.84
  BSEM-SSP (95% CI) 0.89 0.50 0.49 0.43 1.00 0.94 0.93 0.89 1.00 0.93 0.91 0.94
    1.00 0.90 0.92 0.95 0.63 0.39 0.39 0.37 1.00 0.96 0.92 0.93
  BSEM-SSP (50% rule) 1.00 0.57 0.63 0.57 1.00 0.70 0.80 0.77 1.00 0.83 0.77 0.72
    1.00 0.77 0.74 0.82 1.00 0.50 0.49 0.50 1.00 0.77 0.82 0.77
  Post processing (95% CI) 0.78 0.93 0.92 0.92 0.00 0.00 0.00 0.00 0.98 0.92 0.90 0.90
    0.16 0.12 0.12 0.14 1.00 1.00 1.00 1.00 0.80 0.92 0.94 0.94
Bifactor model with cross loading BSS-IL (95% CI) 0.98 1.00 1.00 0.98 1.00 1.00 1.00 1.00  
    1.00 0.99 1.00 1.00 1.00 0.99 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 1.00 0.95 0.96  
    1.00 1.00 0.97 0.97 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 1.00 0.96 0.97  
    1.00 1.00 0.94 0.95 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.88 0.81 0.77 0.78 1.00 0.88 0.96 0.95  
    1.00 0.77 0.96 0.97 0.99 0.88 0.98 0.98  
  BSEM-SSP (95% CI) 0.96 0.37 0.75 0.73 1.00 0.25 0.93 0.96  
    1.00 0.36 0.88 0.88 0.70 0.23 0.44 0.45  
  BSEM-SSP (50% rule) 1.00 0.42 0.81 0.83 1.00 0.39 0.76 0.80  
    1.00 0.42 0.70 0.68 1.00 0.39 0.65 0.61  
  Post processing (95% CI) 1.00 1.00 1.00 1.00 0.00 1.00 0.00 0.00  
    0.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00  

Table 2.

Percentage of correctly identified factor loadings in group factors when p=12 and N=500. In each cell, 1 indicates that the factor loading is correctly identified as zero or non-zero 100% of the times.

    p = 12
    1st group factor 2nd group factor 3rd group factor Extra factor
Bifactor model BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 0.99 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.98 0.98 1.00 1.00 1.00 0.99 0.97  
    1.00 0.99 0.99 0.99 1.00 1.00 1.00 1.00 1.00 0.99 0.99 0.98  
    1.00 0.99 0.99 0.98 1.00 0.98 1.00 0.99 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 0.99 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.00 1.00 1.00 0.99 0.99  
    1.00 0.99 0.99 0.99 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.98  
    1.00 0.99 1.00 0.99 1.00 0.98 0.98 0.99 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.92 0.93 0.92 0.92 1.00 0.90 0.86 0.88 1.00 0.90 0.91 0.92  
    1.00 0.86 0.83 0.86 0.96 0.96 0.96 0.96 1.00 0.87 0.90 0.88  
    1.00 0.89 0.89 0.86 1.00 0.86 0.89 0.89 0.99 0.99 0.98 0.98  
  BSEM-SSP (95% CI) 0.94 0.74 0.75 0.71 1.00 0.98 0.97 0.95 1.00 0.95 0.97 0.96  
    1.00 0.93 0.96 0.94 0.81 0.60 0.56 0.58 1.00 0.94 0.94 0.96  
    1.00 0.96 0.96 0.92 1.00 0.96 0.94 0.94 0.76 0.56 0.56 0.54  
  BSEM-SSP (50% rule) 1.00 0.76 0.77 0.78 1.00 0.96 0.98 0.96 1.00 0.96 0.97 0.96  
    1.00 0.96 0.98 0.94 1.00 0.52 0.53 0.51 1.00 0.98 0.96 0.96  
    1.00 1.00 0.96 0.98 1.00 0.94 0.98 0.98 1.00 0.53 0.54 0.56  
  Post processing(95% CI) 0.90 0.76 0.80 0.78 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00  
    0.26 0.34 0.34 0.34 1.00 1.00 1.00 1.00 0.04 0.00 0.01 0.00  
    0.26 0.34 0.34 0.32 0.07 0.00 0.00 0.00 1.00 1.00 1.00 1.00  
Bifactor model with extra group factor BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.99 1.00 0.99 1.00 0.99 0.99 0.99 1.00 0.95 0.94 0.96
    1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00 0.99 0.99 0.98 1.00 0.92 0.90 0.91
    1.00 1.00 1.00 1.00 1.00 0.99 1.00 0.99 1.00 1.00 1.00 1.00 1.00 0.95 0.92 0.95
  BSS-FM (95% CI) 0.99 1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 0.99 0.99 0.99 0.99 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.97 0.95
    1.00 1.00 0.98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 0.92 0.92 0.94
    1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.94 0.92 0.91
  Ridge prior (95% CI) 0.98 0.91 0.93 0.96 1.00 0.88 0.90 0.89 1.00 0.91 0.88 0.92 1.00 0.83 0.79 0.80
    1.00 0.89 0.94 0.89 0.99 0.99 0.99 0.98 1.00 0.92 0.86 0.90 1.00 0.84 0.86 0.86
    1.00 0.91 0.90 0.91 1.00 0.92 0.92 0.93 1.00 0.99 0.97 0.98 1.00 0.84 0.89 0.90
  BSEM-SSP (95% CI) 0.94 0.64 0.66 0.66 1.00 0.96 0.97 0.98 1.00 0.97 0.96 0.96 1.00 0.94 0.96 0.96
    1.00 0.98 0.98 0.98 0.69 0.52 0.52 0.52 1.00 0.99 0.96 0.98 1.00 0.94 0.96 0.94
    1.00 0.97 0.99 0.98 1.00 0.98 0.98 0.97 0.64 0.50 0.48 0.50 1.00 0.97 0.97 0.98
  BSEM-SSP (50% rule) 1.00 0.82 0.78 0.78 1.00 0.97 1.00 0.98 1.00 0.96 0.96 0.95 1.00 0.90 0.90 0.89
    1.00 0.96 0.96 0.96 1.00 0.57 0.56 0.56 1.00 0.96 0.93 0.93 1.00 0.94 0.94 0.94
    1.00 0.96 0.98 0.98 1.00 0.96 0.96 0.96 1.00 0.48 0.50 0.48 1.00 0.92 0.94 0.92
  Post processing(95% CI) 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.95 0.96 0.96 0.95
    0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 0.04 0.00 0.00 0.00 0.88 0.98 0.99 0.98
    0.00 0.00 0.00 0.00 0.10 0.00 0.00 0.00 1.00 1.00 1.00 1.00 0.86 0.96 0.98 0.96
Bifactor model with cross loading BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.98 0.99 1.00 1.00 0.93 0.93 0.98  
    1.00 0.92 0.93 0.97 1.00 1.00 1.00 1.00 1.00 0.96 0.94 1.00  
    1.00 0.95 0.96 0.97 1.00 0.99 0.97 0.98 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.97 0.97 1.00 1.00 0.96 0.96 0.96  
    1.00 0.95 0.96 0.98 1.00 1.00 1.00 1.00 1.00 0.96 0.93 1.00  
    1.00 0.98 0.99 0.97 1.00 0.96 0.98 0.97 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.98 0.91 0.92 0.74 1.00 0.94 0.97 0.95 1.00 0.82 0.85 0.92  
    1.00 0.80 0.84 0.96 0.98 0.97 0.97 0.99 1.00 0.85 0.80 0.97  
    1.00 0.72 0.67 0.74 1.00 0.93 0.84 0.85 0.97 0.95 0.95 0.95  
  BSEM-SSP (95% CI) 0.98 0.91 0.92 0.74 1.00 0.94 0.97 0.95 1.00 0.82 0.85 0.92  
    1.00 0.80 0.84 0.96 0.98 0.97 0.97 0.99 1.00 0.85 0.80 0.97  
    1.00 0.72 0.67 0.74 1.00 0.93 0.84 0.85 0.97 0.95 0.95 0.95  
  BSEM-SSP (50% rule) 0.98 0.88 0.86 0.82 1.00 0.99 0.99 0.60 1.00 0.96 0.96 0.94  
    1.00 1.00 1.00 0.98 0.76 0.65 0.66 0.62 1.00 0.98 0.96 0.40  
    1.00 0.98 0.98 0.98 1.00 0.96 0.96 0.96 0.53 0.38 0.38 0.38  
  Post processing (95% CI) 1.00 0.89 0.89 0.84 1.00 0.98 0.98 0.64 1.00 0.96 0.96 0.84  
    1.00 0.97 0.96 0.95 1.00 0.72 0.71 0.66 1.00 0.92 0.93 0.47  
    1.00 0.98 0.94 0.98 1.00 0.98 0.98 0.96 1.00 0.42 0.42 0.41  

Table 3.

Percentage of correctly identified factor loadings in group factors when p=8 and N=200. In each cell, 1 indicates that the factor loading is correctly identified as zero or non-zero 100% of the times.

    P = 8
    1st group factor 2nd group factor Extra factor
Bi-factor model BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.94 0.96 0.93  
    1.00 0.97 0.95 0.90 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.95 0.93 0.93  
    1.00 0.97 0.94 0.95 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.93 0.76 0.71 0.79 1.00 0.94 0.95 0.94  
    1.00 0.92 0.87 0.88 0.93 0.93 0.93 0.93  
  BSEM-SSP (95% CI) 0.86 0.39 0.45 0.48 1.00 0.82 0.84 0.85  
    1.00 0.88 0.89 0.85 0.55 0.36 0.34 0.35  
  BSEM-SSP (50% rule) 1.00 0.62 0.61 0.59 1.00 0.59 0.61 0.62  
    1.00 0.68 0.72 0.71 1.00 0.53 0.50 0.47  
  Post processing(95% CI) 0.76 0.96 0.94 0.96 0.02 0.02 0.03 0.03  
    0.22 0.12 0.12 0.12 0.99 0.99 0.99 0.99  
Bifactor model with extra group factor BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.95 0.96 0.95 1.00 0.69 0.65 0.71
    1.00 0.97 0.96 0.97 1.00 1.00 1.00 1.00 1.00 0.63 0.65 0.58
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.95 0.96 0.95 1.00 0.69 0.65 0.71
    1.00 0.97 0.96 0.97 1.00 1.00 1.00 1.00 1.00 0.63 0.65 0.58
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.00
    1.00 1.00 1.00 1.00 0.99 0.99 0.99 1.00 1.00 1.00 1.00 1.00
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.94 0.95 0.97 1.00 0.73 0.73 0.74
    1.00 0.97 0.96 0.95 1.00 1.00 1.00 1.00 1.00 0.63 0.59 0.58
  Ridge prior (95% CI) 0.94 0.73 0.79 0.76 1.00 0.98 0.99 0.99 1.00 0.90 0.89 0.87
    1.00 0.94 0.97 0.96 0.92 0.91 0.88 0.90 1.00 0.91 0.91 0.92
  BSEM-SSP (95% CI) 0.85 0.32 0.27 0.26 1.00 0.91 0.90 0.94 1.00 0.90 0.93 0.93
    1.00 0.93 0.88 0.95 0.55 0.20 0.24 0.22 1.00 0.92 0.93 0.94
  BSEM-SSP (50% rule) 1.00 0.54 0.57 0.56 1.00 0.46 0.52 0.51 1.00 0.52 0.55 0.55
    1.00 0.63 0.59 0.59 1.00 0.39 0.42 0.40 1.00 0.63 0.61 0.57
  Post processing (95% CI) 0.86 0.96 0.97 0.96 0.01 0.04 0.02 0.03 0.95 0.80 0.81 0.83
    0.11 0.13 0.13 0.16 0.99 0.99 0.99 0.99 0.86 0.87 0.88 0.85
Bifactor model with cross loading BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00  
    1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 1.00 0.94 0.93  
    1.00 1.00 0.91 0.94 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 1.00 0.94 0.93  
    1.00 1.00 0.91 0.94 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 0.99 0.99 0.99 1.00 1.00 0.99 0.99 1.00  
    1.00 1.00 0.99 1.00 1.00 0.99 1.00 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 1.00 0.93 0.91  
    1.00 1.00 0.90 0.95 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.94 0.73 0.88 0.91 1.00 0.86 0.96 0.94  
    1.00 0.69 0.99 0.95 0.98 0.91 0.95 0.98  
  BSEM-SSP (95% CI) 0.97 0.30 0.56 0.57 1.00 0.30 0.85 0.86  
    1.00 0.28 0.92 0.93 0.61 0.27 0.35 0.39  
  BSEM-SSP (50% rule) 1.00 0.42 0.67 0.67 1.00 0.45 0.61 0.55  
    1.00 0.41 0.65 0.63 1.00 0.44 0.54 0.54  
  Post processing(95% CI) 0.99 1.00 1.00 1.00 0.02 1.00 0.02 0.01  
    0.03 1.00 0.02 0.05 1.00 1.00 1.00 1.00  

Table 4.

Percentage of correctly identified factor loadings in group factors when p=12 and N=200. In each cell, 1 indicates that the factor loading is correctly identified as zero or non-zero 100% of the times.

    p = 12  
    1st group factor 2nd group factor 3rd group factor Extra factor
Bifactor model BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.96 0.96 0.94 1.00 0.96 0.96 0.94  
    1.00 0.96 0.98 0.98 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.98  
    1.00 0.94 0.98 0.92 1.00 0.98 0.94 0.96 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00  
    1.00 1.00 1.00 1.00 0.99 0.99 0.99 0.99 1.00 1.00 1.00 0.99  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.95 0.95 0.97 1.00 0.96 0.97 0.97  
    1.00 0.94 0.96 0.94 1.00 1.00 1.00 1.00 1.00 0.96 0.93 0.96  
    1.00 0.97 0.99 0.97 1.00 0.96 0.96 0.96 1.00 1.00 1.00 1.00  
  Ridge prior (95% CI) 0.94 0.96 0.94 0.96 1.00 0.88 0.89 0.90 1.00 0.87 0.88 0.90  
    1.00 0.90 0.89 0.91 0.97 0.96 0.98 0.96 1.00 0.91 0.92 0.92  
    1.00 0.88 0.92 0.92 1.00 0.90 0.87 0.92 0.97 0.98 0.98 0.97  
  BSEM-SSP (95% CI) 0.85 0.50 0.50 0.48 1.00 0.96 0.92 0.94 1.00 0.96 0.93 0.96  
    1.00 0.92 0.93 0.94 0.71 0.44 0.43 0.45 1.00 0.95 0.95 0.94  
    1.00 0.94 0.94 0.97 1.00 0.93 0.92 0.94 0.73 0.54 0.53 0.54  
  BSEM-SSP (50% rule) 1.00 0.66 0.64 0.66 1.00 0.90 0.91 0.90 1.00 0.88 0.90 0.90  
    1.00 0.90 0.92 0.94 1.00 0.62 0.63 0.66 1.00 0.88 0.91 0.90  
    1.00 0.92 0.88 0.93 1.00 0.88 0.91 0.92 1.00 0.56 0.56 0.52  
  Post processing (95% CI) 0.83 0.89 0.90 0.90 0.03 0.03 0.03 0.03 0.03 0.04 0.04 0.04  
    0.18 0.16 0.15 0.17 0.98 0.98 0.98 0.98 0.20 0.14 0.09 0.14  
    0.18 0.16 0.18 0.16 0.46 0.04 0.04 0.04 0.98 0.98 0.99 0.98  
Bifactor model with extra group factor BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.96 1.00 0.97 0.97 0.98 1.00 0.90 0.86 0.84
    1.00 0.98 0.98 0.98 1.00 1.00 1.00 1.00 1.00 0.97 1.00 0.98 1.00 0.86 0.84 0.84
    1.00 0.98 0.97 0.96 1.00 0.96 0.96 0.98 1.00 1.00 1.00 1.00 1.00 0.84 0.84 0.84
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.97 0.96 0.97 1.00 0.97 0.95 0.96 1.00 0.92 0.88 0.88
    1.00 0.96 0.99 0.98 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.98 1.00 0.78 0.87 0.80
    1.00 0.98 0.98 0.98 1.00 0.98 0.97 0.95 1.00 1.00 1.00 1.00 1.00 0.86 0.81 0.82
  Ridge prior (95% CI) 1.00 0.98 0.97 0.94 1.00 0.94 0.93 0.94 1.00 0.92 0.92 0.93 1.00 0.88 0.94 0.92
    1.00 0.94 0.91 0.92 0.98 0.98 0.97 0.98 1.00 0.93 0.92 0.90 1.00 0.90 0.90 0.89
    1.00 0.92 0.92 0.91 1.00 0.94 0.90 0.88 0.96 0.96 0.95 0.97 1.00 0.91 0.91 0.92
  BSEM-SSP (95% CI) 0.89 0.46 0.47 0.48 1.00 0.99 0.98 1.00 1.00 0.98 0.98 0.99 1.00 0.96 0.98 0.96
    1.00 0.99 0.99 0.99 0.64 0.35 0.32 0.34 1.00 0.98 0.98 0.98 1.00 0.98 0.98 0.99
    1.00 1.00 1.00 1.00 1.00 1.00 0.98 1.00 0.66 0.37 0.39 0.37 1.00 0.98 0.96 0.97
  BSEM-SSP (50% rule) 1.00 0.66 0.64 0.64 1.00 0.91 0.88 0.86 1.00 0.92 0.94 0.92 1.00 0.80 0.82 0.82
    1.00 0.94 0.92 0.92 1.00 0.57 0.56 0.56 1.00 0.92 0.92 0.94 1.00 0.86 0.90 0.82
    1.00 0.92 0.94 0.93 1.00 0.86 0.90 0.89 1.00 0.59 0.60 0.60 1.00 0.87 0.89 0.89
  Post processing (95% CI) 0.99 0.99 1.00 0.98 0.01 0.00 0.01 0.01 0.01 0.03 0.02 0.01 0.96 0.98 0.96 0.95
    0.03 0.03 0.01 0.01 1.00 1.00 1.00 1.00 0.30 0.10 0.09 0.12 0.94 0.98 0.99 0.99
    0.03 0.02 0.03 0.00 0.42 0.02 0.01 0.04 1.00 0.99 1.00 1.00 0.92 0.98 0.96 0.99
Bifactor model with cross loading BSS-IL (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.98 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-IL (50% rule) 1.00 1.00 1.00 1.00 1.00 0.94 0.96 1.00 1.00 0.88 0.88 0.98  
    1.00 0.92 0.89 0.97 1.00 1.00 1.00 1.00 1.00 0.84 0.90 1.00  
    1.00 0.92 0.92 0.92 1.00 0.96 0.95 0.98 1.00 1.00 1.00 1.00  
  BSS-FM (95% CI) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 0.99 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
  BSS-FM (50% rule) 1.00 1.00 1.00 1.00 1.00 0.96 0.97 1.00 1.00 0.96 0.97 1.00  
    1.00 0.82 0.92 0.98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00  
    1.00 0.93 0.90 0.94 1.00 0.96 0.95 0.96 1.00 0.96 0.95 0.96  
  Ridge prior (95% CI) 0.97 0.95 0.95 0.90 1.00 0.94 0.95 0.97 1.00 0.85 0.90 0.97  
    1.00 0.88 0.85 0.97 1.00 1.00 1.00 0.99 1.00 0.85 0.87 0.98  
    1.00 0.75 0.83 0.84 1.00 0.85 0.89 0.92 0.96 0.97 0.97 0.98  
  BSEM-SSP (95% CI) 0.94 0.76 0.74 0.66 1.00 1.00 0.99 0.38 1.00 0.97 0.96 0.90  
    1.00 1.00 0.99 0.96 0.71 0.56 0.52 0.44 1.00 1.00 0.99 0.42  
    1.00 0.98 0.98 0.98 1.00 0.96 0.96 0.96 0.60 0.40 0.37 0.41  
  BSEM-SSP (50% rule) 1.00 0.77 0.75 0.72 1.00 0.95 0.92 0.60 1.00 0.92 0.89 0.80  
    1.00 0.95 0.94 0.88 1.00 0.64 0.66 0.68 1.00 0.86 0.87 0.60  
    1.00 0.90 0.94 0.92 1.00 0.92 0.90 0.92 1.00 0.54 0.56 0.54  
  Post processing (95% CI) 0.92 0.96 0.96 0.96 0.04 0.03 0.03 0.98 0.04 0.04 0.06 0.57  
    0.08 0.07 0.09 0.32 0.98 0.98 0.98 0.98 0.20 0.14 0.10 0.96  
    0.09 0.08 0.07 0.07 0.51 0.07 0.04 0.06 0.98 0.96 0.96 0.96  

Table 1 (p=8) and 2 (p=12) show the factor loading selections of group factors with N=500, and Table 3 (p=8) and 4 (p=12)are for N=200. All tables include strictly bifactor models as well as models with cross loadings. Additionally, ‘extra factor’ indicates simulations with an assumption that we have one more group factor with an unknown number of group factors. As shown in the simulation study, regardless of sample sizes, our BSS-IL and BSS-FM with the symmetric confidence intervals criteria demonstrate 98-100% correctness of identifying zero and non-zero variables for each latent factor (Tables 1–4) throughout the scenarios of strict bifactor models, model identification with extra group factor, and bifactor models with cross loading. Bayesian SEM with ridge prior demonstrate an average of 92% of correctness when identifying the factor structure in the bifactor models and an average of 89% correctness when identifying the extra factor (i.e. correctly identifying as zero). In terms of models with cross loadings, it can identify an average of 91% of cross loading structures. BSEM-SSP shows better performance when identifying zero factor loadings compared to non-zero factor loadings in each latent factor for each scenario. The post-processing method shows reasonable performance in identifying non-zero factor loadings while failing to distinguish the zero factor loadings in the bifactor model. This suggests that while factor rotation may help identify non-zero factor loadings, it cannot effectively separate the zero factor loadings across group latent factors and scenarios.

If we use the 50% threshold criterion, the correctness is in general inferior to symmetric confidence interval criteria, but BSS-IL and BSS-FM still show a better performance than other approaches. Overall, our methods provide quantifiable measures of the uncertainties associated with cross loading inclusion and model choices, leading to more correctly identified results compared to other available methods.

It has been discussed that the parameters can be estimated based on the MCMC samples with various prior settings including SSP [31]. We observe that oftentimes MCMC samples do not provide satisfactory estimates and relevant inferences, when the majority of group factor loading structures are not determined a priori. Instead, we recommend that MCMC samples using SSP be primarily used for model selection, followed by the final model being checked for identifiability. If identifiability is satisfied, then the parameters can be estimated using the maximum likelihood approach, in which the factor loadings identified as zero in the MCMC step remain at zero. This strategy works well, as shown in Table 5, where we present simulation results for two versions of the model: one with 12 items and two group factors and the other with 12 items and three group factors. Each model includes one or two cross loadings. Due to a high chance of correctly identifying the factor structure, the results show that the coverage rate of 95% confidence intervals of all factor loadings is close to the target confidence level.

Table 5.

Coverage rates of nonzero factor loading estimates using the 95% confidence interval. There are 12 general factor loadings and 6 (two group factor model) or 4 (three group factor model) group factor loadings per factor to be estimated in addition to cross loadings (indicated by bold numbers).

    General factor 1st group factor 2nd group factor 3rd group factor
2 factor model Bifactor model with 1 cross loading (bold) 0.94 0.94 0.96 0.95 0.94 0.94 0.95 0.95 0.94 0.95 0.93 0.92  
    0.93 0.96 0.94 0.94 0.94 0.95 0.94 0.96 0.94  
    0.95 0.94 0.96 0.97      
  Bifactor model with 2cross loading (bold) 0.93 0.95 0.96 0.92 0.90 0.91 0.90 0.95 0.98 0.95 0.95 0.93  
    0.94 0.95 0.94 0.94 0.90 0.97 0.94 0.96 0.94 0.96  
    0.95 0.94 0.93 0.93      
3 factor model Bifactor model with 1 cross loading 0.94 0.93 0.95 0.95 0.93 0.90 0.95 0.95 0.97 0.94 0.94 0.91 0.95 0.96 0.94 0.95
    0.95 0.96 0.94 0.93   0.94  
    0.96 0.93 0.97 0.94      
  Bifactor model with 2cross loading 0.94 0.92 0.95 0.93 0.97 0.96 0.96 0.96 0.92 0.94 0.93 0.95 0.94 0.93 0.95 0.91
    0.94 0.93 0.96 0.93   0.96 0.94
    0.93 0.93 0.97 0.95      

5. Application

For the application of the proposed methods, we use the SF-12 version 2 (SF-12v2) data [36] from the Household Component of the Medical Expenditure Panel Survey (MEPS-HC) carried out between 2015 and 2016 ( N = 17,017). The SF-12v2 is a widely used health survey consisting of 12 questions designed to assess self-reported health-related quality of life. The SF-12v2 is divided into two areas, physical health and mental health [51] with corresponding scales for each area. The area on physical health focuses on participants’ general health, limitations in mobility, work, and other physical activities as well as limitations due to pain. The corresponding scales include general health (one item), physical functioning (two items), role physical (two items), and bodily pain (one item). The area on mental health encompasses limitations in social activity, emotional state, and level of distraction. The corresponding scales include a physical health and vitality summary (one item), social functioning (one item), role emotional (two items), and mental health (two items). In the data analysis, following common practice [18,28], each of the coded scores in the SF-12v2 is converted to a value ranging from 0 to 100, and inverse-coded scores are corrected so that 100 indicates the best health condition.

Figure 1 compares the bifactor model, the Bayesian factor selection with ridge prior, and our proposed Bayesian factor selection (BSS-IL and BSS-FM). We note that, when employing the Lasso prior [11], the bifactor model is chosen. We use ‘limited in moderate activity’ and ‘feel blue’ as the items with the known bifactor structure for BSS-IL and BSS-FM as well as the ridge prior. That is, we have a prior belief that ‘limited in moderate activity’ strictly corresponds to only physical health, while ‘feel blue’ only corresponds to mental health.

Figure 1.

Figure 1.

Factor model and factor loading estimate from bifactor model, model selected from the ridge prior, and model selected from our BSS-IL and BSS-FM. The numbers under the items (e.g. ADDAYA2, etc.) are estimated general factor loadings. The numbers in the columns under F1 and F2 are the factor loadings for respective group factors. If a factor loading is small negative value, it is fixed to be 0.

The model selected by the ridge prior has seven cross loadings, while the models selected by BSS-IL and BSS-FM show the same results with six cross loadings. In the model selected by BSS-IL and BSS-FM, the items belonging to physical health area such as ‘health in general’ and ‘pain limited in normal work’ load on the mental health area too. A majority of items belonging to mental health area such as ‘social time’, ‘energy’, ‘work limited due to mental problem’ and ‘calm/peaceful’ load on physical health. This may indicate that mental health status is highly affected by the physical health status. Model selection using the ridge prior chooses ‘accomplished less by physical’ as cross loading, while our proposed model does not. The estimated magnitude of this extra factor loading is relatively small compared to other factor loadings. In addition, we compare the fit indexes of these models as shown in Table 6. Both ridge prior selection and our model have the same CFI value. The RMSEA shows that BSS-IL and BSS-FM have a slight improvement of fit over the ridge prior selection. The ridge prior selection has lower AIC and BIC values, but the difference is small. Since BSS-IL and BSS-FM provide a more parsimonious model, we consider that the model selected by BSS-IL and BSS-FM is preferable to the model from ridge prior selection, and thus it may be reasonable to ignore this cross loading in the overall analysis.

Table 6.

Comparison of fit indexes.

  Bi-factor model Ridge prior Our model
Comparative Fit Index (CFI) 0.939 0.965 0.965
Akaike information criterion (AIC) 1,235,608.197 1,233,286.928 1,233,301.043
Bayesian information criterion (BIC) 1,235,865.841 1,233,603.462 1,233,610.215
RMSEA 0.105 0.089 0.088

It may be of interest that the model selection is changed by using different items for known bifactor structures. When we change a known structure from ‘limited in moderate activity’ to ‘limits in climbing’, we obtain the same model not affected by the different choice of item.

We note that the CFI value is larger than that of the bifactor model by 0.026, as shown in Table 6. The magnitude of the difference in CFI values is typically interpreted using guidelines proposed by several SEM researchers, such as Cheung and Rensvold [12]. They state that a difference in CFI of 0.01 or more may indicate that the proposed model is a better fit to the data than the alternative model. It is also suggested that a criterion of a 0.01 change in CFI is paired with a change in RMSEA of 0.015 [10]. The higher value of the CFI and lower value of the RMSEA indicate that including cross loadings produces a better model fit compared to the bifactor model. In our models, these cross loadings enhance the validity and reliability of the factor model by capturing the complexity and multidimensionality of the construct being measured.

In addition, we carry out a sensitivity analysis of parameters for the spike and slap prior. For sensitivity analysis in the first scenario, we set the value of the mean as 0 for the spike and slab prior as a common practice (e.g. [31]), while the variance values ( σλ0jk2) are varied between 0.5 and 10 (0.5, 1, 2, 5, 10). The value 1 is the hyperparameter that we originally use as a non-informative prior choice. In this scenario, the prior probability of the slab is fixed at 0.5. Overall, our proposed methods produce consistent results regardless of different prior parameters, with only two items’ loadings showing changes. In particular, for BSS-FM, there are no model structure changes regardless of different variances (σλ0jk2)between 1 and 5. More changes are observed for BSS-IL, but the loading changes are limited to two items. For BSS-FM with variances of 0.5 or 10 and BSS-IL with a variance other than 1, changes are identified in the loadings of the two items: ADMALS2 (accomplished less because of emotional problems) and ADCAPE2 (felt calm/peaceful). ADMALS2 changed from a mental loading to both mental and physical loadings, while ADCAPE2 changed from both loadings to mental loading only. As the name’s annotation suggests, ADMALS2 may be associated with both physical and emotional constructs. For ADCAPE2, it corresponds to the mental construct by the original intention of the scale development. Thus, these changes produce reasonable explanations. The altered model exhibits similar CFI and RMSEA values (0.964 and 0.088, respectively) for model fitting.

In the second scenario for the sensitivity analysis, if researchers have a strong understanding of potential cross loadings, they may use a higher slab probability (πjk) than 0.5. In this context, we conduct a sensitivity analysis based on the changes of the slab probability while keeping the mean and variance for the spike and slab prior in a noninformative way (0 and 1, respectively). We vary the slab probability between 0.2 and 0.8 (0.2, 0.4, 0.5, 0.6, 0.8). A value 0.5 indicates no informative prior, that we use for the original data analysis. Once again, the changes were limited only on the two items, ADMALS2 and ADCAPE2. For BSS-FM with πjk of 0.2, 0.4, and 0.6 and for BSS-IL with πjk of 0.2 and 0.4, the same model as our original analysis is selected. With BSS-IL with πjk = 0.6, ADMALS2 and ADCAPE2 have the same loadings as the model described above with the variance changes. When BSS-FM and BSS-IL with πjk=0.8 which indicates a strong belief of the existence of cross loadings, ADMALS2 and ADCAPE2 have loadings on both constructs. This model also exhibits similar CFI and RMSEA values (0.965 and 0.089, respectively).

Overall, we can reliably suggest that our proposed methods can produce relatively robust results, while our observations indicate that BSS-FM is more robust to hyperparameter changes compared to BSS-IL. Some different choices of the hyperparameters may provide further interpretations of certain items with changed structures. Changing structures indicate that these items may have relatively less solid interpretations. As a rule of thumb, for the final model choices, unless strong prior belief exists regarding the hyperparameters, we recommend choosing the model that shows consistent results by both BSS-FM and BSS-IL, using common non-informative prior choices such as σλ0jk2=1 and πjk = 0.5.

6. Conclusion

In this paper, we presented a novel Bayesian variable-selection approach for factor loading selection in CFA. Our methods employed SSP for variable selection, and we discussed how SSP distributions can be used in model selection. In the proposed methods, the probability of including each factor loading was assessed through a marginal conditional distribution and variables were selected using the binomial distribution, where the assessment involved unobserved latent factor scores. This approach enabled us to assess the level of uncertainties related to cross loading inclusion and model selection. Our methods simplified the calculation of inclusion probabilities and showed an improvement in variable selection performance compared to other existing methods. In addition, to improve the model identification procedures and the numerical stability of the factor model, we proposed model identification constraints for a bifactor model altered by certain cross loadings, which offered ample flexibility in including those loadings. We conducted a Monte Carlo simulation study under various scenarios, including different numbers of items, factors, and cross loadings, where our primary focus was to examine the rate of correct model identification. Our proposed methods led to more correctly identified results when compared to other available methods. For real data analysis, we applied our proposed methods to the SF-12V2 dataset from MEPS. The BSS-IL and BSS-FM models showed that items in the physical health area were loaded on the mental health area and that the majority of items in the mental health area were also loaded on the physical health area. This suggested that the statuses of one’s physical health and mental health may be highly linked. Our models demonstrated that incorporating cross loadings can improve the validity and reliability of the factor model by accounting for the intricacy and multi-dimensionality of the constructs under investigation. In summary, our BSS-IL and BSS-FM methods outperformed other methods in factor model selection, and we demonstrated their effectiveness through simulation studies and real-world applications.

Appendices.

Appendix 1: Proofs

In this appendix, we provide the proofs of theorems and corollaries in Section 3. We state following lemmas first to simplify the proof of the theorems.

Lemma 1

Suppose a cyclic system of equations xy=xy,xz=xz,yz=yz, where x,y,z,x,y,z are real numbers. Then x=±x,y=±y,z=±z where the signs of x,y,z are same.

Proof: If x=cx for a real number constant c, then y=1cy and z=1cz. This leads to yz=1c2yz=yz. Thus, c=±1.

Lemma 2

Suppose that ai (i=1,,n<) are arbitrary real numbers, and xi and xi are real constants. Then, i=1naixi=i=1naixixi=xi, i=1,,n.

Proof: i=1naixi=i=1naixii=1nai(xixi)=0. We can let x1x1=ci(xixi), i=1,,n for some constant ci’s. Then, i=1nai(xixi)=0(x1x1)i=1naici=0. Thus, x1=x1. In turn, xi=xi,i=1,,n.

Proof Proof of Theorem 1 —

The elements in Σ is expressed as Σ=λ0λ0T+λ1λ1T+λ2λ2T+Φ, where Φ affects only diagonal elements of Σ.

(i) Without loss of generality, let us assume | B1,c|3 and 2H2 by condition (C5) and (C4), respectively. The elements in Σ corresponding to B1,c and B2,c exist by condition (C2) and form off-diagonal elements. Thus, to satisfy non-identifiability, we have λ0[B1,c]λ0[B2,c]T =  λ0[B1,c]λ0[B2,c]T since λ1[B2,c]=0 and λ2[B1,c]=0. This leads to

λ0[B1,c]=kλ0[B1,c]andλ0[B2,c]=1kλ0[B2,c]. (A.1)

Since 2H2by (C4), we have a disjoint union of S1 and S2 satisfying B2,c=S1 S2, and each subset has a rank at least 2. Then, for off-diagonal elements in Σ corresponding to S1 and S2, non-identifiability results in

λ0[S1]λ0[S2]T+λ2[S1]λ2[S2]T=λ0[S1]λ0[S2]T+λ2[S1]λ2[S2]T(11k2)λ0[S1]λ0[S2]T+λ2[S1]λ2[S2]T=λ2[S1]λ2[S2]T, (A.2)

by (A.1). To match the rank of left-hand side and the right-hand side of the equation (A.2), k=±1. Let k=1 by (C3). Thus, by (A.1), we have

λ0[B1,c]=λ0[B1,c] and λ0[B2,c]=λ0[B2,c]. (A.3)

Now, for the off-diagonal elements in Σ corresponding to B1,c,

(λ0[B1,c]λ0[B1,c]T+λ1[B1,c]λ1[B1,c]T)J=λ0[B1,c]λ0[B1,c]T+λ1[B1,c]λ1[B1,c]T)J(λ1[B1,c]λ1[B1,c]T)J=(λ1[B1,c]λ1[B1,c]T)J,

by (A.3), where J is a conformable matrix consisted of 0 for the multiplication of same indexes and 1 for the multiplication of different indexes in the operation of [B1,c][B1,c]T and indicates the Hadamard product. This satisfies a cyclic relationship in Lemma 1 since |B1,c|3 by (C5). Thus,

λ1[B1,c]=λ1[B1,c]. (A.4)

For the non-identifiability of the off-diagonal elements in Σ corresponding to B1,c and B1,c B2,c, we have

λ0[B1,c]λ0[B1,cB2,c]T+λ1[B1,c]λ1[B1,cB2,c]T=λ0[B1,c]λ0[B1,cB2,c]T+λ1[B1,c]λ1[B1,cB2,c]Tλ0[B1,c]λ0[B1,cB2,c]T+λ1[B1,c]λ1[B1,cB2,c]T=λ0[B1,c]λ0[B1,cB2,c]T+λ1[B1,c]λ1[B1,cB2,c]T, (A.5)

by (A.3) and (A.4). But, by Lemma 2 and rank(Λ[B1,c])2 by (C6), (A.5) leads to

λ0[B1,cB2,c]=λ0[B1,cB2,c], and λ1[B1,cB2,c]=λ1[B1,cB2,c]. (A.6)

(A.3), (A.4) and (A.6) indicate that all elements in λ0 and λ1 are identifiable. Finally, for the non-identifiability of the off-diagonal elements in Σ corresponding to B1,c or B2, we have

(λ2[B2B1,c]λ2[B2B1,c]T)J=(λ2[B2B1,c]λ2[B2B1,c]T)J,

where J is a conformable matrix consisted of 0 for the multiplication of same indexes and 1 for the multiplication of different indexes in the operation of [B2B1,c][B2B1,c]T. This satisfies cyclic relationship in Lemma 2 if |B2|3 by (C1) regardless of the existence of B1,c. Thus, all elements in λ2 are identifiable.

(ii) Without loss of generality, let us assume |B1,c|3 and 1H2 by (C4) and (C5). The elements in Σ corresponding to B1,c and B2,c are off-diagonal elements, thus we have the relationship (A.1) similar to the proof part (i). Since 1H2 by (C4), we have a disjoint union of S1 and S2 satisfying B1,c=S1 S2, and each subset has rank at least 2. Then, for off-diagonal elements in Σ corresponding to S1 and S2, non-identifiability results in

λ0[S1]λ0[S2]T+λ1[S1]λ1[S2]T=λ0[S1]λ0[S2]T+λ1[S1]λ1[S2]T, (1k2)λ0[S1]λ0[S2]T+λ1[S1]λ1[S2]T= λ1[S1]λ1[S2]T.

To match the rank of left-hand side and the right-hand side of the equation (A.2), k=±1. The rest of the proof is same as the proof part (i).

Proof Proof of Corollary 1 —

Without loss of generality, let us assume |B1,c|3 and |B2,c|1 by (C5) and (C2) in Theorem 1. The elements in Σ corresponding to B1,c and B2,c are off-diagonal elements and thus satisfy non-identifiability if λ0[B1,c]λ0[B2,c]T=λ0[B1,c]λ0[B2,c]T since λ1[B2,c]=0 and λ2[B1,c]=0. Thus, we have

λ0[B1,c]=kλ0[B1,c] and λ0[B2,c]=1kλ0[B2,c].

However, one element in λ0[B1,c] is fixed, k=1. Thus, we have

λ0[B1,c]=λ0[B1,c] and λ0[B2,c]=λ0[B2,c].

This is the same conclusion in (A.3). Now, similar arguments used in the proof of Theorem 1 leads to the identifiable Λ.

Proof Proof of Theorem 2 —

Without the loss of generality, let |B1,c|3 that satisfies (C5) and 2H2 by (C4). The elements in Σ corresponding to B1,c and B2,c exist by (C2) and form off-diagonal elements. Based on (C3), the same arguments in (C1), (C2) and (C3) in Theorem 1 lead to

λ0[B1,c]=λ0[B1,c] and λ0[B2,c]=λ0[B2,c]. (A.7)

Similar to the part (ii) in the proof of Theorem 1, the same conclusion above is made for let |B1,c|3 that satisfies (C5) and 1H2 by (C4). Now, for the off-diagonal elements in Σ corresponding to B1,c, the same argument used in (C4) establishes λ1[B1,c]=λ1[B1,c].

For the off-diagonal elements in Σ corresponding to B1,c and B1,cB2 Bg, the same argument in Theorem 1 establishes

λ0[B1,cB2 Bg]=λ0[B1,cB2 Bg] and λ1[B1,cB2 Bg]=λ1[B1,cB2 Bg],

where the argument requires to use (C6). So far, we identify all elements in λ0 and λ1.

Now, for B2,c, |B2,c|2 by (C2) results in the cyclic system of equations. For example, assume B2,c={a,b}, where a and b indicate the item index corresponding to the factor g=2. Since all elements in λ0 and λ1 are identified and λka=λkb=0 for all k3, this leads the non-identifiability relationship

λ0aλ0b+λ1aλ1b+λ2aλ2b+=λ0aλ0b+λ1aλ1b+λ2aλ2b+λ2aλ2b=λ2aλ2b.

Likewise, we have λ2aλ2i=λ2aλ2i and λ2bλ2i=λ2bλ2i, for any iB, which form a cyclic system of equations. This identifies all λ2. Similar arguments can be made for Bg,c, |Bg,c|2, leading to the identification of all λg, gH1 satisfying (C2).

Now, without the loss of generality, assume g=G does not satisfy (C2). Since all λ0,,λg1 are identified, for the off-diagonal elements in Σ corresponding to Bg, the non-identifiability leads to

λGiλGj=λGiλGj,ij,jBG.

If |BG|3 by (C1), we have the system of cyclic equations, leading to the identification of λg[BG]. This leads to the identification of all λg by observing the relationship λgiλgj=λgiλgj,iBBG,jBG. We note that, if there are crossloadings in BBG, can be less than 3 by forming the system of cyclic equations as λgiλGj=λgiλGj,gG,jBG; however, |BG| needs to be at least 3, if no crossloading exists in BBG.

Proof Proof of Corollary 2 —

Without loss of generality, let us assume |B1,c|3and |B2,c|2 by (C5) and (C2) in Theorem 2. The elements in Σ corresponding to B1,c and B2,c are off-diagonal elements and thus satisfy non-identifiability if λ0[B1,c]λ0[B2,c]T=λ0[B1,c]λ0[B2,c]T similar to the argument in the proof of Theorem 2. Thus, we have

λ0[B1,c]=kλ0[B1,c] and λ0[B2,c]=1kλ0[B2,c].

However, one element in λ0[B1,c] is fixed, k=1. Thus, we have

λ0[B1,c]=λ0[B1,c] and λ0[B2,c]=λ0[B2,c].

This is the same conclusion in (A.7). Now, similar arguments used in the proof of Theorem 1 leads to the identifiable Λ.

Appendix 2: Derivation of (7) and (9)

For rjk=0 case, in the right-hand side of Equation (7), we can show

exp{12i=1N(YiΛkcfki)TΨ1(YiΛkcfki)}=exp{12i=1N(YiTΨ1Yi2YiTΨ1Λkcfki+fkiTΛlcTΨ1Λkcfki)}. (B.1)

Without loss of generality, let us assume that λ1k,,λj1,k have the predetermined structure. Then,

YiTΨ1Yi=(yi1,,yip)(ψ1100ψp1)(yi1,,yip)T=yi12ψ1++yip2ψp,YiTΨ1Λkcfki=(yi1,,yip)(ψ1100ψp1)(λ1kλjk(=0)0)fki=yi1λ1kfkiψ1++yi,j1λj1,kfkiψj1+0+0,fkiTΛkcTΨ1Λkcfki=λ1k2fki2ψ1++λj1,k2fki2ψj1+0++0.

Thus, the orginal Equation (5) is expressed as

p(Λk, R,Ψ|Y, F,rjk=0)exp{12[i=1N(yi12ψ12yi1λ1kfkiψ1+λ1k2fki2ψ1)+(λ1kλ01k)2ψ1σλ01k2++i=1N(yij12ψj12yij1λj1,kfliψj1+λj1,k2fki2ψj1)+(λj1,kλ0,j1,k)2ψj1σλ0,j1,k2+i=1Nyij2ψj++i=1Nyip2ψp]} (B.2)

For the first term of the exponent in (B.2),

exp{12[i=1N(yi12ψ12yi1λ1kfkiψ1+λ1k2fki2ψ1)+(λ1kλ01k)2ψ1σλ01k2]}=exp{12[λ1k2i=1N(fki2ψ1+1ψ1σλ01k2)2λ1k(λ01kψ1σλ01k2+i=1Nyi1fkiψ1)+λ012ψ1σλ01k2+i=1Nyi12ψ1]}=exp[12K1(λ1k22L1K1λ1k)12(λ01k2ψ1σλ01k2+i=1Nyi12ψ1)]=exp[12K1(λ1k22L1K1λ1k+L12K12)+(L1K1)2K1212(λ01k2ψ1σλ01k2+i=1Nyi12ψ1)]=exp[12K1(λ1kL1K1)2]exp[L12K1212(λ01k2ψ1σλ01k2+i=1Nyi12ψ1)]=1K1122πexp[12K1(λ1kL1K1)2]K1122πexp[L12K1212(λ01k2ψ1σλ01k2+i=1Nyi12ψ1)], (B.3)

where K1=i=1Nfki2ψ1+1ψ1σλ01k2,L1=λ01kψ1σλ01k2+i=1Nyi1fkiψ1. In (B.3), 1K1122πexp[(L1K1)212K1(λ1kL1K1)2] can be integreted out as dF(λ1k)=1.

In this way, from (B.2), we obtain

p0=K1122πexp[L12K1212(λ01k2ψ1σλ01k2+i=1Nyi12ψ1)]×Kj1122πexp[Lj12Kj1212(λ0,j1,k2ψj1σλ0,j1,k2+i=1Nyi,j12ψj1)]×exp[12(i=1Nyij2ψj+i=1Nyip2ψp)] π1kπj1,k(1πjk),

where Kj=i=1Nfki2ψj+1ψjσλ,0,j,k2,Lj=λ0,j,kψjσλ,0,j,k2+i=1Nyijfkiψj.

For Equation (6) ( rjk=1 case), a similar derivation above gives rise to

p1=K1122πexp[L12K1212(λ01k2ψ1σλ01k2+i=1Nyi12ψ1)]×Kj122πexp[Lj2Kj212(λ0jk2ψjσλ0,j,k2+i=1Nyi,j2ψj)]×exp[12(i=1Nyij+12ψj+1++i=1Nyip2ψp)]π1kπ2kπjk.

Thus, we have the posterior distribution of rjk as the Bernoulli random variable with the conditional probability

p1p1+p0=Kj122πexp[Lj2Kj212(λ0jk2ψjσλ0,j,22+i=1Nyi,j2ψj)]πjkKj122πexp[Lj2Kj212(λ0jk2ψjσλ0,j,k2+i=1Nyi,j2ψj)]pjl+exp[12(i=1Nyij2ψj)](1πjk)

For the derivation of (9), we have the following calculations. Let zij=yijΛj,kfi, for rjk=0 case,

p0=exp{12ψji=1N(zij)2}k=1q(ψjσλ0jk2)12exp{λ0jk22ψjσλ0jk2}(1πjk).

For rjk=1 case,

p(Λjk, rj,ψj|Yj, F,rjk=1)exp[12ψji=1N(zijλjkfik)2]k=1q(ψjσλ0jk2)12exp[(λjkλ0jk)22ψjσλ0jk2]πjkexp[12ψji=1N(zij22λjkfkizij+λjk2fik2)]k=1q(ψjσλ0jk2)12exp[(λjkλ0jk)22ψjσλ0jk2]πjkexp(12ψji=1Nzij2)exp[12ψj(i=1N2λjkfikzij+i=1Nλjk2fik2 )]×k=1q(ψjσλ0jk2)12exp{λjk22λjkλ0jk+λ0jk22ψjσλ0jk2 }πjkexp{12ψji=1Nzij2}exp[(12ψji=1Nfik2 12ψjσλ0jk2)λjk2(12ψji=1N2fikzij2λ0jk2ψjσλ0jk2)λjkλ0jk22ψjσλ0jk2]πjkk=1q(ψjσλ0jk2)12exp{12[(1ψji=1Nfik2+1ψjσλ0jk2)λjk22(1ψji=1Nfikzij+λ0jkψjσλ0jk2)λjk+λ0jk2ψjσλ0jk2]}×k=1q(ψjσλ0jk2)12exp{12ψji=1Nzij2}πjk. (B.4)

Define K=1ψji=1Nfik2 +1ψjσλ0jk2,L=1ψji=1Nfikzij+λ0jkψjσλ0jk2. Then, (B.4) is

p(Λjk, rj,ψj|Yj, F,rjk=1)exp(12ψji=1Nzij2)exp[12(Kλjk22Lλjk+λ0jk2ψjσλ0jk2)]πjkexp(12ψji=1Nzij2)exp[12K(λjk22LKλjk+L2K2L2K2+λ0jk2Kψjσλ0jk2)]πjk. (B.5)

Now, we integrate out with respect to λjk, (B.5) becomes

p1=exp(12ψji=1Nzij2)πjkexp[12K(L2K2λ0jk2Kψjσλ0jk2)]K122πexp(12ψji=1Nzij2)πjkexp[12(L2Kλ0jk2ψjσλ0jk2)]K122π.

We have the posterior distribution of rjk as the Bernoulli random variable with the conditional probability

p1p1+p0=exp[12(L2Kλ0jk2ψjσλ0jk2)]K122ππjkexp[12(L2Kλ0jk2ψjσλ0jk2)]K122ππjk+exp{λ0jk22ψjσλ0jk2}(1πjk).

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Akaike H., Information theory and an extension of the maximum likelihood principle, in Selected Papers of hirotugu akaike, Springer, 1973. pp. 199–213.
  • 2.Anderson T., and Rubin H., Statistical inference in factor analysis, in Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Univ of California Press, 1956. pp. 111.
  • 3.Asparouhov T., and Muthén B., Exploratory structural equation modeling. Struct. Eq. Model.: Multidiscip. J. 16 (2009), pp. 397–438. [Google Scholar]
  • 4.Bader M., and Moshagen M., No probifactor model fit index bias, but a propensity toward selecting the best model, 2022. [DOI] [PubMed]
  • 5.Barrett P., Structural equation modelling: adjudging model fit. Pers. Individ. Dif. 42 (2007), pp. 815–824. [Google Scholar]
  • 6.Bentler P.M., Comparative fit indexes in structural models. Psychol. Bull. 107 (1990), pp. 238. [DOI] [PubMed] [Google Scholar]
  • 7.Bollen K.A., Structural Equations with Latent Variables, Vol. 210, John Wiley & Sons, New York, 1989. [Google Scholar]
  • 8.Brown T.A., and Moore M.T., Confirmatory factor analysis. Handbook Struct. Eq. Model. 361 (2012), pp. 379. [Google Scholar]
  • 9.Casella G., et al. , Penalized regression, standard errors, and Bayesian lassos. Bayesian Anal. 5 (2010), pp. 369–411. [Google Scholar]
  • 10.Chen F.F., Sensitivity of goodness of fit indexes to lack of measurement invariance. Struct. Eq. Model.: Multidiscip. J. 14 (2007), pp. 464–504. [Google Scholar]
  • 11.Chen J., Partially confirmatory approach to factor analysis with Bayesian learning: a LAWBL tutorial. Struct. Eq. Model.: Multidiscip. J. 29 (2022), pp. 800–816. [Google Scholar]
  • 12.Cheung G.W., and Rensvold R.B., Evaluating goodness-of-fit indexes for testing measurement invariance. Struct. Equ. Modeling. 9 (2002), pp. 233–255. [Google Scholar]
  • 13.Efroymson M.A., Multiple regression analysis, in Mathematical Methods for Digital Computers, Vol. 1, Ralston A., Wilf H.S., Efroymson M.A., eds., Wiley, New York, 1960. pp. 191–203. [Google Scholar]
  • 14.Fang G., et al., Identifiability of bifactor models, arXiv preprint arXiv:2012.12196, 2020.
  • 15.Geman S., and Geman D., Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6 (1984), pp. 721–741. [DOI] [PubMed] [Google Scholar]
  • 16.George E.I., and McCulloch R.E., Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88 (1993), pp. 881–889. [Google Scholar]
  • 17.Hadi A.S., and Chatterjee S., Regression Analysis by Example, John Wiley & Sons, New York, 2015. [Google Scholar]
  • 18.Hays R.D., Sherbourne C.D., and Mazel R.M., The rand 36-item health survey 1.0. Health Econ. 2 (1993), pp. 217–227. [DOI] [PubMed] [Google Scholar]
  • 19.Hirose K., and Konishi S., Variable selection via the grouped weighted lasso for factor analysis models, 2010.
  • 20.Hoerl A.E., and Kennard R.W., Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 12 (1970), pp. 55–67. [Google Scholar]
  • 21.Hogarty K.Y., et al. , Selection of variables in exploratory factor analysis: an empirical comparison of a stepwise and traditional approach. Psychometrika 69 (2004), pp. 593–611. [Google Scholar]
  • 22.Holzinger K.J., and Swineford F., The bi-factor method. Psychometrika 2 (1937), pp. 41–54. [Google Scholar]
  • 23.Howe W.G., Some Contributions to Factor Analysis, Oak Ridge National Lab., Tenn., Oak Ridge, 1955. [Google Scholar]
  • 24.Hu L.T., and Bentler P.M., Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct. Eq. Model.: Multidiscip. J. 6 (1999), pp. 1–55. [Google Scholar]
  • 25.Ishwaran H., and Rao J.S., Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33 (2005), pp. 730–773. [Google Scholar]
  • 26.Jacobucci R., Brandmaier A.M., and Kievit R.A., A practical guide to variable selection in structural equation modeling by using regularized multiple-indicators, multiple-causes models. Adv. Methods Pract. Psychol. Sci. 2 (2019), pp. 55–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Jennrich R.I., and Bentler P.M., Exploratory bi-factor analysis. Psychometrika 76 (2011), pp. 537–549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kathe N., et al. , Assessment of reliability and validity of SF-12v2 among a diabetic population. Value Health 21 (2018), pp. 432–440. [DOI] [PubMed] [Google Scholar]
  • 29.Kline R.B., Principles and Practice of Structural Equation Modeling, Guilford publications, New York, 2015. [Google Scholar]
  • 30.Lee S.-Y., Structural Equation Modeling: A Bayesian Approach, John Wiley & Sons, Chichester, 2007. [Google Scholar]
  • 31.Lu Z.-H., Chow S.-M., and Loken E., Bayesian factor analysis as a variable-selection problem: alternative priors and consequences. Multivariate. Behav. Res. 51 (2016), pp. 519–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Mai Y., Zhang Z., and Wen Z., Comparing exploratory structural equation modeling and existing approaches for multiple regression with latent variables. Struct. Eq. Model.: Multidiscip. J. 25 (2018), pp. 737–749. [Google Scholar]
  • 33.Malsiner-Walli G., and Wagner H., Comparing spike and slab priors for Bayesian variable selection, arXiv preprint arXiv:1812.07259, 2018.
  • 34.Mansolf M., and Reise S.P., When and why the second-order and bifactor models are distinguishable. Intelligence 61 (2017), pp. 120–129. [Google Scholar]
  • 35.Mitchell T.J., and Beauchamp J.J., Bayesian variable selection in linear regression. J. Am. Stat. Assoc. 83 (1988), pp. 1023–1032. [Google Scholar]
  • 36.Montazeri A., et al. , The 12-item medical outcomes study short form health survey version 2.0 (SF-12v2): a population-based validation study from Tehran, Iran. Health Qual. Life Outcomes 9 (2011), pp. 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Mulaik S.A., Linear Causal Modeling with Structural Equations, Chapman and Hall/CRC, Boca Raton, 2009. [Google Scholar]
  • 38.Mulaik S.A., and Quartetti D.A., First order or higher order general factor? Struct. Eq. Model.: Multidiscip. J. 4 (1997), pp. 193–211. [Google Scholar]
  • 39.Muthén B., and Asparouhov T., Bayesian structural equation modeling: a more flexible representation of substantive theory. Psychol. Methods 17 (2012), pp. 313. [DOI] [PubMed] [Google Scholar]
  • 40.Neath A.A., and Cavanaugh J.E., The Bayesian information criterion: background, derivation, and applications. Wiley Interdiscip. Rev.: Comput. Stat. 4 (2012), pp. 199–203. [Google Scholar]
  • 41.Papastamoulis P., and Ntzoufras I., On the identifiability of Bayesian factor analytic models. Stat. Comput. 32 (2022), pp. 1–29. [Google Scholar]
  • 42.Peeters C.F., Rotational uniqueness conditions under oblique factor correlation metric. Psychometrika 77 (2012), pp. 288–292. [Google Scholar]
  • 43.Peugh J., and Feldon D.F., “How well does your structural equation model fit your data?”: is Marcoulides and Yuan’s equivalence test the answer? CBE—Life Sci. Educat. 19 (2020), pp. es5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Pfadt J.M., et al. , Bayesian estimation of single-test reliability coefficients. Multivariate. Behav. Res. 57 (2022), pp. 620–641. [DOI] [PubMed] [Google Scholar]
  • 45.Rothman K.J., Greenland S., and Lash T.L., Modern Epidemiology, Vol. 3, Wolters Kluwer Health/Lippincott Williams & Wilkins, Philadelphia, 2008. [Google Scholar]
  • 46.Schwarz G., Estimating the dimension of a model. Ann. Stat. 6 (1978), pp. 461–464. [Google Scholar]
  • 47.Schweizer K., Troche S.J., and DiStefano C., Scaling the variance of a latent variable while assuring constancy of the model. Front. Psychol. 10 (2019), pp. 887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Steiger J.H.A.L., Statistically based tests for the number of common factors, in Annual meeting of the Psychometric Society, Iowa City, IA, 1980.
  • 49.Tibshirani R., Regression shrinkage and selection via the lasso. J. R. Stat. Soc.: Ser. B (Methodological) 58 (1996), pp. 267–288. [Google Scholar]
  • 50.Turkkan N., and Pham-Gia T., Computation of the highest posterior density interval in Bayesian analysis. J. Stat. Comput. Simul. 44 (1993), pp. 243–250. [Google Scholar]
  • 51.Ware Jr J.E., Kosinski M., and Keller S.D., A 12-item short-form health survey: construction of scales and preliminary tests of reliability and validity. Med. Care 34 (1996), pp. 220–233. [DOI] [PubMed] [Google Scholar]
  • 52.Weiss L.G., et al. , WISC-IV and clinical validation of the four-and five-factor interpretative approaches. J. Psychoeduc. Assess. 31 (2013), pp. 114–131. [Google Scholar]
  • 53.West S.G., Taylor A.B., and Wu W., Model fit and model selection in structural equation modeling. Handbook Struct. Equ. Model. 1 (2012), pp. 209–231. [Google Scholar]
  • 54.Xiao Y., Liu H., and Hau K.-T., A comparison of CFA, ESEM, and BSEM in test structure analysis. Struct. Eq. Model.: Multidiscip. J. 26 (2019), pp. 665–677. [Google Scholar]
  • 55.Zhang B., et al. , Small but nontrivial: a comparison of six strategies to handle cross-loadings in bifactor predictive models. Multivariate. Behav. Res. 58 (2021), pp. 1–18. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES