Skip to main content
Entropy logoLink to Entropy
. 2021 Jan 24;23(2):140. doi: 10.3390/e23020140

An Entropy-Based Tool to Help the Interpretation of Common-Factor Spaces in Factor Analysis

Nobuoki Eshima 1,*, Claudio Giovanni Borroni 2, Minoru Tabata 3, Takeshi Kurosawa 4
PMCID: PMC7912700  PMID: 33498798

Abstract

This paper proposes a method for deriving interpretable common factors based on canonical correlation analysis applied to the vectors of common factors and manifest variables in the factor analysis model. First, an entropy-based method for measuring factor contributions is reviewed. Second, the entropy-based contribution measure of the common-factor vector is decomposed into those of canonical common factors, and it is also shown that the importance order of factors is that of their canonical correlation coefficients. Third, the method is applied to derive interpretable common factors. Numerical examples are provided to demonstrate the usefulness of the present approach.

Keywords: canonical factor analysis, canonical common factor, entropy, factor contribution

1. Introduction

In factor analysis, extracting interpretable factors is important for practical data analysis. In order to carry it out, methods for factor rotation have been studied, e.g., varimax [1] and orthomax [2] for orthogonal rotations and oblimin [3] and orthoblique [4] for oblique rotations. The basic idea for factor rotation in factor analysis is owed to the criteria of simple structures of factor analysis models by Thurstone [5], and the methods of factor rotation are constructed with respect to maximizations of variations of the squared factor loadings in order to derive simple structures of factor analysis models. Let Xi be manifest variables, let ξj be latent variables (common factors), let εi be unique factors related to Xi, and finally, let λij be factor loadings that are weights of common factors ξj to explain Xi. Then, the factor analysis model is given as follows:

Xi=j=1mλijξj+εi,   i=1,2,,p, (1)

where

{E(Xi)=E(εi)=0, i=1,2,,p;E(ξj)=0, j=1,2,,m;Var(ξj)=1, j=1,2,,m; Cov(ξk,ξl)=ϕkl;Var(εi)=ωi2>0, i=1,2,,p;Cov(εk,εl)=0, kl.

To derive simple structures of factor analysis models, for example, in the varimax method, the following variation function of squared factor loadings is maximized with respect to factor loadings:

V=i=1pj=1m(λij2λ2¯)2, (2)

where λ2¯=1pmi=1pj=1mλij2. In this sense, the basic factor rotation methods can be viewed as those for exploratively analyzing multidimensional common-factor spaces. The interpretation of factors is made according to manifest variables with large weights in common factors. As far as we know, novel methods for factor rotation have not been investigated except for rotation methods similar to the above basic ones. In real data analyses, manifest variables are usually classified into some groups of variables in advance that may have common factors and concepts for themselves. For example, suppose we have a test battery including the following five subjects: Japanese, English, Social Science, Mathematics, and Natural Science. It is then reasonable to classify the five subjects into two groups, {Japanese, English, Social Science} and {Mathematics, Natural Science}. In such cases, it is meaningful to determine common factors related to the two manifest variable groups. For this objective, it is useful to develop a novel method to derive the common factors based on a factor contribution measure. In conventional methods of factor rotation, for example, as mentioned above, variation function (2) for the varimax method is not related to factor contribution.

An entropy-based method for measuring factor contribution was proposed by [6], and the method can measure factor contributions to manifest variables vectors and can decompose the factor contributions into those of manifest subvectors and individual manifest variables. By using the method, we can derive important common factors related to the manifest subvectors and the manifest variables. The aim of the present paper is to propose a new method for deriving simple structures based on entropy, that is, extracting common factors easy to interpret. In Section 2, an entropy-based method for measuring factor contribution [6] is reviewed to apply its properties for deriving simple structures in factor analysis models. Section 3 discusses canonical correlation analysis between common factors and manifest variables, and the contributions of common factors to the manifest variables are decomposed into components related to the extracted pairs of canonical variables. A numerical example is given to demonstrate the approach. In Section 4, canonical correlation analysis is applied to obtain common factors easy to interpret, and the contributions of the extracted factors are measured. Numerical examples are given to illustrate the present approach, and finally, Section 5 provides discussions and conclusions to summarize the present approach.

2. Entropy-Based Method for Measuring Factor Contributions

First, in order to derive factor contributions, factor analysis model (1) with error terms εi, i=1,2,,p, which are normally distributed, can be discussed in the framework of generalized linear models (GLMs) [7]. A general path diagram among manifest variables Xi, i=1,2,,p and common factors ξj, j=1,2,,m in the factor analysis model is illustrated in Figure 1. The conditional density functions of manifest variables of Xi, i=1,2,,p, given the factors ξj, j=1,2,,m, are expressed as follows:

fi(xi|ξ)=12πωi2exp((xij=1mλijξj)22ωi2)=exp(xij=1mλijξj12(j=1mλijξj)2ωi2xi22ωi2log2πωi2), i=1,2,,p.

Let θi=j=1mλijξj and d(xi,ωi2)=xi22ωi2log2πωi2. Then, the above density function is described in a GLM framework as

fi(xi|ξ)=exp(xiθi12θi2ωi2+d(xi,ωi2)), i=1,2,,p. (3)

According to the local independence of the manifest variables in factor analysis model (1), the conditional density function of X=(X1,X2,,Xp)T given ξ=(ξ1, ξ2,,ξm)T is expressed as

f(x|ξ)=i=1pexp(xiθi12θi2ωi2+d(xi,ωi2))=exp(i=1pxiθi12θi2ωi2+i=1pd(xi,ωi2)). (4)

Figure 1.

Figure 1

Path diagram of a general factor analysis model.

Let g(ξ) be the joint density function of common-factor vector =(ξ1, ξ2,,ξm)T; let fi(xi) be the marginal density functions of Xi, i=1,2,,p; and let us set

KL(X, ξ)= f(x|ξ)g(ξ)logf(x|ξ)f(x)dxdξ+ f(x)g(ξ)logf(x)f(x|ξ)dxdξ, (5)
KL(Xi,ξ)= fi(xi|ξ)g(ξ)logfi(xi|ξ)fi(xi)dxidξ+ fi(xi)g(ξ)logfi(xi)fi(xi|ξ)dxidξ, i=1,2,,p. (6)

where “KL” stands for “Kullback–Leibler information” [8]. From (3) and (4), we have

KL(Xi,ξ)=Cov(Xi,θi)ωi2=j=1mλijCov(Xi,ξj)ωi2,i=1,2,,p; (7)
KL(X, ξ)=i=1pCov(Xi,θi)ωi2=i=1pj=1mλijCov(Xi,ξj)ωi2. (8)

The above quantities (7) and (8) are interpreted as the signal-to-noise ratios for dependent variables Xi and predictors θi; and the signal-to-noise ratio for dependent-variable vectors X and common-factor vector ξ, respectively.

From (7) and (8), the following theorem can be derived [6]:

Theorem 1.

In factor analysis model (1), let X=(X1, X2,,Xp)T and ξ=(ξ1, ξ2,,ξm)T . Then,

KL(X,ξ)=i=1pKL(Xi,ξ).

Consistently, the following theorem, which is actually an extended version of Corollary 1 in [6], can be also obtained:

Theorem 2.

Let manifest variable subvectors X(a), a=1,2,,A be any decomposition of manifest variable vector X=(X1, X2,,Xp)T . Then,

KL(X,ξ)=a=1AKL(X(a),ξ). (9)

Following Eshima et al. [6], the contribution of factor vector ξ=(ξ1, ξ2,,ξm)T to manifest variable vector X=(X1, X2,,Xp)T is thus defined as

C(ξX)=KL(X,ξ),

so that, in Theorem 2, the contributions of factor vector ξ=(ξ1, ξ2,,ξm)T to manifest variable vectors X(a), a=1,2,,A are defined by

C(ξX(a))=KL(X(a),ξ), a=1,2,,A.

Let ξ\j be subvectors of all variables ξi except ξj from ξ=(ξ1, ξ2,,ξm)T, i.e.,

ξ\j=(ξ1, ξ2,,ξj1,ξj+1,,ξm)T, j=1,2,,m;

and let KL(X,ξ\j|ξj) and KL(X(a),ξ\j|ξj) be the conditional Kullback–Leibler information as defined in (5) and (6). The contributions of common factors ξj are defined by

C(ξjX)=KL(X,ξ) KL(X,ξ\j|ξj),
C(ξjX(a))=KL(X(a),ξ)KL(X(a),ξ\j|ξj), j=1,2,,m.

Remark 1.

Information KL(X,ξ\j|ξj) and KL(X(a),ξ\j|ξj) can be expressed by using the conditional covariances Cov(Xi,θi|ξj) . For example,

KL(X,ξ\j|ξj)=i=1pCov(Xi,θi|ξj)ωi2.

Finally, the following decomposition of KL(X,ξ) holds for orthogonal factors ([6], Theorem 3):

Theorem 3.

If the common factors are mutually independent, it follows that

C(ξX)=j=1ma=1AC(ξjX(a))=j=1mi=1pC(ξjXi).

The entropy coefficient of determination (ECD) [9] between ξ and X is defined by

ECD(ξ,X)=KL(ξ,X)KL(ξ,X)+1,

so that the total relative contribution of factor vector ξ to manifest variable vector X in entropy can be defined as

RC˜(ξX)=ECD(ξ,X)=C(ξX)C(ξX)+1,

while, for a single factor ξj, two relative contribution ratios can be defined:

RC(ξjX)=C(ξjX)C(ξX)=KL(X,ξ) KL(X,ξ\j|ξj)KL(ξ,X),RC˜(ξjX)=C(ξjX)KL(ξ,X)+1=KL(X,ξ) KL(X,ξ\j|ξj)KL(ξ,X)+1

(see [6] for details).

Second, factor analysis model (1) in a general case is discussed. Let Σ be the variance–covariance matrix of manifest variable vector X=(X1,X2,,Xp)T; let Ω be the p×p variance–covariance matrix of unique factor vector ε=(ε1,ε2,,εp)T; let Λ be the p×m factor loading matrix of λij; and let Φ be the correlation matrix of common-factor vector ξ=(ξ1,ξ2,,ξm)T. Then, model (1) can be expressed as

X=Λξ+ε

and we have

Σ=ΛΦΛT+Ω.

Now, the above discussion is extended in a general factor analysis model (1) with the following variance–covariance matrix of X and ε:

(ΛΦΛT+ΩΛΦΦΛTΦ). (10)

Let θ=Λξ be the predictor vector of manifest variable vector XT=(X1,X2,,Xp). Then, the contribution of common-factor vector ξ to manifest variable vector X is defined by the following generalized signal-to-noise ratio:

E(XTΩ1θ)=E(XTΩ˜Λξ)|Ω|=trΩ˜ΛΦΛT|Ω|, (11)

where Ω˜ is the cofactor matrix of Ω. The signal is trΩ˜ΛΦΛT and the noise |Ω|, and both are positive. Hence, the above quantity is defined as the explained entropy with the factor analysis model, and the same notation KL(X, ξ) as above is used, having to do with the Kullback–Leibler information for the factor analysis model with normal distribution errors (4). Similarly, in the general model, as in (9), signal-to-noise ratio (11) is decomposed into

trΩ˜ΛΦΛT|Ω|=i=1pCov(Xi,θi)ωi2=i=1pj=1mλijCov(Xi,ξj)ωi2,

so the above theorems hold true as well. Thus, the results mentioned above are applicable to factor analysis models with error terms with non-normal distributions.

3. Canonical Factor Analysis

In order to derive interpretable factors from the common-factor space, we propose taking advantage of the results of canonical correlation analysis applied to manifest variables and common factors. This approach can be referred to as “canonical factor analysis” [10]. In the factor analysis model (1), the variance–covariance matrix of X=(X1, X2,,Xp)T and ξ=(ξ1, ξ2,,ξm)T is given by (10). Then, we have the following theorem:

Theorem 4.

For canonical correlation coefficients ρk, k=1,2,,m between X  and ξ in factor analysis model (1) with (10), it follows that

KL(X,ξ)=j=1mρj21ρj2.

Proof. 

Let B(1), B(2), and F be m×p, (pm)×p, and m×m matrices, respectively; let V(1)=(V1,V2,,Vm)T=B(1)X, V(2)=B(2)X, and η=(η1,η2,,ηm)T=Fξ. It is assumed that (Vj,ηj) are the pairs of canonical variables with correlation coefficients ρj, j=1,2,,m; that matrices (B(1)B(2)) and F are nonsingular; and that V(1) and V(2) are statistically independent. Since all pairs of canonical variables (Vj,ηj) and V(2) are mutually independent, we have

KL(V(2),η)=0, KL(V(1),ηj)=KL(Vj,ηj), j=1,2,,m.

From Theorem 2, it follows that

KL(X,ξ)=KL(V,Fξ)=KL((V(1)V(2)),η)=KL(V(1),η)+KL(V(2),η)=KL(V(1),η)=j=1mKL(V(1),ηj)=j=1mKL(Vj,ηj)=j=1mρj21ρj2.

This completes the theorem. □

In the proof of the above theorem, we have

KL(X,ηj)=KL(Vj,ηj)=ρj21ρj2, j=1,2,,m. (12)

It implies that

C(ηjX)=C(ηjVj)=ρj21ρj2;
RC˜(ηjX)=KL(X,ηj)KL(X,ξ)+1=KL(X,ηj)KL(η,V)+1=ρj21ρj2a=1mρa21ρa2+1;
RC(ηjX)=KL(X,ηj)KL(ξ,X)=KL(Vj,ηj)KL(V,η)=ρj21ρj2a=1mρa21ρa2, j=1,2,,m. 

Theorem 4 shows that the contribution of common-factor vector ξ to manifest variable vector X is decomposed into those of canonical common factors ηj, i.e.,

KL(X,ξ)=j=1mKL(X,ηj)=j=1mKL(Vj,ηj), j=1,2,,m.

Let us assume

1>ρ12ρ22ρm20. (13)

According to the entropy-based criterion in Theorem 4, the order of importance of canonical common factors is that of canonical correlation coefficients. The interpretation of factors ηj can be made with the corresponding manifest canonical variables Vj and the factor loading matrix of canonical common factors η=Fξ. For the canonical common factors, the factor loading matrix can be obtained as Λ=ΛF1. We refer to the canonical correlation analysis in Theorem 4 as canonical factor analysis [10].

Theorem 5.

In factor analysis model (1), for any p×p and m×m nonsingular matrices P and Q , the canonical factor analysis between manifest variable vector PX and common-factor vector Qξ is invariant.

Proof. 

Since the variance–covariance matrix of PX and Qξ is given by

(P00Q)(ΣΛTΛIm)(P00Q)T,

the theorem follows. □

Notice that we also have

KL(PX,Qξ)=KL(X,ξ).

From the above theorem, the results of the canonical factor analysis do not depend on the initial common factors ξj in factor analysis model (1). For factor analysis model (1), it follows that

KL(X,ξ)=j=1mKL(Vj,ηj)=i=1pKL(Xi,ξ),

implying that

trΩ˜ΛΦΛT|Ω|=j=1mρj21ρj2 =i=1pRi21Ri2,

where Ri are the multiple correlation coefficients between manifest variables Xi and factor vector ξ=(ξ1,ξ2,,ξm), i=1,2,,p.

Numerical Example 1

Table 1 shows the results of orthogonal factor analysis (varimax method by S-PLUS ver. 8.2) as reported in [6]; the same example is used here to demonstrate the canonical factor analysis mentioned above. In Table 1, manifest variables X1, X2, and X3 are scores in some subjects in the liberal arts, while variables X4 and X5 are those in the sciences. We refer to the factors as the initial common factors. In this example, from Table 1, the variance–covariance matrices in (10) are given as follows:

Σ=(10.540.390.420.360.5410.490.380.220.390.4910.2100.420.380.2110.540.360.2200.541),
Φ=(1001).

where covariance matrix ΛT is given in Table 1.

Table 1.

Factor loadings of orthogonal (varimax) factor analysis.

X1 X2 X3 X4 X5
ξ1 0.60 0.75 0.65 0.32 0.00
ξ2 0.39 0.24 0.00 0.59 0.92
uniqueness 0.50 0.38 0.58 0.55 0.16

Uniqueness is the proportion of unique factor εi related to manifest variable Xi.

From the above matrices, to obtain the pairs of canonical variables, linear transformation matrices B(1) and F in Theorem 4 are as follows:

B(1)=(0.190.200.320.580.060.200.940.370.000.65),

and

F=(0.320.950.950.32).

By the above matrices, we have the following pairs of canonical variables (Vi,ηi) and their squared canonical correlation coefficients ρi2:

{V1=0.19X1+0.20X2+0.06X3+0.20X4+0.94X5,η1=0.32ξ1+0.95ξ2,ρ12=0.88,
{V2=0.32X1+0.58X2+0.37X3+0.07X40.65X5,η2=0.95ξ10.32ξ2,ρ22=0.73.

According to the above canonical variables, the factor loading for canonical factors ηi, i=1,2 is calculated with the initial loading matrix Λ and the rotation matrix F, and we have

ΛT=(ΛF1)T=(0.320.950.950.32)1(0.60.750.390.240.650.320.000.000.590.92)=(0.560.470.450.640.210.660.870.620.120.29).

From the above results, the first canonical factor η1 can be viewed as a general common ability (factor) to solve all five subjects. The second factor η2 can be regarded as a factor related to subjects in the liberal arts, which is independent of the first canonical factor. In the canonical correlation analysis, the contributions of canonical factors are calculated. Since the multiple correlation coefficient between η1 and X=(X1, X2,,X5)T is ρ12=0.88 and that between η2 and X is ρ22=0.73, we have

C(η1X)=ρ121ρ12=7.06, C(η2X)=ρ221ρ22=2.70.

Let ξ=(ξ1,ξ2). From the above results, we have

C(ξX)=KL(ξ,X)=C(η1X)+ C(η2X)=9.86,CR˜(ξX)=KL(ξ,X)KL(ξ,X)+1=0.91(=ECD(ξ,X)).

From this, 91% of the variation of manifest random vector X in entropy is explained by the common latent factors ξ. The contribution ratios of canonical common factors are calculated as follows:

CR(η1X)=7.067.06+2.70=0.72, CR(η2X)=2.70.

The contribution of the first canonical factor is about 2.6 times greater than that of the second one.

4. Deriving Important Common Factors Based on Decomposition of Manifest Variables into Subsets

From (9) in Theorem 2, KL(X,ξ) is decomposed into those for manifest variable subvectors X(a), KL(X(a),ξ), a=1,2,,A. Thus, we have the following theorem:

Theorem 6.

Let manifest variable vector Xbe decomposed into subvectorsX(a), a=1,2,,A. Let ρ(a)j, j=1,2,,m(a)be the canonical correlation coefficients between manifest variable subvectorX(a)and common-factor vector ξ,a=1,2,,Ain the factor analysis model (1), where m(a)min{dimension of X(a),m}. Then,KL(X,ξ)is decomposed into canonical components as follows:

KL(X,ξ)=a=1Aj=1m(a)ρ(a)j21ρ(a)j2.

Proof. 

For manifest variable vector X(a) and common-factor vector ξ, applying canonical correlation analysis, we have m(a) pairs of canonical variables (Vj(α),ηj(α)) with squared canonical correlation coefficients ρ(a)j2, j=1,2,,m(a). Then, applying Theorem 4 to KL(X(a),ξ) it follows that

KL(X(a),ξ)=j=1m(a)KL(Vj(α),ηj(α))=j=1m(a)ρ(a)j21ρ(a)j2, a=1,2,,A.

From Theorem 2, the theorem follows. □

Remark 2.

As shown in the above theorem, the following relations hold:

KL(X(a),ηj(α))=KL(Vj(α),ηj(α))=ρ(a)j21ρ(a)j2, j=1,2,,m(a); a=1,2,,A.

In this sense,

C(ηj(a)X(a))=ρ(a)j21ρ(a)j2, j=1,2,,m(a); a=1,2,,A.

To derive important common factors, the above theorem can be used. In many of the data in factor analysis, manifest variables can be classified into subsets that have common concepts (factors) to be measured. For example, in the data used for Table 1, it is meaningful to classify the five variables into two subsets X(1)=(X1,X2,X3) and X(2)=(X4,X5), where the first subset is related to the liberal arts and the second one is related to the sciences. In (X(1),ξ) and (X(2),ξ), it is possible to derive the latent ability for the liberal arts and that for the sciences, respectively.

4.1. Numerical Example 1 (Continued)

For (X(1),ξ) and (X(2),ξ), two sets of canonical variables are obtained, respectively, as follows:

{η1(1)=0.95ξ1+0.32ξ2, V1(1)=0.52X1+0.76X2+0.39X3, ρ(1)12=0.77,η2(1)=0.32ξ10.95ξ2, V2(1)=0.71X10.07X20.71X3, ρ(1)22=0.12,
{η1(2)=0.06ξ1+1.00ξ2, V1(2)=0.18X4+0.98X5, ρ(2)12=0.97,η2(2)=1.00ξ10.06ξ2, V2(2)=0.83X40.55X5, ρ(2)22=0.03.

According to the above canonical variables, we have the following factor contributions:

{C(η1(1)X(1))=C(η1(1)V1(1))=0.7710.77=3.27, CR(η1(1)X(1))=0.96,C(η2(1)X(1))=0.1210.12=0.14. CR(η2(1)X(1))=0.04;
{C(η1(2)X(2))=6.14, CR(η1(2)X(2))=0.97,C(η2(2)X(2))=0.17, CR(η2(2)X(2))=0.03.

From the above results, canonical factors η1(1) and η1(2) can be interpreted as general common factors for the liberal arts and for the sciences, respectively. By using the factors, the factor loadings are given in Table 2. In this case, Table 2 is similar to Table 1; however, the factor analysis model is oblique and the correlation coefficient between η1(1) and η1(2) is 0.374. The contributions of the factors to manifest variable vector X=(X1,X2,X3,X4,X5)=(X(1),X(2)) are calculated as follows:

{C(η1(1)X)=6.563, CR(η1(1)X)=0.687,CR˜(η1(1)X)=0.60, C(η1(2)X)=4.223. CR(η1(2)X)=0.442,CR˜(η1(2)X)=0.39.

In this case, factors η1(1) and η1(2) are correlated, so it follows that

CR(η1(1)X)+CR(η1(2)X)=1.129>1.

Table 2.

Factor loadings by using canonical common factors η1(1) and η1(2).

X1 X2 X3 X4 X5
η1(1) 0.62 0.80 0.70 0.31 0.06
η1(2) 0.19 0.02 0.22 0.49 0.94
uniqueness 0.50 0.38 0.58 0.55 0.16

4.2. Numerical Example 2

Table 3 shows the results of the maximum likelihood factor analysis (orthogonal) for six scores Xi, i=1,2,,6 ([11], pp. 61–65); such results are treated as the initial estimates in the present analysis. In this example, variables are classified into the following three groups: variable X1 is related to the Spearman’s g factor; variables X2, X3, and X4 account for problem-solving ability; and variables X5 and X6 are associated with verbal ability [11]; however, it is difficult to explain the three factors by using Table 3. In this example, the present approach is employed for deriving the three factors. From (10) and Table 3, the correlation matrix of the manifest variables is given as follows:

Σ^=(10.4170.5760.41710.5670.5760.56710.3120.5760.5140.3060.2650.2630.4270.3550.3540.3120.3060.4270.5760.2650.3550.5140.2630.35410.1930.1930.19310.7990.1930.7991).

Let X(2)=(X2,X3,X4), let X(3)=(X5,X6), and let ξ=(ξ1,ξ2). Canonical correlation analysis is carried out for (X1,ξ), (X(2),ξ), and (X(3),ξ), and we have the following canonical variables, respectively:

η1(1)=10.672+0.372(0.64ξ1+0.37ξ2)=0.87ξ1+0.50ξ2, V1(1)=X1, ρ(1)12=0.55
{η1(2)=0.52ξ1+0.85ξ2, V1(2)=0.24X2+0.96X3+0.14X4, ρ(2)12=0.83,η2(2)=0.85ξ10.52ξ2, V2(2)=0.81X20.59X30.02X4, ρ(1)22=0.00,
{η1(3)=0.99ξ10.12ξ2, V1(3)=0.99X5+0.11X6, ρ(3)12=0.96,η2(3)=0.12ξ1+0.99ξ2, V2(3)=0.64X50.77X6, ρ(3)22=0.01.

The contributions of canonical factors ηi(k), i=1,2;k=2.3 are calculated as follows:

{C(η1(2)X(2))=C(η1(2)V1(2))=0.8310.83=4.88, CR(η1(2)X(2))=1.00,C(η2(2)X(2))=0.0010.00=0.00. CR(η2(1)X(1))=0.00;
{C(η1(3)X(3))=0.9610.96=24.00, CR(η1(3)X(3))=0.99,C(η2(2)X(2))=0.01, CR(η2(2)X(2))=0.01.

The common factor η1(1)(=g) can be interpreted as the Spearman’s g factor (general intelligence) and canonical common factors η1(2) and η1(3) can be interpreted as problem-solving ability and verbal ability, respectively. The correlation coefficients between the three factors are given by

Corr(g,η1(2))=0.88, Corr(g,η1(3))=0.80, Corr(η1(2),η1(3))=0.42.

The contributions of the above three factors to manifest variable vector X=(X1,X2,X3,X4,X5,X6) are computed as follows:

{C(gX)=19.93,CR(gX)=0.68CR˜(gX)=0.66C(η1(2)X)=9.77,CR(η1(2)X)=0.33,CR˜(η1(2)X)=0.32,C(η1(3)X)=25.02,CR(η1(3)X)=0.85,CR˜(η1(3)X)=0.82.

The common-factor space is two-dimensional, and the factor loadings with common factors η1(2) and η1(3) are calculated as in Table 4. The table shows a clear interpretation of the common factors. Thus, the present method is effective for deriving interpretable factors in situations such as that of this example. The expressions of the factor analysis model can also be given by factor vectors (g,η1(2)) and (g,η1(3)), respectively. The present method is applicable for any subsets of manifest variables.

Table 3.

The initial maximum likelihood estimates of factor loadings (varimax).

X1 X2 X3 X4 X5 X6
ξ1 0.64 0.34 0.46 0.25 0.97 0.82
ξ2 0.37 0.54 0.76 0.41 0.12 0.03
uniqueness 0.45 0.59 0.21 0.77 0.04 0.33

Table 4.

The factor loadings with common factors η1(2) and η1(3).

X1 X2 X3 X4 X5 X6
η1(2) 0.49 0.63 0.89 0.48 0.01 0.07
η1(3) 0.39 0.01 0.00 0.00 0.98 0.79
uniqueness 0.45 0.59 0.21 0.77 0.04 0.33

5. Discussion

In order to find interpretable common factors in factor analysis models, methods of factor rotation are often used. The methods are based on maximizations of variation functions of squares of factor loadings, and orthogonal or oblique factors are applied. The factors derived by the conventional methods may be interpretable; however, it may be more useful to propose a method for detecting interpretable common factors based on factor contribution measurement, i.e., importance of common factors. An entropy-based method for measuring factor contribution [6] can measure the contribution of the common-factor vector to the manifest variable vector, and one can decompose such a contribution into those of single manifest variables (Theorem 1) and into that of some manifest variable subvectors as well (Theorem 2). A characterization in the case of orthogonal factors can be also given (Theorem 3). The paper shows that the most important common factor with respect to entropy can be identified by using canonical correlation analysis between the factor vector and the manifest variable vector (Theorem 4). Theorem 4 shows that the contribution of the common-factor vector to the manifest variable vector can be decomposed into those of canonical factors and that the order of canonical correlation coefficients is that of factor contributions. In most multivariate data, manifest variables can be naturally classified into subsets according to common concepts as in Examples 1 and 2. By using Theorems 2 and 5, canonical correlation analysis can also be applied to derive canonical common factors from subsets of manifest variables and the initial common-factor vector (Theorem 6). According to the analysis, interpretable common factors can be obtained easily, as demonstrated in Examples 1 and 2. In Example 1, Table 1 and Table 2 have similar factor patterns; however, the derived factors in Table 1 are orthogonal and those in Table 2 are oblique. In Example 2, it may be difficult to interpret the factors in Table 3 produced by the varimax method. On the other hand, Table 4, obtained by using the present method, can be interpreted clearly. Finally, according to Theorem 5, the present method produces results that are invariant with respect to linear transformations of common factors, so that the method is independent of the initial common factors. The present method is the first one to derive interpretable factors based on a factor contribution measure, and the interpretable factors can be obtained easily through canonical correlation analysis between manifest variable subvectors and the factor vectors.

Acknowledgments

The authors would like to thank the three referees for their useful comments and suggestions for improving the first version of this paper.

Author Contributions

Conceptualization, N.E.; methodology, N.E., C.G.B., M.T., T.K.; formal analysis, N.E.; writing—original draft preparation, N.E.; writing—review and editing, C.G.B.; funding acquisition, T.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Grant-in-aid for Scientific Research 18K11200, Ministry of Education, Culture, Sports, Science, and Technology of Japan.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Kaiser H.F. The varimax criterion for analytic rotation in factor analysis. Psychometrika. 1958;23:187–200. doi: 10.1007/BF02289233. [DOI] [Google Scholar]
  • 2.Ten Berge J.M.F. A joint treatment of VARIMAX rotation and the problem of diagonalizing symmetric matrices simultaneously in the least-squares sense. Psychometrika. 1984;49:347–358. doi: 10.1007/BF02306025. [DOI] [Google Scholar]
  • 3.Jennrich R.I., Sampson P.F. Rotation for simple loadings. Psychometrika. 1966;31:313–323. doi: 10.1007/BF02289465. [DOI] [PubMed] [Google Scholar]
  • 4.Harris C.W., Kaiser H.F. Oblique factor analytic solutions by orthogonal transformation. Psychometrika. 1964;29:347–362. doi: 10.1007/BF02289601. [DOI] [Google Scholar]
  • 5.Thurstone L.L. Vector of Mind: Multiple Factor Analysis for the Isolation of Primary Traits. University of Chicago Press; Chicago, IL, USA: 1935. [Google Scholar]
  • 6.Eshima N., Tabata M., Borroni C.G. An entropy-based approach for measuring factor contributions in factor analysis models. Entropy. 2018;20:634. doi: 10.3390/e20090634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Nelder J.A., Wedderburn R.W.M. Generalized linear model. J. R. Stat. Soc. A. 1972;135:370–384. doi: 10.2307/2344614. [DOI] [Google Scholar]
  • 8.Kullback S., Leibler R.A. On information and sufficiency. Ann. Math. Stat. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
  • 9.Eshima N., Tabata M. Entropy coefficient of determination for generalized linear models. Comput. Stat. Data Anal. 2010;54:1381–1389. doi: 10.1016/j.csda.2009.12.003. [DOI] [Google Scholar]
  • 10.Rao C.R. Estimation and tests of significance in factor analysis. Psychometrika. 1955;20:93–111. doi: 10.1007/BF02288983. [DOI] [Google Scholar]
  • 11.Bartholomew D.J. Latent Variable Models and Factor Analysis. Oxford University Press; New York, NY, USA: 1987. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article.


Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES