Abstract
Molenaar (2003, 2011) showed that a common factor model could be transformed into an equivalent model without factors, involving only observed variables and residual errors. He called this invertible transformation the Houdini transformation. His derivation involved concepts from time series and state space theory. This paper verifies the Houdini transformation on a general latent variable model using algebraic methods. The results show that the Houdini transformation is illusory, in the sense that the Houdini transformed model remains a latent variable model. Contrary to common knowledge, a model that is a path model with only observed variables and residual errors may, in fact, be a latent variable model.
It is usually thought that structural equation models with latent variables and path models with only observed variables are completely different model types and hence that they cannot be mapped into each other. We show in this paper that models that contain only observed and residual variables may in fact be equivalent to latent variable models. To motivate the development, consider the interdependent system
where the residuals are further structured via δ1 = ε1 − W12ε2 and δ2 = ε2 − W21ε1. For concreteness, let x1 be (2×1) and x2 be (6×1), with and , , , and . There is no hint in that this system might be equivalent to a latent variable model, although it does have an odd error structure. However, the Houdini transformation that is developed below can be used to show that this model is equivalent to the standard orthogonal factor model x = Λζ + ε, where , , E(ζζ′) = I, and .
It is instructive to focus on the distinction between latent versus observed variable models using the most well-known latent variable model, namely, factor analysis. A continuing theme in its 100+ year history (e.g., Cudeck & MacCallum, 2007) has been the key distinction made between this method and principal components analysis. There is consensus that factor analysis is a true latent variable model while components analysis, although in some ways appearing similar to the factor model, simply involves transformations of observed variables. Although it has recently proven possible to relate these methods in a precise way (Bentler & de Leeuw, 2011), and although limit theorems exist on when they become identical (e.g., Bentler & Kano, 1990), in any given empirical application with a relatively small number of variables they are not interchangeable. It is then surprising to discover, as Molenaar (2003, 2011) has done, that any common factor model can be transformed in a 1:1 way so that the resulting model contains no common factors, only observed variables and residuals. Molenaar's proof was lengthy and invoked time series theorems of Granger and Morris (1976). Here we provide a simple direct proof based on manipulations of standard structural equations. We obtain the desired result, but, extending Molenaar's (2003, 2011) conclusions, we show that the resulting model remains a latent variable model.
We need to have some clarity on what a latent variable model is, and is not. While there are several defining characteristics of latent variable models as reviewed by Bollen (2002), we use the definition provided by Bentler (1982) based on the Bentler-Weeks (1980) model. In this approach, all variables in a model are either dependent or independent. A variable is a dependent variable if is expressed in a model as a function of one or more other variables (i.e., it appears on the left side of an equation), while all other variables are considered independent variables (i.e., they never appear on the left side of any equation). Then a model for p observed variables is a latent variable model if, and only if, the dimensionality of the independent variables is greater than the dimensionality of the observed models. Specifically, with p observed variables, the covariance matrix of independent variables must be rank p+k for some k>0. Factor analytic models with p variables and k factors meet this requirement, while principal components with k components do not. The model resulting from a Houdini transformation involves equations in observed variables and errors, as is typical of path models with residual errors. In general, such models are not latent variable models, implying that the Houdini transformed model is not a latent variable model. But we will see that the observed variable model with residual errors obtained from a Houdini transformation remains a latent variable model even though the common factors have disappeared.
MODEL SETUP
For some model generality, let us consider a standard factor analytic measurement model for p variables
with latent factors ξ and residual errors or unique variates ε, and latent variable regressions
Thus this is a factor analytic simultaneous equation model. As usual, we assume that ζ, ε are mutually uncorrelated, and that (I − B) is full rank, so that ξ = (I − B)−1 ζ. It follows that x = μ + Λ(I − B)−1 ζ + ε, where now the ζ play the role of common factors.
With the covariance matrix of the ζ given as Φ, and that of ε given as ψ (taken here as diagonal), both assumed full rank, the covariance structure of the model is given by
(1) |
In this model, the independent variables ζ and ε are of order k and p, respectively. With p observed variables we have p<(p+k), and hence this is a latent variable model.
HOUDINI MANIPULATIONS
To simplify model (1) prior to our manipulations, we rewrite , where Φ5Φ5′ = Φ. Then it follows that model (1) can be written in a version that looks like an exploratory factor analysis model, namely,
(2) |
Without loss of generality, if necessary, we permute the rows and columns of Σ so that when we write
(3) |
the kxk part Λ1 has full rank. Thus the structural equations can be written as
(4) |
It follows that , and hence
(5) |
Since (5) provides a way of expressing the factors in terms of observed and residual error variables, it can be used to eliminate the factors completely from the model. To accomplish this, we substitute (5) into the second set of equations in (4), obtaining
(6) |
We also do some manipulation of (6) to obtain an expression for x1. This is
(7) |
These two equations, (6) and (7), are key results. They express the observed variables as an interdependent system relating observed variables and residual errors only. The common factors are gone. This is the Houdini transformation, considered in a more general SEM context as compared to Molenaar (2003, 2011).
THE ILLUSORY ABSENT FACTORS
The fundamental question in this paper is whether the Houdini transformation has resulted in a model (equations (6) and (7)) that is a latent variable model or not. To answer this question, it should be noted that each of the above steps is reversible. That is, starting out with a model of the form (6) and (7) with no common factors, a model with common factors can be obtained. Hence it would seem that if one model is a latent variable model, so must be the other one.
We look at this by investigating the dimensionality of the independent variables. Clearly, the dependent variables in the model are x1 and x2, and the independent variables are ε1 and ε2. By the original assumptions, they are uncorrelated, and hence they span a p-dimensional space. At first glance, it seems that we do not have a latent variable model. But what about the variables (x1 − ε1) and (x2 − ε2) ? They are not dependent variables, but are they independent variables? To investigate this, from (4) we note that (x1 − ε1) = Λ1ζ and (x2 − ε2) = Λ2ζ are linear transformations of the k common factors ζ and hence both of these variables are k-dimensional latent variables. Furthermore, since they are two linear transformations of a single vector variate ζ that is independent of the unique variates ε1 and ε2, the dimensionality of the space of (6) and (7) is p + k. Hence the Houdini transformation did not reduce the space spanned by the independent variables.
We complete this section by relating equations (6) and (7) to our opening example. The abstract algebra can be mapped into the numbers of the example with the calculations and .
DISCUSSION
A universally accepted distinction in structural modeling is between observed variable path models, or simultaneous equation models, and latent variable models. What the Houdini transformation shows is that a model that contains only observed variables and residual errors actually can be a latent variable model. To our knowledge, this observation is new.
While the results in this paper provide one view on the Houdini transformation, we want to recognize that we have dealt with a specialized approach to eliminating latent factors from a model. Molenaar actually provided a broader view of this transformation which emphasizes the following additional features. The Houdini transformation involves a principled approach to derive a family of equivalent structural equation models which have not been considered before in the published literature. The equivalence relationship concerned is reminiscent of the relationship between state space models with latent state processes and transfer function models directly linking input and output processes (cf. Heij et al., 2007). The way in which the set of equivalent models is obtained implies that all models with latent factors are nested. This has important consequences for model selection. Moreover, the Houdini transformation of a model with q latent factors is obtained in a sequence of steps, where in each step the dimension of the
latent space is reduced. This raises new questions about the proper definition of the dimension of spaces spanned by common latent variables.
ACKNOWLEDGEMENTS
This research was supported in part by grants 5K05DA000017-35 and 5P01DA001070-38 from the National Institute on Drug Abuse to P. M. Bentler and grant 0852147 from the National Science Foundation to P. C. M. Molenaar. Bentler acknowledges a financial interest in EQS and its distributor, Multivariate Software.
REFERENCES
- Bentler PM. Linear systems with multiple levels and types of latent variables. In: Jöreskog KG, Wold H, editors. Systems under indirect observation: Causality, structure, prediction. Amsterdam: North- Holland: 1982. pp. 101–130. [Google Scholar]
- Bentler PM, de Leeuw J. Factor analysis via components analysis. Psychometrika. 2011;76(3):461–470. DOI: 10.1007/S11336-011-9217-5. [Google Scholar]
- Bentler PM, Kano Y. On the equivalence of factors and components. Multivariate Behavioral Research. 1990;25:67–74. doi: 10.1207/s15327906mbr2501_8. [DOI] [PubMed] [Google Scholar]
- Bentler PM, Weeks DG. Linear structural equations with latent variables. Psychometrika. 1980;45:289–308. [Google Scholar]
- Bollen KA. Latent variables in psychology and the social sciences. Annual Review of Psychology. 2002;53:605–634. doi: 10.1146/annurev.psych.53.100901.135239. [DOI] [PubMed] [Google Scholar]
- Cudeck R, MacCallum RC, editors. Factor analysis at 100: Historical developments and future directions. Erlbaum; Mahwah, NJ: 2007. [Google Scholar]
- Granger CWJ, Morris MJ. Time series modelling and interpretation. Journal of the Royal Statistical Society, A. 1976;139:246–257. [Google Scholar]
- Heij C, Ran A, van Schagen F. Introduction to mathematical systems theory: Linear systems, identification and control. Birkhäuser Verlag; Basel: 2007. [Google Scholar]
- Molenaar PCM. State space techniques in structural equation models. 2003, 2011 UCLA Statistics Preprint No. 635. Available from preprints.stat.ucla.edu.