Table 1.
Glossary of terms and their description (alphabetically ordered)
Terms | Description |
---|---|
Akaike Information Criterion (AIC) | An index of how well a model fits which seeks to balance the complexity of the model against the sample size. The AIC is calculated using the maximum likelihood estimate. The AIC penalizes models as the number of parameters increases. The size of this penalty is constant. |
Bayesian Information Criterion (BIC) | An index of how well a model fits which seeks to balance the complexity of the model against the sample size. The BIC is calculated using the maximum likelihood estimate (includes sample-size adjusted BIC). The BIC also penalizes models as the number of parameters increases; however, this penalty increases as the sample size increases. |
Entropy | A measure of separation between latent classes. Higher entropy denotes better class separation. It is calculated using sample size, number of classes, and posterior probabilities. Note that an over-fit model will have high entropy so this should not be used for model selection. |
Factor Analysis / Complete Factor Analysis | Principal component analysis where the 1’s in the diagonal of the correlation matrix are replaced by an estimate of the communality of the variable. |
Full information maximum likelihood (FIML) | A method of finding the maximum likelihood solution in the presence of missing data. |
Growth mixture modeling | A form of mixture modeling used to find latent paths in longitudinal data. Growth, as in monotonic increase, is not a requirement. |
Hidden Markov Model / Latent Transition Analysis | A form of statistical modeling used to model changes in categories over time where the groups or categories are not directly observed. |
Indicators | The variables used in a finite mixture model which generated the observed distribution. |
Latent Class Analysis | A form of mixture modeling where all indicator variables are categorical or, in our usage, a mix of categorical and continuous. |
Latent Profile Analysis | A form of mixture modeling where all indicator variables are continuous. |
Latent Variable | A variable that cannot be directly observed, such as membership in a class. |
Local Independence | The concept that variables are independent of each other within a latent class. |
Local Maxima | A likelihood value that is not the true likelihood—analogous to mistaking the top of a foothill for the top of the mountain. |
Maximum Likelihood | Maximum likelihood estimation/solution is the process of estimating model parameters such that the resultant model generates values that are most likely to represent actual observed values. |
Mixture Modeling | A form of statistical modeling that can be used to identify latent groupings within a dataset. |
Model Parameter | These are internal component parts of a model that define its composition. Parameters are estimated using the training data, and once estimated, they are constant. The simplest example of a model parameter is coefficients generated in a regression equation. |
Monte-Carlo Estimation | Computational algorithms that use repeated and random sampling to estimate expected values in simulated data where direct calculations are not feasible. In LCA, Monte-Carlo simulations studies can be used to determine power or to estimate model performance under varied conditions. |
Multiple Imputation | Approach to handling missing data where the missing values are replaced using algorithms that account for the variance of the data and multiple such data sets are imputed. Results from each dataset are combined for the final analysis. |
Posterior Probability | The probability of class membership for each observation after the model has been fit. |
Principal Components Analysis | A mathematical method of data reduction where N variables are replaced by a smaller set of components. |
Salsa Effect | Forcing a single population into separate latent classes which are just spread along a single spectrum or variable. |
Vuong- Lo-Mendel-Rubin (VLMR) test | A test of the probability that a k-class model fits the data better than a k-1 class model. |