Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jan 1.
Published in final edited form as: Alzheimer Dis Assoc Disord. 2021 Oct-Dec;35(4):306–314. doi: 10.1097/WAD.0000000000000462

A comparison of methods for predicting future cognitive status: mixture modeling, latent class analysis, and competitors

Frank Appiah 1, Richard J Charnigo 2
PMCID: PMC8605986  NIHMSID: NIHMS1707199  PMID: 34224419

Abstract

Purpose:

The present work compares various methods for using baseline cognitive performance data to predict eventual cognitive status of longitudinal study participants at the University of Kentucky’s Alzheimer’s Disease Center.

Methods:

Cox proportional hazards models examined time to cognitive transition as predicted by risk strata derived from normal mixture modeling, latent class analysis, and a one-standard-deviation thresholding approach. An additional comparator involved prediction directly from a numeric value for baseline cognitive performance.

Results:

A normal mixture model suggested three risk strata based on CERAD T scores: high, intermediate, and low risk. Cox modeling of time to cognitive decline based on posterior probabilities for risk stratum membership yielded an estimated hazard ratio (HR) of 4.00 with 95% confidence interval (CI) 1.53 to 10.44 in comparing high risk membership to low risk; for intermediate risk membership versus low risk, the modeling yielded HR = 2.29 and 95% CI = 0.98 to 5.33. Latent class analysis produced three groups, which did not have a clear ordering in terms of risk; however, one group exhibited appreciably greater hazard of cognitive decline. All methods for generating predictors of cognitive transition yielded statistically significant likelihood ratio statistics but modest concordance statistics.

Conclusion:

Posterior probabilities from mixture modeling allow for risk stratification that is data-driven and, in the case of CERAD T scores, modestly predictive of later cognitive decline. Incorporating other covariates may enhance predictions.


In 2018, the prevalence of AD in the United States was about 5 million12, and this number is projected to top 14 million by 2050, with a mortality rate of 43% 3. As a result of the burden of AD, it is imperative to strive toward finding a cure. Equally important is the need to correctly identify and provide helpful routines to people who have preclinical disease or even mild cognitive impairment (MCI) before they develop clinical AD4.

Assessment of cognitive function of older adults can form the basis of or aid in diagnosis57 and identify candidates for potential intervention when the disease is in its early stages89. Accurate assessment also supports future prediction of cognitive status910 and can be used to screen potential study participants in clinical trials1112. However, accurately assessing cognitive status is difficult due in part to strong associations between cognition and demographic characteristics such as age, gender, race, and education1314 as well as the paucity of adequate normative data for most neurocognitive tests15. Additionally, practice effects may limit accurate assessment of cognitive status when tests are administered more than once16, and floor and ceiling effects also complicate measurement17.

The Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) neuropsychological battery measures global cognition in older adults, and the CERAD T score mitigates floor and ceiling effects by providing a wide range of possible scores. The CERAD T score has been used to identify MCI and dementia patients with diverse etiologies 18, examine memory decline19, and discriminate normal cognition from dementia 2021. The CERAD tests have also been used in predicting dementia2223 and to understand the dynamics of cognitive decline over time 2426.

In this study, we examined how CERAD T scores measured at study baseline associated with eventual clinical status among highly educated participants who were nominally cognitively intact at baseline, followed longitudinally, and came to autopsy. We applied normal mixture models to baseline T scores to identify subgroups of participants. We then examined these subgroups with regard to future clinical and neuropathological outcomes. Three competing methods to mixture modeling were also considered: use of subgroups derived from latent class analysis, use of subgroups based on a one-standard-deviation thresholding approach, and direct entry of the T scores into Cox models.

Methods

Subjects

The participants in this study were drawn from the University of Kentucky Alzheimer’s Disease Center (UK ADC) longitudinal cohort. All participants in the current study were recruited between 1989 and 2004. The details of the Institutional Review Board approval, recruitment procedures (inclusion and exclusion criteria), and autopsy protocols are discussed elsewhere 2728. Briefly, participants had to be at least 60 years old, cognitively intact, free of major neurological and psychiatric disorders, and provide written informed consent to annual cognitive assessments and brain donation27. Participants in the current study enrolled as nominally cognitively intact and were followed to autopsy. Autopsy was required for the current study so that final clinical diagnosis was available and so that we could evaluate autopsy results.

Cognitive assessment

A neurocognitive battery including the CERAD measures was administered to participants annually by trained research assistants, and results were reviewed and interpreted by expert neuropsychologists and neurologists. Clinical diagnoses of MCI and dementia were determined by consensus based on established criteria and were not limited to data derived from the CERAD battery 29. T score derivations and details of the various measures in the CERAD battery are presented by Chandler et al. 30. In brief, a total score on the CERAD battery can range from 0 to 100 points. The total score is corrected for demographics, with an amount between 2 and 24 points added based on the subject’s gender, age, and education. The corrected total score – let us call it X – is converted to a T score using a table that is approximately summarized by the linear equation T = (5X - 250) / 4. This conversion is a type of standardization because the T score has a mean of 50 and a standard deviation of 10 for historical controls (cognitively normal people). The CERAD battery was administered annually from 1989 until 2005, when the UK ADC adopted the National Alzheimer’s Coordinating Center’s Uniform Data Set neuropsychological battery, which was mandated for all National Institute on Aging-funded ADCs 31.

Neuropathological assessment

Neuropathological counting metrics and methods were as described previously 32. Level of AD changes observed in the brain were measured by the “ABC” score 33, which is a semi-quantitative measure of the aggregated burden of AD-type pathology: amyloid (A), neurofibrillary tangles (B), and neuritic plaques (C). A composite of these ABC scores is the summary: No AD changes, Low level of AD changes, Intermediate level of AD changes, and High level of AD changes. Moreover, a variable called NPNEUR describes the presence of plaques in the brain. Possible scores are 1 through 4, with a lower score indicating greater pathologies; this is a reverse coding of what has been promulgated in some data dictionaries.

Normal mixture modeling

We implemented a normal mixture model 34 to identify subgroups of participants according to baseline T scores. Subgroups are identified in an unsupervised manner34, which is to say that they are determined by the distribution of the data and not by thresholds which are prescribed a priori. There is interest in seeing whether these subgroups have different experiences regarding transition to cognitive impairment. In addition, these subgroups may differ on other observable characteristics. The assumptions for fitting a normal mixture model are: (i) baseline T scores for different subjects are independent; (ii) there exist a finite number of subpopulations – sometimes called “components” – to which subjects may be assigned, although we do not know a priori what this number is (and, hence, do not know a priori to which subpopulation a subject should be assigned); and, (iii) the T scores in each subpopulation follow a normal distribution specific to that subpopulation.

The mixture modeling was carried out as follows:

  1. A histogram of the baseline T scores is shown in Figure 1a. Because we have the approximate relationship T = (5X – 250)/4, where X is the demographic-corrected total CERAD score, a plot of the demographic-corrected total CERAD scores would look similar, except that the labels of “35”, “40”, “45”, etc., on the horizontal axis of Figure 1a would be replaced by “78”, “82”, “86”, etc. The multimodality of Figure 1a suggests that mixture models may be useful in identifying components underlying the data. The number of components in the mixture was selected using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and an approximation to the singular Bayesian Information Criterion (sBIC) 35. Our Supplemental Material provides a non-technical overview of the sBIC.

  2. Once the model selection criteria in point 1 were used to choose the number of components, estimates of model parameters – including means and variances as in assumption (iii) above – were determined using the normalmixEM function from the mixtools package 36 in R (version 3.6.1, 2019–07-05). These estimates were used to obtain posterior probabilities for each participant (that is, the probability of belonging to each subgroup, given a participant’s T score).

  3. Posterior probabilities were used to create risk strata for the time-to-event outcome of transitioning away from normal cognition. The risk strata, also known as hard classifications, accord with the most probable subgroup membership of each individual. (By “hard” we mean that the numerical values between 0 and 1 for posterior probabilities were coerced to 0’s and 1’s, so that each person received a “1” corresponding to the subgroup to which he/she most probably belonged.) In addition, we noted ranges of T scores which characterized each subgroup based on hard classifications.

Figure 1a:

Figure 1a:

The histogram of T-score overlaid with fitted two-component and three-component models with hard classifications from the latter identified by coloration.

Cox modeling with mixture-derived risk strata

Posterior probabilities from the mixture model, also called soft classifications, were examined in a Cox proportional hazards model with and without adjusting for participants’ age, sex, years of education, and mini mental state exam at baseline. (By “soft” we mean that the numerical values between 0 and 1 for posterior probabilities were retained, reflecting uncertainty in group membership, instead of coerced to 0 or 1.) The time scale was in years, and survival time in the Cox model was the length of time in years from baseline until the participant transitioned to any diagnosed cognitive impairment (i.e., MCI or dementia). Participants who did not transition were effectively censored.

Cox regressions based on competitors to mixture modeling

Additional Cox models were considered, using other quantities as predictors in place of the posterior probabilities from the mixture model; the other quantities are described after this paragraph. The performances of the Cox models using posterior probabilities from the mixture model were compared to those of the Cox models using other predictors. All Cox models were evaluated using concordance (or “C”) statistics37 and likelihood ratio test statistics. Data analyses were conducted in R version 3.6.1 (2019–07-05).

  1. The baseline T scores themselves. This approach assumes that the log hazard for cognitive transition is a linear function of baseline T score. If that assumption is correct, this direct use of T scores in a Cox model will yield better predictions than indirect uses of T scores (as in our other Cox models). A disadvantage of this approach is that it does not define clear risk strata. In contrast, mixture modeling yields a finite number of groups in a data-driven manner, such that people in different groups may have different levels of risk for cognitive transition.

  2. Indicators based on thresholds one standard deviation from the mean for historical controls. Three subsets of patients were formed according to whether their T scores were below 40, between 40 and 59, or above 59. Thus, 40 and 60 (one standard deviation below and above the mean for historical controls) defined thresholds. This approach defines clear risk strata but does not account for the shape of the distribution of T scores.

  3. Indicators from a latent class analysis38. After defining categorical versions of the T score, education, age, and mini mental state exam variables (see Supplemental Material for details), we performed a latent class analysis. We thereby identified three subsets of patients, and indicators for two of the subsets were used in Cox modeling, with the remaining subset regarded as a “reference group”. An advantage of this approach is taking into account multiple variables – not only T scores – to define risk strata. A disadvantage is that continuous variables must first be reduced to categorical versions.

Posterior probabilities, mild impairment versus dementia, and neuropathological assessments

With the subset of 96 patients who experienced cognitive impairment, we fit a Cox model for time to dementia from onset of any cognitive impairment, using the previously obtained posterior probabilities as predictors. The intent was to see whether the posterior probabilities could distinguish mild cognitive impairment from dementia.

With all 284 subjects, we fit ordinal logistic regression models for ABC score and NPNEUR, with the mixture model-based posterior probabilities as predictors. The intent was to see whether the posterior probabilities corresponded to neuropathological assessments.

Results

Participant characteristics

There were 305 UK ADC participants who initially met inclusion criteria. Nineteen participants, who were nominally cognitively intact at study baseline, had baseline T scores ≤ 35, which is 1.5 standard deviations below the mean for normal cognition and is evidence for cognitive impairment. Two participants had two APOE-e4 alleles. These 21 participants were excluded, leaving 284, of whom all were white and approximately 60% were female. Mean educational attainment was about 16 years, and 27% carried an APOE-e4 allele. The median mini mental state exam and CERAD T scores were approximately 29 (out of 30) and 51 at baseline. Shortly before autopsy, 188 (out of 284) participants were cognitively normal (CN), 32 had MCI, and 64 had dementia. Participants’ ABC scores at autopsy showed that most had no to low AD changes, while 21% had intermediate, and 20% had high AD changes. Regarding participants’ NPNEUR scores at autopsy, 22% had the worst rating, 49% had intermediate ratings, and 29% had the best rating. The median follow-up was 9.7 years.

Findings from mixture modeling

Figure 1a shows the distribution of baseline T scores for the 284 subjects. Estimated two-component and three-component mixture models are superimposed. The three-component model appears to be a better fit visually, and all three model selection criteria (AIC, BIC and approximate sBIC) favored three components over two. For purposes of interpretation, the three components were labeled “low risk,” “intermediate risk,” and “high risk.” The estimated component means and standard deviations were 39.7 + 1.0, 51.9 + 5.7, and 64.3 + 1.0.

Of the 284 participants (all cognitively normal at baseline), 52 (18.3%) were classified by maximum posterior probability as being in the low risk component, 184 were classified as intermediate risk (64.8%), and 48 were classified as high risk (16.9%); see Figure 1b. The intermediate risk participants had T score values between 42 and 62, while the high risk group had T score values between 38 and 41. Low risk scores were between 63 and 65.

Figure 1b:

Figure 1b:

An estimated posterior probability plot based on hard classifications. Probability of 1 indicates presumed membership in the mixture component and 0 indicates presumed non-membership.

Over the observed range of T scores, the only T scores not generating posterior probabilities above 80% were T = 41, 42, 62, and 63. For these T scores, the largest posterior probabilities were between 70% and 75%, corresponding to high, intermediate, intermediate, and low risk respectively.

Table 1 shows that many of the subjects who remained cognitively normal until death were of intermediate risk. This occurs, in part, because the intermediate risk group is the largest of the three. Moreover, the labels for mixture components (low, intermediate, and high risk) are in relation to each other rather than absolute. In particular, a person classified as intermediate is thought to have greater risk of transitioning cognitively than a person classified as low.

Table 1:

Descriptive statistics. A cognitive transition entails eventual mild cognitive impairment and/or dementia. Time to event is time to cognitive transition for those who experienced that and time to death for those who remained cognitively normal. Education is in years. Interval variables are represented as mean ± standard deviation. Female gender is represented as number (percent).

Mixture-based classification Cognitive transition? Number Female gender Time to event Age Education Mini mental state exam
Low risk No 29 17 (59%) 8.4 ± 5.2 74.1 ± 5.8 15.8 ± 3.2 29.6 ± 0.7
Yes 9 5 (56%) 13.4 ± 4.1 71.8 ± 6.2 16.1 ± 2.3 29.3 ± 0.7
Intermediate risk No 137 74 (54%) 8.4 ± 4.6 76.7 ± 8.1 16.1 ± 2.4 29.0 ± 1.1
Yes 68 49 (72%) 9.4 ± 4.5 77.3 ± 5.7 15.9 ± 2.6 28.8 ± 1.2
High risk No 22 13 (59%) 8.8 ± 5.8 79.6 ± 7.6 16.0 ± 2.3 28.8 ± 1.0
Yes 19 14 (74%) 7.1 ± 3.9 78.4 ± 6.8 15.3 ± 1.7 28.2 ± 1.5

If those classified as high risk are separated from those classified as low or intermediate risk, then sensitivity for prediction of cognitive transition is 19/96 = 20%, specificity is 166/188 = 88%, positive predictive value is 19/41 = 46%, and negative predictive value is 166/243 = 68%. If those classified as intermediate or high risk are separated from those classified as low risk, then sensitivity is 87/96 = 91%, specificity is 29/188 = 15%, positive predictive value is 87/246 = 35%, and negative predictive value is 29/38 = 76%.

The Kaplan-Meier plot (Figure 2a) for time to cognitive impairment (either MCI or dementia), based on hard classification from posterior probabilities, showed that membership in the low risk group may indicate lower probability of (or greater time until) cognitive decline, while being in the high risk group may indicate higher probability of (or less time until) cognitive decline. Thus, the labeling of mixture components in order of their estimated means appears consistent with the later cognitive states of the subjects. The low risk group was used as the reference group for comparison in the Cox proportional hazards models.

Figure 2a:

Figure 2a:

The Kaplan-Meier plot associated with the hard classifications. Participants in the high risk group were most likely to transition and the low risk group participants were the least likely to transition.

The validity of the mixture modeling was supported by the longitudinal trajectories of the average T scores for the three groups based on hard classification (Figure 2b). Group averages were quite stable over 12 years of follow up. However, averages for the low risk group appeared to decline initially, possibly a form of regression to the mean. Depletion in sample size over time is reflected by widening of the error bars. There are no obviously implausible patterns in Figure 2b; however, the group averages at later time points are conditioned on survival of the subjects to those time points (and on the subjects’ remaining in the study).

Figure 2b:

Figure 2b:

Longitudinal mean T-score plot over 12 years of follow up. The sample sizes at each visit for the low risk group are n=(52, 46, 38, 29,16), for the intermediate risk group are n=(184, 167, 132, 84, 36) and for the high risk group are n=(48, 41, 32, 24, 9). The error bars are standard errors.

Findings from Cox modeling with mixture-derived risk strata

Results from unadjusted and adjusted Cox models for time to cognitive transition appear in Tables 2 and 3. Models #1a/#1b use the soft classifications (posterior probabilities) from the three-component normal mixture model. Models #2a/#2b stem from latent class analysis, models #3a/#3b rely on groupings from thresholds one standard deviation above and below the mean for historical controls, and models #4a/4b employ the numerical T scores themselves.

Table 2:

Unadjusted Cox proportional hazards models. For each model, the concordance statistic and (negative twice) the (log) likelihood ratio test statistic are shown. The concordance statistic generalizes the concept of area under a receiver operator curve, so that a higher value is better. The likelihood ratio test statistic is evaluated against a chi-square distribution on degrees of freedom equal to the number of hazard ratios being estimated. High risk and intermediate risk probabilities in Model #1a are based on the three-component normal mixture model. Indicators for latent classes 1 and 3 in Model #2a are based on the latent class analysis. Indicators for T score less than 40 and T score between 40 and 59 in Model #3a correspond to groupings based on cutoffs one standard deviation below and one standard deviation above the mean for T scores. The label “Numerical T score” in Model #4a means that the T score itself appears directly in the Cox model.

Model #1a
Concordance: 0.64
Likelihood ratio: 8.8
Model #2a
Concordance: 0.60
Likelihood ratio: 12.3
Model #3a
Concordance: 0.60
Likelihood ratio: 8.3
Model #4a
Concordance: 0.66
Likelihood ratio: 12.4
Predictor Estimated hazard ratio (95% confidence interval) p-value Estimated hazard ratio (95% confidence interval) p-value Estimated hazard ratio (95% confidence interval) p-value Estimated hazard ratio (95% confidence interval) p-value
High risk probability 4.00 (1.53–10.44) 0.005
Intermediate risk probability 2.29 (0.98–5.33) 0.055
Indicator for latent class 1 0.41 (0.24–0.70) 0.001
Indicator for latent class 3 0.44 (0.27–0.72) 0.001
Indicator for T score less than 40 2.96 (1.25–7.02) 0.014
Indicator for T score between 40 and 59 1.95 (1.12–3.41) 0.019
Numerical T score 0.95 (0.93–0.98) < 0.001

Table 3:

Adjusted Cox proportional hazards models. Please see the notes for Table 2. In addition, all models here are adjusted for gender, age, education, and mini mental state exam. The corresponding hazard ratios compare men to women (gender) or are based on one-unit increases (age, education, mini mental state exam).

Model #1b
Concordance: 0.70
Likelihood ratio: 40.4
Model #2b
Concordance: 0.68
Likelihood ratio: 37.4
Model #3b
Concordance: 0.70
Likelihood ratio: 40.0
Model #4b
Concordance: 0.71
Likelihood ratio: 43.1
Predictor Estimated hazard ratio (95% confidence interval) p-value Estimated hazard ratio (95% confidence interval) p-value Estimated hazard ratio (95% confidence interval) p-value Estimated hazard ratio (95% confidence interval) p-value
High risk probability 2.52 (0.93–6.77) 0.068
Intermediate risk probability 1.73 (0.73–4.14) 0.216
Indicator for latent class 1 2.05 (0.25–17.13) 0.506
Indicator for latent class 3 1.06 (0.56–1.99) 0.864
Indicator for T score less than 40 1.59 (0.65–3.89) 0.313
Indicator for T score between 40 and 59 1.63 (0.92–2.88) 0.091
Numerical T score 0.97 (0.94–0.99) 0.012
Male gender 0.76 (0.48–1.20) 0.237 0.39 (0.05–3.02) 0.368 0.75 (0.47–1.20) 0.231 0.77 (0.49–1.22) 0.268
Age 1.06 (1.03–1.10) < 0.001 1.07 (1.03–1.12) < 0.001 1.06 (1.03–1.10) < 0.001 1.06 (1.03–1.10) 0.001
Education 0.95 (0.86–1.05) 0.318 0.95 (0.86–1.05) 0.309 0.96 (0.87–1.05) 0.359 0.94 (0.85–1.04) 0.223
Mini mental state exam 0.77 (0.63–0.94) 0.009 0.76 (0.62–0.93) 0.008 0.76 (0.62–0.92) 0.006 0.77 (0.63–0.93) 0.007

For model #1a, membership in the high risk component of the mixture model is estimated to quadruple the hazard of cognitive transition, versus membership in the low risk component (reference group), with estimated hazard ratio (HR) 4.00 and 95% confidence interval (CI) 1.53 to 10.44. Membership in the intermediate risk component is estimated to more than double the hazard (HR 2.29, 95% CI 0.98 to 5.33). Posterior probabilities also permit comparisons between people with uncertain classifications. For instance, considering someone having 80% versus 20% posterior probability of high versus intermediate risk, compared to someone almost certainly of low risk, the hazard ratio estimate would be 3.990.80 2.120.20 = 3.52.

The likelihood ratio statistic for model #1a is 8.8. This is greater than 6.0, the 95th percentile of the reference chi-square distribution, providing support for risk stratification based on mixture modeling of baseline T scores. The concordance statistic of 0.64 indicates that risk stratification based on mixture modeling has merit but is limited in its predictive capabilities.

For model #1b, there is a noticeable decrease in the estimated hazard ratios compared to model #1a. However, the ordering of mixture components with regard to adjusted risk of cognitive transition is consistent with the ordering in the unadjusted model and with our labeling of mixture components. Model #1b also shows that increased age elevates the hazard of cognitive transition; although an estimated hazard ratio of 1.06 per year of age may seem modest, a 7-year age difference is enough to raise the estimated hazard by more than 50%. On the other hand, a higher score on the mini mental state exam is protective, lowering the hazard of cognitive transition.

The likelihood ratio statistic for model #1b is 40.4, which is much greater than 12.6, the 95th percentile of the reference chi-square distribution. The concordance statistic is 0.70, which is larger than that of model #1a but still indicative of limited predictive capabilities. Because gender and education each had p-value > 0.20 in model #1b, one may wonder about the impact of removing them. Doing so causes the concordance and likelihood ratio statistics to drop slightly, from 0.70 to 0.69 and from 40.4 to 37.2 respectively; however, there is little effect on the estimated hazard ratios for the other predictors.

Findings from Cox regressions based on competitors to mixture modeling

Referring to Table 2, Cox regression using predictors derived from latent class analysis (model #2a) had a likelihood ratio statistic of 12.3 and a concordance statistic of 0.60. However, while latent classes 1 and 3 had lower hazards than latent class 2, latent classes 1 and 3 were not distinguishable from each other on hazard of transition to cognitive impairment. With predictors based on thresholds one standard deviation away from the mean, Cox regression (model #3a) yielded a likelihood ratio statistic of 8.3 and a concordance statistic of 0.60. Using the numerical T scores themselves as predictors (model #4a) generated a likelihood ratio statistic of 12.4 and a concordance statistic of 0.66.

Referring to Table 3, the concordance and likelihood ratio statistics of models #2b, #3b, and #4b are not far from those of model #1b. The estimated hazard ratios for age, education, and mini mental state exam are similar across these four models. The indicators for latent classes in model #2b have estimated hazard ratios that are accompanied by very wide confidence intervals, which was not true of model #2a. Apparently, the fact that the latent classes were derived from other variables obviates the latent classes themselves when those other variables appear directly in a Cox model. Regarding risk of cognitive transition, the clear ordering of groups in model #3a is obscured in model #3b.

Other findings involving posterior probabilities from mixture modeling

For the Cox model in which time to dementia is predicted for the 96 subjects who experienced any cognitive transition, the p-values are 0.63 and 0.65 respectively, for comparing high risk to low risk and intermediate risk to low risk.

For the ordinal logistic regression model in which NPNEUR is predicted for all 284 subjects, the p-values are 0.10 and 0.21, respectively, for comparing high risk to low risk and intermediate risk to low risk. As for the ABC score, the p-values are 0.48 and 0.83, respectively, for comparing high risk to low risk and intermediate risk to low risk.

Discussion

Contribution and strengths of this work

This study used multiple methods to derive predictors for Cox modeling of time to cognitive transition, among 284 research volunteers who had normal CERAD T scores at baseline. Our main focus was on normal mixture modeling, for which we used both traditional criteria (AIC and BIC) and a recently introduced criterion (sBIC)35 to confirm that a three-component mixture model provided better fit to T scores than a two-component model. Other methods of deriving predictors were based on latent class analysis, placing thresholds one standard deviation away from the mean, and directly using the numerical T scores.

The three components of the normal mixture model were labeled as low, intermediate, and high risk. The low and intermediate risk groups were less likely to transition to cognitive impairment (Figure 2a) and had higher baseline T scores than the high risk group; the groups also maintained their relative ordering on T scores longitudinally (Figure 2b). This indicates that, even among scores that are considered in the normal range, mixture modeling may identify latent groups at baseline. In our study, individuals in the low risk group had T scores close to 65 at baseline, while individuals in the high risk group had T scores close to 40. Although 40 is only one standard deviation below the mean for historical controls, a score of 40 may raise concerns about future cognitive impairment, particularly for those who are highly educated.

A strength of the current study was the careful longitudinal monitoring of participants’ cognition, which decreases the possibility that participants had cognitive impairment but were undiagnosed as well as the possibility that they were wrongly classified with cognitive impairment. Limiting the study to participants with known cognitive status before death is also a strength.

Direct use of T scores versus mixture modeling

Using T scores directly, as in Model #4a/#4b, may be appealing on the grounds of simplicity. On the other hand, one may wish to define groups of subjects with possibly different levels of risk for cognitive transition. This idea of defining groups is prevalent in medical care and public health. For example, people are categorized as “normal”, “overweight”, or “obese” based on body mass index, even though the risks of cardiovascular disease or other health problems may vary continuously with body mass index. However, the groupings are useful because they provide an objective criterion by which clinicians and patients may recognize that heightened concern and action are warranted.

If we place people into groups, then how do we determine those groups? Mixture modeling yields a finite number of groups in a data-driven manner. However, we want to see whether such groupings also relate to a clinical outcome. Table 2 showed little difference between concordance statistics for Model #1a and Model #4a, although Model #4a exhibited a better likelihood ratio statistic. Much of the inherent predictive utility of the T scores is preserved through mixture modeling, while we also gain a definition of groups and posterior probabilities for group membership.

By working with posterior probabilities in the Cox modeling, we retain a little resolution from the underlying T scores. For example, a person with a T score of 38, 39, or 40 has (estimated) probability greater than 80% of belonging to the high risk group, while a person with a T score of 41 has 72% probability. Thus, a person with a T score of 39 and a person with a T score of 41 are regarded differently in the Cox modeling, even though they are both classified as high risk.

Potential clinical utility

Suppose there is a diagnostic test on which a patient receives a score and on which historical data have been analyzed by mixture modeling. The T scores for cognitive assessments are an example, but the concept can be more general. One way to proceed is to use the mixture modeling to identify groups, as we did from the T scores in this study, and then ascertain the most probable group for each patient. A patient in a group believed to be at higher risk could be treated, counseled, and/or monitored differently.

However, as we see by comparing model #1b to model #1a, predictions can be improved by considering other demographic or clinical variables. Therefore, another way to proceed is to develop Framingham-type risk scores. These risk scores could be derived from Cox regressions, like model #1b, which feature both posterior probabilities (from mixture modeling) and other variables (demographic and clinical). The mixture modeling and Cox regressions could be based on historical data, but posterior probabilities and then a risk score could be calculated for each current patient. The risk scores could then inform treatment, counseling, and/or monitoring of patients.

Another possibility, though beyond the scope of the present work, is to allow that the relationship between the outcome and predictors may vary from one mixture component to another. In essence, mixture component membership may be not only a predictor of the outcome but also an effect modifier of one or more other predictors. A special case of this idea was explored by Gage for perinatal data; the relationship between birthweight and infant mortality was proposed to depend on the mixture component to which an infant belongs39. In the context of Alzheimer’s disease, there could be three formulas for Framingham-type risk scores, one for low risk, one for intermediate risk, and one for high risk. The mixture component to which a patient most probably belongs could be used to select one of these three formulas.

Further connections to existing literature

Group-based trajectory modeling40 is related to mixture modeling but requires longitudinal data to define groups. However, we wish to be able to define groups at baseline, because the potential clinical utility lies with using groups (or associated posterior probabilities) to treat or advise patients. Given that AD has no cure but only a process to help patients cope with the disease1, it is imperative to equip potential future patients with information when they are still cognitively intact. Assignment to a high risk group, as established in a mixture model, may lead to more intensive testing and monitoring.

Hedden and colleagues41 applied mixture modeling to describe the distribution of amyloid burden for subjects in the Harvard Aging Brain Study. The measure of amyloid burden was, after logarithmic transformation, analyzed with a two-component normal mixture model. The posterior probabilities from that mixture model were used in path regressions to see whether amyloid burden and other brain markers mediated relationships between age and various forms of cognition.

Nyberg and Pudas42 proposed a conceptual model for memory aging consisting of four groups: pathological memory decline, ordinary memory decline, successful memory aging due to compensation (despite age-related brain changes), and successful memory aging due to brain maintenance. Thus, there may be multiple routes for memory preservation. Also, modifiable lifestyle factors may be able to help some people pursue these routes. Cabeza and colleagues43 sought to clarify distinctions between compensation, maintenance, and reserve. In so doing, they contrasted two groups: maintainers and decliners, so named based on longitudinal changes in assessments of episodic memory.

Study limitations

The 284 participants in our study were highly educated and racially and geographically homogeneous, which limits generalizability. Moreover, there may be competing biases in opposing directions. Some patients may enroll in an Alzheimer’s study because of family history, and susceptibility to Alzheimer’s has a genetic component. Thus, risk of Alzheimer’s may be overstated in our study. On the other hand, a person who is highly educated may have more cognitive reserve43 and may function effectively with early symptoms of Alzheimer’s. If this caused diagnoses to be delayed for some subjects, then risk of Alzheimer’s may be understated in our study. We do not know the net effect of these biases on the relationship between T scores (or mixture model-based posterior probabilities derived from T scores) and time to cognitive transition.

In addition, we limited participants in the study to those with autopsy, which may introduce further selection bias. Out of 706 subjects who were recruited by UK ADC before 2005 (and who had at least one follow-up visit), 298 had died and been autopsied as of 2015, when our data set was produced. There were 185 subjects still alive and in the study (as of 2015), 186 withdrawn from the study, and 37 who had died without autopsy. Mean and standard deviation for age at baseline were 76.3 + 7.2 for autopsied, 75.0 + 6.6 for died without autopsy, 70.1 + 8.3 for withdrawn, and 68.4 + 6.2 for still alive. Thus, our subjects were (on average) older than contemporaneous UK ADC subjects who were still alive or who had withdrawn by the time the data set was curated. Differences on educational attainment and gender were less pronounced.

The longitudinal trajectories for the three groups derived from mixture modeling were limited to twelve years of follow-up, with later years excluded due to replacement of the CERAD battery with the UDS neuropsychological battery.

Conclusion

Mixture modeling can be used in conjunction with any diagnostic procedure whose result is numerical. Thus, the ideas in this article remain potentially applicable with other diagnostic procedures that may be in use now or that may come into use later. Future research and practice may employ these ideas to obtain normative data for neuropsychological tests or to identify people at heightened risk for cognitive transition.

Supplementary Material

Supplemental Data File (.doc, .tif, pdf, etc.)

Acknowledgement

The CERAD data was provided by the Sanders Brown Center on Aging at the University of Kentucky in association with the NIH-funded ADC grant under award number [P30 AG028383]. We also thank Erin Abner, PhD and Richard Kryscio, PhD from the Sanders Brown Center for information and/or assistance. We thank two anonymous referees for their comments, which guided revision of the paper.

Contributor Information

Frank Appiah, Program, Management, Analytics and Technology, 5200 S Ulster street., Greenwood Village, CO 80111.

Richard J Charnigo, Multidisciplinary Science Building 325a, 725 Rose Street, University of Kentucky Lexington, Kentucky 40536-0082.

References

  • 1.Alzheimer’s Association. Alzheimer’s disease facts and figures. https://www.alz.org/media/Documents/facts-and-figures-2018-r.pdf2018
  • 2.Ostbye T, Hill G, Steenhuis R. Mortality in elderly Canadians with and without dementia: A 5-year follow-up. Neurology. 1999;53(3):521–521. doi: 10.1212/wnl.53.3.521. [DOI] [PubMed] [Google Scholar]
  • 3.Albert M, DeKosky S, Dickson D et al. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimer’s & Dementia. 2011;7(3):270–279. doi: 10.1016/j.jalz.2011.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bonney K, Almeider O, Flicker L et al. Inspection time in non-demented older adults with mild cognitive impairment. Neuropsychologia. 2006;44(8):1452–1456. doi: 10.1016/j.neuropsychologia.2005.12.002. [DOI] [PubMed] [Google Scholar]
  • 5.Ott B, Grace J, Frakey L, Kelley P et al. Prediction of functional decline and conversion from mild cognitive impairment with the telephone-administered Minnesota Cognitive Acuity Screen. Alzheimer’s & Dementia. 2013;9(4):P449. doi: 10.1016/j.jalz.2013.05.896. [DOI] [Google Scholar]
  • 6.Chang Y, Yen Y, Chen T et al. Clinical Dementia Rating Scale Detects White Matter Changes in Older Adults at Risk for Alzheimer’s Disease. Journal of Alzheimer’s Disease. 2015;50(2):411–423. doi: 10.3233/jad-150599. [DOI] [PubMed] [Google Scholar]
  • 7.Fitzpatrick-Lewis D, Warren R, Ali M et al. Treatment for mild cognitive impairment: a systematic review and meta-analysis. CMAJ Open. 2015;3(4):E419–E427. doi: 10.9778/cmajo.20150057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bogdanova Y, Yee M, Ho V et al. Computerized Cognitive Rehabilitation of Attention and Executive Function in Acquired Brain Injury. Journal of Head Trauma Rehabilitation. 2015:1. doi: 10.1097/htr.0000000000000203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Green C, Zhang S. Predicting the progression of Alzheimer’s disease dementia: A multidomain health policy model. Alzheimer’s & Dementia. 2016. doi: 10.1016/j.jalz.2016.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Rieckmann A, Van Dijk K, Sperling R et al. Accelerated decline in white matter integrity in clinically normal individuals at risk for Alzheimer’s disease. Neurobiology of Aging. 2016;42:177–188. doi: 10.1016/j.neurobiolaging.2016.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Annweiler C Vitamin D in dementia prevention. Ann NY Acad Sci. 2016;1367(1):57–63. doi: 10.1111/nyas.13058. [DOI] [PubMed] [Google Scholar]
  • 12.Van de Rest O, Wang Y, Barnes L et al. APOE 4 and the associations of seafood and long-chain omega-3 fatty acids with cognitive decline. Neurology. 2016. doi: 10.1212/wnl.0000000000002719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Marshall G, Dekhtyar M, Bruno J et al. A New Performance-Based Activities of Daily Living Instrument for Early Alzheimer’s Disease. Alzheimer’s & Dementia. 2014;10(4):P365. doi: 10.1016/j.jalz.2014.05.418. [DOI] [Google Scholar]
  • 14.Ganguli M, Snitz B, Lee C et al. Age and education effects and norms on a cognitive test battery from a population-based cohort: The Monongahela–Youghiogheny Healthy Aging Team. Aging & Mental Health. 2010;14(1):100–107. doi: 10.1080/13607860903071014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Keefe R, Harvey P, Goldberg T et al. Norms and standardization of the Brief Assessment of Cognition in Schizophrenia (BACS). Schizophrenia Research. 2008;102(1–3):108–115. doi: 10.1016/j.schres.2008.03.024. [DOI] [PubMed] [Google Scholar]
  • 16.Dozois D, Covin R, Brinker J. Normative data on cognitive measures of depression. Journal of Consulting and Clinical Psychology. 2003;71(1):71–80. doi: 10.1037/0022-006x.71.1.71. [DOI] [PubMed] [Google Scholar]
  • 17.Abner E, Dennis B, Mathews M et al. Practice effects in a longitudinal, multi-center Alzheimer’s disease prevention clinical trial. Trials. 2012;13(1):217. doi: 10.1186/1745-6215-13-217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Welsh K, Butters N, Hughes J, Mohs R, Heyman A. Detection of Abnormal Memory Decline in Mild Cases of Alzheimer’s Disease Using CERAD Neuropsychological Measures. Archives of Neurology. 1991;48(3):278–281. doi: 10.1001/archneur.1991.00530150046016. [DOI] [PubMed] [Google Scholar]
  • 19.Patterson M, Mack J, Mackell J et al. A Longitudinal Study of Behavioral Pathology Across Five Levels of Dementia Severity in Alzheimerʼs Disease. Alzheimer Disease & Associated Disorders. 1997;11:40–44. doi: 10.1097/00002093-199700112-00006. [DOI] [PubMed] [Google Scholar]
  • 20.Welsh K, Butters N, Mohs R et al. The Consortium to Establish a Registry for Alzheimer’s Disease (CERAD). Part V. A normative study of the neuropsychological battery. Neurology. 1994;44(4):609–609. doi: 10.1212/wnl.44.4.609. [DOI] [PubMed] [Google Scholar]
  • 21.Wolfsgruber S, Jessen F, Wiese B et al. The CERAD Neuropsychological Assessment Battery Total Score Detects and Predicts Alzheimer Disease Dementia with High Diagnostic Accuracy. The American Journal of Geriatric Psychiatry. 2014;22(10):1017–1028. doi: 10.1016/j.jagp.2012.08.021. [DOI] [PubMed] [Google Scholar]
  • 22.Tierney M, Szalai J, Snow W et al. Prediction of probable Alzheimer’s disease in memory-impaired patients: A prospective longitudinal study. Neurology. 1996;46(3):661–665. doi: 10.1212/wnl.46.3.661. [DOI] [PubMed] [Google Scholar]
  • 23.Fillenbaum G, van Belle G, Morris J et al. Consortium to Establish a Registry for Alzheimer’s Disease (CERAD): The first twenty years. Alzheimer’s & Dementia. 2008;4(2):96–109. doi: 10.1016/j.jalz.2007.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Heyman A, Peterson B, Fillenbaum G, Pieper C. The Consortium to Establish a Registry for Alzheimer’s Disease (CERAD). Part XIV: Demographic and clinical predictors of survival in patients with Alzheimer’s disease. Neurology. 1996;46(3):656–660. doi: 10.1212/wnl.46.3.656. [DOI] [PubMed] [Google Scholar]
  • 25.Mohs R, Morris J, Heyman A, Belle G. Longitudinal assessment of Alzheimer’s disease: Follow-up of over 250 patients from the consortium to establish a registry for AD (CERAD). Biological Psychiatry. 1990;27(9):42. doi: 10.1016/0006-3223(90)90080-l. [DOI] [Google Scholar]
  • 26.Markesbery W, Schmitt F, Kryscio R, Davis D, Smith C, Wekstein D. Neuropathologic Substrate of Mild Cognitive Impairment. Arch Neurol. 2006;63(1):38. doi: 10.1001/archneur.63.1.38. [DOI] [PubMed] [Google Scholar]
  • 27.Schmitt FA, Nelson PT, Abner E, et al. University of Kentucky Sanders-Brown healthy brain aging volunteers: donor characteristics, procedures and neuropathology. Curr Alzheimer Res. 2012;9(6):724–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Riley K Prediction of preclinical Alzheimer’s disease: longitudinal rates of change in cognition. J Alzheimers Dis. 2011;25(4):707–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Winblad B, Palmer K, Kivipelto M et al. Mild cognitive impairment - beyond controversies, towards a consensus: report of the International Working Group on Mild Cognitive Impairment. J Intern Med. 2004;256(3):240–246. doi: 10.1111/j.1365-2796.2004.01380.x. [DOI] [PubMed] [Google Scholar]
  • 30.Chandler M, Lacritz L, Hynan L et al. A total score for the CERAD neuropsychological battery. Neurology. 2005;65(1):102–106. doi: 10.1212/01.wnl.0000167607.63000.38. [DOI] [PubMed] [Google Scholar]
  • 31.Weintraub S, Salmon D, Mercaldo N et al. The Alzheimerʼs Disease Centersʼ Uniform Data Set (UDS). Alzheimer Disease & Associated Disorders. 2009;23(2):91–101. doi: 10.1097/wad.0b013e318191c7dd. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Nelson P, Jicha G, Schmitt F et al. Clinicopathologic Correlations in a Large Alzheimer Disease Center Autopsy Cohort. Journal of Neuropathology and Experimental Neurology. 2007;66(12):1136–1146. doi: 10.1097/nen.0b013e31815c5efb. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Montine T, Phelps C, Beach T et al. National Institute on Aging–Alzheimer’s Association guidelines for the neuropathologic assessment of Alzheimer’s disease: a practical approach. Acta Neuropathologica. 2011;123(1):1–11. doi: 10.1007/s00401-011-0910-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chen H, Chen J. The likelihood ratio test for homogeneity in finite mixture models. Canadian Journal of Statistics. 2001;29(2):201–215. doi: 10.2307/3316073. [DOI] [Google Scholar]
  • 35.Drton M, Plummer M. A Bayesian Information Criterion for Singular Models. J. R. Statist. Soc. B (2017) 79, Part 2, pp. 323–380 [Google Scholar]
  • 36.Benaglia T, Chauveau D, Hunter DR, and Young DS (2009). “mixtools: An R Package for Analyzing Mixture Models.” Journal of Statistical Software, 32(6), 1–29 [Google Scholar]
  • 37.Uno H, Cai T, Pencina M, D’Agostino R, Wei L. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statist Med. 2011:n/a–n/a. doi: 10.1002/sim.4154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Linzer Drew A., Lewis Jeffrey B. (2011). poLCA: An R Package for Polytomous Variable Latent Class Analysis. Journal of Statistical Software, 42(10), 1–29. URL http://www.jstatsoft.org/v42/i10/. [Google Scholar]
  • 39.Gage TB. Birth-weight-specific infant and neonatal mortality: effects of heterogeneity in the birth cohort. Hum Biol. 2002. April;74(2):165–84. doi: 10.1353/hub.2002.0020. [DOI] [PubMed] [Google Scholar]
  • 40.Nagin Daniel S, Odgers Candice L. Group-Based Trajectory modeling (Nearly) Two Decades Later. J Quant Criminol (2010) 26:445–453 DOI 10.1007/s10940-010-9113-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hedden Trey, Schultz Aaron P., Rieckmann Anna, Mormino Elizabeth C., Johnson Keith A., Sperling Reisa A., Buckner Randy L., Multiple Brain Markers are Linked to Age-Related Variation in Cognition, Cerebral Cortex, Volume 26, Issue 4, April 2016, Pages 1388–1400 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Nyberg L, Pudas S. Successful Memory Aging. Annual Review of Psychology. 2019;70:219–243. doi: 10.1146/annurev-psych-010418-103052 [DOI] [PubMed] [Google Scholar]
  • 43.Cabeza R, Albert M, Belleville S et al. Maintenance, reserve and compensation: the cognitive neuroscience of healthy ageing. Nat Rev Neurosci 19, 701–710 (2018). 10.1038/s41583-018-0068-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data File (.doc, .tif, pdf, etc.)

RESOURCES