Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Mar 19.
Published in final edited form as: J Alzheimers Dis. 2021;84(3):1267–1278. doi: 10.3233/JAD-210621

Investigating predictors of preserved cognitive function in older women using machine learning: Women’s Health Initiative Memory Study

Ramon Casanova 1, Sarah A Gaussoin 2, Robert Wallace 3,4, Laura Baker 5, Jiu-Chiuan Chen 6, JoAnn E Manson 7, Victor W Henderson 8, Bonnie C Sachs 9,10, Jamie Justice 11, Eric A Whitsel 12, Kathleen M Hayden 13, Stephen R Rapp 14,15
PMCID: PMC8934040  NIHMSID: NIHMS1783031  PMID: 34633318

Abstract

BACKGROUND:

Identification of factors that may help to preserve cognitive function in late life could illucidate mechanisms, facilitate interventions to improve the lives of millions of people. However, the large number of potential factors associated with cognitive function poses an analytical challenge.

OBJECTIVE:

We used data from the longitudinal Women’s Health Initiative Memory Study and machine learning to investigate 50 demographic, biomedical, behavioral, social, and psychological predictors of preserved cognitive function in later life.

METHODS:

Participants who were at least 80 years old and had at least one cognitive assessment following their 80th birthday were classified as either cognitively preserved or impaired. Preserved cognitive function was defined as having a score≥39 on the most recent administration of the modified Telephone Interview for Cognitive Status(TICSm) and a mean score across all assessments≥ 39. Cognitively impaired participants were those adjudicated to have probable dementia or at least two adjudications of mild cognitive impairment within the 14 years of follow-up and a last TICSm score <31. Random Forests was used to rank the predictors of preserved cognitive function.

RESULTS:

Discrimination between groups based on area under the curve was 0.80(95% CI 0.76–0.85). Women with preserved cognitive function were younger, better educated, less forgetful, less depressed and more optimistic at baseline. They also reported better physical function, less sleep disturbance and had lower systolic blood pressure, hemoglobin, and blood glucose levels.

CONCLUSIONS:

The predictorsof preserved cognitive function are diverse and include demographic, psychological, physical, metabolic and vascular factors suggesting a complex mix ofpotential contributors.

Keywords: Cognitive preservation, women, machine learning, Random Forests, WHIMS

Introduction

Identifying factors that may protect and preserve cognitive functioning in late life can have substantial public health benefits, as the number of people worldwide surviving to old age grows. Such research could inform public policy, identify targets for intervention and help elucidate underlying mechanisms. However, the potential contributors to preserved cognitive aging are complexly arranged and multivariable. Biological, behavioral, psychological, social, economic, and environmental variables all may play important roles in protecting the brain and cognitive functions and this fact poses an analytical challenge. Conventional approaches emphasize hypothesis testing that require a priori assumptions about causation and limit the number of predictors. Machine learning (ML) approaches, on the other hand, are especially useful in population studies with large phenotypic data sets, allowing studies to go beyond well-known predictors [13]. Some machine learning methods have embedded variable selection mechanisms, which can provide valuable information about the comparative relevance or importance of predictors in the data set, and thus can capture subtle multivariate relationships, nonlinearities and even interactions that are otherwise difficult to detect.

Recently machine learning methods have been used to study cognitive resilience in the presence of Alzheimer’s disease neuropathology[4], genetic risk[5, 6] and hippocampal atrophy[7]. They have also been used to predict preserved cognitive function, cognitive trajectories and cognitive impairment[811].

The Women’s Health Initiative (WHI) has been collecting diverse types of information from a large cohort of women over the past 25 years. The WHI Memory Study (WHIMS) and its several extensions collected cognitive performance data annually on a sub-cohort of WHI participants and adjudicated mild cognitive impairment (MCI) and dementia. The richly phenotyped WHIMS cohort, provides an excellent opportunity to evaluate the comparative relevance of a diverse set of demographic, lifestyle, social, medical and psychological predictors of well-preserved cognitive functioning into late life.

Methods

Participants

The Women’s Health Initiative Hormone Therapy trials assessed the effects of two hormone therapy regimens, conjugated equine estrogens (CEE) with or without medroxyprogesterone acetate (MPA) versus placebo, on coronary heart disease, fractures and breast cancer among postmenopausal women[12]. WHIMS was an ancillary study to the WHI Hormone Therapy trials designed to assess the effect of postmenopausal hormone therapy (CEE with or without MPA versus placebo) on the incidence of all-cause probable dementia and global cognitive functioning, through annual in-person cognitive assessments, in 7479 women between 65 and 79 years of age at enrollment into WHI [13]. Following the early termination of the WHI CEE+MPA trial (July, 2002) and WHI CEE-Alone trial (March 2004), WHIMS participants continued their annual assessments during the WHIMS-Extension Study which lasted until June 2008 when they were enrolled in the WHIMS-Epidemiology of Cognitive Health Outcomes (WHIMS-ECHO, 2008–2021) study. At that point cognitive assessments transitioned from in-person to telephone-based. In addition to WHIMS, the WHI study collected a wide range of information from participants throughout all phases of the study which continues today. The present study cohort includes WHIMS and WHIMS-ECHO participants who met criteria, as described below. All women provided written informed consent and all protocols were approved by local Institutional Review Boards.

Cognitive assessments

In WHIMS and WHIMS-Extension, the Modified Mental State Exam (3MSE; [14]), was administered annually to all participants. The 3MSE is a measure of global cognitive function and includes items assessing orientation, memory, attention, language, visuo-construction, and executive function. Women scoring below a pre-set cut-point were administered additional cognitive tests, a neuropsychiatric evaluation by a physician specialist, optional computerized tomography (CT) of the brain and blood tests. Additionally, a knowledgeable friend or family member was interviewed about observed cognitive and behavioral changes. [13] In WHIMS-ECHO, participants were administered annually a validated telephone cognitive test battery and questionnaires [15] by trained and certified staff blinded to treatment assignment. Included in the battery was the modified Telephone Interview for Cognitive Status (TICSm), [16] a measure of global cognitive function similar to the 3MSE with items assessing orientation, memory, attention, language, and executive function. Continuously across WHIMS, WHIMS Extension and WHIMS-ECHO incident MCI and probable dementia were adjudicated as described below.

Adjudication of cognitive impairment

Throughout all phases of WHIMS, a central, multidisciplinary (geropsychiatry, neurology, geropsychology, geriatrics) adjudication panel of experts in the diagnosis of dementia and related syndromes identified cases as either probable dementia, mild cognitive impairment or no cognitive impairment. Adjudication was triggered by a prespecified threshold on the global cognitive function measure (3MSE, TICSm). Each case was assigned to two adjudicators blinded to hormone treatment assignment who independently reviewed all available data from cognitive tests, clinical evaluations, laboratory tests and informant interviews before making their classification following standardized criteria. If the adjudicators agreed, their classification was final. If they disagreed, the case was discussed on regularly scheduled conference calls until consensus was reached.

Study Groups

The study cohort included women who met the following criteria: (a) previously enrolled in the WHIMS and WHIMS ECHO study (b) at least 80 years of age as of January 1, 2018, and (c) had at least one WHIMS ECHO annual cognitive assessment following their 80th birthday. The cognitively preserved group (N=205) included participants: (a) not previously adjudicated to have mild cognitive impairment (MCI) or probable dementia; (b) had a score ≥39 (high normal) on the most recent administration of the TICSm; and (c) had an average of TICSm scores ≥39 across all previous WHIMS-ECHO visits. The cognitively impaired group (N=176) included participants: (a) adjudicated with probable dementia within the first 14 years of follow-up or; (b) who had at least two classifications of MCI within the first 14 years of follow-up and (c) whose latest TICSm score was <31.

Random Forests

We selected Random Forests (RF), a state of the art machine learning analytical approach[17]. RF has several desirable properties including it: 1) is non-linear; 2) is easy to use, often needing little tuning; 3) generates several metrics of variable importance; 4) allows data imputation; 5) deals effectively with problems associated with large predictors to sample size ratios and 6) has a built-in mechanism to evaluate performance. These characteristics make RF an appealing tool for identifying the relative importance of varied risk and protective factors. RF belongs to the class of ensemble learning algorithms. Ensemble learning refers to methods that combine multiple classifiers (or ‘trees’ in the case of RF) to make a final prediction. Usually each individual classifier is “weak” (not a good performer in terms of prediction) but the combination of them can lead to powerful algorithms. Once the forest is built, predicting class membership of a new sample is accomplished by combining the trees, using a majority vote of all classifiers. Because each tree in the forest is generated using sampling of the training data with (or without) replacement, samples are omitted when building each tree. These out-of-the-bag (OOB) samples can be used to assess the performance of the classifier and to build measures of variable importance. This OOB mechanism is similar to a built-in cross-validation procedure that allows evaluation of performance.

RF classification analyses was used to evaluate the discrimination between cognitively preserved vs. cognitively impaired groups and to investigate the relative importance of each predictor of preserved cognitive function. RF contains built-in metrics of variable importance (e.g. Gini, permutation and minimal depth indexes) which allow the evaluation of the relative relevance for prediction of each variable in a RF model. The two main hyper parameters are the number of trees to grow (ntree) in the ensemble and the number of variables selected at random (mtry) to define the best split at each node.

Candidate Predictors

We employed a diverse set of 50 demographic biomedical, psychosocial, sensory, behavioral, and clinical variables predictors collected at the WHI baseline (1993–98) as input to our machine learning procedure. Table 3 lists each predictor and how it is scaled. Many of these predictors have been associated with cognitive function in previous research.

Table 3 –

The list of all predictors included in the RF model ordered according to the minimal depth index is presented.

Rank Variable Scaling MDI
1 Age years 2.29
2 Forgetfulness 1=mild, 3=severe 3.97
3 Physical functioning 0=low, 100=high 4.13
4 Optimism Life Orientation Test-Revised, range: 6–30 5.20
5 Hemoglobin GM/dL 5.28
6 Glucose mg/dL 5.36
7 Sleep disturbance 0=no, 4=frequent 5.44
8 Systolic blood pressure mmHg 5.53
9 Education 1=didn’t attend, 11=doctoral degree 5.55
10 Depressive symptom severity 0 – 1. Higher scores indicate greater depressive symptom severity 5.77
11 Emotional well-being Rand 36 Health Survey, Emotional wellbeing subscale (0–100). Higher scores indicate better emotional well-being 5.91
12 White blood count 10^3/ul 5.92
13 LDL cholesterol mg/dL 5.95
14 Health Eating Index 0–100. Higher indicates healthier diet 5.98
15 Body Mass Index Kg/m2 5.99
16 Stressful Life Events 0=none 33=high 6.00
17 C-reactive protein mg/dL 6.05
18 Social support 9=low 45=high 6.07
19 Insulin pmil/L 6.07
20 Creatinine mg/dL 6.16
21 Triglycerides, total mg/dL 6.16
22 Glycemic Index (total carbs) 0–>0 carbohydrates/day 6.19
23 Cynical Hostility Cynicism subscale of Cooke-Medley Questionnaire, range: 0–13. Higher scores indicate greater hostility 6.26
24 WHI Hormone Trial treatment arm Estrogen, estrogen + progesterone, placebo 6.29
25 HDL cholesterol mg/dL 6.39
26 Social strain 4=none, 20=all 6.68
27 Total energy expended in recreational activity per wk. Kcal/week per kg 6.70
28 Hearing loss 1=mild, 3=severe 6.71
29 US Region of country 1=Northeast 2=South 3=Midwest 4=West 6.85
30 Life satisfaction 0=dissatisfied, 10=satisfied 7.00
31 Occupation Managerial/professional, technical/sales, homemaker only, service/labor 7.03
32 Walk > 10 min 0=rarely, 5=7 or more times each week 7.15
33 Alcohol intake 1=none, 6=7+servings/week 7.30
34 1 yr change in health status 1=much better, 5= much worse 7.82
35 General health status 1=Excellent, 5=poor 7.90
36 Difficulty concentrating 1=mild, 3=severe 8.30
37 Social functioning 0=poor, 100= high 8.40
38 Moderate-strenuous activity >/=20 min/wk 1=no activity, 4=moderate or strenuous activity 8.75
39 Major money problems 1=not too much, 3=very much 8.92
40 Race/Ethnicity 1=Amer Indian/Alaskan Native, 2=Asian or Pacific Islander, 3=Black or African American, 4=Hispanic/Latino, 5= White 9.12
41 Trouble with vision 1=mild, 3=severe 9.22
42 Marital status 1=never married, 2=divorced/separated, 3=widowed, 4=married, 5=marriage-lke relationship 9.28
43 Smoking 0=never, 1=past, 2=current 9.59
44 Hypertension 0=No, 1=Yes 9.69
45 Live alone 0=No, 1=Yes 9.75
46 Contraceptive use, ever 0=No, 1=Yes 9.86
47 Hypercholesterolemia 0=No, 1=Yes 10.40
48 Cardiovascular disease, ever 0=No, 1=Yes 10.49
49 Diabetes, ever 0=No, 1=Yes 10.91
50 Cancer, ever 0=No, 1=Yes 11.12

Analyses

We focused our analyses on identifying baseline predictors that best discriminated participants with preserved cognitive function from those with cognitive impairment agnostic to specific hypotheses. RF classification analyses were performed to investigate discrimination between cognitively preserved and impaired participants with 14 years of follow up because it bridged WHIMS, WHIMS Extension and WHIMS-ECHO timelines and yielded a suitably large and balanced dataset of women aged 80+ for classification analyses. Levels of missing values were restricted to less than 5% and imputation was carried out using RF based methods[18]. APOE ε4 carrier status, a risk factor for Alzheimer’s disease, was not included due to large levels of missing data (> 15%) and only available for non-Hispanic White women.

To evaluate RF classification performance, the matrix of predictors and labels were provided to the randomForestSRC software package available in R[19]. Computations were repeated 100 times. The default values of the algorithm (mtry = sqrt(#predictors)) were used with the exception of the number of trees which was fixed to 1000. Missing values were imputed using the “on the fly” (OTF) approach described in Tang et al [18]. Briefly, the basic steps of the OTF approach are: 1) only non-missing data are used to calculate the split-statistic for splitting a tree node; 2) when assigning left and right daughter node membership (samples are partitioned at any given node) if the variable used to split the node has missing data, missing data for that variable are “imputed” by drawing a random value from the inbag (samples selected to build the tree) non-missing data; 3) following a node split, imputed data are reset to missing and the process is repeated until terminal nodes are reached. Note that after terminal node assignment, imputed data are reset back to missing and 4) missing data in terminal nodes are then imputed using OOB non-missing terminal node data from all the trees.

To assess the relevance of each variable for overall prediction, RF minimal depth index (MDI) was used. The MDI was derived based on the observation that in forests, variables that split closer to the root node (upper) have a strong effect on prediction accuracy[20]. The MDI is a measure of how close to the root node in a tree structure (lower MDI indicate higher relevance). The closer the variable is located to the root of the tree (in average across the trees in the forest) the more influence it has on the final prediction of the full model. MDI has several advantages[20]. It is independent of the way prediction error is measured (e.g. AUC, classification error, etc.). It can be expressed in closed form and from this a rigorous threshold value for selecting variables can be computed efficiently in high-dimensional settings.

The MDI of all variables were averaged across all repetitions of the computations. To elucidate further predictors importance, we used variable selection capabilities available in the randomForestSRC software package. The subset of more relevant predictors was selected using the more conservative threshold available in the software, which is based on a probabilistic distribution of non-informative variables in RF previously derived by Ishwaran and colleagues [21]. Those predictors that survived the threshold 95% of the time were deemed as most relevant. Average values of the area under the curve (AUC) and confidence interval across repetitions of the computations were reported as measure of the model performance. AUC and its confidence intervals were estimated using the pROC r library[22] based on the OOB mechanism.

Results

Table 1 lists baseline descriptive characteristics of the full cohort. The mean age was 69.9 (3.6) years, 89% were White, and 76% had a greater than a high school education. Model performance discriminating between the cognitively preserved and impaired groups based on AUC was 0.80 (95% CI - 0.76–0.85). In Table 2 the raw baseline values of these identified top ten predictors for each group are presented in order to provide insight about univariate effects directions of the predictors identified by RF. Compared to their cognitively impaired counterparts, women with preserved cognitive function were younger, better educated, less forgetful, less depressed and more optimistic at baseline. They also reported better physical function, less sleep disturbance (insomnia) and had lower systolic blood pressure, hemoglobin, and blood glucose levels. The six predictors in order which survived the 95% threshold were age, self-reported forgetfulness, self-reported physical function, optimism, hemoglobin and blood glucose levels. Table 3 lists all 50 predictors in order of their MDI from lowest (most important) to highest (least important).

Table 1.

Baseline demographics and values of several variables of the selected cohort (n=381).

Variables Mean (SD) or N (%)
Age 69.9 (3.6)
Race/Ethnicity
 American Indian or Alaskan Native 2 (0.5%)
 Asian or Pacific Islander 1 (0.3%)
 Black or African-American 28 (7.4%)
 Hispanic/Latino 8 (2.1%)
 White (not of Hispanic origin) 338 (88.7%)
 Other 4 (1.1%)
Education
 Grade school (1–4 years) 1 (0.3%)
 Grade school (5–8 years) 5 (1.3%)
 Some high school (9–11 years) 11 (2.9%)
 High school diploma or GED 71 (18.7%)
 Vocational or training school 43 (11.4%)
 Some college or Associate Degree 112 (29.6%)
 College graduate or Baccalaureate Degree 28 (7.4%)
 Some post-graduate or professional 30 (7.9%)
 Master’s Degree 70 (18.5%)
 Doctoral Degree 8 (2.1%)
Forgetfulness
 Symptom did not occur 152 (40.1%)
 Symptom was mild 181 (47.8%)
 Symptom was moderate 41 (10.8%)
 Symptom was severe 5 (1.3%)
Physical Function 80.0 (19.5)
Systolic blood pressure 131.6 (16.9)
Glucose 99.9 (20.2)
Optimism 23.5 (3.2)
Sleep Disturbance 6.6 (4.6)
Depression 0.03 (0.1)

Table 2–

Comparison of baseline characteristics of the top-ten most relevant predictors for the cognitively-preserved versus cognitively-impaired groups.

Variables Preserved (n=205) Impaired (n=176)
Age 68.3 (2.9) 71.7 (3.5)
Education
 Grade school (1–4 years) 0 (0%) 1 (0.6%)
 Grade school (5–8 years) 0 (0%) 5 (2.9%)
 Some high school (9–11 years) 2 (1.0%) 9 (5.1%)
 High school diploma or GED 33 (16.2%) 38 (21.7%)
 Vocational or training school 21 (10.3%) 22 (12.6%)
 Some college or Associate Degree 69 (33.8%) 43 (24.6%)
 College graduate or Baccalaureate Degree 14 (6.9%) 14 (8.0%)
 Some post-graduate or professional 14 (6.9%) 16 (9.1%)
 Master’s Degree 47 (23.0%) 23 (13.1%)
 Doctoral Degree 4 (2.0%) 4 (2.3%)
Depression 0.02 (0.09) 0.03 (0.09)
Optimism 24.1 (3.0) 22.9 (3.2)
Forgetfulness
 Symptom did not occur 113 (55.4%) 39 (22.3%)
 Symptom was mild 83 (40.7%) 98 (56.0%)
 Symptom was moderate 7 (3.4%) 34 (19.4%)
 Symptom was severe 1 (0.5%) 4 (2.3%)
Sleep Disturbance 5.9 (4.6) 7.4 (4.4)
Physical Function 84.9 (16.8) 74.2 (20.8)
Systolic blood pressure 129.6 (16.5) 133.9 (17.2)
Blood Glucose 98.3 (19.4) 101.7 (21.0)
Blood Hemoglobin 13.6 (0.8) 13.8 (1.1)

Discussion

In this study we aggregated a diverse set of potential predictors of well-preserved cognitive function in women over 80 years of age. We capitalized on the richly phenotyped datasets from the longitudinal WHI and WHIMS studies which collected demographic, functional, biomedical, behavioral, psychological and social variables in addition to annual assessments of cognitive function with centrally adjudicated cognitive impairment outcomes. Applying a machine learning approach we identified a set of diverse, mostly modifiable baseline predictors over 14 years of follow-up. Compared to women with adjudicated cognitive impairment, women with preserved cognitive function were younger, reported less forgetfulness and better physical function, were dispositionally more optimistic, had lower hemoglobin and glucose concentrations in their blood, reported less sleep disturbance, had lower systolic blood pressure and higher educational attainment, and reported fewer depressive symptoms. Among the weakest predictors were race/ethnicity, vision problems, marital status, smoking status, living alone, prior contraception use, and diagnosed hypertension, hypercholesterolemia, cardiovascular disease, diabetes or cancer.

RF is a strongly nonlinear and multivariate method which potentially could identifiy complex relationships between variables which could differ from associations detected in linear settings. On the other hand, when interpreting these findings it should be kept in mind that there is a body literature indicating that RF measures of variable importance like the Gini and permutation indices could be affected by correlations in the data and/or could favor continuous or categorical variables with many values [23]. The impact of those factors on MDI, used in this work, has been less studied. A recent simulation study suggested that the MDI performs well in the presence of correlations but could favor continuous variables or categorical variables with many values [24]. However, these reports are often based on simple simulation settings with small number of variables and it is unclear how they generalize to other problems of much higher dimensions.

Our results were similar to several other studies employing machine learning strategies to predict better cognitive function. McFall et al. found age, BMI and depressive symptoms to be the best discriminators between older adults with a stable episodic memory aging trajectory and declining memory trajectory[9]. In a sample of men and women aged 65–90 years, Casanova et al. found that education, age, gender, history of stroke, diabetes, and socioeconomic status were the most important discriminators of individuals with higher cognitive functioning from those with lower cognitive functioning based on a shortened TICSm [8]. Aschwanden et al. reported that among 52 diverse predictors, “higher order factors” including emotional distress, poorer subjective health, lower education, increasing BMI were strong predictors of cognitive impairment also defined by the TICSm and that behavioral risk factors like smoking and lower physical activity and polygenic scores were relatively less important predictors [10]. While results of multivariable machine learning studies will be dependent on which populations and variables are selected for analysis, there are several consistencies across these studies. First, the top predictors represent different variable classes and include demographic, behavioral, medical and psychological factors. Second, most of the predictors are potentially modifiable. Third, some well-known risk factors like diagnosed diabetes, cardiovascular disease, hypertension were not among the best predictors. Lastly, age, education, depression or emotional distress and BMI appear repeatedly across studies.

In a prior study of optimal cognitive functioning using WHIMS data, Goveas et al. used stepwise logistic regression to identify variables that distinguished WHIMS participants who maintained high TICSm scores from women whose scores were lower but still in the normal range and were free of cognitive impairment. They found that younger age, higher education, greater family income, being non-Hispanic White, better emotional well-being, fewer depressive symptoms, absence of diabetes, not carrying the apolioprotein ɛ4 genotype and slightly more sleep disturbance distinguished the higher functioning women from the low normal group[25]. Our work extended their study by including more predictors from diverse classes, requiring a higher level of cognitive performance on the TICSm for the preserved cognitive function group, and using adjudicated cognitive impairment outcomes. Despite these methodological differences, we too found younger age, more education, fewer depressive symptoms and better emotional well-being to be among the strongest discriminators between cognitively well-preserved from impaired women.

Hemoglobin and glucose levels were also among the best predictors of preserved cognitive function in our study. While poorer cognitive performance has been associated with both decreases and increases in hemoglobin [2628], high blood glucose levels are risk factors for insulin resistance, diabetes mellitus and dementia[29]. Higher glucose levels at baseline was a more highly ranked predictor than a diagnosis of DM, which could reflect that a greater number of women have elevated glucose than are diagnosed with DM. Disrupted glucose metabolism and insulin resistance also are associated with sleep disturbance[30], another predictor of cognitive function in our study. This may suggest that close monitoring of blood glucose levels and sleep quality could protect cognitive function.

Lower systolic blood pressure predicted better cognitive functioning (and lower likelihood of cognitive impairment) in our modeling while diagnosed hypertension was not a consistent predictor, perhaps because there may be more women with elevated SBP than there are women who have been diagnosed with hypertension. These results appear consistent with the findings from a recent SBP lowering treatment trial. In the Systolic Blood Pressure Trial in which intensive SBP treatment to a goal of <120 mmHg compared to a goal of <140 mmHg resulted in significantly fewer cases of MCI and a composite of probable dementia or MCI [31].

Optimism, the dispositional tendency to form positive expectations, has been studied in relation to health for many years. It has been associated with a variety of favorable health outcomes that could affect cognitive function including cardiovascular disease, coronary artery disease, stroke, congestive heart failure and mortality[32]. Using data from the WHI, Tindle et al showed that optimists were less likely that pessimisms to develop CHD and less likely to die from CHD-related causes[33]. Kim et al. reported that optimism, assessed in the same way as the present study, predicted disease-specific and all-cause mortality in older women from the Nurse’s Health Study [34]. While the mechanisms by which optimism protects health are not known, there exist data suggesting both biological and behavioral pathways[32]. Our results hint this psychological factor may also benefit cognitive functioning, though we must be cautious not to infer causal relationships from this study.

After age, forgetfulness reported by participants at WHI study entry was the most reliable predictor of cognitive function over 14 years of follow up. Perceived cognitive changes are associated with structural brain changes [3537], functional neural connectivity[38], and cerebral metabolic dysfunction[39, 40] 2012). Subjective cognitive decline (SCD) may be a precursor to MCI and dementia [41]given that the risk of progression to Alzheimer’s dementia (AD) in older adults with SCD is two times greater than individuals without SCD and the annual conversion rate to MCI and AD are 6.6% and 2.3%, respectively[41]. Our data support the significance of perceived cognitive changes as an early marker of late life cognitive function.

This study is not without limitations. The transition from face-to-face cognitive assessments to telephone assessments between WHIMS and WHIMS-ECHO may have influenced ascertainment of cognitive impairment. However, Espeland et al. demonstrated similar cognitive trajectories across the different WHIMS assessment methodologies [42]. The use of only baseline predictors may have reduced our model’s precision. The WHI data collection protocol does not collect all variables at each assessment, so modeling time varying variables comparably was not possible. However, this limitation should make finding associations between some predictors and subsequent cognitive function more difficult. While the RF indices of variable importance are widely used in studies like ours, they can be impacted by correlated predictors and bias towards continuous and categorical variables with many levels [23], though the impact of correlations on the minimal depth index has been less reported. Finally, the analyses are based on a subset of women who may not fully represent women the WHIMS inception cohort.

Strengths of the study include the large number and variety of candidate predictors available in WHI, the relatively large sample size, the length of follow-up, the longitudinal administration of standardized cognitive assessments, the use of adjudicated cognitive impairment outcomes and the use of state of the art high-dimensional approaches for predicting modelling.

Conclusion

By employing machine learning techniques we were able to identify a diverse set of largely modifiable predictors that distinguished cognitively resilient older women from those who developed cognitive impairment. Identifying predictors of cognitive resilience can indicate potential targets for intervention research aimed at enhancing and protecting this vital function and reducing impairment and disability in later life.

Supplementary Material

supplementary material

Acknowledgments

Wake Forest Alzheimer’s Disease Core Center (P30AG049638-01A1). The WHI programs is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, US Department of Health and Human Services through contracts, HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C and HHSN271201100004C. For a list of all the investigators who have contributed to WHI science, please visit: https://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Long%20List.pdf. The Women’s Health Initiative Memory Study was funded cas an ancillary study to the WHI by Wyeth Pharmaceuticals, Inc., Wake Forest University; and the National Heart, Lung, and Blood Institute, National Institutes of Health; and the National Institute of Aging, National Institutes of Health (contract number HHSN271-2011-00004C).

Wake Forest Alzheimer’s Disease Core Center (P30AG049638-01A1).

Funding

The Women’s Health Initiative Memory Study was funded as an ancillary study to the WHI by Wyeth Pharmaceuticals, Inc., Wake Forest University; and the National Heart, Lung, and Blood Institute, National Institutes of Health; and the National Institute of Aging, National Institutes of Health (HHSN271-2011-00004C, HHSN271-2017-00002C). RC, SRR, SAG, KMH, BCS and LB receive funding from the Wake Forest Alzheimer’s Disease Core Center (P30AG049638-01A1). JNJ receives funding from the Wake Forest Claude D. Pepper OAIC (P30 AG021332) and NIA (K01 AG059837). VWH receives funding from the Stanford Alzheimer’s Disease Research Center (P30 AG066515).

Footnotes

Conflict of Interests

No conflicts to report.

Contributor Information

Ramon Casanova, Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC.

Sarah A. Gaussoin, Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC

Robert Wallace, College of Public Health, University of Iowa Epidemiology and Internal Medicine, College of Public Health Building.

Laura Baker, Department of Gerontology and Geriatrics, Wake Forest School of Medicine, Winston-Salem, NC

Jiu-Chiuan Chen, Departments of Preventive Medicine and Neurology, University of Southern California, Los Angeles, CA

JoAnn E. Manson, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA

Victor W. Henderson, Departments of Epidemiology and Population Health and of Neurology and Neurological Sciences, Stanford University, Stanford, CA

Bonnie C. Sachs, Department of Social Sciences & Health Policy, Wake Forest School of Medicine, Winston-Salem, NC Department of Neurology, Wake Forest School of Medicine, Winston-Salem, NC.

Jamie Justice, Department of Gerontology and Geriatrics, Wake Forest School of Medicine, Winston-Salem, NC

Eric A. Whitsel, Department of Epidemiology, Gillings School of Global Public Health and Department of Medicine, School of Medicine, University of North Carolina, Chapel Hill, NC.

Kathleen M. Hayden, Department of Social Sciences & Health Policy, Wake Forest School of Medicine, Winston-Salem, NC.

Stephen R. Rapp, Department of Psychiatry and Behavioral Medicine, Wake Forest School of Medicine, Winston-Salem, NC Department of Social Sciences & Health Policy, Wake Forest School of Medicine, Winston-Salem, NC.

References

  • [1].Ambale-Venkatesh B, Yang X, Wu CO, Liu K, Hundley WG, McClelland R, Gomes AS, Folsom AR, Shea S, Guallar E, Bluemke DA, Lima JAC (2017) Cardiovascular Event Prediction by Machine Learning: The Multi-Ethnic Study of Atherosclerosis. Circ Res 121, 1092–1101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Casanova R, Saldana S, Simpson SL, Lacy ME, Subauste AR, Blackshear C, Wagenknecht L, Bertoni AG (2016) Prediction of Incident Diabetes in the Jackson Heart Study Using High-Dimensional Machine Learning. PLoS One 11, e0163942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].La Joie R, Visani AV, Baker SL, Brown JA, Bourakova V, Cha J, Chaudhary K, Edwards L, Iaccarino L, Janabi M, Lesman-Segev OH, Miller ZA, Perry DC, O’Neil JP, Pham J, Rojas JC, Rosen HJ, Seeley WW, Tsai RM, Miller BL, Jagust WJ, Rabinovici GD (2020) Prospective longitudinal atrophy in Alzheimer’s disease correlates with the intensity and topography of baseline tau-PET. Sci Transl Med 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Aiello Bowles EJ, Crane PK, Walker RL, Chubak J, LaCroix AZ, Anderson ML, Rosenberg D, Keene CD, Larson EB (2019) Cognitive Resilience to Alzheimer’s Disease Pathology in the Human Brain. J Alzheimers Dis 68, 1071–1083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].McDermott KL, McFall GP, Andrews SJ, Anstey KJ, Dixon RA (2017) Memory Resilience to Alzheimer’s Genetic Risk: Sex Effects in Predictor Profiles. J Gerontol B Psychol Sci Soc Sci 72, 937–946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Kaup AR, Nettiksimmons J, Harris TB, Sink KM, Satterfield S, Metti AL, Ayonayon HN, Yaffe K, Health A, Body Composition S (2015) Cognitive resilience to apolipoprotein E epsilon4: contributing factors in black and white older adults. JAMA Neurol 72, 340–348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Topiwala A, Suri S, Allan C, Valkanova V, Filippini N, Sexton CE, Heise V, Zsoldos E, Mahmood A, Singh-Manoux A, Mackay CE, Kivimaki M, Ebmeier KP (2019) Predicting cognitive resilience from midlife lifestyle and multi-modal MRI: A 30-year prospective cohort study. PLoS One 14, e0211273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Casanova R, Saldana S, Lutz MW, Plassman BL, Kuchibhatla M, Hayden KM (2018) Investigating predictors of cognitive decline using machine learning. J Gerontol B Psychol Sci Soc Sci. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].McFall GP, McDermott KL, Dixon RA (2019) Modifiable Risk Factors Discriminate Memory Trajectories in Non-Demented Aging: Precision Factors and Targets for Promoting Healthier Brain Aging and Preventing Dementia. J Alzheimers Dis 70, S101–S118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Aschwanden D, Aichele S, Ghisletta P, Terracciano A, Kliegel M, Sutin AR, Brown J, Allemand M (2020) Predicting Cognitive Impairment and Dementia: A Machine Learning Approach. J Alzheimers Dis 75, 717–728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Gupta A, Kahali B (2020) Machine learning-based cognitive impairment classification with optimal combination of neuropsychological tests. Alzheimers Dement (N Y) 6, e12049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Stefanick M CB, Hsia J, Barad D, Liu J, Johnson S. (2003) The WHI Postmenopausal Hormone Trials.. Ann Epidemiol 13, S78–S86. [DOI] [PubMed] [Google Scholar]
  • [13].Shumaker SA, Reboussin BA, Espeland MA, Rapp SR, McBee WL, Dailey M, Bowen D, Terrell T, Jones BN (1998) The Women’s Health Initiative Memory Study (WHIMS): a trial of the effect of estrogen therapy in preventing and slowing the progression of dementia. Control Clin Trials 19, 604–621. [DOI] [PubMed] [Google Scholar]
  • [14].Teng EL, Chui HC (1987) The Modified Mini-Mental State (3MS) examination. Journal of Clinical Psychiatry 48, 314–318. [PubMed] [Google Scholar]
  • [15].Rapp SR, Legault C, Espeland MA, Resnick SM, Hogan PE, Coker LH, Dailey M, Shumaker SA, Group CATS (2012) Validation of a cognitive assessment battery administered over the telephone. J Am Geriatr Soc 60, 1616–1623. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Welsh KA, Breitner J, Magruder-Habib KM (1993) Detection of dementia in the elderly using the telephone interview for cognitive status. Neuropsychiatry, Neuropsychology, & Behavioral Neurology 6, 103–110. [Google Scholar]
  • [17].Breiman L (2001) Random Forests. Machine Learning 45, 5–32. [Google Scholar]
  • [18].Tang F, H. I (2017) Random forest missing data algorithms. Statistical Analysis and Data Mining 10, 363–377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Ishwaran H, Kogalur (2014) Random Forests for Survival, Regression and Classification. http://cranr-project.org/package=randomForestSRC.
  • [20].Chen X, Ishwaran H (2012) Random forests for genomic data analysis. Genomics 99, 323–329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Ishwaran H, Kogalur UB, Gorodeski EZ, Minn AJ, Minn AJ, Lauer MS (2010) High-Dimensional Variable Selection for Survival Data. Journal of the American Statistical Association 105. [Google Scholar]
  • [22].Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Muller M (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Strobl C, Boulesteix AL, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Epifanio I (2017) Intervention in prediction measure: a new approach to assessing variable importance for random forests. BMC Bioinformatics 18, 230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Goveas JS, Rapp SR, Hogan PE, Driscoll I, Tindle HA, Smith JC, Kesler SR, Zaslavsky O, Rossom RC, Ockene JK, Yaffe K, Manson JE, Resnick SM, Espeland MA (2016) Predictors of Optimal Cognitive Aging in 80+ Women: The Women’s Health Initiative Memory Study. J Gerontol A Biol Sci Med Sci 71 Suppl 1, S62–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Shah RC, Wilson RS, Tang Y, Dong X, Murray A, Bennett DA (2009) Relation of hemoglobin to level of cognitive function in older persons. Neuroepidemiology 32, 40–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Schneider AL, Jonassaint C, Sharrett AR, Mosley TH, Astor BC, Selvin E, Coresh J, Gottesman RF (2016) Hemoglobin, Anemia, and Cognitive Function: The Atherosclerosis Risk in Communities Study. J Gerontol A Biol Sci Med Sci 71, 772–779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Wolters FJ, Zonneveld HI, Licher S, Cremers LGM, Heart Brain Connection Collaborative Research G, MK Ikram, Koudstaal PJ, Vernooij MW, Ikram MA (2019) Hemoglobin and anemia in relation to dementia risk and accompanying changes on brain MRI. Neurology 93, e917–e926. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Xu WL, von Strauss E, Qiu CX, Winblad B, Fratiglioni L (2009) Uncontrolled diabetes increases the risk of Alzheimer’s disease: a population-based cohort study. Diabetologia 52, 1031–1039. [DOI] [PubMed] [Google Scholar]
  • [30].Kothari V, Cardona Z, Chirakalwasan N, Anothaisintawee T, Reutrakul S (2021) Sleep interventions and glucose metabolism: systematic review and meta-analysis. Sleep Med 78, 24–35. [DOI] [PubMed] [Google Scholar]
  • [31].Group SMIftSR, Williamson JD, Pajewski NM, Auchus AP, Bryan RN, Chelune G, Cheung AK, Cleveland ML, Coker LH, Crowe MG, Cushman WC, Cutler JA, Davatzikos C, Desiderio L, Erus G, Fine LJ, Gaussoin SA, Harris D, Hsieh MK, Johnson KC, Kimmel PL, Tamura MK, Launer LJ, Lerner AJ, Lewis CE, Martindale-Adams J, Moy CS, Nasrallah IM, Nichols LO, Oparil S, Ogrocki PK, Rahman M, Rapp SR, Reboussin DM, Rocco MV, Sachs BC, Sink KM, Still CH, Supiano MA, Snyder JK, Wadley VG, Walker J, Weiner DE, Whelton PK, Wilson VM, Woolard N, Wright JT Jr., Wright CB (2019) Effect of Intensive vs Standard Blood Pressure Control on Probable Dementia: A Randomized Clinical Trial. JAMA 321, 553–561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Scheier MF, Carver CS (2018) Dispositional optimism and physical health: A long look back, a quick look forward. Am Psychol 73, 1082–1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].indle HA, Chang Y, Kuller LH, Manson JE, Robinson JG, Rosal MC, Siegle GJ, KA. M (2009) Optimism, cynical hostility, and incident coronary heart disease and mortality in the Women’s Health Initiative. Circulation 120, 656–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Kim ES, Hagan KA, Grodstein F, DeMeo DL, De Vivo I, Kubzansky LD (2017) Optimism and Cause-Specific Mortality: A Prospective Cohort Study. Am J Epidemiol 185, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Saykin AJ, Wishart HA, Rabin LA, Santulli RB, Flashman LA, West JD, McHugh TL, Mamourian AC (2006) Older adults with cognitive complaints show brain atrophy similar to that of amnestic MCI. Neurology 67, 834–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Peter J, Scheef L, Abdulkadir A, Boecker H, Heneka M, Wagner M, Koppara A, Kloppel S, Jessen F, Alzheimer’s Disease Neuroimaging I (2014) Gray matter atrophy pattern in elderly with subjective memory impairment. Alzheimers Dement 10, 99–108. [DOI] [PubMed] [Google Scholar]
  • [37].Perrotin A, de Flores R, Lamberton F, Poisnel G, La Joie R, de la Sayette V, Mezenge F, Tomadesso C, Landeau B, Desgranges B, Chetelat G (2015) Hippocampal Subfield Volumetry and 3D Surface Mapping in Subjective Cognitive Decline. J Alzheimers Dis 48 Suppl 1, S141–150. [DOI] [PubMed] [Google Scholar]
  • [38].Haiqing S, Haixia L, Xiumei Z, Chunshui Y, Bing L, W Z (2015) APOE effects on default mode network in Chinese cognitive normal elderly: relationship with clinical cognitive performance. PloS ONE. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Jessen F, Amariglio RE, van Boxtel M, Breteler M, Ceccaldi M, Chetelat G, Dubois B, Dufouil C, Ellis KA, van der Flier WM, Glodzik L, van Harten AC, de Leon MJ, McHugh P, Mielke MM, Molinuevo JL, Mosconi L, Osorio RS, Perrotin A, Petersen RC, Rabin LA, Rami L, Reisberg B, Rentz DM, Sachdev PS, de la Sayette V, Saykin AJ, Scheltens P, Shulman MB, Slavin MJ, Sperling RA, Stewart R, Uspenskaya O, Vellas B, Visser PJ, Wagner M, Subjective Cognitive Decline Initiative Working G (2014) A conceptual framework for research on subjective cognitive decline in preclinical Alzheimer’s disease. Alzheimers Dement 10, 844–852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [40].Scheef L, Spottke A, Daerr M, Joe A, Striepens N, Kolsch H, Popp J, Daamen M, Gorris D, Heneka MT, Boecker H, Biersack HJ, Maier W, Schild HH, Wagner M, Jessen F (2012) Glucose metabolism, gray matter structure, and memory decline in subjective memory impairment. Neurology 79, 1332–1339. [DOI] [PubMed] [Google Scholar]
  • [41].Mitchell AJ, Beaumont H, Ferguson D, Yadegarfar M, Stubbs B (2014) Risk of dementia and mild cognitive impairment in older people with subjective memory complaints: meta-analysis. Acta Psychiatr Scand 130, 439–451. [DOI] [PubMed] [Google Scholar]
  • [42].Espeland MA, Chen JC, Weitlauf J, Hayden KM, Rapp SR, Resnick SM, Garcia L, Cannell B, Baker LD, Sachs BC, Tindle HA, Wallace R, Casanova R, Women’s Health Initiative Memory Study Magnetic Resonance Imaging Study G (2018) Trajectories of Relative Performance with 2 Measures of Global Cognitive Function. J Am Geriatr Soc 66, 1575–1580. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary material

RESOURCES