Abstract
Introduction
Hippocampal atrophy is an established biomarker for conversion from the normal ageing process to developing cognitive impairment and dementia. This study used a novel hypothesis-free machine-learning approach, to uncover potential risk factors of lower hippocampal volume using information from the world’s largest brain imaging study.
Methods
A combination of machine learning and conventional statistical methods were used to identify predictors of low hippocampal volume. We run gradient boosting decision tree modelling including 2,891 input features measured before magnetic resonance imaging assessments (median 9.2 years, range 4.2–13.8 years) using data from 42,152 dementia-free UK Biobank participants. Logistic regression analyses were run on 87 factors identified as important for prediction based on Shapley values. False discovery rate-adjusted p value <0.05 was used to declare statistical significance.
Results
Older age, male sex, greater height, and whole-body fat-free mass were the main predictors of low hippocampal volume with the model also identifying associations with lung function and lifestyle factors including smoking, physical activity, and coffee intake (corrected p < 0.05 for all). Red blood cell count and several red blood cell indices such as haemoglobin concentration, mean corpuscular haemoglobin, mean corpuscular volume, mean reticulocyte volume, mean sphered cell volume, and red blood cell distribution width were among many biomarkers associated with low hippocampal volume.
Conclusion
Lifestyles, physical measures, and biomarkers may affect hippocampal volume, with many of the characteristics potentially reflecting oxygen supply to the brain. Further studies are required to establish causality and clinical relevance of these findings.
Keywords: Hippocampal volume, Risk factors, Machine learning, Statistical methods, UK Biobank
Introduction
Brain magnetic resonance imaging (MRI) can detect alterations in brain structure and function, such as hippocampal atrophy, long before symptoms of cognitive impairment or dementia are apparent [1–3]. Hippocampal atrophy is an established biomarker for conversion from the normal ageing process to cognitive impairment and dementia [4, 5]. Modifiable risk factors such as smoking, physical inactivity, obesity, and hypertension have been associated with hippocampal atrophy and cognitive decline [6–9]. Compared to changes in other brain regions, hippocampal atrophy is linked to a faster deterioration of memory, and it provides better accuracy in diagnosing Alzheimer’s disease (AD) [10]. Therefore, identifying factors associated with hippocampal atrophy can help with revising actionable strategies for dementia prevention, at the stage when it may be possible to stop or delay the disease process. Dementia is the seventh leading cause of death worldwide and one of the leading causes of disability and dependency in older people globally [11]. The economic and human impact of dementia is enormous, estimated in 2016 to have contributed to the loss of 28.8 million disability-adjusted life years globally [12].
There have been several studies looking into the risk factors affecting hippocampal atrophy [4, 13–15]. However, most of these studies have been hypothesis driven, addressed through traditional epidemiological approaches. This study uses a novel hypothesis-free machine-learning (ML) approach, including over 2,800 phenotypic variables available from the UK Biobank (UKB) to screen for potential risk factors that associate with lower hippocampal volume (HV). Specifically, we used the gradient boosting decision tree (GBDT) approach, which is one of the leading ML algorithms for tasks related to predictive analytics [16]. GBDT allows us to overcome practical constraints that prevent this type of multi-trait analyses using standard epidemiological analyses, including multicollinearity and sensitivity to missing information. It also allows for non-linearity, and intricate interactions among the features, which, where present, may mask an association with standard models that typically assume linearity of association unless explicitly modelled [17]. Understanding which factors may influence the risk of low HV can support strategies of early dementia prediction and prevention and direct the development of clinical trials to prevent age-related cognitive decline and dementia [4]. Hence, this study was conducted to identify risk factors that are associated with low HV.
Materials and Methods
Participants
The UKB is an observational prospective epidemiological cohort including over 500,000 participants aged 37–73 years at the time of recruitment between 2006 and 2010 [18]. The baseline data collection was done in 22 assessment centres across the UK, with the participants undergoing physical measurements and providing blood and urine samples for biomedical assays, in addition to providing information through touchscreen questionnaire surveys and interviews [18]. An imaging sub-study was introduced in 2014, and so far over 60,000 participants (of a target of 100,000) have undergone brain MRIs [19]. This study is restricted to the 42,152 participants who took part in the imaging sub-study and for whom we had valid information on HVs (Fig. 1). As we were interested in identifying predictors of lower HV, analyses were conducted excluding all participants who had dementia at baseline (n = 19).
Outcome
Brain imaging data were obtained using a Siemens Skyra 3T running VD13A SP4 with a standard Siemens 32-channel RF receive head coil [20]. Centrally at the UKB, the MRI data were further processed to remove artefacts and align images across modalities and individuals and to generate the most useful image-derived phenotypes. HV was ascertained from T1-weighted imaging modality, with further processing details provided in previous publication [20]. We normalized HV for participant head size using volumetric scaling factor from T1 head image to standard space (normalized HV = HV × head size scaling factor) [21]. We excluded outlier HV measurements outside the range of ±3 standard deviations from the mean (N = 616).
We classified participants in the lowest 20% of HV as cases with “low HV” and compared them to all others. This categorization is based on evidence from previous study [22], which indicated that individuals in the lowest quintiles, compared to other groups, had approximately 4 times higher risk of AD. Similar estimates were found in our dataset (data not shown).
Potential Predictors
As potential predictors, we included information obtained at the baseline, including details from the touch screen questionnaires, interviews, clinical assessments, and blood and urine sampling. The data covered different domains such as sociodemographic, family history, early life, medical history, cognitive function, blood pressure, anthropometry, diet, mental health, biomarkers, and physical activity. Data were pre-processed using PHEnome Scan Analysis (PHESANT) software [23], which allowed the conversion of categorical variables to dummy variables and the exclusion of negative numerical values from particular categories denoting missingness [23]. After excluding closely correlated features (r ≥ 0.9) and restricting information to those that were available for at least 70% of the participants, we had 2,891 different phenotypic features for our analyses. Among the small sets of highly correlated features, the one with the least missing values was chosen for the GBDT models and the others from the set were dropped. For the purposes of reporting, the features were classified into six broad categories, including “baseline, personal, and sociodemographic characteristics,” “lifestyle and environment,” “physical measures,” “cognitive function and psychosocial factors,” “health and medical history,” and “biomarkers.”
Statistical Analyses
Identifying Potential Risk Factors Using GBDT-SHAP Pipeline
We used the GBDT with SHapley Additive exPlanations (SHAPs) (version 0.39.0) ML pipeline to identify potential predictors of lower HV risk [24]. Data were split into training, development, and test sets at the ratio of 60:20:20. The development set is used to avoid overfitting, a problem usually seen in high-dimensional datasets [25], through early stopping of training. The test set was used for reporting the performance of the ML models. Model development and feature importance calculation was repeated 100 times using different random splits of the data and the feature importance was averaged to reduce split-specific variabilities. We calculated SHAP values for each feature and each participant individually. These values were aggregated by taking the mean of absolute SHAP values across all samples, resulting in global feature importance, that allowed us to estimate the relative contribution of each risk factor to the model prediction. Elimination of irrelevant predictors was carried out using a previously proposed SHAP threshold (top 3%) [26], resulting in a reduced set of “important” features for further epidemiological analyses (Fig. 2). CatBoost version 0.21, implemented in Python 3.10, was used for GBDT model development.
Epidemiological Analyses
After identifying potential risk factors using the GBDT-SHAP pipeline, the associations between each factor and low HV were assessed using logistic regression, adjusting all analyses for age, sex, assessment centre, ethnicity, education, employment, Townsend deprivation index, and duration until imaging (in years). Due to previously demonstrated dependence of body size [27, 28], models on forced vital capacity and left-hand grip strength were further adjusted for height and height and weight, respectively. Continuous measures (e.g., physical measures, blood and urine biomarkers) were divided in quintiles for the interest of presentation of the results. Quadratic terms were included in the models to check for nonlinearity. We also tested for effect modification by age (<65 years vs. ≥ 65 years) and sex by including relevant interaction terms in the models of the features and low HV. We used false discovery rate (FDR) to account for multiple testing. Odds ratios (ORs) and their 95% confidence intervals (95% CI) were used to present the results. All epidemiological analyses were undertaken using STATA (version 17, StataCorp, College Station, TX, USA).
Results
Participant Characteristics
The analyses included 42,152 individuals, of whom 53.1% were female. The median follow-up time from baseline assessment to HV measurements was 9.22 years (interquartile range of 7.64–10.31), and we classified 8,431 individuals (20% of the study participants) as having low HV. Compared to others, the proportion of participants classified as low HV was higher in males, older people, those with no education, and hypertensives (Table 1).
Table 1.
Characteristics | Total, n (%) (N = 42,152) | Low HV, % (n) |
---|---|---|
Sex | ||
Female | 22,372 (53.07) | 12.19 (2,728) |
Male | 19,780 (46.93) | 28.83 (5,703) |
p value | 1E−307 | |
Age at initial assessment | ||
<65 years | 37,722 (89.49) | 17.29 (6,522) |
≥65 years | 4,430 (10.51) | 43.09 (1,909) |
p value | 1E−307 | |
Ethnic background | ||
White European | 40,770 (96.72) | 20.27 (8,266) |
Asian | 566 (1.34) | 9.72 (55) |
Black African | 283 (0.67) | 9.19 (26) |
Other/mixed/unknown | 533 (1.26) | 15.76 (84) |
p value | 1.34E−10 | |
Educational status | ||
None | 2,672 (6.34) | 24.48 (654) |
NVQ/CSE/A-levels | 13,005 (30.85) | 18.77 (2,441) |
Degree/professional | 26,339 (62.49) | 20.12 (5,300) |
Missing | 136 (0.32) | 26.47 (36) |
p value | 3.63E−33 | |
Townsend deprivation index | ||
Q1 | 10,033 (23.80) | 21.02 (2,109) |
Q2 | 9,626 (22.84) | 20.48 (1,971) |
Q3 | 8,735 (20.72) | 19.86 (1,735) |
Q4 | 7,879 (18.69) | 18.72 (1,475) |
Q5 | 5,840 (13.87) | 19.49 (1,138) |
Missing | 39 (0.09) | 7.69 (3) |
p value | 1.14E−05 | |
Alcohol intake frequency | ||
Daily or almost daily | 9,522 (22.60) | 25.16 (2,396) |
Three or four times a week | 11,809 (28.03) | 20.87 (2,465) |
Once or twice a week | 10,809 (25.65) | 17.68 (1,911) |
One to three times a month | 4,597 (10.91) | 16.10 (740) |
Special occasions only | 3,447 (8.18) | 16.13 (556) |
Never | 1,949 (4.63) | 18.42 (359) |
Missing | 19 (0.05) | 21.05 (4) |
p value | 1.58E−07 | |
Smoking status | ||
Never | 25,580 (60.69) | 17.94 (4,590) |
Previous | 13,869 (32.90) | 23.40 (3,246) |
Current | 2,610 (6.19) | 21.95 (573) |
Missing | 93 (0.22) | 23.66 (22) |
p value | 1.59E−28 | |
Hypertension | ||
No | 34,327 (81.45) | 18.56 (6,370) |
Yes | 7,818 (18.55) | 26.35 (2,060) |
Missing | 7 (0.02) | 14.29 (1) |
p value | 1.49E−04 |
p values are from likelihood ratio test using logistic regression adjusted for age, sex, and assessment centre.
Identifying Potential Risk Factors Using GBDT-SHAP Pipeline
The average area under receiver operating curve for the GBDT model after 100 iterations was 0.73 both when including all 2,891 features and when restricting to the 3% of features with the highest SHAP values. The SHAP values of each feature in the GBDT model are shown in supplementary Table 1 (for all online suppl. material, see https://doi.org/10.1159/000538565). Of the total feature importance, 41% was captured by baseline, personal, and sociodemographic characteristics, 25% by physical measures, 15% by biomarkers, and 9% by lifestyle and environment, while medical and family history accounted for 7% and psychosocial and cognitive function for 4%.
Of the 77 lifestyle and environment features included in the input data, 13 were identified as important by the GBDT-SHAP pipeline, while 15 of 24 physical features were picked up. From the five features of cognitive function included in the input, 3 were identified as potentially important (time to complete round, number of incorrect matches in the round, and mean time to correctly identify matches), while out of the 511 self-reported non-cancer illnesses that were considered, only hypertension was identified as potentially important for risk of low HV. Twenty-nine features were picked up as important from 55 biomarkers included in the input data (online suppl. Table 2).
Epidemiological Analyses
Results from the logistic regression analyses of the 87 potentially important features and low HV, adjusted for age, sex, assessment centre, ethnicity, education, employment, Townsend deprivation index, and duration until imaging, are shown in online supplementary Table 3. Of the 87 features included in the analyses, 47 showed evidence for an association with risk of low HV after FDR correction. There were no interactions by age or sex for any of the features in the association with low HV (p value after FDR correction >0.05 for all comparisons).
Males had a higher risk of low HV compared to females (OR 2.83, 95% CI = 2.68–2.99) as did participants who were ≥65 years compared to <65 years (OR 1.41, 95% CI = 1.29–1.53). Among lifestyle and environment-related features, a greater risk of low HV was seen in current smokers compared to non-smokers (OR 1.27, 95% CI = 1.14–1.41) and participants who drink ≥5 cups of coffee/day compared to those who drink less than 1 cup of coffee/day (OR 1.21, 95% CI = 1.10–1.32). Higher level of moderate physical activity (PA) was associated with a lower risk of low HV, while protective association was also seen for more time spent watching television and the use of car or motor vehicle for non-work transport (Fig. 3).
Regarding physical measures, several measures reflecting greater height, weight, or fat mass were associated with higher risk of low HV (Fig. 4). Greater hand grip strength (left) and forced vital capacity were associated with lower risk of low HV, while higher pulse rate was associated with a slightly higher risk.
Low HV risk was also associated with some cognitive function, psychosocial and other health, and medical history-related features (Fig. 5). The higher the mean time taken to correctly identify matches the greater the risk (Q5 vs. Q1 OR 1.11, 95% CI = 1.03–1.20). Having longstanding illness (OR 1.15, 95% CI = 1.08–1.22), other serious medical condition/disability diagnosed by doctor (OR 1.17, 95% CI = 1.10–1.26), and hypertension (OR 1.11, 95% CI = 1.05–1.18) were associated with a greater risk of low HV.
Many blood cell measures were among the biomarkers that associated with low HV. Having a higher red blood cell count or haemoglobin concentration was associated with lower risk of low HV, while higher mean corpuscular haemoglobin, mean corpuscular volume, mean reticulocyte volume, mean sphered cell volume, and greater red blood cell distribution width were all associated with more risk of low HV. Among other biomarkers, higher urate and cystatin C were associated with greater risk, while creatinine, insulin-like growth factor 1 (IGF1), total protein, albumin, and calcium were inversely associated with risk (Fig. 6).
Discussion
Hippocampal atrophy commonly precedes cognitive decline and identifying factors affecting the risk of low HV can provide clues into dementia prevention at a stage where it is still possible to stop or delay the disease process. In this hypothesis-free analysis, we used information from the world’s largest brain imaging study and a novel ML pipeline to identify risk factors for low HV. Our analysis confirmed many of characteristics that have previously been associated with HV such as older age, greater weight, and hypertension as well as the potentially important role for many factors relating to our modifiable lifestyles. Interestingly, lung function and several red blood cell indices were associated with low HV, and indeed, many of the characteristics associated with low HV may be associated with low oxygen supply to the brain. Our analysis supports the view that by retaining a healthy weight and taking action on modifiable lifestyles it is possible to lower the risk of low HV, while associations seen with the several blood biomarkers suggest that even with relatively simple biomarker assessments, it may be possible to develop ways to identify people more likely to have low HV. This may help to predict dementia, at the stage when cognitive changes are yet to occur, offering opportunities for pre-emptive prevention.
Differences in lifestyle are important for brain health, with this notion strongly supported by our study. In addition to many other adverse health effects, smoking increases the risk of AD [29], with our study and others confirming a role in hippocampal atrophy [30, 31]. While the biological mechanisms underlying this association are not well established, smoke-induced oxidative stress, caused by the reactive oxygen species found in tobacco smoke, may lead to cigarette smoking-related hippocampal injury, as demonstrated in some animal model studies [32, 33]. As demonstrated by our previous research [34], higher coffee consumption is associated with a greater risk of low HV. While this type of negative effect of coffee on HV appears plausible as clinical trials and mendelian randomization studies suggest that high caffeine/coffee intake can lower grey matter volume in the brain [35, 36], further research is required to establish the presence of a causal relationship and the underlying mechanisms. PA, another risk factor supported by this and other studies [37, 38], could lower the risk of low HV through upregulation of growth factors that promote hippocampal neurogenesis and angiogenesis [39]. There are many other benefits of physical exercise that may explain a link between PA and HV, including improved oxygen supply to the brain [40], dendritic arborization [41], neuronal plasticity [39], preservation of neurovascular structures, and a reduction in systemic inflammation [42]. Interestingly, the risk of low HV was lower in car/motor users than non-users in this study. A similar finding has been reported in a recent study conducted among Japanese older adults [43]. While we are unable to exclude a non-causal explanation, this association could be related to the effect of driving on brain activation and neuronal plasticity [44]. It is also possible that this type of mechanism might explain the unexpected finding suggesting a protective association between time spent watching television and HV, even if our observation contrasts with findings from some previous studies [45, 46].
The hippocampus is a brain area that is vulnerable to obesity-related atrophy and the association between overweight/obesity and low HV has been reported in previous studies [47, 48]. Though the mechanisms by which obesity causes brain atrophy are complex and warrant further investigation, it could be that obesity is associated with vascular risk factors, such as hypertension, dyslipidaemia, and impaired blood vessel function [49]. It is interesting that despite many disease conditions included as an input to our modes, only hypertension was identified as an important predictor for low HV. Hypertension and related factors can compromise blood flow to the brain, including the hippocampus, leading to reduced oxygen and hippocampus atrophy over time [9]. Hand grip strength was another factor inversely associated with low HV, a finding earlier reported in participants with diagnosed major depressive disorder [50]. This might be because hand grip strength is a predictive biomarker of whole-body muscle strength and function, nutritional, and disease status [51], which could also have an effect on brain health.
The observed inverse association between better lung function and a reduced risk of low HV may reflect beneficial influences from adequate oxygen provision to the brain. Chronic hypoxemia is one of the factors associated with greater risk of hippocampal atrophy and cognitive impairment [52], and associations with many of the features identified as potentially important could be explained by hypoxemia. This is supported by red blood cell (RBC) count and several other RBC indices such as haemoglobin concentration, mean corpuscular haemoglobin, RBC distribution width, and mean volumes of reticulocytes, sphered cells, and red blood corpuscles, being identified as important features for low HV. Greater RBC count and haemoglobin concentration suggest greater oxygen carrying capacity in the blood, consistent with their association with lower risk of low HV. On the other hand, larger erythroid cells, more haemoglobin per cell, and greater RBC distribution width were associated with more risk of low HV. Considering these biomarkers together paints a picture reminiscent of megaloblastic macrocytic anaemia that results from folate or vitamin B12 deficiencies [53], interfering with the expansion of the erythroid lineage, and leading to fewer RBC production, larger RBC volume, and more haemoglobin per cell, while overall blood haemoglobin concentration is lower. Folate or vitamin B12 deficiencies are associated with increased serum levels of homocysteine [54], an amino acid that has been associated with hippocampal atrophy [55].
This study also showed the association between “higher mean time to correctly identify matches” (slower reaction time) and a greater risk of low HV, which might be explained through an association between anaemia and cognitive impairment [56], since anaemia is an established risk factor for cognitive decline that is known to impact reaction time [57]. It is associated with reduced oxygenation in cortical tissue, including the hippocampus, which is highly oxygen sensitive and particularly susceptible to the effects of inadequate oxygen supply [58].
Higher levels of both serum albumin and calcium were associated with lower risk of low HV. This could be because albumin, the most abundant serum protein, plays a key role in chaperoning calcium and other nutrients around the body [59], which are fundamental for normal neuronal function. Low calcium is also involved in pathophysiology of dementia [60] as it increases Aβ formation, hyperphosphorylation of TAU, and neuronal cell death [61]. Furthermore, low serum albumin concentration has also been associated with the accumulation of amyloid [62], a protein involved in the pathogenesis of AD [62].
We also found that higher cystatin C and higher urate levels were associated with greater risk of low HV, suggesting kidney dysfunction may be important in predicting hippocampal atrophy. Higher cystatin C concentration is a sensitive marker of chronic kidney disease, particularly in the elderly population [63], and since low HV is an independent predictor of cognitive performance [64], our findings are consistent with previous studies linking both higher cystatin C and chronic kidney disease to cognitive decline [65, 66]. In contrast to this, our study suggested a potentially protective association between higher creatinine levels and lower risk of HV. Creatinine is the waste product of the metabolism of creatine for energy, and while it is typically used in the assessment of kidney function, it also acts as a buffer for energy supply, particularly for the brain and muscles. In the brain, creatinine has been shown to improve mitochondrial performance and to act as an anti-oxidant and neuroprotectant [67]. Indeed, a meta-analysis of randomized controlled trials suggests that oral creatine supplementation may help to improve short-term memory and intelligence/reasoning among healthy individuals [68].
Similar to the finding of another study conducted in older adults [69], direct association between IGF-1 and HV was observed in this study. This could be related to the important role of IGF-1 for growth and protection of neurons and for enhancing and maintaining myelination in the brain [70]. In light of our other findings, it may be also notable that IGF-1 promotes the activation of vitamin D [71], a key facilitator of dietary calcium absorption, and a relationship is emerging between IGF-1 levels and calcium [72].
To the best of our knowledge, this is the largest and the most comprehensive study to examine determinants of low HV. We used a novel ML pipeline that was able to pick important features among thousands of available participant characteristics, with analyses conducted in a way that would have allowed us to uncover risk factors even in the presence of interaction and non-linear associations. However, this study has also several potential limitations, and due to the observational nature of our analyses it is not possible to demonstrate causality. Despite covariate adjustments, we cannot discount presence of residual confounding. We excluded closely correlated features, and in epidemiological analyses the association between each feature and low HV was examined individually; hence, these associations may not capture complex biological relationships. While many data items were collected through clinical examinations or blood sampling, by necessity some characteristics are self-reported and may be affected by reporting or recall bias. It is also not possible to discount reverse causality underlying some of the observations, even though we used a prospective study design. The UKB is affected by healthy volunteer bias, and participants have been reported to be less likely to be obese, to smoke, and they report less health conditions compared to the general population [73]. That said, features related to smoking and obesity were still identified by the GBD approach as important predictive features. As many ethnic groups are underrepresented, our findings may not be directly generalizable to the whole UK population. Despite healthy volunteer bias and ethnic underrepresentation in the UKB, comparisons between UKB and 18 nationally representative studies with conventional response rates suggest that analyses in the UKB show similar risk factor-outcome associations to those evident in the comparator studies [74].
Conclusions
In summary, lifestyles, physical measures, and biomarkers may affect low HV risk, with many of the characteristics associated with low HV, also associated with low oxygen supply to the brain. The largest effect estimates were seen for factors including age, sex, coffee intake, smoking, and body size and composition, whereas features reflecting cognitive function, psychosocial function, and health/medical history were relatively weakly associated. Associations with many blood biomarkers suggests that relatively simple biomarker assessments could help in early prediction, providing potential for pre-emptive dementia prevention. Future research is needed to confirm causality and clinical relevance and to validate these findings in more diverse populations.
Statement of Ethics
Ethical approval for the UK Biobank was granted by the National Information Governance Board for Health and Social Care and Northwest Multicentre Research Ethics Committee (11/NW/0382). During data collection, electronic consent was given by each participant to use their anonymized data and access their medical records for health-related research [75]. The current study was conducted under g project number 89630.
Conflict of Interest Statement
The authors report no competing interests.
Funding Sources
This work was in part supported by Research Training Program international scholarship from the Government of Australia and Medical Research Future Fund (MRF2007431), Australia. EH is funded by National Health and Medical Research Council (Australia) leadership award (GNT2025349). The funders had no involvement in the study’s design, conduct, data handling, analysis, manuscript writing, and decision to submit for publication.
Author Contributions
YY: data curation, formal analysis, investigation, software, methodology, visualization, validation, and writing – original draft. IM: data curation, formal analysis, methodology, supervision, validation, and writing – review and editing. AM: data curation, methodology, supervision, validation, and writing – review and editing. AL: investigation, supervision, validation, and writing – review and editing. EH: conceptualization, funding acquisition, methodology, project administration, resources, supervision, validation, and writing – review and editing.
Funding Statement
This work was in part supported by Research Training Program international scholarship from the Government of Australia and Medical Research Future Fund (MRF2007431), Australia. EH is funded by National Health and Medical Research Council (Australia) leadership award (GNT2025349). The funders had no involvement in the study’s design, conduct, data handling, analysis, manuscript writing, and decision to submit for publication.
Data Availability Statement
The data used for this research will be accessible to approved users of the UK Biobank upon application.
Supplementary Material.
Supplementary Material.
Supplementary Material.
References
- 1. Dubois B, Feldman HH, Jacova C, Dekosky ST, Barberger-Gateau P, Cummings J, et al. Research criteria for the diagnosis of Alzheimer’s disease: revising the NINCDS-ADRDA criteria. Lancet Neurol. 2007;6(8):734–46. [DOI] [PubMed] [Google Scholar]
- 2. Apostolova LG, Mosconi L, Thompson PM, Green AE, Hwang KS, Ramirez A, et al. Subregional hippocampal atrophy predicts Alzheimer’s dementia in the cognitively normal. Neurobiol Aging. 2010;31(7):1077–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Beason-Held LL, Goh JO, An Y, Kraut MA, O'Brien RJ, Ferrucci L, et al. Changes in brain function occur years before the onset of cognitive impairment. J Neurosci. 2013;33(46):18008–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Fotuhi M, Do D, Jack C. Modifiable factors that alter the size of the hippocampus with ageing. Nat Rev Neurol. 2012;8(4):189–202. [DOI] [PubMed] [Google Scholar]
- 5. Lobanova I, Qureshi AI. The association between cardiovascular risk factors and progressive hippocampus volume loss in persons with Alzheimer’s disease. J Vasc Interv Neurol. 2014;7(5):52–5. [PMC free article] [PubMed] [Google Scholar]
- 6. Wiesmann M, Kiliaan AJ, Claassen JA. Vascular aspects of cognitive impairment and dementia. J Cereb Blood Flow Metab. 2013;33(11):1696–706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Hashimoto M, Araki Y, Takashima Y, Nogami K, Uchino A, Yuzuriha T, et al. Hippocampal atrophy and memory dysfunction associated with physical inactivity in community-dwelling elderly subjects: the Sefuri study. Brain Behav. 2017;7(2):e00620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Cox SR, Lyall DM, Ritchie SJ, Bastin ME, Harris MA, Buchanan CR, et al. Associations between vascular risk factors and brain MRI indices in UK Biobank. Eur Heart J. 2019;40(28):2290–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Yao H, Araki Y, Yamashita F, Sasaki M, Hashimoto M. Chapter 24: hippocampal atrophy associated with dementia risk factors and dementia. In: Martin CR, Preedy VR, editors. Genetics, neurology, behavior, and diet in dementia. Academic Press; 2020. p. 373–87. [Google Scholar]
- 10. Lombardi G, Crescioli G, Cavedo E, Lucenteforte E, Casazza G, Bellatorre AG, et al. Structural magnetic resonance imaging for the early diagnosis of dementia due to Alzheimer’s disease in people with mild cognitive impairment. Cochrane Database Syst Rev. 2020;3(3):Cd009628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. WHO. Dementia. 2021. [Google Scholar]
- 12. GBD 2016 Dementia Collaborators . Global, regional, and national burden of Alzheimer’s disease and other dementias, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol. 2019;18(1):88–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Firbank MJ, Narayan SK, Saxby BK, Ford GA, O'Brien JT. Homocysteine is associated with hippocampal and white matter atrophy in older subjects with mild hypertension. Int Psychogeriatr. 2010;22(5):804–11. [DOI] [PubMed] [Google Scholar]
- 14. Ho AJ, Raji CA, Becker JT, Lopez OL, Kuller LH, Hua X, et al. The effects of physical activity, education, and body mass index on the aging brain. Hum Brain Mapp. 2011;32(9):1371–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Frangou S, Abbasi F, Watson K, Haas SS, Antoniades M, Modabbernia A, et al. Hippocampal volume reduction is associated with direct measure of insulin resistance in adults. Neurosci Res. 2022;174:19–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Olson RS, Cava W, Mustahsan Z, Varik A, Moore JH. Data-driven advice for applying machine learning to bioinformatics problems. Pac Symp Biocomput. 2018;23:192–203. [PMC free article] [PubMed] [Google Scholar]
- 17. Zhang Z, Zhao Y, Canes A, Steinberg D, Lyashevska O; written on behalf of AME Big-Data Clinical Trial Collaborative Group . Predictive analytics with gradient boosting in clinical medicine. Ann Transl Med. 2019;7(7):152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. UK Biobank. Help us to complete the world's largest imaging study. 2023. [Google Scholar]
- 20. Miller KL, Alfaro-Almagro F, Bangerter NK, Thomas DL, Yacoub E, Xu J, et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat Neurosci. 2016;19(11):1523–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Smith SMA-AF, Miller KL. Brain imaging documentation. 2020. [Google Scholar]
- 22. Weinstein G, Beiser AS, Decarli C, Au R, Wolf PA, Seshadri S. Brain imaging and cognitive predictors of stroke and Alzheimer disease in the Framingham Heart Study. Stroke. 2013;44(10):2787–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Millard LAC, Davies NM, Gaunt TR, Davey Smith G, Tilling K. Software Application Profile: PHESANT: a tool for performing automated phenome scans in UK Biobank. Int J Epidemiol. 2018;47(1):29–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Madakkatel I, Zhou A, McDonnell MD, Hyppönen E. Combining machine learning and conventional statistical approaches for risk factor discovery in a large cohort study. Sci Rep. 2021;11(1):22997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Maldonado S, Weber R, Famili F. Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines. Inf Sci. 2014;286:228–46. [Google Scholar]
- 26. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A. Feature selection for high-dimensional data. Prog Artif Intell. 2016;5(2):65–75. [Google Scholar]
- 27. Talaminos Barroso A, Márquez Martín E, Roa Romero LM, Ortega Ruiz F. Factors affecting lung function: a review of the literature. Arch Bronconeumol. 2018;54(6):327–32. [DOI] [PubMed] [Google Scholar]
- 28. Pratt J, De Vito G, Narici M, Segurado R, Dolan J, Conroy J, et al. Grip strength performance from 9,431 participants of the GenoFit study: normative data and associated factors. Geroscience. 2021;43(5):2533–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Cataldo JK, Prochaska JJ, Glantz SA. Cigarette smoking is a risk factor for Alzheimer’s disease: an analysis controlling for tobacco industry affiliation. J Alzheimers Dis. 2010;19(2):465–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Durazzo TC, Meyerhoff DJ, Nixon SJ. Interactive effects of chronic cigarette smoking and age on hippocampal volumes. Drug Alcohol Depend. 2013;133(2):704–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Hawkins KA, Emadi N, Pearlson GD, Taylor B, Khadka S, King D, et al. The effect of age and smoking on the hippocampus and memory in late middle age. Hippocampus. 2018;28(11):846–9. [DOI] [PubMed] [Google Scholar]
- 32. Ho YS, Yang X, Yeung SC, Chiu K, Lau CF, Tsang AW, et al. Cigarette smoking accelerated brain aging and induced pre-Alzheimer-like neuropathology in rats. PLoS One. 2012;7(5):e36752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Khanna A, Guo M, Mehra M, Royal W 3rd. Inflammation and oxidative stress induced by cigarette smoke in Lewis rat brains. J Neuroimmunol. 2013;254(1–2):69–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Pham K, Mulugeta A, Zhou A, O’Brien JT, Llewellyn DJ, Hyppönen E. High coffee consumption, brain volume and risk of dementia and stroke. Nutr Neurosci. 2022 2022/10/03;25(10):2111–22. [DOI] [PubMed] [Google Scholar]
- 35. Lin YS, Weibel J, Landolt HP, Santini F, Meyer M, Brunmair J, et al. Daily caffeine intake induces concentration-dependent medial temporal plasticity in humans: a multimodal double-blind randomized controlled trial. Cereb Cortex. 2021;31(6):3096–106. [DOI] [PubMed] [Google Scholar]
- 36. Zheng BK, Niu PP. Higher coffee consumption is associated with reduced cerebral gray matter volume: a mendelian randomization study. Front Nutr. 2022;9:850004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Mueller K, Möller HE, Horstmann A, Busse F, Lepsien J, Blüher M, et al. Physical exercise in overweight to obese individuals induces metabolic- and neurotrophic-related structural brain plasticity. Front Hum Neurosci. 2015;9:372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Kim YS, Shin SK, Hong SB, Kim HJ. The effects of strength exercise on hippocampus volume and functional fitness of older women. Exp Gerontol. 2017;97:22–8. [DOI] [PubMed] [Google Scholar]
- 39. Kim JH, Liu QF, Urnuhsaikhan E, Jeong HJ, Jeon MY, Jeon S. Moderate-intensity exercise induces neurogenesis and improves cognition in old mice by upregulating hippocampal hippocalcin, Otub1, and spectrin-α. Mol Neurobiol. 2019;56(5):3069–78. [DOI] [PubMed] [Google Scholar]
- 40. Steventon JJ, Foster C, Furby H, Helme D, Wise RG, Murphy K. Hippocampal blood flow is increased after 20 min of moderate-intensity exercise. Cereb Cortex. 2020;30(2):525–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Tsai SF, Ku NW, Wang TF, Yang YH, Shih YH, Wu SY, et al. Long-term moderate exercise rescues age-related decline in hippocampal neuronal complexity and memory. Gerontology. 2018;64(6):551–61. [DOI] [PubMed] [Google Scholar]
- 42. Fuss J, Biedermann SV, Falfán-Melgoza C, Auer MK, Zheng L, Steinle J, et al. Exercise boosts hippocampal volume by preventing early age-related gray matter loss. Hippocampus. 2014;24(2):131–4. [DOI] [PubMed] [Google Scholar]
- 43. Shimada H, Bae S, Harada K, Makino K, Chiba I, Katayama O, et al. Association between driving a car and retention of brain volume in Japanese older adults. Exp Gerontol. 2023;171:112010. [DOI] [PubMed] [Google Scholar]
- 44. Spiers HJ, Maguire EA. Neural substrates of driving behaviour. Neuroimage. 2007;36(1):245–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Fancourt D, Steptoe A. Television viewing and cognitive decline in older age: findings from the English Longitudinal Study of Ageing. Sci Rep. 2019;9(1):2851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Takeuchi H, Kawashima R. Effects of television viewing on brain structures and risk of dementia in the elderly: longitudinal analyses. Front Neurosci. 2023;17:984919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Raji CA, Ho AJ, Parikshak NN, Becker JT, Lopez OL, Kuller LH, et al. Brain structure and obesity. Hum Brain Mapp. 2010;31(3):353–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Debette S, Seshadri S, Beiser A, Au R, Himali JJ, Palumbo C, et al. Midlife vascular risk factor exposure accelerates structural brain aging and cognitive decline. Neurology. 2011;77(5):461–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Dorrance AM, Matin N, Pires PW. The effects of obesity on the cerebral vasculature. Curr Vasc Pharmacol. 2014;12(3):462–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Firth JA, Smith L, Sarris J, Vancampfort D, Schuch F, Carvalho AF, et al. Handgrip strength is associated with hippocampal volume and white matter hyperintensities in major depression and healthy controls: a UK Biobank study. Psychosom Med. 2020;82(1):39–46. [DOI] [PubMed] [Google Scholar]
- 51. Bohannon RW. Grip strength: an indispensable biomarker for older adults. Clin Interv Aging. 2019;14:1681–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Daulatzai MA. Quintessential risk factors: their role in promoting cognitive dysfunction and Alzheimer’s disease. Neurochem Res. 2012;37(12):2627–58. [DOI] [PubMed] [Google Scholar]
- 53. WHO. Nutritional anaemias: tools for effective prevention and control. 2017. [Google Scholar]
- 54. Ni J, Zhang L, Zhou T, Xu WJ, Xue JL, Cao N, et al. Association between the MTHFR C677T polymorphism, blood folate and vitamin B12 deficiency, and elevated serum total homocysteine in healthy individuals in Yunnan Province, China. J Chin Med Assoc. 2017;80(3):147–53. [DOI] [PubMed] [Google Scholar]
- 55. den Heijer T, Vermeer SE, Clarke R, Oudkerk M, Koudstaal PJ, Hofman A, et al. Homocysteine and brain atrophy on MRI of non-demented elderly. Brain. 2003;126(Pt 1):170–5. [DOI] [PubMed] [Google Scholar]
- 56. Kung WM, Yuan SP, Lin MS, Wu CC, Islam MM, Atique S, et al. Anemia and the risk of cognitive impairment: an updated systematic review and meta-analysis. Brain Sci. 2021;11(6):777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Winchester LM, Powell J, Lovestone S, Nevado-Holgado AJ. Red blood cell indices and anaemia as causative factors for cognitive function deficits and for Alzheimer’s disease. Genome Med. 2018;10(1):51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Zhang H, Roman RJ, Fan F. Hippocampus is more susceptible to hypoxic injury: has the Rosetta Stone of regional variation in neurovascular coupling been deciphered? Geroscience. 2022;44(1):127–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Peacock M. Calcium metabolism in health and disease. Clin J Am Soc Nephrol. 2010;5(Suppl 1):S23–30. [DOI] [PubMed] [Google Scholar]
- 60. Cascella R, Cecchi C. Calcium dyshomeostasis in Alzheimer’s disease pathogenesis. Int J Mol Sci. 2021;22(9):4914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. LaFerla FM. Calcium dyshomeostasis and intracellular signalling in Alzheimer’s disease. Nat Rev Neurosci. 2002;3(11):862–72. [DOI] [PubMed] [Google Scholar]
- 62. Kim JW, Byun MS, Lee JH, Yi D, Jeon SY, Sohn BK, et al. Serum albumin and beta-amyloid deposition in the human brain. Neurology. 2020;95(7):e815–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Murty MS, Sharma UK, Pandey VB, Kankare SB. Serum cystatin C as a marker of renal function in detection of early acute kidney injury. Indian J Nephrol. 2013;23(3):180–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. O'Sullivan M, Ngo E, Viswanathan A, Jouvent E, Gschwendtner A, Saemann PG, et al. Hippocampal volume is an independent predictor of cognitive performance in CADASIL. Neurobiol Aging. 2009;30(6):890–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Yaffe K, Lindquist K, Shlipak MG, Simonsick E, Fried L, Rosano C, et al. Cystatin C as a marker of cognitive function in elders: findings from the health ABC study. Ann Neurol. 2008;63(6):798–802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Nair P, Misra S, Nath M, Vibha D, Srivastava AK, Prasad K, et al. Cystatin C and risk of mild cognitive impairment: a systematic review and meta-analysis. Dement Geriatr Cogn Disord. 2020;49(5):471–82. [DOI] [PubMed] [Google Scholar]
- 67. Rae CD, Bröer S. Creatine as a booster for human brain function. How might it work? Neurochem Int. 2015;89:249–59. [DOI] [PubMed] [Google Scholar]
- 68. Avgerinos KI, Spyrou N, Bougioukas KI, Kapogiannis D. Effects of creatine supplementation on cognitive function of healthy individuals: a systematic review of randomized controlled trials. Exp Gerontol. 2018;108:166–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Maass A, Düzel S, Brigadski T, Goerke M, Becke A, Sobieray U, et al. Relationships of peripheral IGF-1, VEGF and BDNF levels to exercise-related changes in memory, hippocampal perfusion and volumes in older adults. Neuroimage. 2016;131:142–54. [DOI] [PubMed] [Google Scholar]
- 70. Bedse G, Di Domenico F, Serviddio G, Cassano T. Aberrant insulin signaling in Alzheimer's disease: current knowledge. Front Neurosci. 2015;9:204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Ameri P, Giusti A, Boschetti M, Murialdo G, Minuto F, Ferone D. Interactions between vitamin D and IGF-I: from physiology to clinical practice. Clin Endocrinol. 2013;79(4):457–63. [DOI] [PubMed] [Google Scholar]
- 72. Van Hemelrijck M, Shanmugalingam T, Bosco C, Wulaningsih W, Rohrmann S. The association between circulating IGF1, IGFBP3, and calcium: results from NHANES III. Endocr Connect. 2015;4(3):187–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Fry A, Littlejohns TJ, Sudlow C, Doherty N, Adamska L, Sprosen T, et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am J Epidemiol. 2017;186(9):1026–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Batty GD, Gale CR, Kivimäki M, Deary IJ, Bell S. Comparison of risk factor associations in UK Biobank against representative, general population based studies with conventional response rates: prospective cohort study and individual participant meta-analysis. BMJ. 2020;368:m131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data used for this research will be accessible to approved users of the UK Biobank upon application.