How reproducible are data-driven subtypes of Alzheimer's disease atrophy?

Emma Prevot; Cameron Shand; for Alzheimer’s Disease Neuroimaging Initiative; Neil Oxtoby

doi:10.1177/13872877251415019

. 2026 Jan 28;110(1):426–442. doi: 10.1177/13872877251415019

How reproducible are data-driven subtypes of Alzheimer's disease atrophy?

Emma Prevot ^1,^✉, Cameron Shand ²; for Alzheimer’s Disease Neuroimaging Initiative^*, Neil Oxtoby ²

PMCID: PMC12960741 PMID: 41603392

Abstract

Background

Alzheimer's disease (AD) exhibits substantial clinical and biological heterogeneity, complicating efforts in treatment and intervention development. While new computational methods offer insights into AD subtyping and disease staging, the reproducibility of these subtypes across datasets remains understudied, particularly concerning the robustness of subtype definitions when validated on diverse databases.

Objective

This study evaluates the robustness of the AD progression subtypes identified by the Subtype and Stage Inference (SuStaIn) algorithm on a larger and more diverse cohort.

Methods

We extracted T1-weighted MRI data for 5444 subjects from ANMerge, OASIS, and ADNI datasets, forming four independent cohorts. Each cohort was analyzed with SuStaIn under two conditions: one using the full cohort, including cognitively normal controls, and another excluding controls to test subtype robustness.

Results

Results confirm the three primary atrophy subtypes identified in earlier studies: Typical, Cortical, and Subcortical, as well as the emergence of rare and atypical AD variants such as posterior cortical atrophy. Notably, each subtype displayed varying robustness to the inclusion of controls, with certain subtypes, like the Subcortical subtype, more influenced by cohort composition.

Conclusions

This investigation underscores SuStaIn's reliability for defining stable AD subtypes and suggests its utility in clinical stratification for trials and diagnosis. However, our findings also highlight the need for improved dataset ethnic and demographic diversity, particularly in terms of ethnic representation, to enhance generalizability and support broader clinical application.

Keywords: Alzheimer's disease, brain, dementia, machine learning, neurodegenerative diseases, SuStaIn

Introduction

Alzheimer's disease (AD) stands as the predominant age-related neurodegenerative disorder and principal cause of dementia. Affecting 60–80% of the estimated 55 million global dementia cases,^1,2 AD poses a significant burden with healthcare costs expected to increase to more than $1 trillion by 2050, in light of an aging population.³ However, despite being first described more than a century ago, disease-modifying therapies are only just now beginning to emerge, and not without controversy. With over 200 clinical trials failed in the last 15 years,⁴ it is clear that significant gaps remain in our understanding of AD pathophysiology and treatment development. A significant obstacle to understanding and treating AD is the vast heterogeneity in clinical, genetic and pathophysiological biomarkers. Recent work has revealed the importance of distinguishing between temporal heterogeneity (different disease stage/severity) and phenotypic heterogeneity (different pathophysiological cascades) to provide a comprehensive picture of the disease and enable accurate patient stratification and recruitment in clinical trials.^5–7

Within this complex landscape, the proliferation of publicly available datasets including demographic, clinical, and biologic information of individuals, has catalyzed innovative insights into AD's pathophysiology, harnessing computational data-driven disease progression modeling and big data.⁸ We focus our attention to the Subtype and Stage Inference (SuStaIn) algorithm,^9,10 an unsupervised machine-learning technique which combines disease progression modeling and clustering to perform subtyping and staging of individuals. Unlike traditional data-driven models that either assume a universal temporal progression,^11–13 or subtype while neglecting temporal insights,^5,6,14–16 SuStaIn uniquely disentangles various heterogeneity levels, stratifying patients that are temporally and phenotypically heterogeneous, based on a wide range of disease biomarkers. In the context of our study, we deploy SuStaIn on volumetric data across several brain regions to infer multiple subtype atrophy patterns.

Previous analyses of AD atrophy subtypes using SuStaIn have revealed three distinct sequences of temporal progression,⁹ which will be referred to as the original subtypes throughout. These clusters exhibit phenotypic similarities to patterns of atrophy revealed by postmortem histology⁶ and retrospective analysis of MRI scans close to the time of death.^17–19 However, in the original Z-score study, SuStain was exclusively applied to either synthetic data or data from the Alzheimer's Disease Neuroimaging Initiative (ADNI),²⁰ which is a well-characterized research dataset with a significant imbalance in terms of its ethnic representation.^17,21 One exception is a follow-up study from Archetti et al.²² which focused on demonstrating the transferability of the SuStaIn disease progression models.⁹ They trained SuStaIn on ADNI and tested the algorithm on a heterogeneous and less structured cohort, using data from three additional independent and less well-phenotyped datasets. Their study successfully demonstrated reproducibility on lower-quality MRI data, highlighting SuStaIn's robustness. Building upon this work, our study extends the investigation by analyzing a larger cohort and ensuring consistency in biomarker selection, allowing for a more comprehensive assessment of subtype reproducibility. Additionally, we examine the impact of including or excluding controls during model fitting, a factor not addressed in Archetti et al., to further evaluate the stability of SuStaIn subtypes across different dataset compositions.

In this work, we extend the investigation of the transferability of the SuStaIn Alzheimer's disease progression model by training and validating the algorithm on a larger and more diverse cohort, ensuring consistency between biomarkers while examining how differences in dataset structure impact subtype reproducibility. Our primary aim is to assess the stability of the original SuStaIn subtypes and evaluate the reliability of the algorithm as a patient stratification tool, a necessary step before clinical implementation. Furthermore, to the best of our knowledge, this study is the first to explore the impact of excluding cognitively normal controls from SuStaIn modeling, providing insights into how subtype classifications may depend on control population definitions. Additionally, by applying SuStaIn across datasets with varying recruitment strategies, imaging protocols, and geographical origins, we highlight key factors that contribute to variability in subtype expression. Finally, we demonstrate SuStaIn's ability to characterize highly specific atrophy patterns, including those linked to atypical presentations of AD, such as posterior cortical atrophy (PCA), reinforcing its potential for detecting diverse disease trajectories.

Methods

We begin by providing a high-level experimental overview that comprises key elements of our approach. First, we obtained the data from the chosen datasets and divided participants into four cohorts. We then established a control group, which played a crucial role as the reference for z-scoring and aided in adjusting for key confounding factors. Subsequently, we trained atrophy subtype models for each cohort using SuStaIn and determined the optimal number of clusters per model using 10-fold cross-validation. This was repeated both with the control group integrated into the model fitting process and without. Finally, we qualitatively and quantitatively analyzed the resulting eight cross-validated models and their disease progression subtypes.

Participants and cohorts

SuStaIn model fitting was performed with cross-sectional volumetric MRI data from 3 different databases: the ADNI,²⁰ the Open Access Series of Imaging Studies (OASIS),²³ and the new version of the AddNeuroMed dataset (ANMerge).²⁴ We used all available data that passed quality control (QC) from the respective study. It is important to note that our study focuses on the clinical syndrome of probable AD, as diagnosed in research cohorts such as ADNI, OASIS, and ANMerge. While these cohorts are enriched for AD, definitive confirmation of underlying AD pathology would require biomarker evidence (amyloid/tau PET or cerebrospinal fluid (CSF) analysis) or postmortem examination. Our approach, therefore, reflects real-world clinical diagnosis, where AD is typically identified based on cognitive symptoms and neuroimaging findings, rather than direct pathological verification.

Extracted data included demographic and clinical assessments, apolipoprotein E (APOE) genotype, intracranial volume, and numerical regional brain volumes or average cortical thicknesses of 14 brain regions. These regions of interest (ROIs), which are used to construct the disease progression sequences, include the hippocampus, amygdala, nucleus accumbens, insula, cingulate, caudate, pallidum, putamen, thalamus, entorhinal cortex, and the frontal, temporal, occipital, and parietal lobes. We chose these regions following FreeSurfer lobe mapping as they were also used in the original SuStaIn study to allow direct comparison of findings. We acknowledge that this ROI set combines regions of varying anatomical granularity. For example, multiple distinct substructures are defined within the medial temporal lobe (MTL) and basal ganglia, while broader lobe-level ROIs are used for neocortical areas. This decision was made to ensure direct comparability with the original SuStaIn study, which used the same FreeSurfer-based lobe-level parcellations. However, future work may benefit from employing finer cortical parcellations to improve spatial specificity. For ADNI and OASIS datasets, we extracted volumes for all ROIs. In the case of ANMerge, due to the unavailability of volumetric data for some regions, we extracted cortical thickness measurements for the insula, cingulate, entorhinal cortex, frontal, temporal, occipital, and parietal lobes, and volumes for the remaining regions. To combine ADNI with ANMerge, we also extracted cortical thickness measurements for the same ROIs in ADNI. In the Supplemental Material we detail the experiments conducted on the ADNI dataset to confirm that the SuStaIn model fitted using cortical thickness is equivalent to the one based on volumes, ensuring that no significant bias was introduced (Supplemental Figure 4). However, being aware that cortical thinning is one of the earliest detectable signs of cognitive decline, we anticipate that the regions where thickness data were used might show earlier signs of atrophy.²⁵ We averaged the volumes and thicknesses of the left and right hemispheres for a unified representation. This was done after confirming that there were no significant asymmetries in the atrophy patterns between the two hemispheres in the averaged ROIs, for all cohorts, as shown in Supplemental Figures 5–8. Inclusion was based on data availability, and only participants with complete entries for the extracted data were included, using baseline covariate data to maintain cross-sectionality.

ADNI data was downloaded from LONI's Imaging Data Archive (IDA) and two independent data sets were constructed: one including 3T (field strength) MRI data ( $n = 352$ ) and the other 1.5T MRI data ( $n = 1153$ ), both pre-processed with FreeSurfer version 5.1 to obtain cross-sectional volumes. ANMerge data was downloaded from the Synapse data portal and 1.5T MRI data pre-processed with FreeSurfer version 5.3 was downloaded ( $n = 931$ ). For both ADNI and ANMerge the individuals were broadly diagnosed as either Cognitively Normal (CN), Mild Cognitive Impairment (MCI), or Alzheimer's Disease (AD). OASIS data was downloaded from XNAT Central, a publicly accessible data repository, and 3T MRI data pre-processed with FreeSurfer version 5.3 was extracted ( $n = 1038$ ). However, for OASIS data the diagnosis for each individual was not presented in a straightforward categorical manner as in the other datasets.

These three databases were used to construct four different cohorts to investigate atrophy subtype reproducibility across varying MRI field strengths, recruitment strategies, and geographic regions (USA/Europe): ANMerge ( $n = 931)$ , OASIS ( $n = 1038$ ), combined ADNI1.5T/ANMerge ( $n = 2084$ ), and combined ADNI3T/OASIS ( $n = 1391$ ). Both the ANMerge and OASIS MRI data acquisitions were designed to be compatible and comparable with ADNI, ensuring technical consistency in imaging-derived measures.

A primary goal of this study is to evaluate whether differences in field strength, which can affect estimated volumes and thicknesses of brain regions, influence the reproducibility of atrophy subtypes. However, dataset variability extends beyond imaging protocols. ADNI participants were recruited using a highly standardized research protocol, while OASIS and ANMerge represent more heterogeneous, real-world datasets, potentially introducing greater variability in clinical diagnoses, participant demographics, and imaging pipelines. Additionally, ANMerge includes participants recruited from multiple European countries, which may introduce heterogeneity in diagnostic practices and population characteristics. In contrast, ADNI and OASIS used a harmonized protocol across sites and recruited primarily from research-focused centers in North America, where participants tend to be more socioeconomically homogeneous, a factor that may influence generalizability beyond geographic considerations alone.

By examining SuStaIn subtypes across these structurally different datasets, we aim to understand which cohort-specific factors influence subtype expression and which aspects of subtype reproducibility are robust to differences in dataset composition. As for the different FreeSurfer versions, a review of release notes confirmed no significant differences affecting the regions used in modeling.

Data preparation

Control population. The definition of the control population was based on the Clinical Dementia Rating (CDR) scale. This metric evaluates cognitive and behavioral performances to stage the severity of dementia. Six different domains are assessed: memory, orientation, judgment and problem solving, community affairs, every-day life activities, and personal care.^26,27 A subject with a CDR score of zero can be considered cognitively normal, i.e., not affected by any form of dementia. Thus, the control population of each cohort was identified with subjects having $C D R = 0$ . Table 1 shows the size of each control population with respect to that of the full cohort, as well as the percentage of carriers of at least one APOE $ε 4$ allele for each group, which has been long associated with AD susceptibility,²⁸ and The Mini-Mental State Examination (MMSE) result for each group, which is a widely used test to screen for Dementia.²⁹ A perfect MMSE score is 30, while a score of 24 is the recommended cut point for Dementia.³⁰

Table 1.

Demographic, clinical, and genetic information of the four constructed cohorts, divided between controls (CDR = 0), and patients (CDR > 0).

Cohort		n	APOE (% of ε4)	Sex (% of F)	Age (y, s.d.)	MMSE (mean, s.d.)	CDR (mean, s.d.)	Amyloid-β positivity
ANMerge	Controls	273	30%	54%	74.3 ± 6.2	29.0 ± 1.2	-	-
	Patients	658	53%	60%	75.0 ± 6.5	23.2 ± 4.8	1.02 ± 0.69	-
OASIS	Controls	736	32%	60%	67.4 ± 8.2	29.1 ± 1.3	-	-
	Patients	302	51%	40%	72.6 ± 7.6	25.3 ± 4.6	0.72 ± 0.42	-
ADNI 1.5T	Controls	314	27%	53%	73.0 ± 6.0	29.0 ± 1.2	-	34%
	Patients	726	50%	43%	72.0 ± 7.8	27.2 ± 2.6	0.56 ± 0.17	60%
A1.5/A	Controls	610	29%	53%	73.6 ± 6.1	29.0 ± 1.2	-	-
	Patients	1474	51%	52%	73.4 ± 7.4	25.4 ± 4.3	0.70 ± 0.36	-
ADNI 3T	Controls	72	15%	69%	75.2 ± 3.1	29.2 ± 0.8	-	30%
	Patients	191	50%	42%	72.9 ± 8.0	25.9 ± 2.2	0.54 ± 0.13	75%
A3/O	Controls	849	34%	60%	67.9 ± 9.4	29.1 ± 1.2	-	-
	Patients	542	55%	44%	73.4 ± 7.8	25.6 ± 3.8	0.65 ± 0.33	-

Open in a new tab

Given that we report amyloid-β positivity data in the table even though it was not used to define control status, we only include ADNI participants for whom β-amyloid data was available. As a result, the numbers listed do not sum perfectly to the combined cohort totals. Specifically, amyloid-β data was missing for 113 participants in the ADNI 1.5T cohort and 90 participants in the ADNI 3T cohort.

Abbreviations: s.d., standard deviation; APOE ε4, percentage of carriers of at least one APOE ε4 allele; n, number of subjects in the cohort; M, male; F, female; MMSE, mini-mental state examination; CDR, clinical dementia rating.

Despite being a widely-used scale, CDR relies on cognitive tests, which are not as reliable as other typical AD CSF biomarkers such as amyloid-β,³¹ especially for early diagnosis or individuals at preclinical stages. Nonetheless, such information was not available across all databases used in this study. Clinical diagnoses were also unsuitable due to inconsistencies in classification between ADNI, ANMerge, and OASIS, with the latter providing extraneous information alongside diagnoses, hampering its reliability and comparability as a control separator. Therefore, to maintain consistency across cohorts, CDR was chosen to define the control population. As shown in Table 1, the $C D R = 0$ control groups align well with our expectations for cognitively normal individuals, characterized by near-perfect MMSE scores and a lower prevalence of APOE $ε 4$ allele carriers. To further assess the impact of control definitions, we compared mean and variance (SD) values of key biomarkers (hippocampus volume, amygdala volume, and entorhinal cortex thickness) across control groups (see Supplemental Table 1). Our findings indicate that means are generally comparable across cohorts.

Additionally, we performed sanity checks on the ADNI dataset, where we tested two control definitions: CDR = 0 alone and CDR = 0 with amyloid-negativity (Aβ_1–42 > 192 pg/ml), as used in the original SuStaIn study.⁹ Our analysis confirms that using CDR = 0 alone or combining it with amyloid-negativity does not significantly alter SuStaIn subtype identification (Supplemental Figures 1–3).

Detrending and Z-scoring. Covariate correction (detrending) was performed to regress out the effect of age, sex, and intracranial volume (ICV) on the brain region measurements,³² given the inter-cohort demographic variability. Residuals from linear regression of each ROI on age, sex, and ICV were used in place of raw ROI values prior to Z-scoring. Subsequently, data was z-scored relative to the control population; each regional brain volume was expressed as a z-score by subtracting the mean of the control population and dividing by the control standard deviation. By doing so, z-scored regional volumes and thicknesses indicate how many standard deviations away from the control mean (i.e., normality) each subject value is, which is indicative of abnormality. Given that brain volumes reduce with AD progression, leading to decreasing and eventually negative z-scores, we flip the sign of the calculated z-scores allowing them to increase with advancing atrophy, as expected by the SuStaIn algorithm. Importantly, the model only looks at abnormal z-scores, which are defined to be positive for definiteness, i.e., $z = 3$ is more severe atrophy than $z = 2$ , and so on.

Because control group selection can influence the reference distribution, we performed z-scoring within each cohort separately rather than applying a universal control group. This within-cohort normalization ensures that subtype comparisons remain valid within each dataset, as it accounts for differences in control distributions across studies. This approach prevents biases that might arise from pooling control groups with different demographic and clinical characteristics, making the analysis more robust to inter-cohort variability.

While we acknowledge that W-scores, which directly adjust for covariates, are a more commonly used approach in similar studies,^33–35 we opted to use Z-scores after performing separate covariate detrending. This decision was made to remain consistent with the original SuStaIn study methodology, allowing for more direct comparison of findings.

SuStaIn model

SuStaIn, or the Subtype and Stage Inference algorithm,⁹ is a data-driven disease progression model designed to identify unique disease progression subgroups within populations by unraveling their phenotypic and temporal heterogeneity. It can employ multiple event-based disease progression models including a linear z-score model. The Z-score model characterizes disease progression as a linear increase in ROIs abnormalities, measured in z-scores, i.e., standard deviations from the mean of the healthy (control) subjects. SuStaIn can work with purely cross-sectional data, creating a common timeframe divided into stages for regional atrophies evolution. We direct the reader to the original reference for more details.⁹

SuStaIn performs hierarchical clustering to group subjects with similar disease progression patterns into subtypes. It starts with all data in one cluster, estimating the disease sequence, then progressively subdivides into more subtypes, each time recalculating the sequence. For each cluster, SuStaIn estimates the proportion of subjects, the most probable z-score sequence for each ROI, and assigns each subject to the most probable stage in the sequence. The relative likelihood of each sequence is approximated evaluating the probability of a number of possible sequences sampled using Markov Chain Monte Carlo (MCMC),³⁶ which also gives a visual and quantitative measure of the uncertainty associated with each subtype z-score ordering.

Two models were constructed for each cohort, one incorporating and the other excluding the control population in the model fitting. This yields a total of eight experiments, numbered as follows: models 1 and 2 were constructed on ANMerge 1.5T MRI data, with and without controls respectively, and similarly models 3 and 4 for the combined ADNI and ANMerge 1.5T MRI data, 5 and 6 for OASIS 3T MRI data, and finally 7 and 8 for the combined ADNI and OASIS 3T MRI data. Removing control subjects from the model fitting is an element of novelty of our study. The assumption is that if the control group is accurately defined, its inclusion or exclusion should not significantly impact SuStaIn's subtyping and staging, as by model specification we would expect almost all the controls to be below $z = 2$ in any given ROI.

To account for the fact that removing controls also reduces the available sample size, we carried out an additional subsampling analysis on the OASIS cohort, where this reduction was most evident. Five random subsamples were drawn from the full dataset using different random seeds, each with a total sample size equivalent to the patients-only cohort and maintaining the original control-to-patient ratio. SuStaIn was rerun on each subsample using identical model parameters. Subtype structures and progression sequences were compared against both the model incorporating and the model excluding the control population to assess subtype robustness as a function of sample composition versus sample size.

pySuStaIn settings. The SuStaIn algorithm is publicly available through the pySuStaIn software package available at https://github.com/ucl-pond.

To construct a SuStaIn model, several parameters are established, but this discussion focuses on those altered from the original Z-score study.⁹ The z-score event thresholds used, i.e., the number of standard deviations away from the mean of the healthy controls which can be considered an abnormal event, were only 2 and 3, omitting z-score 1 to avoid erroneous conclusions on the severity of the atrophy. A z-score 2 event is the linear accumulation from any z-score up to z-score 2, which corresponds to a brain region reaching a value that is 2 standard deviations away from the control mean. Similarly, the z-score 3 event occurs when a region z-score linearly accumulates from z-score 2 to 3.

Cross-validation. As SuStaIn proceeds hierarchically, when running it with $N_{m a x}$ maximum subtypes, it will also construct the ( $N_{m a x} - 1$ ) subtypes models. Across each SuStaIn model (for each number of subtypes up to the pre-defined maximum) 10-fold cross-validation is used to capture model uncertainty. The test set log-likelihood and the Cross Validation Information Criterion (CVIC)³⁷ are then used for model selection, including choosing the best number of clusters, aiming for consistent log-likelihood increases and lower CVIC for optimized model complexity and accuracy balance. Following the original SuStaIn study,⁹ we prioritized simpler models when the evidence for a more complex model was weak, defined as a CVIC difference $< 6$ or an out-of-sample log-likelihood difference $< 3$ from the minimum, to minimize overfitting.

The number of maximum clusters per model was initially set to 4, with adjustments made after inspecting output visualizations, test set log-likelihood and CVIC to choose the best subtype model.

Subtype and stage analysis

For each subtype in each model, SuStaIn outputs one Positional Variance Diagram (PVD) which summarizes the inferred sequence of disease progression and its uncertainty. Each sequence is divided into 28 stages (14 ROIs $\times$ 2 z-score events). Additionally, individuals with no abnormality (according to the model) in all 14 examined brain regions are allocated to Stage 0. In the original Z-score study,⁹ three neuroanatomical AD subtypes, namely the Typical, Cortical and Subcortical, were found when modeling data from the ADNI. SuStaIn revealed that for the Typical subtype, atrophy started in the amygdala and hippocampus; for the Cortical subtype, it began in the cingulate, insula, and nucleus accumbens; for the Subcortical subtype in the caudate, pallidum, putamen, and nucleus accumbens. These three clusters emerged from both 1.5T and 3T MRI data. Additionally, since we also consider the entorhinal cortex, we acknowledge its early involvement in the Typical subtype, in line with existing literature on AD progression.^38–40

We qualitatively assessed whether one subtype could be identified as either Typical, Cortical, or Subcortical looking at the regions exhibiting early severe atrophy in the PVDs. Subsequently, we also assess late-stage atrophy observations to provide a richer characterization. When comparing subtypes between different models the number of cross-validated subtypes and the MRI field strength (either 1.5T or 3T) were also considered, the latter specifically to account for potential variances in smaller brain regions.

Cognitive assessments analysis. For specific atrophy patterns, we explored cognitive assessment scores of the corresponding individuals. This was primarily performed for the ANMerge cohort, where a broad collection of neurocognitive and psychological assessments is available, including single question or single task scores, which we divided into different domains as shown in Supplemental Table 2. The following tests were analyzed: CDR, MMSE score; Alzheimer's Disease Assessment Scale-Cognitive Subscale (ADAS-Cog); Geriatric Depression Scale (GDS); and Alzheimer's Disease Cooperative Study Activities of Daily Living Scale (ADCS-ADL). In the OASIS database, only summative scores of CDR and MMSE were available. For combined cohorts this investigation was not possible due to inconsistencies in available data and protocols.

Statistical analysis

We statistically assessed the similarity and differences between subtypes across cohorts. Pairwise Kendall's tau distances of the ROIs z-score sequences were used to quantify cross-cohort similarity of the Typical, Cortical, and Subcortical subtypes. Qualitatively, a Kendall rank correlation coefficient greater than 0.50 suggests strong similarity, while a coefficient below 0.30 suggesting weak similarity.⁴¹ Additional model and subtype comparisons included ANOVA, two-proportions z-tests, and pairwise t-tests with Bonferroni correction (for the number of subtypes) across demographic, cognitive, genetic, and CSF features. ANOVA was used to compare means among subtypes, such as age or MMSE score. For a more targeted pairwise comparison, ensuring we account for multiple comparisons and control the family-wise error rate, we used pairwise t-tests coupled with a Bonferroni correction. Finally, when it came to evaluating proportional differences, especially when the data was categorical, the two-proportion z-test was used.

Results

Table 2 provides an overview of all the models from the 8 cohorts, constructed from selected combinations of data from ADNI, ANMerge, and OASIS: 1.5T, 3.0T, with/without controls during model fitting. Supplemental Figures 9–16 show the PVDs¹¹ for the cross-validated subtype atrophy progression patterns of each experiment. Overall, the three original subtypes of SuStaIn discovered in ADNI data were replicated across most cohorts.⁹ Additionally, the cross-validated models also included mixed clusters with multiple patterns of atrophy, clusters made of outliers with no significant atrophy pattern, and also a subtype exhibiting strong and early atrophy in the posterior cortices, i.e., occipital and parietal lobes, which resembled PCA for the atrophy patterns, as well as cognitive tests results. The exclusion of control data in model fitting often resulted in the Subcortical and/or Cortical subtype disappearing, suggesting the need for longitudinal validation of the subtypes. In the subsequent sections, we elaborate on each of them individually. For clarity, we will use “clusters” to refer to subgroups identified by the algorithm, and “subtypes” when these groups have been assigned a clinical interpretation.

Table 2.

Summary of the suStaIn subtypes in the constructed models.

		With Controls				Without Controls
Cohort	MFS	X	Subtypes	$N$	CVIC	X	Subtypes	$N$	CVIC
ANMerge	1.5T	1	Typical	3	42,869	2	Typical	3	30,288
			Cortical				Cortical
			PCA subtype				PCA subtype
ANMerge + ADNI 1.5T	1.5T	3	Typical	4	94,147	4	Typical	3	66,318
			Cortical				Cortical
			Subcortical				Mixed
			Mixed
OASIS	3.0T	5	Typical	3	46,413	6	Typical	2	12,990
			Cortical				Outliers
			Subcortical
OASIS + ADNI 3.0T	3.0T	7	Typical	3	63,778	8	Typical	2	25,043
			Cortical				Outliers
			Subcortical

Open in a new tab

The three original subtypes are Typical, Cortical, and Subcortical.⁹ Additionally, we refer to Mixed subtype when several atrophy patterns were identified within the same disease progression sequence; Outliers instead indicates a subtype with no recognizable atrophy pattern. The CVIC column reports the optimum value found for the corresponding number of subtypes N in the model.

Abbreviations: MFS, magnetic field strength; X, experiment number; $N$ , number of cross-validated subtypes; CVIC, cross validation information criterion.

SuStaIn atrophy models

1.5T atrophy models, ANMerge dataset. Figure 1 shows PVDs for the cross-validated 3-subtype atrophy progression pattern estimated from 1.5T MRI in CN+MCI+AD participants from the ANMerge study, along with the CVIC model comparison plot (lower right panel). We can recognize the Typical subtype with severe and early atrophy affecting the hippocampus and amygdala, as well as entorhinal cortex, and the Cortical subtype with atrophy starting in the insula and cingulate. The third subtype initially resembles typical AD atrophy, followed by notably earlier atrophy in the posterior cortices, i.e., occipital and parietal lobes. Given the prominent involvement of the occipital and parietal lobes, this group is anatomically consistent with a posterior-dominant atrophy pattern. We refer to it as the “suspected PCA” subtype to acknowledge its similarity to posterior cortical atrophy presentations, while also recognizing that the presence of early MTL involvement makes strict syndromic classification uncertain.

Statistical comparison of demographic and cognitive outcomes across these subtypes revealed that nearly 27% of the subjects in the suspected PCA subtype were 65 years old or younger, which is significantly higher than 6% and 2% for the Typical and Cortical subtype respectively (two-proportions z-test, $p ≪ 0.05$ ). Additionally, comparing the mean age of individuals with pairwise t-tests with a Bonferroni correction suggests that subjects assigned to the suspected PCA subtype are younger than the typical and cortical ones ( $p \sim 0.03$ ). In agreement with clinical knowledge on parieto-occipital atrophy, measures of visual capabilities (extracted from MMSE and ADAS-Cog) are worse for these individuals (two-proportions z-test, $p < 0.05$ )—more than 80% of parieto-occipital subtype individuals reproduced a drawing shown to them incorrectly, and more than 60% were only able to name zero or one object when shown to them.

The Subcortical subtype found in previous analyses of 3T MRI data was undetected in the 1.5T ANMerge data; however, after adjusting the model to include the z-score 1 event threshold, as per the original z-score study, nearly 30% of subjects exhibited early z-score 1 atrophy in subcortical regions.

Supplemental Figure 10 shows the three cross-validated clusters when controls were not included in model fitting. The same three atrophy subtypes were produced, including an even earlier atrophy in the occipital and parietal lobes for the suspected PCA subtype. Again, individuals assigned to this subtype were younger on average than those assigned to the typical and/or cortical subtypes (ANOVA test, $p ≪ 0.05$ ). The suspected PCA subtype also displayed exacerbated difficulties in calculation, copying drawings, and naming objects that were shown to them (two-proportions z-test, $p < 0.001$ versus Cortical subtype and $p ≪ 0.0001$ versus Typical subtype).

1.5T atrophy models, ADNI and ANMerge combined. Figure 2 shows the resulting PVD for the model trained on the combined 1.5T MRI data from ADNI and ANMerge, and we provide grid PVDs in Supplemental Figure 11. Four subtypes were cross-validated, three of which broadly replicate the original findings—Typical, Cortical, Subcortical—with addition of a mixed Typical/Cortical subtype.⁹ Beyond the expected atrophies from the original SuStaIn model, the Typical subtype exhibited frontal lobe abnormality, the Cortical subtype included fronto-temporal abnormalities, and the Subcortical subtype also involved early abnormality in the thalamus, hippocampus, and amygdala. The fourth subtype showed a typical/cortical pattern of atrophies, but no lobe was affected by severe shrinking ( $z = 3$ events).

Supplemental Figure 12 shows the three cross-validated subtypes without controls in the model fitting. The clusters are similar to the disease sequences constructed with controls, excluding the Subcortical subtype which was not found.

3.0T atrophy models, OASIS dataset. Supplemental Figure 13 shows the results of experiments on 3T MRI data from the OASIS study. Broadly speaking, the three subtypes replicated the original finding, but with high positional variance/uncertainty (blurriness).⁹ Notable differences with the original findings include earlier involvement of the frontal lobe in the Typical subtype, and early severe atrophy of the caudate in the Subcortical subtype. Without controls in the model fitting, cross-validation suggested only two subtypes but the difference was marginal and fitting a 3-subtype model without controls revealed again the Subcortical subtype with remarkably abnormal atrophies in caudate and putamen, as shown in Supplemental Figure 14.

3.0T atrophy models, ADNI and OASIS combined. Figure 3 shows the results of experiments on 3.0T MRI data from ADNI and OASIS studies. Cross-validation supported a 3-subtype model, which broadly reproduced the three original subtypes.⁹ Notably, the Typical pattern additionally exhibited early and severe frontal atrophy, and the Subcortical pattern featured early involvement of the caudate. We provide grid PVDs in Supplemental Figures 15 and 16 shows the resulting subtypes with controls excluded from model fitting. Cross-validation supported a 2-subtype model having high positional variance/uncertainty (blurriness), with more than 95% of the cohort staged earlier than stage 9, leaving only data from a small number of individuals from which to estimate the latter parts of each subtype sequence. Only the Typical subtype was clearly identified; the second subtype exhibited mild atrophy (z-score 2) in various regions and a severe (z-score 3) atrophy in the caudate.

Similarity between subtypes across models

Table 2 shows that the original subtypes from the original SuStaIn study were successfully replicated in distinct datasets. Supplemental Figures 23 and 24 summarize the pairwise Kendall's tau correlation coefficients for the atrophy sequences of the Typical, Cortical, and Subcortical subtypes identified in each experiment. We divided the groups by MRI field strength, with a 1.5T group including Experiments 1 to 4, and a 3T group comprising Experiments 5 to 8. Agreement between the atrophy sequences for the same subtype was generally very high, especially within individual databases and within cohorts (with/without controls). The Typical subtype emerged across all eight models and consistently manifested late atrophies in cortical regions, demonstrating the highest level of concordance in the progression sequence across all cohorts. The Cortical subtype appeared in six of the eight models and often progressed into a pattern similar to the Typical subtype, except for the OASIS cohorts and the ADNI 3T/OASIS cohorts, where the putamen was also involved at comparable stages. We also identified similarities between the Typical and Cortical subtypes, particularly their convergence on late-stage cortical atrophy, which suggests an overlap in their progression patterns. The lowest concordance was evident between models using entirely different datasets and contrasting approaches to the inclusion or exclusion of control subjects, such as between the combined OASIS and ADNI 3.0T model and the combined ANMerge and ADNI 1.5T model without controls. The Subcortical subtype emerged only in three of the eight models and was notably affected by the removal of control subjects, the magnetic field strength of the MRI scans, and the decision to omit the z-score 1 event. This subtype was markedly distinct from the Typical and Cortical subtypes, underscoring its unique progression pattern characterized by early subcortical involvement and a less pronounced cortical trajectory.

Table 3 summarizes demographic, clinical, biological and genetic variables across subtypes in each cohort, highlighting statistical differences or similarities. In all cohorts the proportion of control subjects participating in the disease progression sequence (i.e., stage > 0) is significantly higher in the Cortical and Subcortical subtypes but the proportion of subjects genetically more susceptible to AD, i.e., carrying at least one APOE $ε 4$ allele, is significantly higher in the Typical subtype. We do not consistently observe significant difference in demographic variables like age and sex. Importantly, despite the imbalance in the proportion of controls and the genetic AD susceptibility, we did not observe consistent significant differences in cognition as measured by the MMSE.

Table 3.

Descriptive statistics of demographic, genetic, and cognitive variables for individuals assigned to each subtype in each cohort when including controls in model fitting.

Cohort	Subtype	$n$	Control	APOE $ε 4$	Sex (% of F)	Age (y, s.d.)	MMSE (mean, s.d.)	Stage (mean, s.d.)
ANMerge	Typical	319	4%^a	66%^a	69%^a	75.7 ± 6.3^a	22.9 ± 4.9	4.4 ± 3.9^a
	Cortical	123	22%^a	41%^a	54%^a	74.3 ± 5.4^a	23.9 ± 5.8	5.8 ± 5.2^a
	Subcortical	-	-	-	-	-	-	-
OASIS	Typical	151	34%^a^{,^b}	58%^a	52%	71.5 ± 3.4	25.1 ± 4.4^a,b	3.4 ± 3.4^a,b
	Cortical	133	58%^a^{,^c}	38%^a	57%	69.5 ± 1.8	27.1 ± 3.6^a,c	2.4 ± 1.9^a
	Subcortical	54	72%^b^{,^c}	50%	50%	69.2 ± 1.6	28.3 ± 2.4^b,c	1.9 ± 1.6^b
A1.5/A	Typical	613	4%^a^{,^b}	61%^a^{,^b}	55%	74.4 ± 6.9^a,b	24.2 ± 4.4	4.0 ± 3.4
	Cortical	185	22%^a	31%^a^{,^c}	58%	77.0 ± 5.5^a,c	24.0 ± 5.5	4.5 ± 4.5
	Subcortical	170	26%^b	47%^b^{,^c}	50%	71.0 ± 7.5^b,c	25.1 ± 4.6	4.1 ± 3.6
A3/O	Typical	301	21%^a^{,^b}	56%^a	53%^b	72.3 ± 8.7	25.9 ± 3.1	2.8 ± 2.6^b
	Cortical	134	58%^a	38%^a	57%^c	71.0 ± 8.1	28.1 ± 2.4	2.4 ± 1.9
	Subcortical	57	60%^b	47%	37%^b^{,^c}	72.2 ± 7.7	27.9 ± 2.7	1.9 ± 1.6^b

Open in a new tab

We removed the subjects assigned to Stage 0 as those are not participating in the disease progression sequence. It is worth noting that we did not include the posterior cortical atrophy subtype found in ANMerge data.

^{^a}

Indicates a significant difference ( $p < 0.05$ ) between Typical and Cortical subtype.

^{^b}

Indicates a significant difference ( $p < 0.05$ ) between Typical and Subcortical subtype.

^{^c}

Indicates a significant difference ( $p < 0.05$ ) between Cortical and Subcortical subtype.

Abbreviations: n, number of subjects assigned to the subtype; Control, percentage of subject assigned to the subtype with $C D R = 0$ ; APOE $ε 4$ , percentage of subject assigned to the subtype with at least one $ε 4$ allele; M, male; F, female; MMSE, mini-mental state examination.

Supplemental Figures 17 and 18 show the probability of subtype assignment for each subject, indicative of SuStaIn's confidence, across each cohort, separated into controls and non-controls. Subjects are generally strongly assigned to the Typical subtype, as expected. Additionally, subtype assignment was considerably less confident in the OASIS data, especially controls.

Supplemental Figures 21 and 22 presents confusion matrices illustrating the consistency of subtype assignments across overlapping cohorts. These matrices provide insight into how reliably individuals are classified into the same subtype when analyzed under different models, particularly when control populations are included or excluded. The results indicate that Typical AD subtypes demonstrate very high consistency, while Cortical subtypes moderately high consistency and Subcortical subtypes show more variation. Especially for the latter, there's a lot of conversion to Stage 0 and this is in line with lower assignment probabilities as shown in Supplemental Figures 17 and 18.

Effect of control inclusion on subtype identification

The inclusion or exclusion of cognitively normal controls during model fitting led to notable differences in the identification and stability of atrophy subtypes across cohorts. The Typical subtype remained consistently identified in all models, irrespective of the presence of controls, indicating its robustness as the dominant atrophy pattern in AD progression. In contrast, the Cortical subtype exhibited variability depending on the dataset and inclusion of controls. In the ANMerge and combined ADNI-ANMerge models, the Cortical subtype was identified in both conditions, suggesting relative stability. However, in the OASIS based cohorts, the Cortical subtype was absent when controls were removed, instead producing a more diffuse clustering of subjects. Many of those controls also had high cortical atrophy z-scores (Supplemental Figure 19) indicating detectable atrophy patterns. This was true also for the Subcortical subtype (Supplemental Figure 20), which failed to emerge in all cohorts where controls were excluded.

The observed differences were also reflected in the similarity between subtypes across conditions. Supplemental Figures 23 and 24 demonstrate that for OASIS-based cohorts, the similarity between a subtype identified in the full cohort and its counterpart in the control-excluded model was consistently lower than in ANMerge-based cohorts.

To evaluate whether the changes in subtype structure observed upon removal of controls were driven primarily by sample size or by clinical composition, we conducted a subsampling analysis using the OASIS cohort, which was most impacted by control exclusion. Five random subsamples were drawn using different seeds, each matched in total sample size to the patients-only cohort ( $n \approx 300$ ), while preserving the original 7:3 control-to-patient ratio. SuStaIn was rerun independently on each of these subsamples, and the resulting subtype progression patterns are shown in Supplemental Figure 25.

Across all five subsamples, the Typical, Subcortical, and Cortical subtypes were consistently reproduced, with stable early atrophy profiles and comparable subtype fractions. Notably, the Cortical and Subcortical subtypes, absent in the patients-only models, re-emerged in all five subsampled models. When visually comparing the PVDs in Supplemental Figure 25 to those in Supplemental Figures 13 (full cohort with controls) and 14 (without controls), the subsampled models show a closer resemblance to the models that included controls. Specifically, we observe early and strong atrophy in the Cingulate and Insula for the Cortical subtype, and in the Pallidum, Putamen, and Caudate for the Subcortical subtype.

While there is increased uncertainty in event ordering, as shown in the greater blurriness in the PVDs, likely due to the reduced sample size, the core atrophy patterns and subtype assignment distributions remain consistent across subsamples. These results confirm that sample size plays a role in subtype robustness, but also demonstrate that diagnostic composition, particularly the presence of cognitively normal individuals with subtle or early atrophy, is a key factor in enabling SuStaIn to detect early-stage or atypical AD subtypes.

Discussion

This study was designed to investigate the reproducibility of AD atrophy subtypes identified in the original SuStaIn study, as a function of MRI scanner field strength, data source (study), and whether controls were included in the model fitting.⁹

Our results show that the three original atrophy subtypes discovered by SuStaIn—Typical, Cortical, and Subcortical—consistently emerged across datasets and scanner field strength, with some variation depending on whether controls were included in the model fitting.

The Typical subtype, characterized by early atrophy in the hippocampus and amygdala, aligns with the classic AD pattern often described in literature.^9–38 This was the largest subtype across all datasets (Table 3) and demonstrated the highest level of assignment consistency and confidence across cohort (Supplemental Figures 17 and 18). Notably, the Typical subtype was always assigned the smallest fraction of control subjects and the highest fraction of APOE $ε 4$ allele carriers (Table 3), consistent with a canonical manifestation of AD. In terms of agreement across cohorts, both the presence of the Typical subtypes and their internal consistency within cohorts were noteworthy. Kendall's tau calculations (Supplemental Figures 23 and 24) indicated a high degree of intra-subtype inter-cohort similarity, with almost perfect sequence concordance between the same cohort including or excluding the control population.

The Subcortical subtype presented a more complex picture. While it generally showed similarity with the corresponding cluster defined in the original SuStaIn work, it also paralleled the hippocampal-sparing variant of AD described in prior research,^6–43 partly explaining why it continually included the highest representation of control subjects. Moreover, it was characterized by a higher prevalence of male subjects (Table 3), which is in line with previous research.^5–22 Nonetheless, an anomaly was observed whereby the ANMerge data did not support a Subcortical subtype under cross-validation (although this was a borderline result and model comparison using CVIC, or any other criterion, is not perfect). There are two possible explanations for this. First, the limitations of 1.5T MRI scans may have reduced the signal to noise ratio in the small volumetric variations in subcortical regions. Secondly, model hyperparameters have a significant impact—when adding a subtle atrophy event of $z = 1$ (as in the original SuStaIn analysis),⁹ the Subcortical subtype returned. This implies that subcortical atrophy in AD variants is more subtle than in other affected regions. Across cohorts, Kendall's tau (Supplemental Figures 23 and 24) suggests a high degree of intra-subtype inter-cohort similarity for the models supporting a Subcortical subtype.

Finally, the Cortical subtype was found in most models and always mimicked the disease pattern found in the original study,⁹ characterized by early insula and cingulate atrophy. The Cortical subtype consistently included the smallest fraction of APOE $ε 4$ allele carriers, and a higher proportion of controls, similar to the Subcortical subtype (Table 3). Notably in the OASIS models, the cortical pattern was not as evident (lower, albeit still moderately strong, intra-subtype inter-cohort similarity: Supplemental Figure 25) and seemed to be driven by data from controls. When the control population was removed, the Cortical subtype disappeared.

Our findings can be contextualized alongside Vogel et al. (2020),⁴⁴ who applied SuStaIn to tau-PET imaging and identified four tau deposition subtypes: limbic-predominant, medial temporal lobe (MTL)-sparing, posterior, and lateral temporal. Their limbic-predominant and posterior subtypes align with our Typical and PCA-like subtypes, while their MTL-sparing subtype shares features with our Cortical subtype. However, a key distinction is that Vogel et al. focused on tau pathology, whereas we used MRI-based atrophy, which may explain differences, particularly for the Subcortical subtype, which lacks a clear counterpart in their study.

Importantly, tau aggregation often precedes neurodegeneration, meaning that tau deposition patterns do not necessarily correlate directly with atrophy patterns at all disease stages.⁴⁵ Selective vulnerability of neuronal populations and regional differences in resistance to neurodegeneration likely contribute to these discrepancies. Additionally, studies have shown that the relationship between tau burden and cortical atrophy is nonlinear, with relatively weak correlations between tau-PET and cortical thickness once tau levels are high.⁴⁶ Notably, the strongest tau-thickness correlations occur only at high cortical tau burdens,⁴⁶ which tend to appear later in disease progression, when subtypes begin to converge and look more similar, as observed in Vogel et al.⁴⁴ This suggests that early-stage differences in tau deposition may not always translate into distinct atrophy subtypes, particularly in the Subcortical subtype, which is more sensitive to methodological and cohort variations. Future work integrating tau-PET and MRI-based SuStaIn models could provide a more comprehensive view of AD subtypes, refining how we interpret the progression from tau accumulation to structural neurodegeneration.

With these subtype patterns established, we next examined how the inclusion of cognitively normal controls influenced model stability and subtype reproducibility across datasets. The exclusion of controls data from model fitting had varying impacts on the three original atrophy subtypes. Controls, by definition, should not exhibit AD-related atrophy, yet their inclusion in the model might provide insights into early-stage disease progression if they are in fact pre-symptomatic AD cases. The key concern, though, is whether the presence of controls could inadvertently incorporate the effects of normal aging into the AD subtypes, as it is well-documented that brain volumes, specifically cortical, decrease with age in cognitively unimpaired older adults.^47–48 Our results suggest that the effect of including data from controls in AD progression model fitting is more complex than merely introducing normal aging patterns. We highlight one key observation per subtype.

The Typical subtype was unaffected and appeared in all models. This is unsurprising as the subtype was driven mostly by patient data even when control data was included in model fitting, and all studies were enriched for the typical, memory-led clinical phenotype of AD. The Cortical subtype appeared consistently in 1.5T non-control data from ANMerge and ADNI, but not in the 3T non-control data that was predominantly from OASIS. This is plausibly explained by OASIS's relative enrichment for preclinical individuals,²³ with a substantial 71% of participants having a CDR score of 0, combined with the relatively stringent minimum atrophy z-score of 2 in our modeling. Moreover, Supplemental Figure 19 demonstrates controls showed pronounced atrophies, underscoring the significance of their contribution to the sequence. Finally, the Subcortical subtype disappeared altogether when control data was excluded from model fitting. There are multiple layers to this. First, the controls ( $C D R = 0$ ) apparently driving the Subcortical subtype, showed pronounced atrophies, with z-scores exceeding 2, akin to non-controls (Supplemental Figure 20). Second, the Subcortical subtype might be expected to have an atypical clinical phenotype (at least early on), leading to under-representation among the non-control samples in these memory-enriched studies. This could render clustering analyses such as ours underpowered.

Focusing on the OASIS models, which were significantly impacted by the removal of control subjects, reveals one more insight. Notably, 60% of controls carried the $ε 4$ allele of the APOE gene, linked to higher AD risk, significantly differing from the 20% in other databases ( $p ≪ 0.05$ ). These individuals could be preclinical AD cases, thereby influencing the model outcomes when removed prior to fitting.

Supported by our insights, we argue that including control subjects in AD models does more than merely introduce normal aging patterns. This is particularly pertinent for subcortical and cortical atrophy, which may not immediately affect memory and cognitive scores like the CDR.⁴⁹ However, the Cortical and Subcortical subtypes individuals would eventually manifest typical AD-related atrophy and subsequent cognitive decline. If precision in defining control subjects is limited, their inclusion might still be beneficial for early AD pathology detection, outweighing risks of normal aging contamination. This is vital for early detection and intervention in AD, where neuropathological changes often precede cognitive symptoms by years,⁵⁰ and early-stage interventions, like anti-amyloid therapies, have shown increased effectiveness.^51–52 However, while our findings highlight the potential of SuStaIn in detecting early AD-related atrophy patterns, our current models are not optimized for identifying preclinical AD cases, as the chosen z = 2 threshold likely prevents subtype assignment in asymptomatic individuals.

Additionally, we note that the impact of control exclusion on subtype stability may not solely reflect clinical diagnosis differences, but could also stem from underlying sample size effects, especially in the OASIS cohorts, where controls constitute the majority. To further disentangle the effects of clinical composition and sample size, we conducted a subsampling analysis. By generating matched-size subsamples that preserved the original control-to-patient ratio, we demonstrated that subtype stability, particularly for the Subcortical and Cortical subtypes, was largely recovered. This indicates that the loss of these subtypes in the patients-only models cannot be attributed solely to reduced sample size, but mostly reflects the exclusion of individuals with early or subclinical atrophy. These findings reinforce the value of diverse diagnostic composition in training robust disease progression models and support the potential of SuStaIn for detecting early and atypical AD-related atrophy patterns.

The analysis of the ANMerge cohort revealed a distinct subtype characterized by prominent atrophy in the occipital and parietal lobes, resonating with PCA, a variant of AD.^53–54 Consistent with the neurodegeneration, PCA manifests itself mainly with different forms of visual impairments, as well as difficulties in mathematical calculation and spelling. Moreover, PCA has a much earlier onset than normal AD as it commonly occurs between 50–65 years of age. Our findings align with these characteristics, as the suspected PCA cluster individuals performed significantly worse in tasks typically associated with PCA manifestations, with impairments becoming even more evident in non-control subjects. Furthermore, the mean age of the suspected PCA cluster was significantly lower than the others, and, in the model not including the control population, the majority of the subjects (55%) were 65 years old or younger. These findings support the hypothesis that the cluster found in ANMerge can be linked to PCA. SuStaIn's ability to discern this atypical subtype could prompt clinicians to conduct targeted clinical assessments alongside standard (memory-led) ones, which might fall short or misinterpret atypical AD and other neurodegenerative diseases.⁵⁵

By examining SuStaIn subtypes across structurally different datasets, we identified which cohort-specific factors influence subtype expression and which aspects remain robust despite dataset variability. Recruitment strategies and diagnostic frameworks impacted subtype composition, with ADNI's structured protocol yielding more consistent subtypes, while OASIS and ANMerge, with more heterogeneous diagnostic criteria, introduced greater variability. Geographic differences (e.g., ANMerge's European data versus North American cohorts) and imaging protocols also influenced subtype stability, particularly for Cortical and Subcortical subtypes. Despite these differences, the Typical subtype consistently emerged, suggesting core AD-related neurodegeneration is reproducible across datasets. However, precisely identifying which aspects of the recruitment protocols or geographical differences drive this variability remains challenging and could be explored in future studies. These findings highlight the importance of dataset composition in SuStaIn-based subtyping and the need for further validation in diverse, real-world populations.

We highlight some strengths and limitations of our study. We assessed the SuStaIn model and AD models under different scenarios, thereby evaluating their reproducibility and robustness. Moreover, our methodology was further enhanced by incorporating a diverse array of neuropsychological assessments, where possible, to improve interpretability of our models. However, our study is not without limitations, which in turn, guide the trajectory for future research. While our models were validated across multiple datasets spanning two continents, it is important to contextualize our findings within the limited diversity of ethnicity and socioeconomic status represented in the data. This limitation, inherent to the public datasets employed in our research, underscores a broader challenge faced by many researchers in the field. Existing research suggests that differences in brain morphology, genetic risk factors such as APOE ε4 allele prevalence, and socioeconomic disparities may contribute to variations in disease presentation across populations. Studies have shown that African American and Hispanic individuals are at higher risk of developing AD compared to non-Hispanic whites, with potential contributions from both biological and lifestyle factors. Additionally, neuroimaging studies have demonstrated that genetic ancestry can be predicted from brain structure, reinforcing the notion that structural variations may influence disease trajectories.⁵⁶ Addressing this gap in future studies by expanding the ethnic diversity of cohorts will be crucial to validate and extend the applicability of our results.⁵⁷

Moreover, we have highlighted the CDR's shortcomings in capturing non-typical disease manifestation and progression. Disease progression models like the Z-score model must be interpreted with respect to the reference control population. However, the complexity and variability of AD make defining an ideal control group challenging.¹⁸ Our cohorts are enriched for AD but the presence of mixed pathologies remains a possibility, particularly in older individuals, and neuropathological confirmation was available only for a small subset of participants. Future studies incorporating comprehensive biomarker assessments will be crucial for refining data-driven AD subtypes. Our basic definition ( $C D R = 0$ ) could be augmented in future work with biomarker data on amyloid-β and/or phosphorylated-tau negativity,^58–60 and genetic risk factors like the APOE $ε 4$ allele.

Beyond validating SuStaIn subtypes across diverse datasets, an important next step is evaluating their clinical translatability. While research datasets typically contain standardized, high-quality MRI scans, real-world clinical imaging often exhibits greater variability due to differences in acquisition protocols and scanner quality. Additionally, data drift, including shifts in patient demographics such as ethnicity imbalance, may affect the robustness of SuStaIn subtypes in clinical settings. Another key consideration is clinical adoption, as integrating machine-learning models into healthcare workflows requires not only technical validation but also clinician trust and usability. Previous studies have emphasized the importance of human factors in the implementation of AI-based diagnostic tools, highlighting the need for prospective validation in real-world clinical environments.⁶¹ Future research should focus on assessing how well SuStaIn subtypes generalize to routine healthcare imaging and how clinicians on the frontline perceive and integrate these models into their diagnostic decision-making processes.

In conclusion, this study uniquely reveals conditions under which one can expect robustness and inter-cohort reliability of the SuStaIn algorithm in its ability to reliably subtype and stage AD patients across distinct databases. The evident consistency in identifying three primary disease progression sequences, Typical, Cortical, and Subcortical, reinforces the notion that AD may in fact be a set of sub-diseases, or disease subtypes, rather than a single biological cascade.

Supplemental Material

sj-docx-1-alz-10.1177_13872877251415019 - Supplemental material for How reproducible are data-driven subtypes of Alzheimer's disease atrophy?

sj-docx-1-alz-10.1177_13872877251415019.docx^{(7.9MB, docx)}

Supplemental material, sj-docx-1-alz-10.1177_13872877251415019 for How reproducible are data-driven subtypes of Alzheimer's disease atrophy? by Emma Prevot, Cameron Shand, and Neil Oxtoby in Journal of Alzheimer's Disease

Acknowledgements

The authors acknowledge members of the UCL POND group (http://pond.cs.ucl.ac.uk) for valuable feedback received during group discussions.

EP is supported by and acknowledges the Oxford EPSRC Centre for Doctoral Training in Health Data Science (EP/S02428X/1), which is currently funding her DPhil at the University of Oxford.

Part of the data used in this project was obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). Data collection and sharing for the Alzheimer's Disease Neuroimaging Initiative (ADNI) is funded by the National Institute on Aging (National Institutes of Health Grant U19 AG024904), the Canadian Institutes of Health Research in Canada, and generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The grantee organization is the Northern California Institute for Research and Education. Private sector contributions were made possible through the Foundation for the National Institutes of Health (http://www.fnih.org).

Another part of the data was provided by OASIS-1: Cross-Sectional. The authors acknowledge the Principal Investigators: D. Marcus, R, Buckner, J, Csernansky J. Morris; P50 AG05681, P01 AG03991, P01 AG026276, R01 AG021910, P20 MH071616, U24 RR021382.

Finally, data was also provided by AddNeuroMed. The authors acknowledge the AddNeuroMed project participants and the AddNeuroMed project scientists—clinical leads responsible for data collection—Iwona Kloszewska (Lodz), Simon Lovestone (London), Patrizia Mecocci (Perugia), Hilkka Soininen (Kuopio), Magda Tsolaki (Thessaloniki), and Bruno Vellas (Toulouse), imaging leads—Andy Simmons (London), Lars-Olof Wahlund (Stockholm) and Christian Spenger (Zurich), and bioinformatics leads—Richard Dobson (London) and Stephen Newhouse (London).

Footnotes

ORCID iDs: Emma Prevot https://orcid.org/0009-0002-6729-9505

Cameron Shand https://orcid.org/0000-0002-1299-890X

Neil Oxtoby https://orcid.org/0000-0003-0203-3909

Author's Note: Emma Prevot is now affiliated with the Department of Statistics, University of Oxford, Oxford, UK.

Ethical considerations: This was a secondary analysis of deidentified human data generated by previous studies having appropriate informed consent. This analysis was approved by the UCL Research Ethics Committee under project 8019/005.

Consent to participate: Not applicable

Consent for publication: Not applicable

Author contribution(s): Emma Prevot: Data curation; Formal analysis; Investigation; Validation; Visualization; Writing – original draft; Writing – review & editing.

Cameron Shand: Conceptualization; Formal analysis; Methodology; Software; Supervision; Validation; Writing – review & editing.

Neil Oxtoby: Conceptualization; Data curation; Formal analysis; Funding acquisition; Methodology; Project administration; Software; Supervision; Validation; Writing – review & editing.

Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the UKRI Medical Research Council via both a Future Leaders Fellowship (MR/S03546X/1, MR/X024288/1) and the Joint Programme—Neurodegenerative Disease Research (E-DADS project: MR/T046422/1).

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: NPO is a consultant for Queen Square Analytics Limited (UK) on unrelated projects. All other authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement: ADNI data can be obtained via the ADNI and LONI websites (adni.loni.usc.edu). For up-to-date information, see www.adni-info.org. The OASIS-1: Cross-Sectional data can be obtained via http://www.oasis-brains.org. AddNeuroMed data can instead be accessed through https://www.synapse.org. SuStaIn algorithm is available at https://github.com/ucl-pond

Supplemental material: Supplemental material for this article is available online.

References

1.World Health Organization (WHO). Dementia, https://www.who.int/news-room/fact-sheets/detail/dementia (2023, accessed March 15, 2023).
2.Alzheimer’s Disease International. Dementia statistics, https://www.alzint.org/about/dementia-facts-figures/dementia-statistics/ (2024).
3.Economic burden of Alzheimer disease and managed care considerations. Am J Manag Care 2020; 26: S177–S183. [DOI] [PubMed] [Google Scholar]
4.Ferreira D, Wahlund L-O, Westman E. The heterogeneity within Alzheimer’s disease. Aging 2018; 10: 3058–3060. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Ferreira D, Nordberg A, Westman E. Biological subtypes of Alzheimer disease. Neurology 2020; 94: 436–448. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Murray ME, Graff-Radford NR, Ross OA, et al. Neuropathologically defined subtypes of Alzheimer’s disease with distinct clinical characteristics: a retrospective study. Lancet Neurol 2011; 10: 785–796. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Zheng C, Xu R. Molecular subtyping of Alzheimer’s disease with consensus non-negative matrix factorization. PLoS One 2021; 16: e0250278. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Oxtoby N, Alexander D. Imaging plus X: multimodal models of neurodegenerative disease. Curr Opin Neurol 2017; 30: 371–379. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Young AL, Marinescu RV, Oxtoby NP, et al. Uncovering the heterogeneity and temporal complexity of neurodegenerative diseases with Subtype and Stage Inference. Nat Commun 2018; 9: 4273. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Young AL, Vogel JW, Aksman LM, et al. Ordinal SuStaIn: subtype and stage inference for clinical scores, visual ratings, and other ordinal data. Front Artif Intell 2021; 4: 613261. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Fonteijn HM, Modat M, Clarkson MJ, et al. An event-based model for disease progression and its application in familial Alzheimer’s disease and Huntington’s disease. Neuroimage 2012; 60: 1880–1889. [DOI] [PubMed] [Google Scholar]
12.Raket LL. Statistical disease progression modeling in Alzheimer disease. Front Big Data 2020; 3: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Venkatraghavan V, Vinke EJ, Bron EE, et al. Progression along data-driven disease timelines is predictive of Alzheimer’s disease in a population-based cohort. Neuroimage 2021; 238: 118233. [DOI] [PubMed] [Google Scholar]
14.Gomez-Nicola D, Boche D. Post-mortem analysis of neuroinflammatory changes in human Alzheimer’s disease. Alzheimers Res Ther 2015; 7: 42. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Mitelpunkt A, Galili T, Kozlovski T, et al. Novel Alzheimer’s disease subtypes identified using a data and knowledge driven strategy. Sci Rep 2020; 10: 1327. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Alexander N, Alexander DC, Barkhof F, et al. Identifying and evaluating clinical subtypes of Alzheimer’s disease in care electronic health records using unsupervised machine learning. BMC Med Inform Decis Mak 2021; 21: 43. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Whitwell JL, Dickson DW, Murray ME, et al. Neuroimaging correlates of pathologically defined subtypes of Alzheimer’s disease: a case-control study. Lancet Neurol 2012; 11: 868–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Noh Y, Jeon S, Lee JM, et al. Anatomical heterogeneity of Alzheimer disease: based on cortical thickness on MRIs. Neurology 2014; 83: 1936–1944. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Jellinger KA. Pathobiological subtypes of Alzheimer disease. Dement Geriatr Cogn Disord 2020; 49: 321–333. [DOI] [PubMed] [Google Scholar]
20.Westman E, Simmons A, Muehlboeck J-S, et al. Addneuromed and ADNI: similar patterns of Alzheimer’s atrophy and automated MRI classification accuracy in Europe and North America. Neuroimage 2011; 58: 818–828. [DOI] [PubMed] [Google Scholar]
21.Ferreira D, Hansson O, Barroso J, et al. The interactive effect of demographic and clinical factors on hippocampal volume: a multicohort study on 1958 cognitively normal individuals. Hippocampus 2017; 27: 653–667. [DOI] [PubMed] [Google Scholar]
22.Archetti D, Young AL, Oxtoby NP, et al. Inter-cohort validation of SuStaIn model for Alzheimer’s disease. Front Big Data 2021; 4: 661110. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Marcus DS, Wang TH, Parker J, et al. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 2007; 19: 1498–1507. [DOI] [PubMed] [Google Scholar]
24.Birkenbihl C, Westwood S, Shi L, et al. ANMerge: a comprehensive and accessible Alzheimer’s disease patient-level dataset. J Alzheimers Dis 2021; 79: 423–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Pacheco J, Goh JO, Kraut MA, et al. Greater cortical thinning in normal older adults predicts later cognitive impairment. Neurobiol Aging 2015; 36: 903–908. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Khan TK. Clinical diagnosis of Alzheimer’s disease. In: Khan TK. (ed.) Biomarkers in Alzheimer’s disease. New York: Elsevier, 2016, pp.27–48. [Google Scholar]
27.Manning CA, Ducharme JK. Dementia syndromes in the older adult. In: Lichtenberg PA. (ed.) Handbook of assessment in clinical gerontology. London: Elsevier, 2010, pp.155–178. [Google Scholar]
28.Husain MA, Laurent B, Plourde M. APOE And Alzheimer’s disease: from lipid transport to physiopathology and therapeutics. Front Neurosci 2021; 15: 630502. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Galasko D, Klauber MR, Hofstetter CR, et al. The mini-mental state examination in the early diagnosis of Alzheimer’s disease. Arch Neurol 1990; 47: 49–52. [DOI] [PubMed] [Google Scholar]
30.Gluhm S, Goldstein J, Loc K, et al. Cognitive performance on the mini-mental state examination and the Montreal cognitive assessment across the healthy adult lifespan. Cogn Behav Neurol 2013; 26: 1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Roe CM, Fagan AM, Grant EA, et al. Amyloid imaging and CSF biomarkers in predicting cognitive impairment up to 7.5 years later. Neurology 2013; 80: 1784–1791. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Kahan BC, Jairath V, Doré CJ, et al. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials 2014; 15: 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Iaccarino L, La Joie R, Edwards L, et al. Spatial relationships between molecular pathology and neurodegeneration in the Alzheimer’s disease continuum. Cereb Cortex 2021; 31: 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Jack CR, Jr, Petersen RC, Xu YC, et al. Medial temporal atrophy on MRI in normal aging and very mild Alzheimer’s disease. Neurology 1997; 49: 786–794. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Katsumi Y, Quimby M, Hochberg D, et al. Association of regional cortical network atrophy with progression to dementia in patients with primary progressive aphasia. Neurology 2023; 100: e286–e296. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Shivam Agrahari, Towards Data Science. Monte Carlo Markov Chain (MCMC), https://altoida.com/blog/the-neuropathological-hallmarks-of-alzheimers-disease/ (2021).
37.Smyth P. Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 2000; 10: 63–72. [Google Scholar]
38.Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol (Berl) 1991; 82: 239–259. [DOI] [PubMed] [Google Scholar]
39.Hyman BT, Van Hoesen GW, Damasio AR, et al. Alzheimer’s disease: cell-specific pathology isolates the hippocampal formation. Science 1984; 225: 1168–1170. [DOI] [PubMed] [Google Scholar]
40.Gómez-Isla T, Price JL, McKeel DW, et al. Profound loss of layer II entorhinal cortex neurons occurs in very mild Alzheimer’s disease. J Neurosci 1996; 16: 4491–4500. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg 2018; 126: 1763–1768. [DOI] [PubMed] [Google Scholar]
42.Marinescu R, Eshaghi A, Alexander D, et al. Brainpainter: a software for the visualisation of brain structures, biomarkers and associated pathological processes. Multimodal Brain Image Anal Math Found Comput Anat 2019; 11846: 112–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Ferreira D, Pereira JB, Volpe G, et al. Subtypes of Alzheimer’s disease display distinct network abnormalities extending beyond their pattern of brain atrophy. Front Neurol 2019; 10: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.the Alzheimer’s Disease Neuroimaging Initiative, Vogel JW, Young AL, et al. Four distinct trajectories of tau deposition identified in Alzheimer’s disease. Nat Med 2021; 27: 871–881. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Seeley WW, Crawford RK, Zhou J, et al. Neurodegenerative diseases target large-scale human brain networks. Neuron 2009; 62: 42–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.La Joie R, Bejanin A, Fagan AM, et al. Associations between [¹⁸F]AV1451 tau PET and CSF measures of tau pathology in a clinical sample. Neurology 2018; 90: e282–e290. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Freeman SH, Kandel R, Cruz L, et al. Preservation of neuronal number despite age-related cortical brain atrophy in elderly subjects without Alzheimer disease. J Neuropathol Exp Neurol 2008; 67: 1205–1212. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Cook AH, Sridhar J, Ohm D, et al. Rates of cortical atrophy in adults 80 years and older with superior vs average episodic memory. JAMA 2017; 317: 1373–1375. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Gupta Y, Lee KH, Choi KY, et al. Alzheimer’s disease diagnosis based on cortical and subcortical features. J Healthc Eng 2019; 2019: 2492719. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Beason-Held LL, Goh JO, An Y, et al. Changes in brain function occur years before the onset of cognitive impairment. J Neurosci 2013; 33: 18008–18014. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Aisen PS, Jimenez-Maggiora GA, Rafii MS, et al. Early-stage Alzheimer disease: getting trial-ready. Nat Rev Neurol 2022; 18: 389–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Scharre DW. Preclinical, prodromal, and dementia stages of Alzheimer’s disease, https://practicalneurology.com/articles/2019-june/preclinical-prodromal-and-dementia-stages-ofalzheimers-disease (2019).
53.Alagiakrishnan K, Khan K, Saqqur M. Posterior cortical atrophy and dementia. In: Martin CR, Preedy VR. (eds) Diet and nutrition in dementia and cognitive decline. London: Elsevier, 2014, pp.139–145. [Google Scholar]
54.Warren JD, Fletcher PD, Golden HL. The paradox of syndromic diversity in Alzheimer disease. Nat Rev Neurol 2012; 8: 451–464. [DOI] [PubMed] [Google Scholar]
55.Reul S, Lohmann H, Wiendl H, et al. Can cognitive assessment really discriminate early stages of Alzheimer’s and behavioural variant frontotemporal dementia at initial clinical presentation? Alzheimers Res Ther 2017; 9: 61. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Altmann A, Mourao-Miranda J. Evidence for bias of genetic ancestry in resting state functional MRI. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 2019, pp. 275–279.
57.Weiner MW, Veitch DP, Miller MJ, et al. Increasing participant diversity in AD research: plans for digital screening, blood testing, and a community-engaged approach in the Alzheimer’s Disease Neuroimaging Initiative 4. Alzheimers Dement 2023; 19: 307–317. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Prvulovic D, Hampel H. Amyloid β (Aβ) and phospho-tau (p-tau) as diagnostic biomarkers in Alzheimer’s disease. Clin Chem Lab Med 2011; 49: 367–374. [DOI] [PubMed] [Google Scholar]
59.Ashton NJ, Pascoal TA, Karikari TK, et al. Plasma p-tau231: a new biomarker for incipient Alzheimer’s disease pathology. Acta Neuropathol (Berl) 2021; 141: 709–724. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Janelidze S, Bali D, Ashton NJ, et al. Head-to-head comparison of 10 plasma phospho-tau assays in prodromal Alzheimer’s disease. Brain 2023; 146: 1592–1601. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Bellio M, Furniss D, Oxtoby NP, et al. Opportunities and barriers for adoption of a decision-support tool for Alzheimer’s disease. ACM Trans Comput Healthcare 2021; 2: 32. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-docx-1-alz-10.1177_13872877251415019 - Supplemental material for How reproducible are data-driven subtypes of Alzheimer's disease atrophy?

sj-docx-1-alz-10.1177_13872877251415019.docx^{(7.9MB, docx)}

[bibr1-13872877251415019] 1.World Health Organization (WHO). Dementia, https://www.who.int/news-room/fact-sheets/detail/dementia (2023, accessed March 15, 2023).

[bibr2-13872877251415019] 2.Alzheimer’s Disease International. Dementia statistics, https://www.alzint.org/about/dementia-facts-figures/dementia-statistics/ (2024).

[bibr3-13872877251415019] 3.Economic burden of Alzheimer disease and managed care considerations. Am J Manag Care 2020; 26: S177–S183. [DOI] [PubMed] [Google Scholar]

[bibr4-13872877251415019] 4.Ferreira D, Wahlund L-O, Westman E. The heterogeneity within Alzheimer’s disease. Aging 2018; 10: 3058–3060. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr5-13872877251415019] 5.Ferreira D, Nordberg A, Westman E. Biological subtypes of Alzheimer disease. Neurology 2020; 94: 436–448. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr6-13872877251415019] 6.Murray ME, Graff-Radford NR, Ross OA, et al. Neuropathologically defined subtypes of Alzheimer’s disease with distinct clinical characteristics: a retrospective study. Lancet Neurol 2011; 10: 785–796. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr7-13872877251415019] 7.Zheng C, Xu R. Molecular subtyping of Alzheimer’s disease with consensus non-negative matrix factorization. PLoS One 2021; 16: e0250278. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr8-13872877251415019] 8.Oxtoby N, Alexander D. Imaging plus X: multimodal models of neurodegenerative disease. Curr Opin Neurol 2017; 30: 371–379. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr9-13872877251415019] 9.Young AL, Marinescu RV, Oxtoby NP, et al. Uncovering the heterogeneity and temporal complexity of neurodegenerative diseases with Subtype and Stage Inference. Nat Commun 2018; 9: 4273. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr10-13872877251415019] 10.Young AL, Vogel JW, Aksman LM, et al. Ordinal SuStaIn: subtype and stage inference for clinical scores, visual ratings, and other ordinal data. Front Artif Intell 2021; 4: 613261. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr11-13872877251415019] 11.Fonteijn HM, Modat M, Clarkson MJ, et al. An event-based model for disease progression and its application in familial Alzheimer’s disease and Huntington’s disease. Neuroimage 2012; 60: 1880–1889. [DOI] [PubMed] [Google Scholar]

[bibr12-13872877251415019] 12.Raket LL. Statistical disease progression modeling in Alzheimer disease. Front Big Data 2020; 3: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr13-13872877251415019] 13.Venkatraghavan V, Vinke EJ, Bron EE, et al. Progression along data-driven disease timelines is predictive of Alzheimer’s disease in a population-based cohort. Neuroimage 2021; 238: 118233. [DOI] [PubMed] [Google Scholar]

[bibr14-13872877251415019] 14.Gomez-Nicola D, Boche D. Post-mortem analysis of neuroinflammatory changes in human Alzheimer’s disease. Alzheimers Res Ther 2015; 7: 42. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr15-13872877251415019] 15.Mitelpunkt A, Galili T, Kozlovski T, et al. Novel Alzheimer’s disease subtypes identified using a data and knowledge driven strategy. Sci Rep 2020; 10: 1327. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr16-13872877251415019] 16.Alexander N, Alexander DC, Barkhof F, et al. Identifying and evaluating clinical subtypes of Alzheimer’s disease in care electronic health records using unsupervised machine learning. BMC Med Inform Decis Mak 2021; 21: 43. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr17-13872877251415019] 17.Whitwell JL, Dickson DW, Murray ME, et al. Neuroimaging correlates of pathologically defined subtypes of Alzheimer’s disease: a case-control study. Lancet Neurol 2012; 11: 868–877. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr18-13872877251415019] 18.Noh Y, Jeon S, Lee JM, et al. Anatomical heterogeneity of Alzheimer disease: based on cortical thickness on MRIs. Neurology 2014; 83: 1936–1944. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr19-13872877251415019] 19.Jellinger KA. Pathobiological subtypes of Alzheimer disease. Dement Geriatr Cogn Disord 2020; 49: 321–333. [DOI] [PubMed] [Google Scholar]

[bibr20-13872877251415019] 20.Westman E, Simmons A, Muehlboeck J-S, et al. Addneuromed and ADNI: similar patterns of Alzheimer’s atrophy and automated MRI classification accuracy in Europe and North America. Neuroimage 2011; 58: 818–828. [DOI] [PubMed] [Google Scholar]

[bibr21-13872877251415019] 21.Ferreira D, Hansson O, Barroso J, et al. The interactive effect of demographic and clinical factors on hippocampal volume: a multicohort study on 1958 cognitively normal individuals. Hippocampus 2017; 27: 653–667. [DOI] [PubMed] [Google Scholar]

[bibr22-13872877251415019] 22.Archetti D, Young AL, Oxtoby NP, et al. Inter-cohort validation of SuStaIn model for Alzheimer’s disease. Front Big Data 2021; 4: 661110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr23-13872877251415019] 23.Marcus DS, Wang TH, Parker J, et al. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 2007; 19: 1498–1507. [DOI] [PubMed] [Google Scholar]

[bibr24-13872877251415019] 24.Birkenbihl C, Westwood S, Shi L, et al. ANMerge: a comprehensive and accessible Alzheimer’s disease patient-level dataset. J Alzheimers Dis 2021; 79: 423–431. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr25-13872877251415019] 25.Pacheco J, Goh JO, Kraut MA, et al. Greater cortical thinning in normal older adults predicts later cognitive impairment. Neurobiol Aging 2015; 36: 903–908. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr26-13872877251415019] 26.Khan TK. Clinical diagnosis of Alzheimer’s disease. In: Khan TK. (ed.) Biomarkers in Alzheimer’s disease. New York: Elsevier, 2016, pp.27–48. [Google Scholar]

[bibr27-13872877251415019] 27.Manning CA, Ducharme JK. Dementia syndromes in the older adult. In: Lichtenberg PA. (ed.) Handbook of assessment in clinical gerontology. London: Elsevier, 2010, pp.155–178. [Google Scholar]

[bibr28-13872877251415019] 28.Husain MA, Laurent B, Plourde M. APOE And Alzheimer’s disease: from lipid transport to physiopathology and therapeutics. Front Neurosci 2021; 15: 630502. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr29-13872877251415019] 29.Galasko D, Klauber MR, Hofstetter CR, et al. The mini-mental state examination in the early diagnosis of Alzheimer’s disease. Arch Neurol 1990; 47: 49–52. [DOI] [PubMed] [Google Scholar]

[bibr30-13872877251415019] 30.Gluhm S, Goldstein J, Loc K, et al. Cognitive performance on the mini-mental state examination and the Montreal cognitive assessment across the healthy adult lifespan. Cogn Behav Neurol 2013; 26: 1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr31-13872877251415019] 31.Roe CM, Fagan AM, Grant EA, et al. Amyloid imaging and CSF biomarkers in predicting cognitive impairment up to 7.5 years later. Neurology 2013; 80: 1784–1791. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr32-13872877251415019] 32.Kahan BC, Jairath V, Doré CJ, et al. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials 2014; 15: 39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr33-13872877251415019] 33.Iaccarino L, La Joie R, Edwards L, et al. Spatial relationships between molecular pathology and neurodegeneration in the Alzheimer’s disease continuum. Cereb Cortex 2021; 31: 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr34-13872877251415019] 34.Jack CR, Jr, Petersen RC, Xu YC, et al. Medial temporal atrophy on MRI in normal aging and very mild Alzheimer’s disease. Neurology 1997; 49: 786–794. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr35-13872877251415019] 35.Katsumi Y, Quimby M, Hochberg D, et al. Association of regional cortical network atrophy with progression to dementia in patients with primary progressive aphasia. Neurology 2023; 100: e286–e296. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr36-13872877251415019] 36.Shivam Agrahari, Towards Data Science. Monte Carlo Markov Chain (MCMC), https://altoida.com/blog/the-neuropathological-hallmarks-of-alzheimers-disease/ (2021).

[bibr37-13872877251415019] 37.Smyth P. Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 2000; 10: 63–72. [Google Scholar]

[bibr38-13872877251415019] 38.Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol (Berl) 1991; 82: 239–259. [DOI] [PubMed] [Google Scholar]

[bibr39-13872877251415019] 39.Hyman BT, Van Hoesen GW, Damasio AR, et al. Alzheimer’s disease: cell-specific pathology isolates the hippocampal formation. Science 1984; 225: 1168–1170. [DOI] [PubMed] [Google Scholar]

[bibr40-13872877251415019] 40.Gómez-Isla T, Price JL, McKeel DW, et al. Profound loss of layer II entorhinal cortex neurons occurs in very mild Alzheimer’s disease. J Neurosci 1996; 16: 4491–4500. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr41-13872877251415019] 41.Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg 2018; 126: 1763–1768. [DOI] [PubMed] [Google Scholar]

[bibr42-13872877251415019] 42.Marinescu R, Eshaghi A, Alexander D, et al. Brainpainter: a software for the visualisation of brain structures, biomarkers and associated pathological processes. Multimodal Brain Image Anal Math Found Comput Anat 2019; 11846: 112–120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr43-13872877251415019] 43.Ferreira D, Pereira JB, Volpe G, et al. Subtypes of Alzheimer’s disease display distinct network abnormalities extending beyond their pattern of brain atrophy. Front Neurol 2019; 10: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr44-13872877251415019] 44.the Alzheimer’s Disease Neuroimaging Initiative, Vogel JW, Young AL, et al. Four distinct trajectories of tau deposition identified in Alzheimer’s disease. Nat Med 2021; 27: 871–881. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr45-13872877251415019] 45.Seeley WW, Crawford RK, Zhou J, et al. Neurodegenerative diseases target large-scale human brain networks. Neuron 2009; 62: 42–52. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr46-13872877251415019] 46.La Joie R, Bejanin A, Fagan AM, et al. Associations between [¹⁸F]AV1451 tau PET and CSF measures of tau pathology in a clinical sample. Neurology 2018; 90: e282–e290. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr47-13872877251415019] 47.Freeman SH, Kandel R, Cruz L, et al. Preservation of neuronal number despite age-related cortical brain atrophy in elderly subjects without Alzheimer disease. J Neuropathol Exp Neurol 2008; 67: 1205–1212. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr48-13872877251415019] 48.Cook AH, Sridhar J, Ohm D, et al. Rates of cortical atrophy in adults 80 years and older with superior vs average episodic memory. JAMA 2017; 317: 1373–1375. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr49-13872877251415019] 49.Gupta Y, Lee KH, Choi KY, et al. Alzheimer’s disease diagnosis based on cortical and subcortical features. J Healthc Eng 2019; 2019: 2492719. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr50-13872877251415019] 50.Beason-Held LL, Goh JO, An Y, et al. Changes in brain function occur years before the onset of cognitive impairment. J Neurosci 2013; 33: 18008–18014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr51-13872877251415019] 51.Aisen PS, Jimenez-Maggiora GA, Rafii MS, et al. Early-stage Alzheimer disease: getting trial-ready. Nat Rev Neurol 2022; 18: 389–399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr52-13872877251415019] 52.Scharre DW. Preclinical, prodromal, and dementia stages of Alzheimer’s disease, https://practicalneurology.com/articles/2019-june/preclinical-prodromal-and-dementia-stages-ofalzheimers-disease (2019).

[bibr53-13872877251415019] 53.Alagiakrishnan K, Khan K, Saqqur M. Posterior cortical atrophy and dementia. In: Martin CR, Preedy VR. (eds) Diet and nutrition in dementia and cognitive decline. London: Elsevier, 2014, pp.139–145. [Google Scholar]

[bibr54-13872877251415019] 54.Warren JD, Fletcher PD, Golden HL. The paradox of syndromic diversity in Alzheimer disease. Nat Rev Neurol 2012; 8: 451–464. [DOI] [PubMed] [Google Scholar]

[bibr55-13872877251415019] 55.Reul S, Lohmann H, Wiendl H, et al. Can cognitive assessment really discriminate early stages of Alzheimer’s and behavioural variant frontotemporal dementia at initial clinical presentation? Alzheimers Res Ther 2017; 9: 61. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr56-13872877251415019] 56.Altmann A, Mourao-Miranda J. Evidence for bias of genetic ancestry in resting state functional MRI. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 2019, pp. 275–279.

[bibr57-13872877251415019] 57.Weiner MW, Veitch DP, Miller MJ, et al. Increasing participant diversity in AD research: plans for digital screening, blood testing, and a community-engaged approach in the Alzheimer’s Disease Neuroimaging Initiative 4. Alzheimers Dement 2023; 19: 307–317. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr58-13872877251415019] 58.Prvulovic D, Hampel H. Amyloid β (Aβ) and phospho-tau (p-tau) as diagnostic biomarkers in Alzheimer’s disease. Clin Chem Lab Med 2011; 49: 367–374. [DOI] [PubMed] [Google Scholar]

[bibr59-13872877251415019] 59.Ashton NJ, Pascoal TA, Karikari TK, et al. Plasma p-tau231: a new biomarker for incipient Alzheimer’s disease pathology. Acta Neuropathol (Berl) 2021; 141: 709–724. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr60-13872877251415019] 60.Janelidze S, Bali D, Ashton NJ, et al. Head-to-head comparison of 10 plasma phospho-tau assays in prodromal Alzheimer’s disease. Brain 2023; 146: 1592–1601. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr61-13872877251415019] 61.Bellio M, Furniss D, Oxtoby NP, et al. Opportunities and barriers for adoption of a decision-support tool for Alzheimer’s disease. ACM Trans Comput Healthcare 2021; 2: 32. [Google Scholar]

PERMALINK

How reproducible are data-driven subtypes of Alzheimer's disease atrophy?

Emma Prevot

Cameron Shand

Neil Oxtoby

Roles

Abstract

Background

Objective

Methods

Results

Conclusions

Introduction

Methods

Participants and cohorts

Data preparation

Table 1.

SuStaIn model

Subtype and stage analysis

Statistical analysis

Results

Table 2.

SuStaIn atrophy models

Figure 1.

Figure 2.

Figure 3.

Similarity between subtypes across models

Table 3.

Effect of control inclusion on subtype identification

Discussion

Supplemental Material

Acknowledgements

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases