Skip to main content
Alzheimer's & Dementia logoLink to Alzheimer's & Dementia
. 2023 Oct 26;20(2):1102–1111. doi: 10.1002/alz.13431

Precision medicine analysis of heterogeneity in individual‐level treatment response to amyloid beta removal in early Alzheimer's disease

Menglan Pang 1,2, Audrey Gabelle 1,2, Paramita Saha‐Chaudhuri 1,2, Willem Huijbers 1,2, Arie Gafson 1,2, Paul M Matthews 3,4, Lu Tian 5, Ivana Rubino 2, Richard Hughes 1,2, Carl de Moor 1,2, Shibeshih Belachew 1,2, Changyu Shen 1,2,
PMCID: PMC10917030  NIHMSID: NIHMS1944210  PMID: 37882364

Abstract

INTRODUCTION

Alzheimer's disease (AD) is a neurological disorder with variability in pathology and clinical progression. AD patients may differ in individual‐level benefit from amyloid beta removal therapy.

METHODS

Random forest models were applied to the EMERGE trial to create an individual‐level treatment response (ITR) score which represents individual‐level benefit of high‐dose aducanumab relative to the placebo. This ITR score was used to test the existence of heterogeneity in treatment effect (HTE).

RESULTS

We found statistical evidence of HTE in the Clinical Dementia Rating–Sum of Boxes (CDR‐SB;P =  0.034). The observed CDR‐SB benefit was 0.79 points greater in the group with the top 25% of ITR score compared to the remaining 75% (P = 0.020). Of note, the highest treatment responders had lower hippocampal volume, higher plasma phosphorylated tau 181 and a shorter duration of clinical AD at baseline.

DISCUSSION

This ITR analysis provides a proof of concept for precision medicine in future AD research and drug development.

Highlights

  • Emerging trials have shown a population‐level benefit from amyloid beta (Aβ) removal in slowing cognitive decline in early Alzheimer's disease (AD).

  • This work demonstrates significant heterogeneity of individual‐level treatment effect of aducanumab in early AD.

  • The greatest clinical responders to Aβ removal therapy have a pattern of more severe neurodegenerative process.

Keywords: Alzheimer's disease, amyloid, clinical trials, personalized medicine, response to treatment

1. BACKGROUND

Alzheimer's disease (AD) is a progressive disorder with heterogeneous symptomatology, pathology, and individual disease courses. 1 , 2 Recent clinical trials demonstrate that early AD modification using anti‐amyloid therapies can have clinical benefit. This includes the positive phase 3 trials of aducanumab 3 (EMERGE ClinicalTrials.gov Identifier: NCT02484547) and lecanemab 4 (CLARITY AD ClinicalTrials.gov Identifier: NCT03887455), both meeting their primary endpoint and all key secondary endpoints. The evidence for a causal effect of amyloid depletion in delaying cognitive decline was consolidated in a recent instrumental‐variable meta‐analysis (integrating data from 16 clinical trials) providing further support for this class of therapy in AD. 5

The heterogeneity of AD is likely to influence the clinical benefit of individual patients from disease‐modifying treatment. Nonetheless, the conventional “responder analysis” by examining what type of patients have a change in clinical outcome exceeding a pre‐specified threshold under a new treatment is fundamentally flawed for parallel randomized controlled trials (RCTs) of AD. The key problem is that the longitudinal change of the clinical outcome of a given patient under a treatment could be attributed to other factors beyond the treatment. 6 To isolate the treatment effect at individual patient level, we would need to know the change of the outcome had the patient been assigned to the control arm (e.g., placebo), which is not observable in a parallel RCT. Therefore, the “response” in a typical responder analysis does not accurately capture the individual‐level treatment benefit and cannot be used to study heterogenous treatment effects (HTE). Fortunately, there have been significant developments in statistical methodologies to assess HTE from both randomized and observational studies, which serve as a foundation for the precision medicine paradigm (see Kosorok and Laber 7 and Kent et al. 8 for a comprehensive review). One statistical framework to investigate HTE is the estimation of an individual‐level treatment response (ITR) score, 9 which essentially is the predicted value of individual patient–level treatment effects. The methodology builds prediction models for an outcome of interest, separately in the treatment and placebo groups, using baseline patient characteristics as predictors. The ITR “score” is then derived at the patient level representing the difference between the predicted outcomes under the treatment and placebo settings. The ITR framework permits the incorporation of various prediction modeling strategies, including machine learning, and provides interpretability through the identification of patient characteristics that are associated with a stronger treatment benefit. The approach has been used to examine HTE in multiple sclerosis, 10 cancer, 11 cardiovascular diseases, 12 and other conditions. 13

In this study, we applied the ITR methodology to patients enrolled in the EMERGE clinical trial (randomized, double‐blind, placebo‐controlled, phase 3 study of aducanumab [NCT02484547]). We evaluated whether there is heterogeneity in the reduction of cognitive decline in response to amyloid‐reducing therapy and identified patient characteristics driving the observed treatment effect heterogeneity.

2. METHODS

2.1. Study population

The study design of EMERGE (ClinicalTrials.gov Identifier: NCT02484547) has been previously described. 3 Briefly, EMERGE (n = 1638) was a randomized, placebo‐controlled, double‐blind, global, phase 3 study of aducanumab in patients with confirmed amyloid beta (Aβ) pathology, aged 50 to 85 years, who met clinical criteria for mild cognitive impairment (MCI) due to AD or mild AD dementia. Participants were randomly assigned 1:1:1 to receive aducanumab low dose (3 or 6 mg/kg target dose), high‐dose (10 mg/kg target dose), or placebo via intravenous (IV) infusion once every 4 weeks over 76 weeks. The primary clinical endpoint was the change from baseline to week 78 on the Clinical Dementia Rating–Sum of Boxes (CDR‐SB). Secondary clinical outcome measures included the Mini‐Mental State Examination (MMSE), the Alzheimer's Disease Assessment Scale‐Cognitive Subscale 13‐item scale (ADAS‐Cog13) and the Alzheimer's Disease Cooperative Study Activities of Daily Living Inventory‐Mild Cognitive Impairment (ADCS‐ADL‐MCI). Tertiary clinical outcome measures included the Neuropsychiatric Inventory‐10 (NPI‐10).

2.2. Outcome and baseline variables included in ITR score model development

The primary endpoint for the HTE analysis was CDR‐SB change at week 78. We included patients from EMERGE who were randomized to receive high‐dose of aducanumab or placebo and who had CDR‐SB measured at both baseline and week 78.

The following baseline characteristics were pre‐specified for development of separate prediction models in the high‐dose aducanumab and placebo groups: age, sex, years of formal education, clinical stage (MCI due to AD, or mild AD), body mass index (BMI), apolipoprotein E (APOE) ε4 status (carrier or non‐carrier), years since first AD symptoms, years since AD diagnosis, AD symptomatic medication use (yes or no), regional brain magnetic resonance imaging (MRI) volume (frontal cortex, parietal cortex, lateral temporal cortex, medial temporal cortex, left hippocampus, right hippocampus, anterior cingulate cortex, posterior cingulate cortex, dorsal medial prefrontal cortex [PFC] default‐mode network [DMN], medial temporal cortex DMN) normalized by the total intracranial volume (TIV), plasma phosphorylated tau (p‐tau) 181, and medical history (defined as yes or no for categories of vascular, cardiac, or psychiatric disorders and microhemorrhage). Data preprocessing with respect to missing values, skewed distributions, and extreme values is described in Appendix A in supporting information.

2.3. Statistical methods

2.3.1. ITR score

We used random forest models 14 to predict the change in CDR‐SB from baseline to week 78 in the high‐dose aducanumab and placebo groups using the baseline characteristics described above. A random forest model was used because of its ability to handle a large number of predictors and to take into account non‐linear relationships and interactions. Once the prediction models were built in each group, a patient's ITR score was calculated as the difference in the predicted change in CDR‐SB at week 78 between the high‐dose aducanumab and placebo prediction models (i.e., how an individual patient's clinical score would have differed had they received treatment vs. placebo).​ Hence, a lower ITR score value corresponds to a greater predicted individual treatment benefit, that is, less cognitive worsening associated with high‐dose aducanumab compared to placebo.

2.3.2. Measure of the HTE

The HTE is graphically depicted by plotting the average treatment difference (ATD) curve. 9 The ATD reflects the average observed treatment effect for the subgroup with an ITR score (i.e., the predicted individual‐level treatment benefit) below the lowest q percentile. This subgroup also corresponds to the q percent of population with highest predicted treatment benefit. Details about the ATD curve construction are described in Appendix A. In general, the ATD curve provides a visual overall assessment of the magnitude of HTE. The area between the horizontal line representing the overall treatment effect and the ATD curve, denoted as the area between the curves (ABC), can provide a quantitative summary statistic for the overall magnitude of the HTE. 9 In addition, to aid in interpreting the magnitude of the HTE, we a priori defined highest responders as the subgroup with the highest 25% ITR score–predicted treatment response and standard responders as the remaining 75%. The average of the estimated treatment effect within both subgroups, as well as the reported ATD curve and the ABC statistics were evaluated in a repeated 5‐fold across validation a total of 200 times (See Figure S1 in supporting information for demonstration).

The results of sensitivity analyses of other thresholds are presented in Appendix B (Table S2) in supporting information. We then performed a permutation test for the existence of HTE using the ABC statistics and for the treatment effect differences between the highest and standard responders. The use of the permutation testing procedure protected against a potentially inflated Type I error rate associated with over‐fitting of the prediction models. 15 More details about the permutation test are provided in Appendix A.

RESEARCH IN CONTEXT
  1. Systematic review: Recent success in clinical trials of pharmacologic treatment of early Alzheimer's disease (AD), including EMERGE for aducanumab, CLARITY‐AD for lecanemab, and TRAILBLAZER‐ALZ2 for donanemab, demonstrated population‐level benefit in slowing clinical progression. As AD is a disease with heterogeneous pathology and clinical manifestation, individual‐level heterogeneity in treatment effect may also exist in the target population.

  2. Interpretation: There is statistically significant evidence that patients in the EMERGE trial derived heterogeneous benefit from aducanumab in slowing the worsening of the Clinical Dementia Rating Scale–Sum of Boxes. A further exploratory analysis suggested that several routinely collected patient characteristics are associated with the level of benefit.

  3. Future directions: These findings provide a proof of concept for the application of precision medicine in AD. More external data testing is warranted to determine the generalizability and enable implementation of such precision medicine models in future drug development through population enrichment and optimization of the use of pharmacologic interventions in AD clinical practice.

2.3.3. Association between treatment effect and baseline variables

To classify each individual in the analysis study population as a highest or standard responder, we assigned the ITR score for each individual by taking the average score from the 200 repeated cross‐validation procedure in which the individual was a hold‐out test sample. Patients were classified into the highest responder versus standard responder category based on the 25% threshold of the assigned ITR scores. To identify the patients’ baseline variables that contributed to the ITR score, we used a multipronged approach. First, we compared baseline characteristics between the highest and standard responders using summary statistics, standardized mean differences, and calculated P values from two sample t tests or chi‐squared tests as appropriate. Second, a variable importance analysis was conducted based on an application of conditional random forests 16 to the ITR scores. Finally, a regression tree 17 was fitted to the ITR scores to depict the potentially complex relationship between the most important baseline characteristics and the ITR score. The depth of the regression tree was limited to two levels to achieve easily interpretable results that focus on the most important variables.

FIGURE 1.

FIGURE 1

ATD curve (blue) constructed from the 200 repeated 5‐fold cross validation. The x axis represents the q percentage threshold of the ITR score used to select a subgroup with high predicted treatment benefit, and the y axis represents the observed treatment effect on change in CDR‐SB (high‐dose aducanumab – placebo) for the corresponding q percentage subgroup. The flat dashed line in red represents the ATD curve in the absence of heterogeneity of treatment effect, with y coordinate indicating the average treatment effect (reduction in CDR‐SB decline under high‐dose aducanumab vs. placebo) in the entire sample. Among the subgroup of patients with the ITR score–predicted highest 25% treatment response (q = 25%, dashed arrow line), the observed treatment effect of high‐dose aducanumab versus placebo was −0.97 CDR‐SB points. ABC: area between curves (i.e., green area between the blue and red curves). ATD, average treatment difference; CDR‐SB, Clinical Dementia Rating Sum of Boxes; ITR, individual‐level treatment response

2.3.4. Treatment benefits for other cognitive and functional measurements using the ITR scores

We compared the observed longitudinal change from baseline in the high‐dose aducanumab versus placebo patients across all study time points (weeks 26, 50, 78) for CDR‐SB (internal validation) and for each of the secondary and tertiary cognitive and behavioral outcome measures (ADAS‐Cog13, MMSE, ADCS‐ADL‐MCI, and NPI‐10) as well as between the stratified highest and standard responders. Although these clinical outcome measures are correlated with CDR‐SB, there are important non‐overlapping domains among these scales measuring different aspects of cognitive and behavioral functions, 18 , 19 thus enabling additional validation of the model. We used the same mixed model for repeated measures (MMRM) as specified in the EMERGE primary analysis to analyze the change from the baseline score across all study timepoints. As in the primary RCT analysis, the fixed effects of the MMRM comprised treatment group, visit, treatment group by visit interaction, baseline clinical measure score, baseline clinical measure score by visit interaction, baseline MMSE, AD symptomatic medication use, region (United States, Europe/Canada/Australia, and Asia) and APOE ε4 status. An unstructured covariance matrix was used to account for the correlation within a patient.

3. RESULTS

3.1. Study population

The analysis study population consisted of 587 patients, with 299 patients from the high‐dose aducanumab group and 288 from the placebo group. Patients’ characteristics were highly comparable between the two treatment arms with respect to all baseline variables. Summary statistics by treatment arm are presented in Table S1 in supporting information.

3.2. Assessment of heterogeneity of treatment effect

The ATD curve is shown in Figure 1. The red horizontal dashed line represents the overall average treatment effect (i.e., −0.37 [95% confidence interval (CI): −0.71, −0.04]) in the analysis study population. This line offers a benchmark for no HTE, that is, when every patient had the same reduction in CDR‐SB decline associated with high‐dose aducanumab versus placebo. The blue curve shows the ATD curve derived from the ITR score based on random forest prediction models. For example, q = 50 on the x axis represents the subgroup of patients with the highest 50% ITR score–predicted treatment response, for which the average treatment benefit estimate of high‐dose aducanumab relative to placebo in CDR‐SB change at week 78 was −0.6 (on the y axis). The ABC statistic between this blue line and the dashed red reference line was −0.246, which was statistically significant (permutation P value = 0.034), indicating that the ATD curve significantly deviated from the null hypothesis for HTE (horizontal reference line).

The ITR score–predicted highest 25% responder group had an average observed reduction of 0.97 points of the CDR‐SB scale worsening (corresponding to a 44% relative reduction) between baseline and week 78 comparing the high‐dose aducanumab group to placebo (Figure 1), whereas the corresponding estimate for treatment benefit in the standard responder group was −0.18 (Table 1). Thus, the observed CDR‐SB benefit from high‐dose aducanumab versus placebo was 0.79 points greater in the ITR score–predicted highest 25% responder group compared to the standard responder group (permutation P = 0.020). Results based on q = 12.5% and q = 50% thresholds for the definition of highest responder group were consistent with the primary analysis and are presented in Table S2 in supporting information.

TABLE 1.

Cross‐validated observed treatment effect in the ITR score–predicted highest (25% lowest ITR score) and standard (all of the others) responder groups.

Change in CDR‐SB Relative change from baseline compared to placebo
Placebo High‐dose Treatment effect
25% Highest responders 2.20 1.23 −0.97 44.1%
75% Standard responders 1.57 1.39 −0.18 11.5%
Treatment effect difference (P value†) −0.79 (0.020)

Note: Results were obtained as average across the 1000 hold‐out sets.

Abbreviations: CDR‐SB, Clinical Dementia Rating Sum of Boxes; ITR, individual‐level treatment response.

P value is derived from the permutation test.

3.3. Uni‐ and multivariate analysis of the association between treatment effect and patients’ baseline characteristics

Baseline characteristics distinguishing the ITR score–predicted highest 25% responders and standard responders were ranked by the magnitude of the group‐wise standardized mean difference (SMD; Table 2 and Figure 2). The largest SMDs were found for right hippocampal volume (SMD = 0.75, P < 0.001) and left hippocampal volume (SMD = 0.67, P < 0.001). In this univariate analysis, eight variables were significantly different between the two responder groups (Table 2) with an SMD that exceeded 0.20 in absolute scale (Figure 2). Overall, patients in the ITR score–predicted highest 25% responder group had lower hippocampal and medial temporal cortical volumes, were older, had higher baseline plasma p‐tau181, had a shorter time since AD symptom onset and diagnosis, and were more likely to have baseline microhemorrhages (Table 2 and Figure 2). In the sensitivity analysis based on alternative thresholds (q = 12.5% and q = 50%), the significantly different variables between ITR score–predicted highest versus standard responders, as well as their ranking by the SMD, were generally consistent with the primary analysis.

TABLE 2.

Baseline characteristics for the ITR score–predicted highest versus standard responders (q = 25 percentile of ITR score distribution as the threshold).

Covariate Highest responders (N = 147) Standard responders (N = 440) P value
Right hippocampal volume (%TIV) 0.0014 (0.00024) 0.0016 (0.00025) <0.001 *
Left hippocampal volume (%TIV) 0.0014 (0.00024) 0.0016 (0.00024) <0.001 *
Age 73.29 (8.31) 69.38 (6.76) <0.001 *
Baseline p‐tau181 3.65 (2.06) 3.09 (1.17) <0.001 *
Time since first AD symptom (years) 3.28 (2.88) 4.12 (2.72) 0.001 *
Medial temporal cortex volume (%TIV) 0.0043 (0.00051) 0.0045 (0.00059) 0.01 *
Time since diagnosis of AD (Years) 1.07 (1.26) 1.45 (1.69) 0.01 *
Microhemorrhage a 29 (19.73%) 49 (11.14%) 0.01 *
Cardiac disorder a 43 (29.25%) 97 (22.05%) 0.10
Sex (male) a 80 (54.42%) 207 (47.05%) 0.15
Posterior cingulate cortex volume (%TIV) 0.0037 (0.00040) 0.0037 (0.00048) 0.14
Frontal cortex volume (%TIV) 0.098 (0.0062) 0.097 (0.0063) 0.17
Clinical stage (mild AD) a 25 (17.01%) 55 (12.5%) 0.21
AD symptomatic medication used a 73 (49.66%) 244 (55.45%) 0.26
Anterior cingulate cortex volume (%TIV) 0.0040 (0.00046) 0.0040 (0.00050) 0.30
Baseline body mass index (kg/m^2) 25.25 (4.62) 25.7 (4.38) 0.29
Psychiatric disorder a 66 (44.9%) 219 (49.77%) 0.35
Vascular disorder a 77 (52.38%) 210 (47.73%) 0.38
Lateral temporal cortex volume (%TIV) 0.058 (0.0044) 0.058 (0.0051) 0.47
Parietal cortex volume (%TIV) 0.063 (0.0057) 0.064 (0.0054) 0.52
APOE ℇ4 status (carrier) a 102 (69.39%) 294 (66.82%) 0.64
Medial temporal cortex DMN volume (%TIV) 0.020 (0.0020) 0.020 (0.0020) 0.70
Dorsal medial PFC DMN volume (%TIV) 0.037 (0.0026) 0.037 (0.0033) 0.76
Year of education 14.69 (3.9) 14.78 (3.43) 0.80

Note: Variables in the table were ordered by decreasing values of standardized mean difference (top to bottom).

Abbreviations: AD, Alzheimer's disease; APOE, apolipoprotein E; ITR, individual‐level treatment response; DMN, default‐mode network; PFC, prefrontal cortex; p‐tau, phosphorylated tau; TIV, total intracranial volume.

a

Summary statistics are presented as N (%) for the categorical variables and as mean (standard deviation) for the continuous variables.

*P value < 0.05.

FIGURE 2.

FIGURE 2

Absolute standardized mean difference in baseline characteristics for ITR score–predicted highest versus standard responders (q = 25 percentile of ITR score distribution as the threshold). Variables were ordered by decreasing values of standardized mean difference (top to bottom). AD, Alzheimer's disease; APOE, apolipoprotein E; ITR, individual‐level treatment response; p‐tau, phosphorylated tau

Figure 3 shows the baseline characteristics ranked by the importance score obtained from the conditional random forest model fitted to the ITR score. The top four most important baseline characteristics were left hippocampal volume, right hippocampal volume, frontal cortex volume, and time since first AD symptom. The magnitude of the variable importance score corresponding to the left and right hippocampus, and to a lesser degree the frontal cortex volume, were far larger than the remaining baseline characteristics. In addition, 8 of the 10 most important variables in this multivariate analysis were related to regional cortical atrophy measures. Finally, the directionality of the effect of frontal cortical volume is opposite to that of hippocampal volume, that is, the ITR score–predicted highest responders have a lower hippocampal cortex volume and a higher frontal cortex volume relative to the rest of the population (standard responders).

FIGURE 3.

FIGURE 3

Variable importance of the baseline characteristics obtained from the conditional random forest model fitted to the ITR score. Variables were ordered by decreasing values of the importance score that represents the mean decrease in accuracy when each covariate was permuted in a conditional random forest model with the ITR score as the outcome (top to bottom). AD, Alzheimer's disease; APOE, apolipoprotein E; ITR, individual‐level treatment response; p‐tau, phosphorylated tau

To further assess how individual baseline characteristics are related to HTE, and how they interact with each other in predicting the ITR score, we fitted a single regression tree with the ITR score as the outcome. The final tree included three baseline characteristics and yielded four patient groups with distinct predicted treatment benefit compared to placebo (Figure 4). The baseline characteristics included left hippocampal volume, frontal cortex volume, and time since first AD symptom, with the first split occurring in left hippocampal volume.

FIGURE 4.

FIGURE 4

Regression tree fitted to the ITR score. The regression tree includes labels for the baseline characteristics and the thresholds used to split the patients into non‐overlapping groups. Boxes include the predicted magnitude of the average treatment effect on change in CDR‐SB from baseline to week 78 (high‐dose aducanumab – placebo) and the proportion (%) of patients in each branch of the tree. For example, patients with left hippocampal volume (normalized using volume to total intracranial volume fraction) < 0.0013, comprising 22% of the study population, had a predicted reduction of −0.71 in CDR‐SB worsening compared to placebo at week 78. AD, Alzheimer's disease; CDR‐SB, Clinical Dementia Rating Sum of Boxes; ITR, individual‐level treatment response

3.4. Assessment of the ITR score model prediction on longitudinal clinical and functional outcome measures

In the ITR score–predicted highest 25% responder subgroup, the adjusted mean differences between high‐dose aducanumab and placebo in the worsening of the cognitive and functional clinical endpoints between baseline to week 78 were: −1.07 (P = 0.002), −2.32 (P = 0.029), and 1.23 (P = 0.040) for CDR‐SB, ADAS‐Cog13, and MMSE, respectively, in contrast to −0.395 (P = 0.020), −1.126 (P = 0.044), and 0.665 (P = 0.034) in the overall study population (Figure 5). A larger separation between the high‐dose aducanumab and placebo groups was observed in the ITR score–predicted highest 25% responders compared to the standard responders at each study visit, suggesting a stronger treatment benefit across both cognitive and functional performance over time. Similar patterns were observed for ADCS‐ADL‐MCI and NPI‐10, shown in Figure S4 in supporting information.,

FIGURE 5.

FIGURE 5

Adjusted mean change from baseline using MMRM stratified by ITR score–predicted highest 25% responder and standard responder subgroups as well as for the overall analysis study population, for (A) CDR‐SB, (B) ADAS‐Cog13, and (C) MMSE endpoints. *P < 0.05; **P < 0.01; ***P < 0.001. ADAS‐Cog, Alzheimer's Disease Assessment Scale‐Cognitive Subscale 13‐item scale; CDR‐SB, Clinical Dementia Rating Sum of Boxes; ITR, individual‐level treatment response; MMRM, mixed model for repeated measures; MMSE, Mini‐Mental State Examination

4. DISCUSSION

Using an ITR score model approach, 9 we demonstrated evidence of heterogeneity in treatment effect of high‐dose aducanumab compared to placebo on reducing CDR‐SB decline in EMERGE. Exploratory analyses showed that the highest benefit from aducanumab was associated with the following baseline characteristics: lower hippocampal and medial temporal cortical volumes, older age, higher plasma p‐tau181, a shorter time since AD symptom onset and diagnosis, and higher prelevance of microhemorrhages. This group also consistently demonstrated higher treatment effect from aducanumab for other clinical outcomes measures that were not used in model training, including global cognitive domain (MMSE), cognitive status (ADAS‐Cog13), functional domain (ADCS‐ADL‐MCI), and neuropsychiatric symptoms (NPI‐10). Given the lack of independent validation, this work should be viewed as hypothesis generating in nature but constitutes the first demonstration of the existence of a heterogeneous response to anti‐amyloid therapy, which lays the foundation for further investigation of the potential of precision medicine models to inform personalized decision making when initiating treatment in patients with early AD.

The analyses of baseline characteristics of the ITR score–predicted highest 25% responders revealed a clear pattern of more severe neurodegenerative process, as measured by a more severe hippocampal atrophy, associated with higher plasma p‐tau181 and shorter duration of AD disease since symptom inception and diagnosis. Interestingly, the most highly ranked baseline characteristics as per their importance in the conditional random forest model were regional brain atrophy features indicating a potential influence of topographical markers in the response to Aβ removal therapy. This is consistent with machine learning models that predict disease progression in AD with high accuracy that are also reliant on brain MRI atrophy variables as input features. 20 It should be noted that the two strongest characteristics associated with highest treatment benefit in the multivariate analysis, a lower bilateral hippocampal volume and higher frontal cortex volume, are also observed in the limbic‐predominant MRI atrophy subtype of AD. 21 , 22 Importantly, APOE ε4 allele carrier status did not come out as a variable of significant importance despite its association with an increased risk of dementia 23 , 24 , 25 while a sensitivity analysis further refining our ITR score model to evaluate the effect of homo‐ versus heterozygous APOE ε4 genotypes was consistent with the primary results in this regard.

The statistical approach used in our analysis has several advantages. First, the ITR methodology offers a flexible multivariate framework and the opportunity to apply diverse analytic methodologies, including machine learning. Second, the application of an ITR score random forest model focused on HTE estimates, obtained from cross‐validation hold‐out sets and tested for statistical significance using permutation statistics, mitigated against overfitting and false positive findings. Third, the ITR methodology naturally generates summary statistics including the ATD curve that depicts a continuous spectrum of HTE estimates, clearly demonstrating how the treatment effect increases for different subsets of patients and allowing for in‐depth study and selection of relevant thresholds. Fourth, our analysis enabled the identification of several prognostic factors that drive HTE from a large list of variables and shed light on potential underlying physiological processes that lead to HTE. Although the prognostic property of these factors has been well established, the demonstration of their involvement in HTE is novel, with implications for future research.

There are limitations to this study. First, the analysis was based on a single trial (EMERGE) 3 and therefore lacked independent validation of the ITR score. As the parallel ENGAGE trial (NCT02477800) did not meet its primary or secondary endpoints, 3 we did not derive the ITR score using the data from this trial. Second, there are 653 opportunity to complete (OTC) subjects in the placebo and high‐dose arms who have the potential to contribute 78‐week outcomes. Our analysis focused on the 587 OTC subjects whose 78‐week data on CDR‐SB were actually collected. The remaining 66 subjects did not have CDR‐SB measured at week 78 due to various reasons, among whom 19 subjects discontinued the study due to adverse events and 6 due to death. On the other hand, the non‐OTC subjects were enrolled into EMERGE relatively late and as such the 78‐week data were not collected due to early trial termination. In general, there is no systematic difference between the OTC and non‐OTC subjects for essentially all the relevant baseline variables as they are simply separated by the time of enrollment (summary statistics for patients in the OTC and non‐OTC population are presented in Table S3 in supporting information). Thus, the OTC subjects should be representative of the population targeted by EMERGE. Although the 66 subjects with missing CDR‐SB data at week 78 could be informative, it is only 10% of the OTC subjects and their impact is likely to be minimal. Third, our exploratory comparison of patient characteristics and longitudinal clinical outcome measures between the highest and standard responders are based on data‐driven subgroups instead of defined a priori. As such numerical values associated with statistical inference (e.g., P value) should be considered with caution. Finally, these analyses are conducted post hoc. Therefore, the results are hypothesis generating, and their generalizability will have to be explored on independent external validation data sets. Future work may also warrant the refinement of this ITR score random forest model potentially with additional clinical variables, spatial imaging covariates (e.g., regional Aβ positron emission tomography), and genetic information, as well as further validation over extended treatment duration with longer term follow‐up and in real‐world populations.

With the granted accelerated approval of aducanumab and conventional approval of lecanemab by the US Food and Drug Administration, and recent positive results of TRAILBLAZER‐ALZ 2 for donanemab 26 , patient stratification and personalized medicine will become increasingly important for better understanding how disease trajectory and response to treatment may vary from one individual to another. In this study, our ITR approach demonstrated that HTE exists for aducanumab and that a substantial proportion of patients could derive more considerable and clinically meaningful benefits from treatment. In this context, our precision medicine analysis provides a proof of concept that personalized approaches to treatment in AD may be feasible, and one day may allow clinical decisions to be made with evidence‐based assessment of individual benefit–risk.

CONFLICT OF INTEREST STATEMENT

M.P., A.G., P.S.C., W.H., A.G., I.R., R.H., S.B., and C.S. are Biogen employees. P.M.M. received consultancy fees from Novartis, Biogen (unconnected with the present research and report), Nodthera, and Rejuveron Therapeutics; honoraria or speakers’ fees from Novartis, Biogen, and Redburn Investing; research or educational funds from Biogen (unconnected with the present research and report), Novartis, Merck, and Bristol Myers Squibb; and institutional grants from UK Dementia Research Institute, NERC, and NIHR Biomedical Research Centre at Imperial College London. L.T. has a consulting agreement with Biogen. C.d.M. holds Biogen stocks. All other authors declare no competing interests. Author disclosures are available in the supporting information

CONSENT STATEMENT

Consent was not necessary for this research.

Supporting information

Supplementary Appendix

ALZ-20-1102-s002.docx (361.2KB, docx)

Supporting Information

ALZ-20-1102-s001.pdf (1.1MB, pdf)

ACKNOWLEDGMENTS

The authors would like to thank Ying Tian, Shuang Wu, John O'Gorman, Yuval Zabar, and Holly Brothers from Biogen for helpful discussions and comments. This research was supported by Biogen.

Pang M, Gabelle A, Saha‐Chaudhuri P, et al. Precision medicine analysis of heterogeneity in individual‐level treatment response to amyloid beta removal in early Alzheimer's disease. Alzheimer's Dement. 2024;20:1102–1111. 10.1002/alz.13431

REFERENCES

  • 1. Masters CL, Bateman R, Blennow K, Rowe CC, Sperling RA, Cummings JL. Alzheimer's disease (Primer). Nat Rev Dis Primers. 2015;1(1):15056. [DOI] [PubMed] [Google Scholar]
  • 2. Habes M, Grothe MJ, Tunc B, McMillan C, Wolk DA, Davatzikos C. Disentangling heterogeneity in Alzheimer's disease and related dementias using data‐driven methods. Biol Psychiatry. 2020;88(1):70‐82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Budd Haeberlein S, Aisen PS, Barkhof F, et al. Two randomized phase 3 studies of aducanumab in early Alzheimer's disease. J Prev Alzheimers Dis. 2022;9(2):197‐210. [DOI] [PubMed] [Google Scholar]
  • 4. van Dyck CH, Swanson CJ, Aisen P, et al. Lecanemab in early Alzheimer's disease. N Engl J Med. 2023;388(1):9‐21. [DOI] [PubMed] [Google Scholar]
  • 5. Pang M, Zhu L, Gabelle A, et al. Effect of reduction in brain amyloid levels on change in cognitive and functional decline in randomized clinical trials: an instrumental variable meta‐analysis. Alzheimer's Dement. 2023;19(4):1292‐1299. [DOI] [PubMed] [Google Scholar]
  • 6. Ferreira GE, McLachlan AJ, Lin CW, et al. Efficacy and safety of antidepressants for the treatment of back pain and osteoarthritis: systematic review and meta‐analysis. BMJ. 2021;372:m4825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Kosorok MR, Laber EB. Precision medicine. Annu Rev Stat Appl. 2019;6:263‐286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Kent DM, Steyerberg E, van Klaveren D. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. BMJ. 2018;363:k4245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Zhao L, Tian L, Cai T, Claggett B, Wei LJ. Effectively selecting a target population for a future comparative study. J Am Stat Assoc. 2013;108(502):527‐539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Pellegrini F, Copetti M, Bovis F, et al. A proof‐of‐concept application of a novel scoring approach for personalized medicine in multiple sclerosis. Mult Scler. 2020;26(9):1064‐1073. [DOI] [PubMed] [Google Scholar]
  • 11. Ballarini NM, Rosenkranz GK, Jaki T, König F, Posch M. Subgroup identification in clinical trials via the predicted individual treatment effect. PloS one. 2018;13(10):e0205971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Claggett B, Tian L, Castagno D, Wei LJ. Treatment selections using risk–benefit profiles based on data from comparative randomized clinical trials with multiple endpoints. Biostatistics. 2015;16(1):60‐72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Chen KS, Xie J, Tang W, Zhao J, Jeppesen PB, Signorovitch JE. Identifying a subpopulation with higher likelihoods of early response to treatment in a heterogeneous rare disease: a post hoc study of response to teduglutide for short bowel syndrome. Ther Clin Risk Manag. 2018;14:1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Wright MN, Ziegler A. ranger: a fast implementation of random forests for high dimensional data in C++ and R. J Stat Softw. 2017;77(1):1‐17. [Google Scholar]
  • 15. Wang R, Schoenfeld DA, Hoeppner B. Evins AE. Detecting treatment‐covariate interactions using permutation methods. Stat Med. 2015;34(12):2035‐2047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Strobl C, Boulesteix AL, Kneib T, Augustin T, Zeileis A. Conditional variable importance for random forests. BMC Bioinformat. 2008;9(1):1‐1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Therneau TM, Atkinson EJ. An introduction to recursive partitioning using the rpart routines. Div Biostat. 1997;61. Mayo Clinic. [Google Scholar]
  • 18. Cedarbaum JM, Jaros M, Hernandez C, et al, Alzheimer's Disease Neuroimaging Initiative . Rationale for use of the Clinical Dementia Rating Sum of Boxes as a primary outcome measure for Alzheimer's disease clinical trials. Alzheimer's Dement. 2013;9(1):S45‐55. [DOI] [PubMed] [Google Scholar]
  • 19. McDougall F, Edgar C, Mertes M, et al. Psychometric properties of the Clinical Dementia Rating—Sum of Boxes and other cognitive and functional outcomes in a prodromal Alzheimer's disease population. J Prev Alzheimers Dis. 2021;8(2):151‐160. [DOI] [PubMed] [Google Scholar]
  • 20. Marinescu RV, Bron EE, Oxtoby NP, et al. Predicting Alzheimer's disease progression: results from the TADPOLE challenge: neuroimaging: neuroimaging predictors of cognitive decline. Alzheimer's Dement. 2020;16:e039538. [Google Scholar]
  • 21. Ferreira D, Nordberg A, Westman E. Biological subtypes of Alzheimer disease: a systematic review and meta‐analysis. Neurology. 2020;94(10):436‐448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Mohanty R, Ferreira D, Frerich S, Muehlboeck JS, Grothe MJ, Westman E. Alzheimer's disease neuroimaging initiative. Neuropathologic features of antemortem atrophy‐based subtypes of Alzheimer disease. Neurology. 2022;99(4):e323‐33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Vermunt L, Sikkes SA, Van Den Hout A, et al. Duration of preclinical, prodromal, and dementia stages of Alzheimer's disease in relation to age, sex, and APOE genotype. Alzheimer's Dement. 2019;15(7):888‐898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Tsai MS, Tangalos EG, Petersen RC, et al. Apolipoprotein E: risk factor for Alzheimer disease. Am J Hum Genet. 1994;54(4):643. [PMC free article] [PubMed] [Google Scholar]
  • 25. Riedel BC, Thompson PM, Brinton RD. Age, APOE and sex: triad of risk of Alzheimer's disease. J Steroid Biochem Mol Biol. 2016;160:134‐147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Sims JR, Zimmer JA, Evans CD, et al. Donanemab in early symptomatic Alzheimer disease: the TRAILBLAZER‐ALZ 2 randomized clinical trial. JAMA. 2023;330(6):512‐527. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Appendix

ALZ-20-1102-s002.docx (361.2KB, docx)

Supporting Information

ALZ-20-1102-s001.pdf (1.1MB, pdf)

Articles from Alzheimer's & Dementia are provided here courtesy of Wiley

RESOURCES