Extension and external validation of mapping between the Mini‐Mental State Examination and the Clinical Dementia Rating scale in patients with Alzheimer's disease

Niels Juul Brogaard; Julie Hviid Hahn‐Pedersen; Jens Gundgaard; Pepa Polavieja; Benjamin D Bray; Mei Sum Chan; Men Hoang; Dominic Trepel

doi:10.1002/alz.70163

. 2025 Jun 19;21(6):e70163. doi: 10.1002/alz.70163

Extension and external validation of mapping between the Mini‐Mental State Examination and the Clinical Dementia Rating scale in patients with Alzheimer's disease

Niels Juul Brogaard ¹, Julie Hviid Hahn‐Pedersen ¹, Jens Gundgaard ¹, Pepa Polavieja ¹, Benjamin D Bray ², Mei Sum Chan ², Men Hoang ³, Dominic Trepel ^3,^✉

PMCID: PMC12179331 PMID: 40538019

Abstract

INTRODUCTION

We aimed to develop and externally validate a mapping to translate Mini‐Mental State Examination (MMSE) scores to Clinical Dementia Rating scale Sum of Boxes (CDR‐SB) scores in patients with Alzheimer's disease (AD).

METHODS

We extended a previously published CDR‐SB to MMSE mapping to include later stages of AD and validated it using the US National Alzheimer's Coordinating Centers (NACC) database. Calibration and discrimination metrics were evaluated, and the mapping was validated in clinically relevant subgroups of the AD population.

RESULTS

A linear MMSE to CDR‐SB mapping ( $CD R_{SB} = - 0.6809 \times MMSE + 20.1982)$ demonstrated high goodness of fit and satisfied all validation metrics. The extended mapping performed well across subgroups, though calibration was less accurate for severe dementia patients.

DISCUSSION

This MMSE to CDR‐SB MMSE mapping performed generally well though caution is needed when applying this mapping to people with severe AD dementia. It could be used to help contextualize outcomes in AD clinical trials and real‐world health research studies.

Highlights

We developed a mapping between the CDR‐SB and MMSE instruments, by obtaining the optimal functional form of a mapping previously published using data from the Alzheimer's Disease Neuroimaging Initiative dataset and extending it to the full range of clinical severities for Alzheimer's disease.
The resulting mapping was externally validated using 26,729 MMSE and CDR‐SB pairs from participants recruited to the US National Alzheimer's Coordinating Centers database.
The selected model satisfied the pre‐specified selection criteria and the extended mapping demonstrated good performance, especially in the earlier clinical stages. However, caution is advised when inferring mapped scores in more severe dementia.
The extension and subsequent validation of the mapping can support the translation of AD severity assessments between datasets using CDR‐SB and MMSE, bridging a common information gap between clinical trials, real‐world evidence studies, and clinical practice.

Keywords: Clinical Dementia Rating scale Sum of Boxes, cognitive assessment mapping, external validation, Mini‐Mental State Examination

1. BACKGROUND

Accurate assessment of cognitive function is a key part of the diagnosis, staging and management of Alzheimer's disease (AD), the most common cause of dementia. ¹ The Clinical Dementia Rating scale Sum of Boxes (CDR‐SB) ² , ³ , ⁴ is widely used in clinical trials (such as for potential disease‐modifying therapies ⁵ ) for assessing cognition and function and for staging AD. However, its use in routine clinical practice is limited, primarily due to the lengthy administration time, which ranges from 30 to 90 min, ⁶ along with the need for specific training. Consequently, alternative shorter cognitive assessment tools such as the Mini‐Mental State Examination (MMSE) are often preferred in routine clinical practice and are more widely included in registries and other types of real‐world health data.

This divergence in assessment tools poses challenges in comparing evidence across different data sources and in contextualizing results from clinical trials. Previous studies have mapped the relationship between CDR‐SB and MMSE for the mild to moderate section of the AD clinical severity spectrum ⁷ and across ranges of CDR scores, ⁸ but a complete mapping across all symptomatic stages for these instruments has not yet been developed. Furthermore, external validation (i.e., evaluating the accuracy of the mapping in a different data source to the one which was used to develop the mapping) is currently lacking but would add confidence that a mapping would be generalizable to different settings and patient populations. ⁸ , ⁹ , ¹⁰ , ¹¹ , ¹² , ¹³

We therefore aimed to draw on the previous published but incomplete mappings to develop and externally validate a complete mapping of the AD clinical severity spectrum between CDR‐SB and MMSE which could be used to translate measurements across data sources, and be a useful tool to support future epidemiological research and clinical trials in AD.

2. METHODS

This study consisted of two phases: firstly, extending a previously published CDR‐SB and MMSE mapping ⁷ to the complete clinical severity spectrum for AD, and secondly, externally validating the extended mapping in a large real‐world cohort recruited at AD research centers in the United States.

2.1. Extending the current mapping to the full range of the AD the clinical severity spectrum

Following a targeted literature review, the most suitable existing mapping was identified on the basis of having the mapping defined at the level of individual CDR‐SB and MMSE scores (i.e., rather than ranges of these). This mapping had previously been developed using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The limitation of this mapping is that it was only developed for a limited range of the AD severity spectrum (i.e., CDR‐SB scores ≤6.5), and we, therefore, investigated the most suitable approach for extending this to the full range of the AD clinical severity spectrum using regression models. Models were fitted to the published CDR‐SB to MMSE mapping data provided by Balsis and colleagues ⁷ with CDR‐SB as the dependent variable and MMSE as the independent variable. Both linear and non‐linear models were considered, arranged in order of increasing functional complexity: (1) Linear (no transformation): CDR ∼ a * MMSE + c; (2) Power or log transformation: f(CDR) ∼ a * MMSE + c; and (3) Logit transformation: CDR ∼ a * logit(MMSE) + c where a is the coefficient for (transformed) MMSE, f is either a power function or log function, and c is the intercept

The appropriateness of these approaches was assessed using the following goodness of fit criteria: (1) $R^{2}$ of the regression being higher than 0.7; (2) standardized residuals plot showing random distribution of residuals; and (3) Q–Q plot showing a normal distribution of residuals. The most parsimonious model that met these pre‐specified criteria was selected for all subsequent components of the analysis. The selected model was extrapolated using the final regression equation to include the more severe clinical AD stages across the maximum attainable MMSE range.

2.2. External validation

This was a cross‐sectional validation study to assess the accuracy of the extended mapping across cognitive stages from unimpaired cognition to severe dementia. Validation methods followed the best practices outlined by TRIPOD, ¹⁴ reporting both calibration and discrimination metrics (see accompany TRIPOD checklist)

Data from the US National Alzheimer's Coordinating Centers (NACC) were used for validation, ¹⁵ The database has been widely used for AD research and includes ∼41,000 participants who visited any of 35 AD centers across the United States between 2005 and 2022. The main NACC dataset (Uniform Data Set v3) was extracted in December 2022 for this study, and included at least 2000 participants in each of the cognitive stages at baseline: from unimpaired cognition to severe dementia.

The study population consisted of participants in the NACC with at least one pair of valid MMSE and CDR‐SB scores during a single assessment visit. Although assessment visits occur approximately annually, cognitive assessment scores might be missing or invalid. Full definitions of a valid cognitive test score are detailed in the Supplementary Methods. Briefly, participants and/or visits were excluded if they (1) had a date of death in the same or earlier month than their most recent visit, (2) had a date of discontinuation from follow up in the same or earlier month than their most recent visit, (3) had missing age or sex, and (4) had clinical staging and CDR‐global staging that were discordant in that visit.

2.3. Analysis

Data from all visits with a valid pair of MMSE and CDR‐SB scores were captured initially, with exploration undertaken to identify the most appropriate timepoint to address the aim of the validation study. The number of participants captured in each clinical stage was reported for each calendar year, and the timepoint which maximized the number of participants with later stages of dementia was selected. Sociodemographic characteristics, medication status, and cognitive scores were reported for the whole study population and subpopulations based on the timepoint selected, with the data presented for the overall population and stratified according to CDR Global staging classification; unimpaired cognition (CDR‐global = 0), MCI (CDR‐global = 0.5), mild dementia (CDR‐global = 1), moderate dementia (CDR‐global = 2), and severe dementia (CDR‐global = 3).

The mapping generated in the initial phase was applied to the NACC dataset to estimate predicted CDR‐SB scores for individuals from their measured MMSE scores.

The mapping was validated by comparing the predicted and observed CDR‐SB scores for each participant, using the validation metrics outlined in Table 1 and in accordance with best practice reporting guidelines outlined by TRIPOD. ¹⁴ When applying the validation metrics, the predicted and observed CDR‐SB scores were treated as ordinal measures for the purpose of statistical testing.

TABLE 1.

Validation criteria for each of the validation metrics.

Validation metric

Discrimination/calibration metric

Description

Validation threshold criteria

Bootstrapped Wilcoxon–Mann–Whitney

Discrimination

Non‐parametric two‐sample statistical test to assess similarity of the distribution of estimated and measured CDR‐SB scores; equivalent to area under the receiver‐operator characteristic (ROC) curve on ordinal data. ¹⁶

Above the critical value of the within‐sample test statistic at the 95% level, for at least 95% of the bootstrapped samples (Supplementary methods)

Somers’ Delta

Discrimination

Non‐parametric two‐sample statistical test.

>0.7

R^{2}

Similar to discrimination

R^{2} = 1 - \frac{\sum_{i} {(o b s e r v e d C D R - S B_{i} - e s t i m a t e d C D R - S B_{i})}^{2}}{\sum_{i} {(o b s e r v e d C D R - S B_{i} - E (o b s e r v e d C D R - S B_{i}))}^{2}}

>0.7

Calibration plot

Calibration

Estimated scores (on the x‐axis) were plotted against the means and standard deviations of the observed scores (on the y‐axis), by tenths of the estimated score. ¹⁵

The points on the plot are close to the 45° line

Area under the receiver operating characteristic curve (AUC): No/mild dementia versus moderate dementia

Discrimination

A graphical representation of a binary classifier's performance in terms of true positive rate and false positive rate, in this case differentiating no/mild dementia versus moderate dementia.

>0.7

AUC: Moderate dementia versus severe dementia

Discrimination

A graphical representation of a binary classifier's performance in terms of true positive rate and false positive rate, in this case differentiating moderate dementia versus severe dementia.

>0.7

Open in a new tab

Abbreviation: CDR‐SB, Clinical Dementia Rating scale Sum of Boxes.

Subgroup analysis was carried out for clinically relevant subgroups, including age tertiles (18–70 years, 70–80 years, and 80+ years), sex, any AD medication use (AD medications listed in Table S1) and education tertiles (0–13 years, 13–16 years, 16+ years).

RESEARCH IN CONTEXT

Systematic review: PubMed was searched for articles on “Clinical Dementia Rating” (CDR), “Mini‐Mental State Examination” (MMSE), and “mapping” to September 4, 2024. From 387 articles, seven assessed the relationship between CDR and other cognitive instruments, with no external validation. Two studies developed mappings between CDR or CDR Sum of Boxes and MMSE using European data. One focused on mild to moderate Alzheimer's disease (AD); the other covered all severities but had less granular CDR ranges.
Interpretation: A mapping between CDR‐SB and MMSE was developed, which bridges an information gap between clinical trials, real‐world studies, and practice, supporting translation of AD severity assessments across different datasets.
Future directions: Opportunities to apply the mapping include: (1) support randomized controlled trials in terms of both feasibility assessment and extensions that utilize routinely collected data to assess long‐term outcomes; (2) support the standardization of evidence submitted for health technology assessment across different markets.

R version 4.1.1 and R package pROC were used for analysis.

3. RESULTS

3.1. Extending the current mapping to the full range of the AD clinical severity spectrum

The first functional form investigated, a linear regression model (Table 2), met the pre‐specified selection criteria with an adjusted $R^{2}$ of 0.99, randomly and normally distributed residuals as displayed in the standardized residuals plot and Q–Q plot (Figures S1 and S2). A Box–Cox test was used as a sensitivity analysis to identify the optimal power transformation to use in the mapping, with the outcome confirming that a linear approach was optimal (Figure S3). Therefore, no alternative functional forms were investigated as the linear functional form met the a priori selection criteria.

TABLE 2.

Results from the linear regression model.

Term	Coefficient (standard error)	t‐Statistic	p‐Value	95% Confidence interval
Intercept	2.1982 (0.2988)	67·6089	<0.001	(19.5923, 2.8041)
MMSE	−0.6809 (0.0120)	−56·9465	<0.001	(−0.7051, −0.6566)
F‐statistic		3242.903 (degrees of freedom = 1; 36)
Adjusted $R^{2}$		0.989 (degrees of freedom = 36)
Residual mean standard error		0.2527

Open in a new tab

Abbreviation: MMSE, Mini‐Mental State Examination.

The final mapping equation between CDR‐SB (denoted as CDR_SB) and MMSE was:

CD R_{SB} = - 0.6809 \times MMSE + 20.1982

The CDR‐MMSE mapping fitted using the linear model matched closely to those in the published Balsis et al. ⁷ study for the early stages of AD (Figure 1). However, predicted CDR‐SB for MMSE scores 0 to 3 exceeded the upper limit of the CDR‐SB range; therefore; scores in this range are assigned the ceiling CDR‐SB score of 18. The full lookup table of synthesized (combined and extrapolated) CDR‐MMSE mapping is provided in Table S2. The fitted and extrapolated mapping in this study correspond with the MMSE score ranges approximated for each CDR‐global stage proposed by Perneczky et al. ⁸ (Figure 1).

Fitted and published CDR‐SB–MMSE mappings, with the approximate CDR stage ranges. CDR‐SB, Clinical Dementia Rating scale Sum of Boxes; MMSE, Mini‐Mental State Examination.

3.2. External validation of the mapping

Among the 41,157 participants registered in the NACC database at the time of analysis, 26,729 had valid MMSE and CDR‐SB scores. 25,768 participants and 74,785 visits which met the inclusion criteria (Figure 2) were considered for this validation phase. The timepoint selected for analysis was patients’ most recent visit, as it provided a larger sample of participants in later stages of AD dementia (Table S3).

Attrition diagram of the NACC population by inclusion and exclusion criteria. NACC, National Alzheimer's Coordinating Centers.

Characteristics for the validation sample at their most recent visit are reported in Table 3. The mean CDR‐SB score recorded for the overall study population was 4.05 (across the CDR‐SB range of 0–18), while the mean fitted CDR‐SB score was 4.36. Compared to the ADNI database used in the Balsis et al. ⁷ study (Table S4), the NACC population contained a lower proportion of individuals with either no cognitive impairment, MCI, or mild dementia and a greater proportion of Black and minority ethnic groups, but was otherwise similar.

TABLE 3.

Characteristics of the study population at their most recent visit.

	Overall	By CDR‐global stage
Characteristic	(All CDR‐global stages) N = 25,768	0.0 No Impairment N = 8986	0.5 MCI N = 7346	1.0 Mild dementia N = 4739	2.0 Moderate dementia N = 3262	3.0 Severe dementia N = 1435
Median age in years (IQR)	75.00 (68.00, 82.00)	73.00 (66.00, 80.00)	76.00 (68.00, 82.00)	77.00 (69.00, 83.00)	78.00 (69.00, 84.00)	77.00 (68.00, 84.00)
Sex
Female	14,564 (56.52%)	5864 (65.26%)	3773 (51.36%)	2391 (50.45%)	1749 (53.62%)	787 (54.84%)
Male	11,204 (43.48%)	3122 (34.74%)	3573 (48.64%)	2348 (49.55%)	1513 (46.38%)	648 (45.16%)
Mean years of education (SD)	14.88 (3.55)	15.68 (3.02)	14.74 (3.63)	14.43 (3.71)	14.09 (3.82)	13.87 (4.06)
Unknown (N)	167	43	48	39	18	19
AD medication use
Did not report use	16,366 (63.51%)	8880 (98.82%)	4906 (66.78%)	1460 (30.81%)	687 (21.06%)	433 (30.17%)
Reported use	9248 (35.89%)	97 (1.08%)	2394 (32.59%)	3253 (68.64%)	2553 (78.26%)	951 (66.27%)
Unknown	154 (0.60%)	9 (0.10%)	46 (0.63%)	26 (0.55%)	22 (0.67%)	51 (3.55%)
Mean CDR‐SB score (SD)	4.05 (5.00)	0.02 (0.10)	1.93 (1.22)	6.02 (1.44)	11.42 (1.63)	16.90 (1.34)
Mean MMSE score (SD)	23.25 (7.67)	28.89 (1.50)	25.84 (3.69)	20.28 (5.36)	13.80 (5.79)	5.91 (5.85)
Mean fitted CDR‐SB score (SD)	4.36 (5.05)	0.63 (0.95)	2.63 (2.48)	6.39 (3.63)	10.75 (3.84)	15.45 (3.36)
Ethnicity ^a
American Indian or Alaska Native	240 (0.93%)	48 (0.53%)	105 (1.43%)	47 (0.99%)	31 (0.95%)	9 (0.63%)
Asian	301 (1.17%)	131 (1.46%)	76 (1.03%)	47 (0.99%)	38 (1.16%)	9 (0.63%)
Black or African American	3186 (12.36%)	1241 (13.81%)	1010 (13.75%)	433 (9.14%)	362 (11.10%)	140 (9.76%)
Native Hawaiian or other Pacific Islander	16 (0.06%)	6 (0.07%)	<5 (0.05%)	<5 (0.04%)	<5 (0.09%)	<5 (0.07%)
Other	526 (2.04%)	79 (0.88%)	166 (2.26%)	130 (2.74%)	86 (2.64%)	65 (4.53%)
Unknown	113 (0.44%)	28 (0.31%)	40 (0.54%)	22 (0.46%)	17 (0.52%)	6 (0.42%)
White	21,386 (82.99%)	7453 (82.94%)	5945 (80.93%)	4058 (85.63%)	2725 (83.54%)	1205 (83.97%)
Hispanic or Latino	2072 (8.04%)	546 (6.07%)	660 (8.98%)	401 (8.46%)	288 (8.83%)	177 (12.33%)

Open in a new tab

Abbreviations: AD, Alzheimer's disease; CDR‐SB, Clinical Dementia Rating scale Sum of Boxes; IQR, interquartile range; MMSE, Mini‐Mental State Examination.

^{^a}

These ethnicity categories may overlap if patients reported belonging to more than one category.

3.3. Validation metrics

When externally validated on the NACC population, the mapping satisfied all validation metrics (Table 4; Figure S4), with Wilcoxon–Mann–Whitney test statistic of 0.98, Somers’ Delta of 0.71, and $R^{2}$ of 0.74. The calibration plot showed the means of each decile of fitted CDR‐SB scores were very close to the observed CDR‐SB scores, particularly at earlier AD stages where CDR‐SB scores were less than 10 (Figure 3). For the top two deciles where the CDR‐SB scores above 10 (i.e., above the midpoint of the moderate dementia stage), the mapping slightly underestimated CDR‐SB scores in the NACC dataset.

TABLE 4.

Validation metrics from the mapping on the overall NACC population.

Validation metric

Validation criteria

Test statistic

Bootstrapped Wilcoxon–Mann–Whitney

Above the critical value of the within‐sample test statistic at the 95% level, for at least 95% of the bootstrapped samples

0.98, p ≥ 0.05

(Figure S4 displays the distribution of the within‐sample p‐values)

Somers’ delta

Above the rule of thumb threshold of 0.7 (strong concordance)

0.70

R^{2}

Above the rule of thumb threshold of 0.7 (strong concordance)

0.74

Calibration plot

The points on the plot are close to the 45° line

See Figure 3

Area under the receiver operating characteristic curve (AUC): No/mild dementia versus moderate dementia

Above the rule of thumb of 0.7

0.89 (95%CI: 0.88–0.89)

AUC: Moderate dementia versus severe dementia

Above the rule of thumb of 0.7

0.76 (95%CI: 0.74–0.77)

Open in a new tab

Abbreviation: CI, confidence interval; NACC, National Alzheimer's Coordinating Centers.

Calibration plot of deciles of the estimated CDR‐SB scores (means and standard deviations) versus observed CDR‐SB scores. CDR‐SB, Clinical Dementia Rating scale Sum of Boxes.

We assessed the distribution of the NACC data that was used for validation against the validated CDR‐MMSE mapping. The mean values of MMSE and CDR‐SB scores for each clinical stage largely fitted within the CDR and MMSE ranges for each clinical stage apart from the “No impairment” stage (Figure S5), and individual scores were observed to aggregate around the mapping with notable dispersion (Figure S6). In addition, the mapping discriminated well between no/mild dementia (consisting of individuals with either no cognitive impairment, MCI, or mild dementia) versus moderate dementia (area under the receiver operating characteristic curve [AUC] = 0.89; 95%CI: 0.88–0.89) and acceptably between moderate dementia versus severe dementia patients (AUC = 0.76; 95% CI: 0.74–0.77) (Figure S7).

3.4. Sensitivity analysis in NACC subgroups

The validation testing applied to the overall NACC population was repeated on the subgroups (Table S5, Figures S8–S11). The validation metrics were largely met although a small number of assessments did fall below the validation thresholds. In summary, the mapping accurately predicted clinical severity stage when stratified by subgroups, apart from when stratified by AD medication use.

4. DISCUSSION

This study has developed a mapping between the CDR‐SB and MMSE instruments, by obtaining the optimal functional form of a mapping previously developed in the ADNI dataset and extending this to the full range of clinical severities, and validating the resulting mapping in a large cohort of participants recruited to the NACC database. In external validation, the resulting mapping demonstrated acceptable accuracy both overall and in clinically relevant subgroups, across a comprehensive range of validation metrics. The resulting mapping could be used in research settings to translate clinical staging assessments between CDR‐SB and MMSE scores and help to contextualize clinical trial findings to routine clinical practice.

There are several findings from this study. Firstly, linear CDR‐SB‐MMSE mapping was found to meet the pre‐specified goodness of fit criteria for model selection, with this observation confirmed by replication in the Cox‐Box sensitivity analysis. The simplicity of linear mapping compared to more complicated models facilitates easier application and interpretation in clinical practice and contextualization of research results. ¹⁶ , ¹⁷ Furthermore, the mapping can be effectively used even in settings with limited resources or expertise in complex statistical methods, increasing equity of access. ¹⁸ Indeed, the mapping can be calculated manually if access to such software is not available. Secondly, the extended mapping showed good performance across the validation cohort overall and the majority of subgroups, demonstrating its robustness and applicability in diverse patient populations. This indicates that the mapping can reliably be used in a wide range of clinical and demographic settings. Performance was especially strong in less severe AD stages, although caution is needed when applying the mapping to severe AD dementia where accuracy was lower.

There are various potential use cases for the mapping. Firstly, the mapping facilitates outcome harmonization through standardization of MMSE to CDR‐SB scores and vice versa, enabling direct and meaningful comparisons between studies. This is important for demonstrating the consistency of treatment effects across diverse patient populations, varying doses, and different treatment durations. Such comparisons are essential in building a comprehensive understanding of treatment efficacy and safety, leading to more reliable conclusions about the generalizability of study results. ¹⁹ Furthermore, given CDR‐SB is a commonly used measure of efficacy in randomized controlled trials, ⁵ the mapping provides a solution for contextualizing trial results. By translating CDR‐SB scores into more universally recognized and utilized metrics like MMSE, the mapping enables a clearer comparison and interpretation of trial outcomes in the context of routinely collected data. Taken together, the mapping bridges potential methodological gaps between types of studies, especially between randomized controlled trials and real‐world evidence studies.

Secondly, the tool may help support the standardization of evidence submitted for health technology assessment across different markets. The tool can be utilized to transform the outcome to the preferred option in the selected geography. Alternatively, the tool can be used to inform estimates for the purposes of transportability analysis. ²⁰ Evidence generation using transportability analysis is gaining traction and refers to the ability to generalize inferences from a study sample in one country to a target population in another country where the study was not conducted. ²¹ , ²² Indeed, updates to real‐world data guidelines, such as the National Institute for Health and Care Excellence (NICE) real‐world evidence framework, are anticipated soon.

Finally, the mapping has the potential to support randomized controlled trials in terms of both feasibility assessment and extensions that utilize routinely collected data to assess long‐term efficacy and safety. Regarding feasibility assessment, the mapping can facilitate more accurate counts for assessing the overall feasibility of trials and/or decisions about opening specific sites. The ability to map MMSE to CDR‐SB may provide more reliable insight into how many individuals meet the trial inclusion criteria, helping to improve trial efficiency. ²³ , ²⁴ Regarding trial extension, routine healthcare data are increasingly being used to extend follow‐up beyond the formal study period of randomized controlled trials, ²³ which may facilitate a longer time horizon for the assessment of efficacy and safety outcomes.

External validation in the large NACC dataset adds evidence that the mapping can be generalized to different data sources and populations, and followed best practice clinical prediction tool validation guidelines, through assessing both discrimination (via several validation metrics) and calibration. It also investigated the performance of the mapping in multiple subgroups of the NACC. This study aligned with approaches recommended in previous literature on the optimal design of and need for an external validation study. ²⁵

The utilization of the NACC database allowed the analysis to be validated on a well‐researched, large data resource that is one of the very few that includes measures of both CDR‐SB and MMSE. Moreover, the NACC database had multiple other advantages: recent cognitive test records and detailed and well‐structured database documentation. ¹⁵ The minimal exclusion criteria minimized selection bias, given the recording of age and sex in this database was complete and few participants had a date of death or date of discontinuation (withdrawal from the NACC cohort) after their most recent visit, and any records that were excluded for these reasons would likely be erroneous.

The main limitation of the resultant mapping was the poorer performance observed in those with more severe dementia. This is a greater limitation when using continuous values given lack of precision, especially for predicting CDR‐SB for MMSE scores 0 to 3 which exceeded the upper limit of the CDR‐SB range, but less of an issue if the outcomes are categorized. The extrapolation to more extreme values is based only on the reported mapping values given that person‐level data to re‐estimate these were not available from the paper by Balsis and colleagues. Therefore, the validation study in NACC was undertaken to address this lack of data. Next steps to potentially improve calibration at more extreme values include additional validation in other data sources, especially beyond the United States, and recalibration of the model as appropriate when more data become available.

We also observed that AD medication use fell below the pre‐specified validation threshold when performing subgroup analysis. Furthermore, since only ∼40 data points were provided for the CDR‐SB to MMSE mapping published by Balsis et al, ⁷ for less severe stages; therefore, candidate functional forms of the mapping derived via regression had to be limited to formats with fewer degrees of freedom, to avoid over specification of the regression.

Finally, we note that CDR is a global instrument designed to assess both cognition and functioning, whereas the MMSE is a short cognitive screening test that has been shown to have poor internal construct validity. Furthermore, although we have demonstrated high goodness of fit in the proposed MMSE to CDR‐SB mapping overall, it may not translate well for all individual applications.

The extended mapping demonstrated good performance, especially in the earlier clinical stages. The findings can support the translation of clinical severity assessments in AD between datasets using CDR‐SB and MMSE, bridging a common information gap between clinical trials and real‐world evidence studies.

AUTHOR CONTRIBUTIONS

Dominic Trepel: Conceptualization; interpretation of data; writing—review and editing. Niels Juul Brogaard and Jens Gundgaard: Conceptualization; methodology; writing—review and editing. Mei Sum Chan: Conceptualization; formal analysis; methodology; writing—review and editing. Benjamin D. Bray: Conceptualization; methodology; data verification; writing—review and editing. Julie Hviid Hahn‐Pedersen and Pepa Polavieja: Methodology; validation; writing—review and editing. Men Hoang: Methodology; writing—review and editing. Mei Sum Chan and Benjamin D. Bray had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors critically reviewed and approved the final version of the manuscript and accept responsibility to submit for publication.

CONFLICT OF INTEREST STATEMENT

D.T. is an associate editor for Alzheimer's and Dementia. M.S.C. and B.C.B. are employed by and B.C.B. is a partner in Health Analytics, Lane Clark & Peacock LLP. N.J.B., J.H.H.P., J.G., P.P., are employed by Novo Nordisk who has two ongoing phase 3 trials in early AD and are shareholder of Novo Nordisk A/S. MH has nothing to disclose. Author disclosures are available in the supporting information.

CONSENT STATEMENT

Direct consent was not necessary. The NACC team obtained Written informed consent is obtained from all participants and co‐participants.

Supporting information

Supporting Information

ALZ-21-e70163-s002.docx^{(100.2KB, docx)}

Supporting Information

ALZ-21-e70163-s001.pdf^{(606.7KB, pdf)}

Supporting Information

ALZ-21-e70163-s003.docx^{(805KB, docx)}

ACKNOWLEDGMENTS

F. Ma and A. Thompson from LCP, participated in writing and technical editing of the manuscript. Funded by Novo Nordisk A/S who participated in study design; analysis and interpretation of data; writing of the report; and decision to submit the article for publication.

Brogaard NJ, Hahn‐Pedersen JH, Gundgaard J, et al. Extension and external validation of mapping between the Mini‐Mental State Examination and the Clinical Dementia Rating scale in patients with Alzheimer's disease. Alzheimer's Dement. 2025;21:e70163. 10.1002/alz.70163

DATA AVAILABILITY STATEMENT

Extension and external validation of mapping between the Mini‐Mental State Examination and the Clinical Dementia Rating scale in Patients with Alzheimer's Disease. The NACC data that support the findings of this study are available on approved request from NACC data query request process | National Alzheimer's Coordinating Center.

REFERENCES

1. Prince M, Wimo A, Guerchet M, Ali G, Wu Y, Prina M. Alzheimer's Disease International: World Alzheimer report. Alzheimer's Disease International; 2015. [Google Scholar]
2. Morris JC. The Clinical Dementia Rating (CDR) current version and scoring rules. Neurology. 1993;43(11):2412‐2412. [DOI] [PubMed] [Google Scholar]
3. Morris JC, Ernesto C, Schafer K, et al. Clinical dementia rating training and reliability in multicenter studies: the Alzheimer's disease cooperative study experience. Neurology. 1997;48(6):1508‐1510. [DOI] [PubMed] [Google Scholar]
4. O'Bryant SE, Waring SC, Cullum CM, et al. Staging dementia using clinical dementia rating scale sum of boxes scores: a Texas Alzheimer's research consortium study. Arch Neurol. 2008;65(8):1091‐1095. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Cedarbaum JM, Jaros M, Hernandez C, et al. Rationale for use of the clinical dementia rating sum of boxes as a primary outcome measure for Alzheimer's disease clinical trials. Alzheimer's & Dementia. 2013;9(1):S45‐S55. [DOI] [PubMed] [Google Scholar]
6. Schmidt K. Clinical Dementia Rating Scale. Encyclopedia of Quality of Life and Well‐Being Research. Springer; 2021:1‐5. [Google Scholar]
7. Balsis S, Benge JF, Lowe DA, Geraci L, Doody RS. How do scores on the ADAS‐Cog, MMSE, and CDR‐SOB correspond?. Clin Neuropsychol. 2015;29(7):1002‐1009. [DOI] [PubMed] [Google Scholar]
8. Perneczky R, Wagenpfeil S, Komossa K, Grimmer T, Diehl J, Kurz A. Mapping scores onto stages: Mini‐Mental State Examination and clinical dementia rating. Am J Geriatr Psychiatry. 2006;14(2):139‐144. [DOI] [PubMed] [Google Scholar]
9. Andrews JS, Desai U, Kirson NY, Zichlin ML, Ball DE, Matthews BR. Disease severity and minimal clinically important differences in clinical outcome assessments for Alzheimer's disease clinical trials. Alzheimers & Dementia. 2019;5(1):354‐363. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Borland E, Edgar C, Stomrud E, Cullen N, Hansson O, Palmqvist S. Clinically relevant changes for cognitive outcomes in preclinical and prodromal cognitive stages: implications for clinical Alzheimer trials. Neurology. 2022;99(11):e1142‐e1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Srishyla D, Adam H, Shah K, Wu T, King T, Summers RL. The association of Mini‐Mental State Examination (MMSE) with Clinical Dementia Rating (CDR). Public Health Rev. 2020;3(2). [Google Scholar]
12. Tahami Monfared AA, Houghton K, Zhang Q, Mauskopf J; Alzheimer's Disease Neuroimaging Initiative . Staging disease severity using the Alzheimer's disease composite score (ADCOMS): a retrospective data analysis. Neurol Ther. 2022;11(1):413‐434. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Wessels AM, Rentz DM, Case M, Lauzon S, Sims JR. Integrated Alzheimer's disease rating scale: clinically meaningful change estimates. Alzheimers Dementia. 2022;8(1):e12312. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) the TRIPOD statement. Circulation. 2015;131(2):211‐219. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Beekly DL, Ramos EM, Lee WW, et al. The National Alzheimer's Coordinating Center (NACC) Database: The Uniform Data Set. Alzheimer Disease & Associated Disorders; 2007:249‐258. [DOI] [PubMed] [Google Scholar]
16. Narayanan M, Chen E, He J, Kim B, Gershman S, Doshi‐Velez F. How do humans understand explanations from machine learning systems? an evaluation of the human‐interpretability of explanation. 2018. arXiv preprint arXiv:180200682. [Google Scholar]
17. Lage I, Chen E, He J, et al. An evaluation of the human‐interpretability of explanation. 2019. arXiv preprint arXiv:190200006 [Google Scholar]
18. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA. 2017;318(6):517‐518. [DOI] [PubMed] [Google Scholar]
19. Moulton V, McElroy E, Richards M, et al. A guide to the cognitive measures in five British birth cohort studies. CLOSER; 2020. [Google Scholar]
20. Dahabreh IJ, Matthews A, Steingrimsson JA, Scharfstein DO, Stuart EA. Using trial and observational data to assess effectiveness: trial emulation, transportability, benchmarking, and joint analysis. Epidemiol Rev. 2024;46(1):1‐16. [DOI] [PubMed] [Google Scholar]
21. Degtiar I, Rose S. A review of generalizability and transportability. Annu Rev Stat Appl. 2023;10(1):501‐524. [Google Scholar]
22. Ramagopalan SV, Popat S, Gupta A, et al. Transportability of overall survival estimates from US to Canadian patients with advanced non–small cell lung cancer with implications for regulatory and health technology assessment. JAMA Netw Open. 2022;5(11):e2239874‐e2239874. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Kwakkenbos L, Imran M, McCall SJ, et al. CONSORT extension for the reporting of randomised controlled trials conducted using cohorts and routinely collected data (CONSORT‐ROUTINE): checklist with explanation and elaboration. BMJ. 2021;373:n857. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. van den Bogert CA, Souverein PC, Brekelmans CT, et al. Recruitment failure and futility were the most common reasons for discontinuation of clinical drug trials. Results of a nationwide inception cohort study in the Netherlands. J Clin Epidemiol. 2017;88:140‐147. [DOI] [PubMed] [Google Scholar]
25. Steyerberg EW, Harrell Jr FE. Prediction models need appropriate internal, internal‐external, and external validation. J Clin Epidemiol. 2015;69:245. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

ALZ-21-e70163-s002.docx^{(100.2KB, docx)}

Supporting Information

ALZ-21-e70163-s001.pdf^{(606.7KB, pdf)}

Supporting Information

ALZ-21-e70163-s003.docx^{(805KB, docx)}

Data Availability Statement

[alz70163-bib-0001] 1. Prince M, Wimo A, Guerchet M, Ali G, Wu Y, Prina M. Alzheimer's Disease International: World Alzheimer report. Alzheimer's Disease International; 2015. [Google Scholar]

[alz70163-bib-0002] 2. Morris JC. The Clinical Dementia Rating (CDR) current version and scoring rules. Neurology. 1993;43(11):2412‐2412. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0003] 3. Morris JC, Ernesto C, Schafer K, et al. Clinical dementia rating training and reliability in multicenter studies: the Alzheimer's disease cooperative study experience. Neurology. 1997;48(6):1508‐1510. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0004] 4. O'Bryant SE, Waring SC, Cullum CM, et al. Staging dementia using clinical dementia rating scale sum of boxes scores: a Texas Alzheimer's research consortium study. Arch Neurol. 2008;65(8):1091‐1095. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0005] 5. Cedarbaum JM, Jaros M, Hernandez C, et al. Rationale for use of the clinical dementia rating sum of boxes as a primary outcome measure for Alzheimer's disease clinical trials. Alzheimer's & Dementia. 2013;9(1):S45‐S55. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0006] 6. Schmidt K. Clinical Dementia Rating Scale. Encyclopedia of Quality of Life and Well‐Being Research. Springer; 2021:1‐5. [Google Scholar]

[alz70163-bib-0007] 7. Balsis S, Benge JF, Lowe DA, Geraci L, Doody RS. How do scores on the ADAS‐Cog, MMSE, and CDR‐SOB correspond?. Clin Neuropsychol. 2015;29(7):1002‐1009. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0008] 8. Perneczky R, Wagenpfeil S, Komossa K, Grimmer T, Diehl J, Kurz A. Mapping scores onto stages: Mini‐Mental State Examination and clinical dementia rating. Am J Geriatr Psychiatry. 2006;14(2):139‐144. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0009] 9. Andrews JS, Desai U, Kirson NY, Zichlin ML, Ball DE, Matthews BR. Disease severity and minimal clinically important differences in clinical outcome assessments for Alzheimer's disease clinical trials. Alzheimers & Dementia. 2019;5(1):354‐363. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0010] 10. Borland E, Edgar C, Stomrud E, Cullen N, Hansson O, Palmqvist S. Clinically relevant changes for cognitive outcomes in preclinical and prodromal cognitive stages: implications for clinical Alzheimer trials. Neurology. 2022;99(11):e1142‐e1153. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0011] 11. Srishyla D, Adam H, Shah K, Wu T, King T, Summers RL. The association of Mini‐Mental State Examination (MMSE) with Clinical Dementia Rating (CDR). Public Health Rev. 2020;3(2). [Google Scholar]

[alz70163-bib-0012] 12. Tahami Monfared AA, Houghton K, Zhang Q, Mauskopf J; Alzheimer's Disease Neuroimaging Initiative . Staging disease severity using the Alzheimer's disease composite score (ADCOMS): a retrospective data analysis. Neurol Ther. 2022;11(1):413‐434. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0013] 13. Wessels AM, Rentz DM, Case M, Lauzon S, Sims JR. Integrated Alzheimer's disease rating scale: clinically meaningful change estimates. Alzheimers Dementia. 2022;8(1):e12312. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0014] 14. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) the TRIPOD statement. Circulation. 2015;131(2):211‐219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0015] 15. Beekly DL, Ramos EM, Lee WW, et al. The National Alzheimer's Coordinating Center (NACC) Database: The Uniform Data Set. Alzheimer Disease & Associated Disorders; 2007:249‐258. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0016] 16. Narayanan M, Chen E, He J, Kim B, Gershman S, Doshi‐Velez F. How do humans understand explanations from machine learning systems? an evaluation of the human‐interpretability of explanation. 2018. arXiv preprint arXiv:180200682. [Google Scholar]

[alz70163-bib-0017] 17. Lage I, Chen E, He J, et al. An evaluation of the human‐interpretability of explanation. 2019. arXiv preprint arXiv:190200006 [Google Scholar]

[alz70163-bib-0018] 18. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA. 2017;318(6):517‐518. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0019] 19. Moulton V, McElroy E, Richards M, et al. A guide to the cognitive measures in five British birth cohort studies. CLOSER; 2020. [Google Scholar]

[alz70163-bib-0020] 20. Dahabreh IJ, Matthews A, Steingrimsson JA, Scharfstein DO, Stuart EA. Using trial and observational data to assess effectiveness: trial emulation, transportability, benchmarking, and joint analysis. Epidemiol Rev. 2024;46(1):1‐16. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0021] 21. Degtiar I, Rose S. A review of generalizability and transportability. Annu Rev Stat Appl. 2023;10(1):501‐524. [Google Scholar]

[alz70163-bib-0022] 22. Ramagopalan SV, Popat S, Gupta A, et al. Transportability of overall survival estimates from US to Canadian patients with advanced non–small cell lung cancer with implications for regulatory and health technology assessment. JAMA Netw Open. 2022;5(11):e2239874‐e2239874. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0023] 23. Kwakkenbos L, Imran M, McCall SJ, et al. CONSORT extension for the reporting of randomised controlled trials conducted using cohorts and routinely collected data (CONSORT‐ROUTINE): checklist with explanation and elaboration. BMJ. 2021;373:n857. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70163-bib-0024] 24. van den Bogert CA, Souverein PC, Brekelmans CT, et al. Recruitment failure and futility were the most common reasons for discontinuation of clinical drug trials. Results of a nationwide inception cohort study in the Netherlands. J Clin Epidemiol. 2017;88:140‐147. [DOI] [PubMed] [Google Scholar]

[alz70163-bib-0025] 25. Steyerberg EW, Harrell Jr FE. Prediction models need appropriate internal, internal‐external, and external validation. J Clin Epidemiol. 2015;69:245. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Extension and external validation of mapping between the Mini‐Mental State Examination and the Clinical Dementia Rating scale in patients with Alzheimer's disease

Niels Juul Brogaard

Julie Hviid Hahn‐Pedersen

Jens Gundgaard

Pepa Polavieja

Benjamin D Bray

Mei Sum Chan

Men Hoang

Dominic Trepel

Abstract

INTRODUCTION

METHODS

RESULTS

DISCUSSION

Highlights

1. BACKGROUND

2. METHODS

2.1. Extending the current mapping to the full range of the AD the clinical severity spectrum

2.2. External validation

2.3. Analysis

TABLE 1.

RESEARCH IN CONTEXT

3. RESULTS

3.1. Extending the current mapping to the full range of the AD clinical severity spectrum

TABLE 2.

FIGURE 1.

3.2. External validation of the mapping

FIGURE 2.

TABLE 3.

3.3. Validation metrics

TABLE 4.

FIGURE 3.

3.4. Sensitivity analysis in NACC subgroups

4. DISCUSSION

AUTHOR CONTRIBUTIONS

CONFLICT OF INTEREST STATEMENT

CONSENT STATEMENT

Supporting information

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases