Abstract
Aims
This study aimed to clarify the different topographical distribution of tau pathology between progressive supranuclear palsy (PSP) and corticobasal degeneration (CBD) and establish a machine learning‐based decision tree classifier.
Methods
Paraffin‐embedded sections of the temporal cortex, motor cortex, caudate nucleus, globus pallidus, subthalamic nucleus, substantia nigra, red nucleus, and midbrain tectum from 1020 PSP and 199 CBD cases were assessed by phospho‐tau immunohistochemistry. The severity of tau lesions (i.e., neurofibrillary tangle, coiled body, tufted astrocyte or astrocytic plaque, and tau threads) was semi‐quantitatively scored in each region. Hierarchical cluster analysis was performed using tau pathology scores. A decision tree classifier was made with tau pathology scores using 914 cases. Cross‐validation was done using 305 cases. An additional ten cases were used for a validation study.
Results
Cluster analysis displayed two distinct clusters; the first cluster included only CBD, and the other cluster included all PSP and six CBD cases. We built a decision tree, which used only seven decision nodes. The scores of tau threads in the caudate nucleus were the most decisive factor for predicting CBD. In a cross‐validation, 302 out of 305 cases were correctly diagnosed. In the pilot validation study, three investigators made a correct diagnosis in all cases using the decision tree.
Conclusion
Regardless of the morphology of astrocytic tau lesions, semi‐quantitative tau pathology scores in select brain regions are sufficient to distinguish PSP and CBD. The decision tree simplifies neuropathologic differential diagnosis of PSP and CBD.
Keywords: corticobasal degeneration, corticobasal syndrome, decision tree classifier, hierarchical cluster analysis, Machine learning, progressive supranuclear palsy
Decision tree classifier for tauopathies.
INTRODUCTION
Progressive supranuclear palsy (PSP) and corticobasal degeneration (CBD) are sporadic, progressive neurodegenerative diseases, collectively termed tauopathies.[1, 2] PSP typically presents with levodopa‐unresponsive parkinsonism, postural instability, frequent falls, vertical supranuclear gaze palsy and cognitive impairment, with the most common presentation referred to as Richardson syndrome (RS).[3] Prototype clinical features of CBD include asymmetric rigidity and apraxia, parkinsonism, dystonia, myoclonus, cortical sensory loss, dystonia and cognitive impairment, referred to as corticobasal syndrome (CBS).[4] In addition to typical presentations, both PSP and CBD can present with RS, CBS, behavioural variant frontal dementia and progressive non‐fluent aphasia.[1, 2] This clinical overlap makes a clinical diagnosis of tauopathy challenging; thus, autopsy is indispensable to confirm a diagnosis.
PSP and CBD show similar tau pathology characterised by numerous neuronal and glial lesions composed of pathological aggregates of insoluble tau protein in the grey and white matter of the neocortex, basal ganglia, diencephalon and brainstem.[5, 6, 7] Neuronal loss and atrophy in the subthalamic nucleus, red nucleus and cerebellar dentate nucleus are more frequent and severe in PSP compared to CBD.[5] While tau pathology occurs predominantly in hindbrain structures in PSP, tau pathology in CBD occurs predominantly in forebrain structures.[8, 9] The distinct morphology of astrocytic lesions is also helpful in distinguishing two diseases.[5, 10] The tufted astrocyte is characteristic for PSP: a radial arrangement of thin, long, branching accumulation of tau in the proximal processes of astrocyte.[6, 7] The astrocytic plaque is a pathognomonic lesion in CBD and is an annular cluster of short and stubby processes of astrocytes.[5] Even though the two diseases have different distributions and morphologic features of tau pathology, the neuropathologic diagnosis of PSP and CBD is sometimes challenging because the distribution pattern of neurodegeneration and tau lesions overlap.[7, 9, 11, 12]
With advances in machine learning, the application of computer‐aided diagnosis is a promising technology to assist diagnostic decision‐making.[13, 14] Several machine learning methods based on deep learning, such as image classification and object detection, have been applied in the fields of radiology, pathology and other specialties [15, 16, 17, 18, 19, 20]; however, due to the “black box” nature of deep learning, it is difficult to interpret the results from deep learning, which may limit their use in decision‐making in clinical practice.[21] A decision tree is a promising method to overcome this problem.[22] A decision tree is a machine learning method that separates outcomes based on the statistical significance, displayed as a probability tree. A recent study has shown that a machine learning‐based decision tree using CSF biomarkers showed a higher diagnostic accuracy of Alzheimer's disease (AD) compared with a traditional cut‐off.[23] The advantage of this technique is the “white box” nature; clinicians are able to interpret the output of the machine learning algorithm and use the results as a flowchart.
The present study aimed to demonstrate that the topographical distribution and severity of tau pathology, rather than the morphology of astrocytic tau lesions or other pathological features, are sufficient to distinguish between PSP and CBD. To achieve this, we performed hierarchical cluster analysis using semi‐quantitative scores of tau lesions in select brain regions from PSP and CBD. In addition, we constructed machine learning‐based decision tree classifiers to identify the most decisive predictive factor and provide a simple flowchart for diagnosis.
MATERIALS AND METHODS
Case selection and ethical approval
All brain tissues used in this study were from the Mayo Clinic brain bank collected between 2000 and January 2020. In this period, 1411 cases and 261 cases have been given a neuropathologic diagnosis of PSP and CBD respectively. Cases with known MAPT mutations were excluded. Any case with missing data for at least one neuroanatomical region for a given tau pathology score were excluded from the study. The remaining 1219 cases, consisting of 1020 PSP and 199 CBD, were included in the study. For the validation study, 10 consecutive cases of either PSP or CBD between June and August 2020 were selected. Demographic information and clinical diagnoses were extracted from medical records and a questionnaire filled out by a family member. Clinical diagnoses of all cases were divided into three categories: RS, CBS and others (i.e., PSP‐parkinsonism, frontotemporal dementia, AD, dementia with Lewy bodies, aphasia and PSP‐pure akinesia with gait freezing) based on available clinical information.[2, 4] Brain autopsies were performed after consent of the legal next‐of‐kin or individuals with power‐of‐attorney to grant consent. De‐identified studies of autopsy samples are considered exempt from human subject research by the Mayo Clinic Institutional Review Board.
Neuropathologic assessment
Formalin‐fixed brains underwent systematic and standardised sampling with neuropathologic evaluation by a single, experienced neuropathologist (DWD). Regions sampled on all cases included six regions of neocortex, two levels of hippocampus, a basal forebrain section that includes amygdala, lentiform nucleus and hypothalamus, anterior corpus striatum, thalamus at the level of the subthalamic nucleus, midbrain, pons, medulla and two sections of cerebellum, one including the dentate nucleus. Paraffin‐embedded 5‐μm thick sections mounted on glass slides were stained with haematoxylin and eosin and thioflavin S. Braak neurofibrillary tangle (NFT) stage and Thal amyloid phase were assigned based upon lesion density and distribution with thioflavin S fluorescent microscopy according to published criteria.[24, 25] For Braak NFT stage, sections from the entorhinal cortex (stage II), the pyramidal layer of the CA1 subsector of the hippocampus (stage III), temporal cortex (stage IV), frontal cortex (stage V) and primary visual cortex (stage VI) were used. For Thal amyloid phase, sections from the frontal cortex (phase 1), the pyramidal layer of the CA1 subsector of the hippocampus (phase 2), putamen (phase 3), CA4 subsector of the hippocampus (phase 4) and the molecular layer of the cerebellum (phase 5) were used. The neuropathological diagnosis of AD was based on the consensus criteria for the neuropathologic diagnosis of AD.[26]
Immunohistochemistry for phospho‐tau (CP13, Ser202, mouse monoclonal, 1:1000, from the late Dr Peter Davies, Feinstein Institute, North Shore Hospital, NY) was performed using a DAKO Autostainer (Universal Staining System, Carpinteria, CA) to establish a neuropathological diagnosis of PSP and CBD.[5, 6, 7] The severity of tau pathology, which included NFTs (including pretangles), coiled bodies, astrocytic lesions (including tufted astrocytes and astrocytic plaques) and tau threads, was graded semi‐quantitatively on a four‐point scale (0 = absent, 1 = mild, 2 = moderate, 3 = severe) by an experienced neuropathologist (DWD) in eight brain regions: temporal cortex, motor cortex, caudate nucleus, globus pallidus, subthalamic nucleus, red nucleus, substantia nigra and midbrain tectum. Representative images of tau pathology scores in each lesion type are shown in Figure 1.
Immunohistochemistry for phospho‐tau (AT8, Ser202/Thr205, mouse monoclonal, 1:1000, Invitrogen) was also performed in select cases using the sections of caudate nucleus to show similarity with CP13 (Figure S1).
Hierarchical cluster analysis
Hierarchical cluster analysis using Euclidean distance and average linkage clustering was performed on patients and region‐specific variables reflecting the tau pathology scores in eight brain regions in 1219 cases. A heatmap was generated to visualise hierarchical clustering using the “pheatmap package” in R 3.4.3 (The R Foundation for Statistical Computing, Vienna, Austria).
Machine learning‐based decision tree classifier
A decision tree classifier was created using the “scikit‐learn” Python module.[27] Classification and regression tree algorithm and Gini impurity measure were used to construct decision trees. A total of 1219 cases were randomly divided into a training set (914 cases; 75%) and a testing set (305 cases; 25%). The target variable was the pathological diagnosis (i.e., PSP and CBD). The dependent variables were the tau pathology scores in eight brain regions.
Validation study
For further validation of the decision tree, three investigators (SK, XZ and DWD), who have different levels of experience in neuropathologic research in tauopathy, blindly assessed tau pathology scores of select brain regions in ten most recent cases of either PSP or CBD (validation set). For scoring, each investigator separately reviewed glass slides under a microscope, rather than using digital images. The diagnosis of PSP or CBD was made based only on the tau pathology scores and the decision tree classifier.
Statistical analysis
All statistical analyses were performed using R 3.4.3. Fisher's exact test was performed for group comparisons of categorical data, as appropriate. Mann‐Whitney rank sum test and student t‐test were used for analyses of continuous variables as appropriate. p‐values < 0.05 were considered statistically significant.
RESULTS
Cohort summary
The demographic and clinicopathologic features of 1020 PSP and 199 CBD cases are summarised in Table 1. Patients with PSP were significantly older than those with CBD (75 ± 8 vs. 70 ± 8 years; p < 0.001). Although the frequency of concurrent neuropathologic diagnosis of AD was not significantly different in PSP and CBD (10% vs. 6%; p = 0.076), the medians of Braak NFT stage (2.5 vs. 2; p = 0.002) and Thal amyloid phase (1 vs. 0; p ≤ 0.001) were significantly higher in PSP than in CBD. The symptomatic duration of PSP was longer than that of CBD (7 vs. 6 years; p < 0.001). The majority of PSP patients (82%) were given a clinical diagnosis of RS, followed by CBS (8%). In contrast, the clinical diagnosis of CBD was more heterogeneous: CBS in 37%, RS in 36% and others in 28%.
TABLE 1.
PSP N = 1020 |
CBD N = 199 |
p value | |
---|---|---|---|
Male, No. (%) | 530 (52%) | 103 (52%) | 0.936 |
Age at death, years | 75 ± 8 | 70 ± 8 | <0.001 |
Brain weight, g | 1140 ± 150 | 1110 ± 140 | 0.006 |
Concurrent AD | 97 (10%) | 11 (6%) | 0.076 |
Braak neurofibrillary tangles stage | II (II, III) | II (I, III) | 0.002 |
0 | 116 (11%) | 27 (14%) | |
I | 111 (11%) | 27 (14%) | |
II | 336 (33%) | 80 (40%) | |
III | 282 (28%) | 44 (22%) | |
IV | 154 (15%) | 18 (9%) | |
V | 12 (1%) | 1 (1%) | |
VI | 9 (1%) | 2 (1%) | |
Thal amyloid phase | 1 (0, 3) | 0 (0, 2) | <0.001 |
0 | 452 (44%) | 111 (56%) | |
1 | 154 (15%) | 35 (18%) | |
2 | 99 (10%) | 23 (12%) | |
3 | 220 (22%) | 25 (13%) | |
4 | 56 (6%) | 2 (1%) | |
5 | 39 (4%) | 3 (2%) | |
Disease duration, years | 7 ± 3 | 6 ± 2 | <0.001 |
Clinical diagnosis | <0.001 | ||
RS | 836 (82%) | 71 (36%) | |
CBS | 77 (8%) | 73 (37%) | |
Other | 107 (11%) | 55 (28%) |
Data are displayed as n (%), mean ± SD and median (25th, 75th range).
Hierarchical cluster analysis
Hierarchical cluster analysis based on regional semi‐quantitative tau pathology scores in 1020 PSP and 199 CBD cases were performed. The results are shown as a heatmap in Figure 2. This heatmap indicates tau pathology scores from white (score =0) to red (score =3). Each row represents the lesion types in a given brain region, and each column represents an individual case. The first cluster contained only CBD cases (Cluster 1), while the second one contained all PSP cases and six CBD cases (Cluster 2). This indicates that PSP and CBD were clearly separated based on the severity and distribution of tau pathology. For further analysis, Cluster 2 was subdivided into four clusters (2–1, 2–2, 2–3 and 2–4).
Some striking differences in the distribution of tau pathology were observed between the clusters. Astrocytic tau lesions in the midbrain tectum and coiled bodies in the globus pallidus, subthalamic nucleus, red nucleus and midbrain tectum were less severe in Cluster 1 than Cluster 2. NFT and tau threads in the caudate nucleus were more severe in Cluster 1 than Cluster 2. Coiled bodies, tau threads, NFT and astrocytic tau lesions in the temporal cortex were more severe in Cluster 1 than Cluster 2, except Cluster 2–2. Astrocytic tau lesions in the globus pallidus, substantia nigra, subthalamic nucleus and red nucleus were much less severe in Cluster 1 than Cluster 2, except Cluster 2–4. Cluster 2–2 had a higher burden of tau pathology in the temporal cortex than other clusters (i.e., 2–1, 2–3 and 2–4). Cluster 2–4 had less severe astrocytic pathology in the globus pallidus, substantia nigra, subthalamic nucleus and red nucleus, compared with other clusters (i.e., 2–1, 2–2 and 2–3).
Table 2 compares demographic and clinicopathologic features among these sub‐clusters of Cluster 2. Cluster 2–2 had several differences compared to other clusters; age at death (81 ± 7 years) was significantly older, the frequency of AD (53%) was significantly higher and the frequency of clinical diagnosis of RS (69%) was significantly lower than other clusters. The breakdown of other clinical diagnoses is given in Table S1.
TABLE 2.
Cluster 2–1 | Cluster 2–2 | Cluster 2–3 | Cluster 2–4 | p value | |
---|---|---|---|---|---|
Number of cases | 568 | 105 | 204 | 149 | |
Male, No. (%) | 306 (54%) | 49 (47%) | 102 (50%) | 76 (51%) | 0.666 |
Age at death, years* | 73 ± 7 | 81 ± 7 | 76 ± 7 | 77 ± 8 | <0.001 |
Pathologic diagnosis of PSP | 566 (99%) | 103 (98%) | 204 (100%) | 147 (99%) | |
Concurrent AD** | 21 (4%) | 56 (53%) | 9 (4%) | 11 (7%) | <0.001 |
Clinical diagnosis*** | <0.001 | ||||
RS | 483 (85%) | 71 (68%) | 179 (88%) | 105 (70%) | |
CBS | 40 (7%) | 15 (14%) | 8 (4%) | 17 (11%) | |
Other | 45 (8%) | 19 (18%) | 17 (8%) | 27 (18%) |
Data are displayed as n (%), mean ± SD and median (25th, 75th range).
*Age is significantly different between all clusters except Clusters 2–3 and 2–4.
**The frequency of concurrent AD is significantly different between Cluster 2–2 and all other clusters.
***The frequency of clinical diagnosis is significantly different between Clusters 2–1 and 2–2; and Clusters 2–2 and 2–3. Pairwise comparison is done using Bonferroni correction.
Six autopsy‐confirmed CBD cases were included in Cluster 2: two cases in Cluster 2–1, two cases in Cluster 2–2, and two cases in Cluster 2–4. Clinicopathologic features are shown in Table S2. Although they were included in the PSP‐predominant cluster, all cases had astrocytic plaques, confirming the neuropathologic diagnosis of CBD (Figure S2). Cases in Cluster 2–1 and cluster 2–4 were characterised by less severe tau pathologies in the temporal cortex, while cases in Cluster 2–2 were characterised by more severe tau pathologies in the midbrain tectum (Figure S2). Four of them were clinically diagnosed with CBS, and the other two were diagnosed with PSP‐RS.
Decision tree classifiers
The cluster analysis showed that several tau pathology scores, such as the tau threads in the caudate nucleus, can help distinguish PSP and CBD. To determine the minimum combination of parameters that can distinguish the two diseases, we next built machine learning‐based decision tree classifiers.
When only one decision node was used to construct a tree (depth of the tree =1), a score of tau threads in the caudate nucleus showed the highest accuracy. As shown in Figure 3A, of 914 cases in a training set (152 CBD and 762 PSP), 743 cases (7 CBD and 736 PSP) had the score <3 and 171 cases (145 CBD and 26 PSP) had the score 3. This indicates that 736 PSP and 145 CBD were correctly categorised in each diagnosis; thus, the accuracy of training set was 96.3% (881/914). In cross‐validation, this decision tree correctly classified 298 out of 305 cases in a testing set (97.7% accuracy).
To improve the diagnostic accuracy of decision tree classifiers, we increased the number of decision nodes by increasing the maximum depth of decision trees. Figure 3B shows a decision tree with a depth of 3, which contained 7 decision nodes. The root node used the score of tau threads in the caudate nucleus. Most PSP cases had a score of <3, so they selected the “True” path. In the second layer, the decision node asked whether the coiled body score in the subthalamic nucleus was 0. The majority of PSP cases had coiled bodies in this region, so they chose the “False” path. In the third layer, the decision node determined whether the coiled body score in the red nucleus was 0. Almost all PSP cases had coiled bodies in this region, so they chose the “False” path and arrived at the leaf node as a PSP (719 PSP and 1 CBD). This is the main pathway for most PSP cases. When the coiled bodies score was 0 in the subthalamic nucleus in the second layer, then the node in the third layer asked whether the score of tau threads in the globus pallidus. The diagnosis of PSP was made if the score was lower than 2, and the diagnosis of CBD was made if the score was 2 or 3. The majority of CBD cases (and a few PSP cases) selected the “False” path at the root node because almost all CBD cases had a score of 3 for the tau threads in the caudate nucleus. In the second layer, the majority of CBD cases answered “True” at the decision node, asking whether a score 0 for astrocytic tau in the red nucleus. Finally, these cases were asked whether having a coiled body score of <3 in the midbrain tectum. Most CBD cases selected the “True” path and arrived at the leaf node as a CBD class. If the score of astrocytic tau in the red nucleus was higher than 0 in the second layer, then the next node asked the score of NFT in the caudate nucleus. The diagnosis of PSP was made if the score was lower than 3, and the diagnosis of CBD was made if the score was 3. As shown in Figure 3B, 903 out of 914 cases in a training set were correctly categorised as PSP or CBD (98.8% accuracy). In cross‐validation, this decision tree correctly classified 302 out of 305 cases in a testing set (99.0% accuracy).
As maximum depth increases, both training and testing scores increased. The decision tree achieved the best testing score of 0.997 when the maximum depth was 6 (data not shown). Table 3 summarises the training and testing scores of the decision tree classifier in each maximum depth.
TABLE 3.
Maximum depth | Training score | Testing score |
---|---|---|
1 | 0.963 | 0.977 |
2 | 0.978 | 0.987 |
3 | 0.988 | 0.990 |
4 | 0.997 | 0.993 |
5 | 0.998 | 0.993 |
6 | 1.000 | 0.997 |
7 | 1.000 | 0.997 |
Validation study
For further validation of the decision tree, three investigators blindly assessed the tau pathology scores and made a diagnosis in 10 cases using a flowchart created by the diagnostic tree classifier (Figure 4). All the scores in six brain regions and final diagnoses are shown in Table 4. All three reviewers made a correct diagnosis in all cases, although the diagnostic path was different in the two cases. In Case 5, for instance, Reviewer 1 and 3 scored 3 in the caudate tau threads, but Reviewer 2 gave a score of 2. After the root node, Review 1 and 3 assessed astrocytic lesions in the red nucleus and coiled bodies in the midbrain tectum, while Reviewer 2 assessed coiled bodies in the subthalamic nucleus and tau threads in the globus pallidus. Nevertheless, the final diagnosis was CBD in all reviewers. This result indicated that despite some inter‐rater differences, multiple nodes rescued the final diagnosis. The same result was observed in Case 9.
TABLE 4.
Case ID | Reviewer | Caudate: Tau threads | STN: CB | RN: CB | RN: Astrocytes | MBT: CB |
GP: Tau threads |
Dx |
---|---|---|---|---|---|---|---|---|
1 | 1 | 3 | 0 | 1 | CBD | |||
2 | 3 | 0 | 0 | CBD | ||||
3 | 3 | 0 | 0 | CBD | ||||
2 | 1 | 3 | 0 | 1 | CBD | |||
2 | 3 | 0 | 1 | CBD | ||||
3 | 3 | 0 | 2 | CBD | ||||
3 | 1 | 1 | 3 | 3 | PSP | |||
2 | 1 | 3 | 2 | PSP | ||||
3 | 1 | 3 | 3 | PSP | ||||
4 | 1 | 2 | 2 | 3 | PSP | |||
2 | 1 | 3 | 2 | PSP | ||||
3 | 2 | 3 | 3 | PSP | ||||
5 | 1 | 3 | 0 | 1 | CBD | |||
2 | 2 | 0 | 3 | CBD | ||||
3 | 3 | 0 | 0 | CBD | ||||
6 | 1 | 1 | 3 | 3 | PSP | |||
2 | 0 | 3 | 3 | PSP | ||||
3 | 1 | 3 | 2 | PSP | ||||
7 | 1 | 1 | 3 | 3 | PSP | |||
2 | 1 | 3 | 3 | PSP | ||||
3 | 1 | 3 | 3 | PSP | ||||
8 | 1 | 1 | 1 | 1 | PSP | |||
2 | 1 | 1 | 1 | PSP | ||||
3 | 1 | 2 | 1 | PSP | ||||
9 | 1 | 2 | 0 | 3 | CBD | |||
2 | 3 | 1 | 0 | 1 | CBD | |||
3 | 3 | 0 | 1 | CBD | ||||
10 | 1 | 1 | 2 | 3 | PSP | |||
2 | 1 | 3 | 2 | PSP | ||||
3 | 2 | 2 | 1 | PSP |
Abbreviations: Caudate, caudate nucleus; CB, coiled bodies; Dx, pathologic diagnosis; GP, globus pallidus; MBT, midbrain tectum; RN, red nucleus; STN, subthalamic nucleus.
Decision trees without the subthalamic nucleus or caudate nucleus
Given the fact that all brain regions we used in the study might not be routinely sampled in other research laboratories, we also built decision tree classifiers without the subthalamic nucleus or caudate nucleus. Without the subthalamic nucleus, the scores of NFT in the caudate nucleus were used instead of coiled bodies in the subthalamic nucleus (Figure S3A). The testing score of this decision tree with a depth of 3 was 0.987, slightly lower than that of the decision tree that used the subthalamic nucleus. When the caudate nucleus was excluded from the decision tree, the root node used the scores of coiled bodies in the midbrain tectum (Figure S3B). The testing score of this decision tree with a depth of 3 was 0.964, lower than that of the previous two decision trees. Training and testing scores of these two decision trees in each maximum depth are given in Table S3.
DISCUSSION
The first aim of the present study was to investigate whether the distribution and severity of tau pathology are sufficient to distinguish PSP and CBD, regardless of the morphology of astrocytic tau lesions. We used a score of astrocytic tau lesions for both tufted astrocytes in PSP and astrocytic plaques in CBD; therefore, the morphology of astrocytic tau lesions was not taken into account for the clustering. Nevertheless, the two diseases were almost entirely separated into two clusters, indicating that besides morphology, distribution patterns of tau pathology are distinct between the two diseases. This result also indicates that the information about neuronal loss and gliosis were not necessary to distinguish the two diseases. Using this finding, a second aim of the present study was to develop decision tree classifiers to differentiate PSP and CBD using the tau pathology scores in select brain regions. As we expected from the heatmap finding, several tau pathology scores were sufficient to distinguish these diseases.
Although the heatmap visualised different distribution of tau pathology between the two diseases and we could observe some differential features, the decision tree was helpful to identify the parameters that contributed to classifying the two diseases.[22] Surprisingly, when only one variable was selected (tau threads in the caudate nucleus), the testing score was 0.977. This indicates that the score of tau threads in the caudate nucleus is the strongest predictor for diagnosing CBD. The diagnostic criteria of CBD stated that “the neuropathologic diagnosis of CBD is based upon detection of tau‐positive neuronal and glial lesions, in particular astrocytic plaques and thread‐like processes in grey and white matter, in a characteristic distribution”.[5] The results of our study propose that the high burden of tau threads in the caudate nucleus has the highest value in the differential diagnosis of CBD and PSP, followed by the paucity of astrocytic tau lesions in the red nucleus in CBD.
Table 3 shows that the decision tree achieved the best testing score when the maximum depth was 6; however, we did not show this decision tree as the main result of this study because it is too complicated for practical use. This decision tree consisted of 6 layers and 16 decision nodes, using the tau pathology scores from the caudate nucleus, subthalamic nucleus, red nucleus, globus pallidus, midbrain tectum, substantia nigra, temporal cortex and motor cortex. Although the testing score was slightly lower, we propose that a decision tree with a depth of 3 is helpful for decision‐making for pathologic diagnosis of PSP and CBD.
To determine whether this decision tree is useful in decision‐making, we performed a validation study using ten additional cases. Three investigators blindly assessed tau‐immunostained slides in select brain regions without any clinical or other pathological information. Morphology of astrocytic tau lesions was not considered; only the semi‐quantitative tau scores were used to make a diagnosis. Although there was some discrepancy in scores, all cases were correctly diagnosed as either PSP or CBD by all three investigators who have varying experience in diagnostic neuropathology. Although the size of the validation set is relatively small, this result indicates that the decision tree classifier in the present study has sufficient accuracy for making a diagnosis of PSP and CBD.
In addition to the main decision tree, as shown in Figure 4, we also created two decision trees, which do not include the subthalamic nucleus or caudate nucleus because these brain regions might not be routinely sampled in some brain banks. As we expected, the decision tree without the subthalamic nucleus showed a similar testing score because the most important region, the caudate nucleus, was included. Unexpectedly, another decision tree, which lacked the caudate nucleus, also achieved relatively high scores when the maximum depth was two or higher. This finding suggests that even without the caudate nucleus, a combination of two or more tau pathology scores, such as the coiled bodies in the midbrain tectum, tau threads in the substantia nigra, and coiled bodies in the subthalamic nucleus, can achieve significant accuracy in the diagnosis of PSP and CBD.
Clinicopathological correlations of each disease are not the main focus of this study, but the heatmap implies that the distribution and severity of tau pathology is not sufficient to predict a clinical diagnosis of PSP or CBS. Although the pathologic diagnosis of PSP and CBD were almost completely divided into different clusters, clinical diagnoses (i.e., RS, CBS and others) were not clearly separated by our clustering analysis. Clinicopathological correlations in PSP and CBD need to be addressed in future studies using more detailed clinical information, ideally from a longitudinal prospective cohort.[28]
Interestingly, when Cluster 2 was divided into four clusters, Cluster 2–2 had some distinct features. As shown in the heatmap, this cluster is characterised by the high burden of tau pathology in the temporal cortex, which can be explained, in part, by the high frequency of concurrent AD pathology.[29] Patients in this cluster were older and were given a clinical diagnosis of RS less frequently than other clusters. These findings indicate that concurrent AD can be seen in elderly PSP patients, and this co‐pathology may modify clinical presentations, leading to a clinical diagnosis of CBS or dementia. Although it is less obvious, Cluster 2–4 is also an interesting group. Less severe astrocytic pathology in the globus pallidus, substantia nigra, subthalamic nucleus and red nucleus is similar to the pattern in CBD (Cluster 1). The frequency of clinical diagnosis of CBS was nominally higher in this cluster compared to other “typical” PSP clusters (Cluster 2–1 and 2–3). These findings suggest that the distribution pattern of tau pathology may characterise “subtypes” of PSP and reflect different clinical presentations.[9, 30, 31] Mimuro and Yoshida described three pathological subtypes of PSP: typical PSP, pallido‐nigro‐Lyusian type and CBD‐like type based on the severity of neurodegeneration and the amount of tau deposition.[9, 11] Our heatmap did not reproduce their classification probably because our heatmap did not include the severity of neurodegeneration. To further discuss the correlations between clinical presentations and underlying pathology in each disease, not only tau pathology, but neuronal loss and gliosis should be taken into account. In addition, other co‐pathologies, such as argyrophilic grain disease, Lewy body disease and TDP‐43 pathology, may also be included in future studies.[32, 33, 34]
A major limitation of this study is the external validity of the decision trees. The decision trees were made using the tau pathology scores assigned by a single neuropathologist. When other neuropathologists use our decision tree classifiers, inter‐rater discrepancy of tau pathology scores might be inevitable because the scores are assigned semi‐quantitatively and not defined by objective means. Ideally, tau scores should be made by objective methods, including digital imaging analysis and machine learning‐based object detection.[17, 19] As the validation study showed, however, the decision tree seems robust against some discrepancy of the scores due to the combination of multiple decision nodes. Another potential limitation is that tau pathology scores were made based on immunohistochemistry with CP13 antibody, which is less widely used compared to AT8. We believe, however, that the difference between CP13 and AT8 does not affect the result because both antibodies show similar staining patterns in PSP and CBD.[35] In addition, we also compared CP13‐ and AT8‐stained slides using the sections of caudate nucleus in PSP and CBD, which were indistinguishable (Figure S1). Therefore, our decision tree classifiers can be used for AT8‐immunostained slides. Finally, our decision tree classifiers can be used only for the dichotomy of PSP and CBD. In a practical setting, not only PSP and CBD, but other 4‐repeat tauopathies, such as globular glial tauopathy and tauopathies due to MAPT mutations, might be raised in the differential diagnosis.[36, 37] Patients with MAPT mutations have an unusual distribution pattern of tau pathology; thus, they may fit neither PSP nor CBD.[38] Future studies need to include these diseases and concurrent pathologies to develop more practical and versatile machine learning‐based diagnostic algorithms.
Conversely, a strength of this study is the large sample size of both PSP and CBD cases. The Mayo Clinic brain bank in Florida has a focus on a wide range of neurodegenerative disorders, as well as normal and pathological aging. PSP and CBD cases include not only advanced disease, but also preclinical and early‐stages. The decision tree classifiers were made using cases with a wide range of severity. Thus, we think our decision tree classifiers can be applied more widely by investigators using their case material.
In conclusion, the present study indicates that the semi‐quantitative scores of tau pathology in select brain regions are sufficient to distinguish PSP and CBD, regardless of the morphology of astrocytic tau lesions. The burden of tau‐positive tau threads in the caudate nucleus is the most decisive predictive factor for the diagnosis of CBD. The diagnostic flowchart created by a machine learning‐based decision tree classifier can be readily used by neuropathologists, which may assist decision‐making in the neuropathologic diagnosis of PSP and CBD.
AUTHOR CONTRIBUTIONS
Shunsuke Koga contributed to study concept and design; acquisition, analysis and interpretation of data; drafting of manuscript; execution of the statistical analysis and machine learning; writing of the first draft. Xiaolai Zhou contributed to acquisition, analysis and interpretation of data; review and critique. Dennis W. Dickson contributed to study concept and design; acquisition and interpretation of data; review and critique.
ETHICAL STATEMENT
Brain autopsies were obtained after consent of the legal next‐of‐kin or individuals with legal authority to grant autopsy permission. De‐identified studies of autopsy samples are considered exempt from human subject research by the Mayo Clinic Institutional Review Board.
Supporting information
ACKNOWLEDGEMENTS
The authors thank the patients and their families who donated brains to help further the scientific understanding of neurodegeneration. The authors also thank Virginia Phillips, Jo A. Landino Garcia, and Ariston L. Librero (Mayo Clinic, Jacksonville) for histologic support, Monica Castanedes‐Casey (Mayo Clinic, Jacksonville) for immunohistochemistry support. This work is supported by a Karin & Sten Mortstedt CBD Solutions Research Grant, and CurePSP, the Rainwater Charitable Trust, and the Jaye F. and Betty F. Dyer Foundation Fellowship in progressive supranuclear palsy research, as well as NINDS Tau Center without Walls (U54‐NS100693).
Koga S, Zhou X, Dickson DW. Machine learning‐based decision tree classifier for the diagnosis of progressive supranuclear palsy and corticobasal degeneration. Neuropathol Appl Neurobiol. 2021;47(7):931-941. 10.1111/nan.12710
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.
REFERENCES
- 1. Kouri N, Whitwell JL, Josephs KA, Rademakers R, Dickson DW. Corticobasal degeneration: a pathologically distinct 4R tauopathy. Nat Rev Neurol. 2011;7:263‐272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Hoglinger GU, Respondek G, Stamelou M, et al. Clinical diagnosis of progressive supranuclear palsy: The movement disorder society criteria. Mov Disord. 2017;32:853‐864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Steele JC, Richardson JC, Olszewski J. Progressive supranuclear palsy. A heterogeneous degeneration involving the brain stem, basal ganglia and cerebellum with vertical gaze and pseudobulbar palsy, nuchal dystonia and dementia. Arch Neurol. 1964;10:333‐359. [DOI] [PubMed] [Google Scholar]
- 4. Armstrong MJ, Litvan I, Lang AE, et al. Criteria for the diagnosis of corticobasal degeneration. Neurology. 2013;80:496‐503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Dickson DW, Bergeron C, Chin SS, et al. Office of rare diseases neuropathologic criteria for corticobasal degeneration. J Neuropathol Exp Neurol. 2002;61:935‐946. [DOI] [PubMed] [Google Scholar]
- 6. Hauw JJ, Daniel SE, Dickson D, et al. Preliminary NINDS neuropathologic criteria for Steele‐Richardson‐Olszewski syndrome (progressive supranuclear palsy). Neurology. 1994;44:2015‐2019. [DOI] [PubMed] [Google Scholar]
- 7. Litvan I, Hauw JJ, Bartko JJ, et al. Validity and reliability of the preliminary NINDS neuropathologic criteria for progressive supranuclear palsy and related disorders. J Neuropathol Exp Neurol. 1996;55:97‐105. [DOI] [PubMed] [Google Scholar]
- 8. Dickson DW. Neuropathologic differentiation of progressive supranuclear palsy and corticobasal degeneration. J Neurol. 1999;246 Suppl 2: II6‐II15. [DOI] [PubMed] [Google Scholar]
- 9. Yoshida M. Astrocytic inclusions in progressive supranuclear palsy and corticobasal degeneration. Neuropathology. 2014;34:555‐570. [DOI] [PubMed] [Google Scholar]
- 10. Komori T, Arai N, Oda M, et al. Astrocytic plaques and tufts of abnormal fibers do not coexist in corticobasal degeneration and progressive supranuclear palsy. Acta Neuropathol. 1998;96:401‐408. [DOI] [PubMed] [Google Scholar]
- 11. Mimuro M, Yoshida M. Chameleons and mimics: Progressive supranuclear palsy and corticobasal degeneration. Neuropathology. 2020;40:57‐67. [DOI] [PubMed] [Google Scholar]
- 12. Saijo E, Metrick MA 2nd, Koga S, et al. 4‐Repeat tau seeds and templating subtypes as brain and CSF biomarkers of frontotemporal lobar degeneration. Acta Neuropathol. 2020;139:63‐77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Esteva A, Robicquet A, Ramsundar B, et al. A guide to deep learning in healthcare. Nat Med. 2019;25:24‐29. [DOI] [PubMed] [Google Scholar]
- 14. Topol EJ. High‐performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44‐56. [DOI] [PubMed] [Google Scholar]
- 15. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402‐2410. [DOI] [PubMed] [Google Scholar]
- 16. Korbar B, Olofson AM, Miraflor AP, et al. Deep learning for classification of colorectal polyps on whole‐slide images. J Pathol Inform. 2017;8:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Signaevsky M, Prastawa M, Farrell K, et al. Artificial intelligence in neuropathology: deep learning‐based assessment of tauopathy. Lab Invest. 2019;99:1019‐1029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non‐small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24:1559‐1567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Tang Z, Chuang KV, DeCarli C, et al. Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline. Nat Commun. 2019;10:2173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Vizcarra JC, Gearing M, Keiser MJ, Glass JD, Dugger BN, Gutman DA. Validation of machine learning models to detect amyloid pathologies across institutions. Acta Neuropathol Commun. 2020;8:59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Vial A, Stirling D, Field M, et al. The role of deep learning and radiomic feature extraction in cancer‐specific predictive modelling: a review. Transl Cancer Res. 2018;7:803‐816. [Google Scholar]
- 22. Hayashi Y. The right direction needed to develop white‐box deep learning in radiology, pathology, and ophthalmology: a Short Review. Front Robot Ai. 2019;6:1‐8. 10.3389/frobt.2019.00024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Babapour Mofrad R, Schoonenboom NSM, Tijms BM, et al. Decision tree supports the interpretation of CSF biomarkers in Alzheimer's disease. Alzheimers Dement (Amst). 2019;11:1‐9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Braak H, Braak E. Neuropathological stageing of Alzheimer‐related changes. Acta Neuropathol. 1991;82:239‐259. [DOI] [PubMed] [Google Scholar]
- 25. Thal DR, Rub U, Orantes M, Braak H. Phases of A beta‐deposition in the human brain and its relevance for the development of AD. Neurology. 2002;58:1791‐1800. [DOI] [PubMed] [Google Scholar]
- 26. Hyman BT, Trojanowski JQ. Consensus recommendations for the postmortem diagnosis of Alzheimer disease from the National Institute on Aging and the Reagan Institute Working Group on diagnostic criteria for the neuropathological assessment of Alzheimer disease. J Neuropathol Exp Neurol. 1997;56:1095‐1097. [DOI] [PubMed] [Google Scholar]
- 27. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit‐learn: Machine learning in python. J Mach Learn Res. 2011;12:2825‐2830. [Google Scholar]
- 28. Ghirelli A, Tosakulwong N, Weigand SD, et al. Sensitivity‐specificity of tau and amyloid beta positron emission tomography in frontotemporal lobar degeneration. Ann Neurol. 2020;88:1009‐1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Vasilevskaya A, Taghdiri F, Multani N, et al. PET tau imaging and motor impairments differ between corticobasal syndrome and progressive supranuclear palsy with and without alzheimer's disease biomarkers. Front Neurol. 2020;11:574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Williams DR, de Silva R, Paviour DC, et al. Characteristics of two distinct clinical phenotypes in pathologically proven progressive supranuclear palsy: Richardson's syndrome and PSP‐parkinsonism. Brain. 2005;128:1247‐1258. [DOI] [PubMed] [Google Scholar]
- 31. Kovacs GG, Lukic MJ, Irwin DJ, et al. Distribution patterns of tau pathology in progressive supranuclear palsy. Acta Neuropathol. 2020;140:99‐119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Jecmenica Lukic M, Kurz C, Respondek G, et al. Copathology in progressive supranuclear palsy: Does it matter? Mov Disord. 2020;35(6):984‐993. 10.1002/mds.28011. [DOI] [PubMed] [Google Scholar]
- 33. Koga S, Sanchez‐Contreras M, Josephs KA, et al. Distribution and characteristics of transactive response DNA binding protein 43 kDa pathology in progressive supranuclear palsy. Mov Disord. 2017;32:246‐255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Koga S, Kouri N, Walton RL, et al. Corticobasal degeneration with TDP‐43 pathology presenting with progressive supranuclear palsy syndrome: a distinct clinicopathologic subtype. Acta Neuropathol. 2018;136:389‐404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Xia Y, Prokop S, Gorion KM, et al. Tau Ser208 phosphorylation promotes aggregation and reveals neuropathologic diversity in Alzheimer's disease and other tauopathies. Acta Neuropathol Commun. 2020;8:88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Ahmed Z, Bigio EH, Budka H, et al. Globular glial tauopathies (GGT): consensus recommendations. Acta Neuropathol. 2013;126:537‐544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Forrest SL, Kril JJ, Stevens CH, et al. Retiring the term FTDP‐17 as MAPT mutations are genetic forms of sporadic frontotemporal tauopathies. Brain. 2018;141:521‐534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Tacik P, DeTure M, Hinkle KM, et al. A novel tau mutation in exon 12, p. Q336H, causes hereditary pick disease. J Neuropathol Exp Neurol. 2015;74:1042‐1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.