Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Apr 14.
Published in final edited form as: Cell. 2026 Jan 29;189(3):956–968.e13. doi: 10.1016/j.cell.2025.12.036

Molecular features of human pathological tau distinguish tauopathy-associated dementias

Mukesh Kumar 1,13, Christoph N Schlaffner 1,2,13, Shaojun Tang 1,3,13, Maaike A Beuvink 1,13, Arthur Viode 4, Waltraud Mair 1, Meenakshi Jha 4, Ceren Uncu 1, Hendrik Wesseling 1, Tian Wang 1, Derek H Oakley 5, Pieter Beerepoot 1, Jie Xue 1, Theresa R Connors 5, David A Davis 6, Matthew P Frosch 5, Melissa E Murray 7, Salvatore E Spina 8, Lea T Grinberg 7, William W Seeley 8, Bruce L Miller 8, Adam L Boxer 9, Daniel H Geschwind 10, Kenneth S Kosik 11, Dennis W Dickson 7, Bernhard Y Renard 2, Michael DeTure 7, Ann C McKee 12, Bradley T Hyman 5, Hanno Steen 1,4,14, Judith A Steen 1,14,15,*
PMCID: PMC13075643  NIHMSID: NIHMS2150432  PMID: 41616780

SUMMARY

In Alzheimer’s disease (AD), pathological tau protein shows a progressive accumulation of post-translational modifications (PTMs), reflecting disease severity, progression, and prion-like activity. Although many neurodegenerative diseases with dementia display tau aggregates, the pathological proteoforms of tau protein from each disease type remain unknown. Here, using a quantitative mass spectrometry-based proteomics platform, FLEXITau, deep characterization of pathological tau protein isolated from the brains of 203 human subjects with AD, familial AD (fAD), chronic traumatic encephalopathy (CTE), corticobasal degeneration (CBD), Pick’s disease (PiD), progressive supranuclear palsy (PSP), dementia with Lewy bodies (DLB)—a non-tauopathy symptomatic control—and healthy controls (CTR) is performed. Unsupervised data analyses and supervised machine learning identify distinct molecular features of pathological tau for each disease, enabling molecular disease stratification. This study identifies potential disease-specific biomarkers and therapeutic targets for tauopathies and provides critical quantitative information for pharmacokinetic modeling required for therapeutic and disease mechanism studies.

In brief

The FLEXITau platform provides detailed, quantitative, peptide-resolved molecular maps of tau for six tauopathies, as well as symptomatic and asymptomatic controls. The data identify disease-specific molecular signatures using machine learning, enabling accurate classification of tauopathies and providing critical targets for diagnostic and therapeutic development.

Graphical Abstract

graphic file with name nihms-2150432-f0006.jpg

INTRODUCTION

This study was conducted to address knowledge gaps in tauopathies, where a lack of human molecular information hinders patient stratification and therapeutic development. Pathological tau protein aggregation is a shared feature in over 20 neurodegenerative diseases,14 collectively referred to as tauopathies, including Alzheimer’s disease (AD),57 familial AD (fAD),8 corticobasal degeneration (CBD),9 progressive supranuclear palsy (PSP),10 Pick’s disease (PiD),11 and chronic traumatic encephalopathy (CTE).12,13 The shared clinical phenotypes, pathological heterogeneity, and co-morbidities 1419 of these diseases, and other non-tauopathies such as dementia with Lewy bodies (DLB), impede early clinical differentiation. 2023 Current clinical or neuroimaging biomarkers cannot accurately or robustly detect and distinguish the different types of tauopathies.2 The development of positron emission tomography (PET) imaging reagents and other diagnostic modalities is hindered by a lack of information regarding the specific molecular features and absolute concentrations of pathological tau in different diseases. Cryo-electron microscopy (cryo-EM) studies have revealed distinct structures of tau filaments isolated from different tauopathies,2428 providing valuable information for identifying binding sites for small molecules and other reagents. However, the lack of biomarkers for most tauopathies obstructs drug discovery and robust therapeutic trials. These studies rely on tau’s primary amino acid sequences and do not account for protein variants/proteoforms. 29,30 A large body of research indicates that different patterns of tau post-translational modifications (PTMs)3134 and cleavages contribute to heterogeneity of tau structures,3537 disease progression,33,38 and clinical presentation39 in AD. This leads to the hypothesis that molecular features of pathogenic tau define the stages and the types of diseases. Consequently, a deep and quantitative characterization of tau protein aggregates propagating in various human tauopathies is required to understand the mechanisms of tau aggregation and determine whether and what differences exist.

Here, we employed quantitative mass spectrometry (MS)-based proteomics for the detailed mapping of the tau proteoform landscape across tauopathies, using insoluble pathological tau aggregates from multiple human tauopathies, including AD, fAD, CTE, CBD, PSP, and PiD. This analysis identified distinct isoforms, PTM patterns, and proteolytic cleavage sites of tau in different diseases, having implications for understanding tau proteoforms in the different tauopathies. Furthermore, unsupervised clustering and machine learning approaches identified disease-specific combinations of pathological tau features, enabling the accurate classification of different disease types. The molecular characteristics of pathological tau may enable the diagnosis, prognosis, and development of directed therapies toward multiple tauopathies.

RESULTS AND DISCUSSION

Study design

Brain tissues (cortical regions with intermediate pathological tau burden) from a cohort of 203 human subjects, classified pathologically as either tauopathy dementia (n = 165) or controls (n = 38), were obtained from six NIH brain banks across the USA. Disease groups with symptomatic dementia and tauopathy included AD (n = 42), fAD (n = 4), CTE (n = 24), CBD (n = 34), PiD (n = 20), and PSP (n = 41). The control groups included symptomatic non-tauopathy disease controls with synuclein pathology, i.e., DLB (n = 14), and non-demented healthy controls (CTR, n = 24) (Figure 1A). A secondary cohort of 142 human subjects, composed of AD (n = 28), CBD (n = 26), PiD (n = 28), PSP (n = 30), and CTR (n = 30), was included as a validation cohort. Detailed patient demographics for both cohorts are provided in Table S1.

Figure 1. Workflow and quantification of pathological insoluble tau and isoforms across tauopathies and control human subjects.

Figure 1.

(A) Schematic of the workflow used for the preparation and analysis of insoluble pathological tau in human subjects, including asymptomatic controls (CTR), symptomatic controls (DLB), and tauopathies (PSP, PiD, CBD, CTE, fAD, and AD). Pathological tau from post-mortem brain tissues of the subjects listed were analyzed by MS and subjected to PTM mapping and FLEXITau quantitative analyses. Unsupervised methods and machine learning were used to classify and derive key molecular features from the data.

(B–D) (A) Boxplots show quartiles one (Q1) and three (Q3), median line, and whiskers as minimum and maximum within 1.5 × (Q3 − Q1) from the quartiles. P values from a two-sided t test with FDR-based multiple testing correction and alpha level of 0.05. Outliers are defined as values beyond this range.

(B) Absolute insoluble pathological tau amounts were calculated using FLEXITau.

(C) 3R/4R ratios were calculated from data-dependent acquisition (DDA) data for each group of human subjects.

(D) Absolute amounts of 2N, 1N, and 0N isoforms were calculated using FLEXITau.

(E) A cumulative summary of all modifications of insoluble pathological tau identified in this study.

See also Figure S1.

MS-based discovery proteomics analysis of pathological sarkosyl-insoluble tau was carried out to identify PTMs, endogenous proteolytic cleavages, and proteoforms. In addition, the targeted MS analysis method, FLEXITau,40,41 was used to quantify absolute amounts of pathological tau, isoform abundance, and peptide-resolved PTM extent. The discovery data were curated to provide discrete (presence/absence) PTM and endogenous proteolytic cleavage information, while the targeted MS data produced continuous quantitative PTM extent information. These three datasets (PTM, proteolytic cleavage, and FLEXITau) were each divided into independent training and testing sets and utilized to build individual classifiers.

Quantification of insoluble pathological tau and isoforms

FLEXITau40,41 measured the absolute amount of pathological insoluble tau in each subject across disease groups (Figure 1B). The total pathological tau amounts, presented as median values in fmol per mg of wet weight (fmol/mg) of brain tissue, showed the following trend (high to low): fAD (∼3,450) > AD (∼1,220) > CTE (∼190) > CBD (∼150) > PSP (∼30) > PiD (∼30) > DLB (∼20) > CTR (∼15). The low median tau abundance for DLB confirms that this disease is truly a symptomatic control without tau pathology. However, two DLB subjects had significantly higher amounts of tau than control subjects, at 1,202 and 925 fmol/mg (Figure 1B), like the median of the AD samples. Revisiting the pathological reports of these subjects confirmed a secondary AD co-pathology.42

Six isoforms of human tau result from the alternative splicing of exons 2, 3, and 10. Exons 2 and 3 code for 1N and 2N, respectively, and the exon 10 inclusion codes for the 4R isoform. The 0N isoform excludes exons 2 and 3, and the 3R isoform excludes exon 10.43 The lowest median ratio of 3R/4R tau was observed for 4R tauopathies,44 including CBD (3R/4R ∼0.1) and PSP (3R/4R ∼0.3) (Figure 1C). This confirms that the dominant isoform observed for CBD and PSP is 4R. Similarly, in fAD (∼0.7), the 4R isoform is dominant, and AD (∼0.9) displays a ratio approaching 1. On the other hand, the highest mean ratio of 3R/ 4R was observed in PiD, a 3R tauopathy,45 at ∼1.6. In CTR, DLB, and CTE, the 3R/4R ratios were all ∼1.1. A summary of the underlying values and results from statistical tests from Figures 1B and 1C is provided in Table S2. Absolute quantification of 0N, 1N, and 2N isoforms shows that all tauopathies exhibit a higher amount of the 0N tau isoform compared with 1N and 2N isoforms in the insoluble fractions, resulting from splicing out exon 2 and 3 (Figure 1D). A summary of the underlying values and results from the statistical tests presented in Figure 1D are provided in Table S2. Figure S1A presents an overview of the quantification of 0N, 1N, and 2N isoforms using data-dependent acquisition (DDA) for the soluble fraction. The corresponding statistical results and underlying values are provided in Table S2. A deep, unbiased MS-based discovery proteomics methodology was used to identify PTMs on insoluble pathological tau. 145 PTM sites were mapped, including 66 phosphorylation, 27 ubiquitination, 28 acetylation, 13 citrullination, and 11 methylation sites. The cumulative map of identified PTMs across all subjects in the insoluble fraction is shown in Figure 1E. The corresponding map for the soluble fraction is presented in Figure S1B.

Patient frequency of tau PTMs and endogenous proteolytic cleavage

Patient frequency analyses are essential for identifying tauopathy disease-specific PTMs, PTMs common to all tauopathies, and nonspecific PTMs identified in controls. PTM and cleavage identification are dependent on the detectability of the peptide and the abundance of the peptide carrying the modification. This frequency measurement is binary and does not reflect the stoichiometry of that PTM or cleavage site. A PTM/cleavage site must be identified at least once in every patient of a disease group to have a 100% patient frequency. Adjacent modifications may be found on one or both possible sites. Figure 2A shows patient frequencies of PTMs and patient frequencies of proteolytic cleavages in Figure 2B. A minimum % patient frequency of 10% was required for a PTM in a single disease group to be included in the chart as a molecular feature. The MS data can localize a cleavage site to a single amino acid. However, we observed “cleavage hotspots”—peptide cleavage products with multiple cleavage sites within a sequence. Cleavage is denoted by annotating the amino acid following the cleavage site. Detailed information on the observed PTMs and cleavage sites is provided in Table S3.

Figure 2. The subject frequency of insoluble tau PTMs and proteolytic cleavages across human subjects highlights common and specific features between disease groups.

Figure 2.

(A) The frequency of PTM occurrence in each group, including CTR, DLB, PSP, PiD, CBD, CTE, fAD, and AD, is identified as a percentage of subjects with the specified PTM.

(B) The frequency of proteolytic cleavages across human subjects is identified as a percentage of subjects with a cleavage/s at a site or a “hotspot” of multiple sequential cleavages. Start and end sites are highlighted for each region. The color legends below denote the percentages of subjects with the PTM and proteolytic cleavage identified. Specific modification identities are color coded and provided with the 2N4R amino acid number. High PTM or cleavage frequency across subjects does not indicate high per-molecule stoichiometry. PTMs at the same site could reflect different patients rather than co-occurring modifications.

The only high-frequency PTM that appears across all control and disease groups is phosphorylation at serine 202 (pS202). All other phosphorylation sites, including threonine 181 (pT181), pT231, pS262, pT403, and pS404, show varying frequencies between 10% and 100%, with CTE, fAD, and AD having frequencies >80% for these phosphorylation sites. Unique PTMs exhibiting high frequencies in one or two disease groups were also observed.

For example, acetylation at lysine 311 (acK311) and 370 (acK370) was identified at a frequency of greater than 70% in fAD, AD, and CTE; however, acetylation at lysine 281 (acK281) and ubiquitination at lysine 353 (ubK353) occurred at a higher frequency in CBD. Citrullination at arginine 406 (citR406) is highly frequent only in AD and fAD, whereas ubiquitination at lysine 385 (ubK385) is only present in fAD.

Next, we analyzed the endogenous proteolytic cleavages of tau across all subjects (Figure 2B). The most frequent cleavage sites, “hotspots,” in the tauopathies include 7–13, 103–115, 227–229, 244–247, 300–312, 323–332, 355–361, 385, and 408–409 (numbers denote the amino acid positions in 2N4R tau). The diseases with the highest total pathological tau showed high-frequency cleavages (>50%); for example, cleavages at 355–361 were found only in CBD, CTE, AD, and fAD. Furthermore, other cleavages appear more disease specific, as cleavages between 300–312 were primarily detected in CTE, AD, and fAD, whereas cleavage at 279–280 was only observed in fAD. These data suggest that PTMs and cleavage sites could be harnessed as molecular features to characterize different diseases.

Unsupervised clustering of subjects based on PTMs and proteolytic cleavages

Unsupervised hierarchical clustering of binary data (presence/absence of PTMs/cleavages) was used to investigate whether PTMs or cleavages could distinguish among different disease groups (Figure 3). For this analysis, fAD and AD were considered a single patient group called AD. Unsupervised clustering of the patients with PTMs/cleavages ordered from N-terminal (N-term) to C-terminal (C-term) can be found in Figure S2. Clustering of the PTM data resulted in six distinct clusters of subjects (Figure 3A: P1–P6). The subject groups were clustered by the increasing presence of PTMs, with the P1–P4 group displaying the fewest and the P5–P6 group the most PTMs. P1 is the subcluster with the least PTMs and is dominated by symptomatic and asymptomatic control subjects with five or fewer PTMs observed per subject. Additional phosphorylation sites in the proline-rich region (PRR) were observed in cluster P2, representing a group of mixed neuropathology diagnoses mainly consisting of PSP and CBD patients. Although the overall number of PTMs increases in P3 and P4, the clustering of P3 is driven by additional acetylation and ubiquitination sites observed in CBD patients, which dominate P3. The mixed neuropathology cluster P4 is distinguishable from the P5 CTE and the P6 AD cluster, showing less phosphorylation in the PRR and C-terminal domains. Clustering of P5 and P6, namely CTE and fAD/AD, identified subjects with the greatest number of PTMs. Cluster P5 contains subjects with CTE and lower Braak stage AD, whereas subcluster P6 comprises subjects with highest Braak stage AD. The two subjects with DLB neuropathology and comorbid AD were also found in this cluster.

Figure 3. Pathological state of human subjects is revealed by the hierarchical clustering of insoluble tau PTMs and proteolytic cleavage data.

Figure 3.

Each row represents a human subject, and each column in the matrix is a single PTM (A) or cleavage (B) event. To indicate the enrichment of specific disease groups and features in individual clusters, we provide color-coded PTM and region identities above and below the matrix. The Braak stages and the neuropathological diagnosis (NPDX) are provided to the right of the matrix. Color codes for the identity of the PTMs, tau region, matrix, Braak stages, and disease groups are in the legend below the cluster panels. Heatmaps without clustering of PTMs and cleavages is shown in Figure S4.

(A) Each column is assigned to one PTM, with the filled pixel denoting its presence. Unsupervised clustering of binary PTM data (presence/absence) produced six clusters with distinct and increasing numbers of PTMs per subject.

(B) Binary proteolytic cleavage data (presence/absence) were subjected to the same Euclidean distance measure clustering analysis, producing six clusters dominated by specific groups of patients.

Like the hierarchical clustering of PTMs, the unsupervised analysis of the presence and absence of proteolytic cleavage sites identified six distinct clusters (Figure 3B, C1–C6). The subjects with the least cleavage sites were found in clusters C1 and C2, dominated by PSP subjects and symptomatic/asymptomatic (DLB/CTR) controls. The C3 and C4 clusters are mixed; however, C4 mainly comprises subjects with AD, CTE, and CBD. On the other end of the cleavage spectrum are the C5 and C6 clusters, with the most cleavage sites. The larger cluster C5 contains AD, CTE, and the two DLB subjects with comorbid AD. C6 consists entirely of AD samples, with all four fAD subjects falling within this cluster. The tau PTM and cleavage profiles indicate the presence of multiple distinct proteoforms in aggregates across these diseases. These modified tau versions probably affect the structure of fibrils, as recently shown by the cryo-EM studies performed on pure synthetic modified versions of tau, where tau fragment sequences and phosphorylation affect the tau structure.24,35,40,41 Data from AD subjects at multiple stages of the disease show that the PTMs accumulate sequentially, the number of modifications increases with disease severity, and the final paired helical fragments reflect the end-stage fibrils. The mixed clusters are likely from subjects whose tau aggregates represent intermediate proteoforms and are not yet dominated by the disease-specific form. Although the binary PTM and cleavage data are not quantitative, they remain valuable in highlighting key PTMs and cleavage patterns that emerge distinctly across disease groups. Additional studies could help to clarify the regional relevance of each tau modification and its potential contribution to aggregation and disease mechanisms.

Unsupervised clustering of subjects based on FLEXITau data

To determine whether the quantitative proteomics data would provide novel insights into the disease groups, we collected and analyzed FLEXITau data. FLEXITau measures the modification extent of detected tau peptides from the N to the C terminus, providing a peptide-resolved quantitative view of the tau protein.40,41 The heatmap provided in the clustering diagram thus shows the modification extent by measuring unmodified peptides from N to C termini. The peptides with the highest abundance and the lowest modification extent are blue, while the peptides with the lowest abundance and hence the highest modification extent are red. The low abundance of peptides can result from chemical modification, proteolytic cleavage, or splicing differences. The unsupervised clustering of the FLEXITau quantitative peptide profiles identified six clusters, F1–F6 (Figure 4). Cluster F1 consists almost entirely of CTR and DLB subjects and correlates with the least insoluble pathological tau. Tau cluster F2 consists primarily of subjects diagnosed with PiD. This PiD group shows a higher modification extent for tau peptides 195–209, 299–317, and 384–395 than the subjects in the F1 cluster. Additionally, PSP subjects dominate the F3 cluster. This cluster is defined by a higher abundance of 4R unmodified tau peptides, specifically those ranging from amino acids 275–281, 281–290, and 299–317, which are from exon 10, compared with cluster F1 (control subjects) and F2 (PiD). In the PSP group, a greater modification extent is observed for peptides in the PRR (175–180, 181–190, 195–209, and 210–224) and the C-terminal peptide 384–395, which distinguishes it from the controls. PSP is distinguishable from all other tauopathies by the N-terminal 6–24 peptide, which is not modified, as also observed for control subjects. CBD subjects group together in cluster F4 and display a higher abundance of unmodified tau peptides of amino acids 275–281, 281–290, and 299–317, reflecting their classification as a 4R tauopathy. In contrast to PSP, CBD shows an even higher abundance of these peptides in the 4R domain and shows a higher modification extent at the N and C termini. In addition, the CBD cluster differs from the PSP cluster by showing a higher modification extent of the 210–224 peptide. The CBD cluster also differs from all other clusters by having a lower abundance, and thus a greater extent of modification of peptide 354–370, which could be explained by ubiquitination sites shown in Figure 2. Cluster F5 contains essentially CTE subjects and differs from AD and CBD by showing a higher abundance of unmodified peptides spanning amino acids 175–180, 181–190, and 260–267. The PRR is highly phosphorylated in AD and CBD. Lastly, cluster F6 contains mainly AD subjects. It shows the highest degree of modification (∼70%–95% on average) across most peptides in tau, with enrichment (low modification extent) of the microtubule domain, compared with control subjects for peptides 341–347 and 354–370. The FLEXITau data from the soluble fraction are shown in Figure S3. These data shows no disease-related clustering, highlighting that this fraction is largely homogeneous.

Figure 4. Clustering of human disease groups using insoluble fraction FLEXITau data.

Figure 4.

FLEXITau measures the extent of modification for each peptide, and these data include contributions from cleavage and chemical modification. Each row represents one patient, and each column represents the modification extent of one peptide measured by FLEXITau. The separation into six clusters highlights distinct patient groups that overlap with the neuropathology diagnosis. The legend provides the modification extent as a colored scale, showing 100% as red and 0% as blue. The total tau amount and the Braak stage are provided to the right of the matrix. The legend denotes the identity of each patient group, listed using the color code.

See also Figure S3.

The FLEXITau clustering analysis uses continuous quantitative data, measuring the modification extent by both PTM and cleavage instead of discrete data regarding the presence or absence of PTMs and cleavage. The clustering analysis using FLEXITau is thus more successful at separating diseases, as the quantitative data account for the domain-specific PTMs, the cleavages, and the extent of modification.

Supervised classification of subjects using machine learning

The distinct patterns of PTMs, cleavage, and FLEXITau modification extent displayed by the unsupervised hierarchical clustering analyses suggest that supervised classification employing machine learning could be used to build classifiers for diagnostic and therapeutic purposes for future patient samples. Supervised classifiers use a training dataset of known disease groups and features to identify distinctive sets of features that can categorize patient groups. Such a classifier separates a specific tauopathy from all other diseases using selected molecular features, which provide excellent targets for companion diagnostics and therapeutic intervention toward specific tauopathies.

Individual classifiers were trained for each data type and tauopathy with at least five samples, resulting in three sets of 6 classifiers. The symptomatic and healthy controls were combined into a single control group. Different classification methods were tested, including gradient boosting, support vector machines, regression, and random forest (RF), and at least ten models were trained for each disease and dataset, as shown in Figure 5A and Table S4. RF was identified as the most successful classifier across all disease groups, showing high sensitivity and specificity, with an average area under the operator receiver curve (AUC) of 0.86 ± 0.13 (SD) across all three datasets, where the pairwise comparison was between a specific tauopathy compared with a balanced mixed tauopathy and a control group (Figures 5B5D).

Figure 5. Key disease-specific features of insoluble tau identified by machine learning.

Figure 5.

(A) A schematic of the classifier development, including the training and test phases. Each data type (PTM, including 4 isoform peptides; cleavage, including 4 isoform peptides; and FLEXITau) was divided into a training and test set. The training set was used to construct a computational classifier to discriminate each category using different methods, including RF. The modeled classifier was then applied to the independent test dataset to evaluate its performance.

(B–D) The results of the testing for three data types are shown in (B), (C), and (D). AUC and TSNE distribution of disease groups using PTM (B), cleavage (C), and FLEXITau (D) data. The performance of classifiers for each patient category was assessed on an independent test set, and the AUCs for each category are provided to the left of each TSNE distribution.

(E) Tau key molecular features associated with pathological tau across the different tauopathies. Pathological tau levels in CTR, PSP, PiD, CBD, CTE, and AD are arranged from low to high abundance. Key PTMs and cleavage sites are mapped using color-coded markers on the 2N4R tau isoform. Filled circles and triangles indicate that the presence of this modification, and cleavage is important for the disease classifier. In contrast, circle and triangle outlines indicate the importance of the absence of individual PTMs and cleavages. A blue to red gradient represents the modification extent across protein domains (0%–100%).

See also Figures S4 and S5.

The PTM classifier separated AD, CTE, and CBD from the other tauopathies with AUC values above 0.84, as highlighted through the t-distributed stochastic neighbor embedding (TSNE) representation in Figure 5B. AD and CTE groups also clearly separate from the other diseases and controls based on cleavage (Figure 5C), highlighting that proteolytic cleavage is the most prevalent feature in these two tauopathies. Lastly, the FLEXITau data provided the most apparent distinction between groups (Figure 5D), emphasizing the value of continuous quantitative data to classify and stratify patients accurately. To validate the classifications, the classifiers for FLEXITau were retrained on a smaller set of 11 features and tested on an independent cohort, as shown in Figure S4, highlighting comparable classification performance between cohorts. Detailed patient demographics of this independent cohort are provided in Table S1. In summary, all classifiers are excellent at discriminating specific tauopathies from both symptomatic and healthy controls. For PiD, complementing the FLEXITau classifier with the PTM classifier would improve specificity.

The classifiers also enable the identification of the most important features driving the classification and Figure 5E summarizes the most statistically significant features identified for each disease, providing insights relevant to both diagnosis and therapeutic considerations. Details of individual classifiers are presented in Figure S5, which highlights the top 5 features identified from each data type—PTMs, cleavage sites, and FLEXITau—showing that each disease is distinct and can be differentiated by the specific set of features identified. Some features, such as pT231, were found to be significantly absent in the control subjects; pT231 is present in all tauopathy disease subjects but does not contribute to the classification of the individual tauopathies.

The schematic in Figure 5E compares tau pathology across diseases, arranged from the lowest to the highest levels of insoluble pathological tau. The control (CTR) group, mixed with the symptomatic controls, DLB, serves as a baseline, exhibiting minimal tau pathology. PSP has ∼2 times more tau aggregates than CTR and features the fewest modifications and cleavage sites. PiD also displays ∼2 times the amount of aggregated tau relative to CTR; however, the phosphorylation extent is slightly higher than CTR and PSP tau.

The absolute abundance of tau (Figure 1B) was included in the classifier input and is significant for AD and control. We depict the median abundance ranking in the graphical abstract. CBD displays ∼10 times more pathological tau than CTR, characterized by extensive phosphorylation, citrullination of the PRR domain, and acetylation and ubiquitination of the microtubulebinding domain (244–365), alongside C-terminal cleavage. CTE exhibits even greater pathology, with ∼12 times more tau aggregation than CTR. AD shows the most severe pathology, with tau aggregation levels ∼81 times higher than CTR, accompanied by the most extensive PTMs and cleavage events. A common phenomenon is that tau cleavage is associated with a significant increase in pathological tau. In addition, charge neutralization of the microtubule binding domain (MBD) by acetylation and ubiquitination likely stabilizes tau aggregates, potentially resulting in the highest levels of pathological tau.

Conclusions

Although it has been established that the protein structure of fibrils in each tauopathy is unique,2428,37,46,47 the protein chemistry of tau across tauopathies has not been explored. Here, using our FLEXITau platform, which includes untargeted and targeted MS-based proteomics, we provide a detailed map of tau, PTMs, cleavage patterns, and patient frequency of PTMs and cleavages in each disease group. The platform further specifies the absolute tau abundance and the modification extent (stoichiometry) from the N to C termini of tau. Unsupervised clustering methods result in the clustering of patients based on the data provided without considering the clinical diagnosis. Interestingly, the FLEXITau data, which provides the extent of modification (cleavages and PTMs) for each peptide, is the most effective in separating groups. Machine learning methods are then used to train classifiers that can distinguish the patient groups and provide key chemical and cleavage sites important to disease classification. This information is critical for disease stratification, diagnosis, and developing therapeutic molecules, as explained below.

Proteins are nanomachines optimized to carry out specific functions or enzymatic reactions; however, PTMs and cleavages can tune or drastically change the physico-chemical properties, structure, and hence function of these nanomachines. In humans, the tau protein has six isoforms that stabilize microtubules—the highways of transport within a cell body, an axon, or a dendrite—enabling the distribution of mitochondria, nutrients, mRNAs, and exosomes. For example, isoforms of tau exhibit functional differences in binding affinity to microtubules, depending on whether they are the 3R or 4R isoforms.4850 The 145 PTMs and 195 cleavage sites that were mapped across tau change the molecular properties of the protein, meaning that the primary sequence of tau and its chemistry differ in health and with PTMs in disease. These changes result in distinct disease proteoforms of tau, where physicochemical features, including degrees of freedom as well as steric and ionic properties, are likely to affect tertiary and quaternary structures. Importantly, the study finds that each disease has a dominant and specific proteoform, which may also affect the fibril structure.

This study highlights the observation that a single gene and its isoforms result in multiple protein chemistries by the action of specific pathways involving enzymes that can modify the properties of the primary amino acids by introducing protein PTMs and cleavages. These modifications tune or modify the function of proteins. These enzymes are writers if they add a chemical group to the protein and erasers if they remove a chemical group. Writers and eraser molecules are induced or shut down during normal development, aging, and stress conditions. The fact that the PTMs, the cleavage patterns of tau, are particular to each disease, suggests that writers and erasers are likely induced by different pathways for each tauopathy. In addition, the differences in tau chemistries suggest that tau’s properties are distinct and can contribute to the variations in disease progression and severity observed for each disease.

Current MS instrumentation is extremely sensitive, enabling the detection of PTM-modified peptides at low concentrations. This high sensitivity is crucial for assessing the frequency of the modification across a patient group, and the stoichiometry of that modification, as it provides insights into the relevance of specific PTMs and cleavage sites to disease.51 Thus, the absolute quantification of tau and the extent of modification data, together with the machine learning approach, is essential to identifying key features.

Our analyses show that in AD, CTE, and CBD, tau is extensively cleaved in the fibrils. The cleavage removes the N regions and enriches the MBD. This suggests that the N region interferes with protein aggregation and that the majority of tau in the aggregates contains full-length 0N or tau, with cleavage sites flanking the MBD. Interestingly, tauopathies with cleaved tau contain the highest concentration of fibrils, suggesting that the cleavage contributes to the stability of the fibril, together with the charge-neutralizing MBD modifications, including ubiquitin and acetylation, identified for AD and CTE. For decades, in vitro studies have shown that tau lacking its N and C termini forms fibrils more readily than full-length tau, likely due to an entropic effect arising from fewer degrees of freedom. At the other end of the tauopathy spectrum, PSP tau proteoforms show the least number of modifications, with only phosphorylation in the PRR and C-terminal domain, and display minimal cleavage. Further, given that the abundance of fibrilized tau is low, this PSP tau proteoform does not fibrilize effectively. However, PSP is associated with younger patients and progresses faster than sporadic AD.52 Thus, the modifications observed in PSP likely drive disease progression. One interpretation is that the PSP tau forms unstable fibrils, which are mostly oligomeric—the most toxic form, as it can be transferred from cell to cell. This raises the possibility that the rapid formation of stable tau fibrils may be protective and less prone to spreading53 —a testable hypothesis.

In conclusion, the level of detail gleaned with the FLEXITau platform enables the identification of key molecular features of tauopathies that are critical for designing diagnostics and therapeutics. Clinical pathological analyses for the post-mortem diagnosis of tauopathy use a single antibody, namely AT8, to the S202 and T205 phosphorylation sites, which measure patterns of staining across different brain regions for definitive diagnosis. The identified features from the classifiers in Figure 5E will be key to expanding the repertoire of reagents in the future, toward staging and understanding each disease. Further, the quantitative information for proteoforms identified with PTMs and cleavage sites is critical for developing PET ligands and body-fluid diagnostic assays. The quantitative and qualitative data for pathologic and normal tau across patients provided here allow chemists to design molecules and model dosage, kinetics, and affinity for candidate PET ligands, biologics, and small-molecule therapies. To advance the field toward precision medicine, this comprehensive characterization of tau proteoforms across tauopathies provides a molecular foundation for developing disease-specific interventions targeting each condition’s unique pathological signatures.

Limitations of the study

This study presents several considerations that should be acknowledged to provide context for the findings. First, the study benefits from analyzing a cohort of 203 post-mortem brain tissue samples across eight different subject groups. Although this represents a substantial dataset, expanding the cohort size could enhance the statistical power and robustness of biomarker identification. Specifically, increasing the size and balance of disease groups could help provide a more comprehensive representation of each group, strengthening the reliability and relevance of findings across different tauopathies. For example, classifier performance was robust for AD, CBD, and controls but more variable for rare tauopathies such as PSP and PiD, highlighting the need for larger, balanced cohorts to improve generalizability. Another point to consider is the use of post-mortem brain tissue, which offers valuable insights into detailed proteomic analysis. Although our previous identification of pT217 as an early disease feature distinguishing control from AD patients33 is now an FDAapproved clinical biomarker of AD in blood, this identification required several other studies to test and validate pT217 as a biomarker.5456 The features identified require future validation in more accessible sample types, such as blood or cerebrospinal fluid, to extend the clinical relevance of the findings and facilitate their broader application as biomarkers.

RESOURCE AVAILABILITY

Lead contact

Further information and requests for resources and reagents should be directed to the lead contact, Judith A. Steen (judith.steen@childrens.harvard.edu).

Materials availability

No new, unique reagents were generated in this study.

Data and code availability

  • Data: the MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE57 partner repository with the dataset identifiers PRIDE: PXD069495; PRIDE: PXD069437; PRIDE: PXD069249; and PRIDE: PXD069217 and are publicly available as of the date of publication.

  • Code: all code and intermediary data files for the processing and analysis of the MS data and for data visualization are available via Zenodo at https://doi.org/10.5281/zenodo.17712260 and are publicly available as of the date of publication.

  • Any additional information required to reanalyze data reported in this paper is available from the lead contact upon request.

STAR★METHODS

EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS

In total, 203 human participants with AD, CTE, CBD, PiD, and PSP and symptomatic (DLB) and asymptomatic controls (CTR) were selected for the first cohort. Of these, 109 brain tissue samples were selected from the Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank, 31 from the Boston University Chronic Traumatic Encephalopathy Center (VA-BU-CLF) through the UNITE brain bank, 8 from the Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital, 30 from the Mayo Clinic Brain Bank, 19 from the Human Brain and Spinal Fluid Resource Center (HBSFRC), and 6 from the University of Maryland Brain and Tissue Bank (UMD BTB). The second cohort for FLEXITau analysis encompasses 142 human participants with AD, CBD, PiD, PSP, and CTR. Of these, 55 brain tissue samples were selected from the Human Brain and Spinal Fluid Resource Center (HBSFRC), 39 from the Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital, 35 from the Neurodegenerative Disease Brain Bank (NDBB) at University of California, San Francisco, 7 from the University of Maryland Brain and Tissue Bank (UMD BTB), and 6 from the University of Miami Brain Bank Cohort (MBB). Samples were collected at the respective brain banks, with informed consent of patients or their relatives and approval of local institutional review boards. Samples were selected based on the following criteria: (1) clinically diagnosed with dementia due to probable aforementioned diseases; (2) diagnosis of any aforementioned diseases confirmed post-mortem by a neuropathologist; and (3) Braak neurofibrillary tangles (NFTs) stage V or VI for AD as determined by the location of NFTs with a total Tau immunostain and Bielchowsky’s silver stain.2022 Age at death, post-mortem interval, and sex were also collected, and the patient demographics summary is listed in Table S1. Autopsy tissues from human brains were collected at the respective brain banks with the informed consent of patients or their relatives and the approval of local institutional review boards.

Demographics of Cohort

This study utilizes existing autopsy brain tissue from de-identified individuals obtained from public repositories, which were characterized and diagnosed by pathologists from NIH/NIA-funded biobanks that strictly adhere to privacy and IRB stipulations. As such, these samples are archived with IRB consent and approval and distributed without any identifying information.

Sex and/or Gender

All samples from the biobanks included balanced numbers of each sex whenever possible. Brain tissues from subjects with frontotemporal degeneration are rare; therefore, obtaining equal numbers of subjects of each sex is sometimes not feasible. The exception is Chronic Traumatic Encephalopathy (CTE) for sociopolitical reasons explained below.

CTE subjects in brain banks are male sportsmen and veterans. Females were not included in the study because samples are currently unavailable. The participation of females in contact sports only began in 1972, following the passage of Title IX, the Education Amendments Act, which mandated the participation of women in all sports, including contact sports, which had previously been underrepresented, as such participants have not reached the age at which tauopathy would result in death. Additionally, since the ban on women in combat was only lifted in 2015; therefore, female subjects with CTE have also not reached an age at which they would donate their brain tissues. It is possible that the lack of CTE female subjects in this study affected the CTE cohort results. Including women in future studies on chronic traumatic encephalopathy could potentially yield different results from those observed in this study.

Age and developmental stage of subjects

The incidence of tauopathy increases with age, and subjects with tauopathy have an average age of 73 years. To ensure that differences between control and disease subjects were not due to age, age-matched control subjects were used whenever possible. PCA analyses of the study data do not reveal age-dependent clusters within the cohort.

Ancestry, Race, and Ethnicity

The NIH tissue banks do not exclude tissue donations from individuals based on ancestry, race, or ethnicity; however, the cohort studied is likely largely Caucasian, given the demographics of donors to the NIH brain banks. The biobanks providing the tissues did not supply us with race, ethnicity, or ancestry, so no data analysis could be performed to evaluate the influence of these factors in the study. It is possible that this factor may bias the results of this study.

METHOD DETAILS

Preparation of the sarkosyl fractions

Approximately 150 mg post-mortem brain tissue was homogenized using a Precellys® 24 tissue homogenizer (5500 speed, 3 cycles of 20 sec with a pause of 30 sec in between) in 640 μL TBS lysis buffer containing 50 mM Trizma hydrochloride (Tris-HCl), 150 mM sodium chloride (NaCl), 0.5 mM magnesium sulfate (MgSO4), 10 mM ethylenediaminetetraacetic acid (EDTA), 10 mM ethylene glycol tetraacetic acid (EGTA), 1 mM dithiothreitol (DTT), 10 mM nicotinamide, 2 μM Trichostatin A, phosphatase inhibitor cocktail 2 and 3 (Sigma), and phosphatase and protease inhibitor cocktail tablets (Roche). Tissue lysates were centrifuged at 14000 rpm for 20 min at 4 °C. The supernatant (S1) was transferred to a new tube. The pellet was resuspended in 640 μL of 1X salt sucrose buffer, containing 0.8 M NaCl, 10% sucrose, 10 mM Tris-HCl, 1 mM EDTA, 1 mM EGTA, 10 mM nicotinamide, 2 μM Trichostatin A, 1 mM DTT, phosphatase inhibitor cocktail 2 and 3 (Sigma), and phosphatase and protease inhibitor cocktail tablets (Roche). The samples were homogenized as above and sonicated on a Qsonica sonicator (time 30 sec, pulse 10 sec with 5-sec pause and an amplitude of 20%, 1 round for each sample). The samples were centrifuged as above, and the supernatant (S2) was transferred into a new tube. To supernatant S1, 640 μL of 2X salt sucrose buffer was added to get the same concentration of salt sucrose buffer in both the S1 and S2 samples. A final concentration of 1% sarkosyl was added to each sample and incubated at 25°C for 90 min at 300 rpm. Samples were transferred to microfuge polypropylene tubes and ultra-centrifuged at 50000 rpm for 90 min at 4 °C. The supernatant, containing the sarkosyl-soluble proteins, was transferred to a new tube, and the pellet containing pathogenic insoluble filament Tau was stored at −80 °C until further processing.

Filter-aided sample preparation of the sarkosyl fractions

Sarkosyl-insoluble and sarkosyl-soluble fractions were further processed and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Sarkosyl-insoluble fractions were resuspended in 100 μL pellet buffer, containing 50 mM Tris-HCl, 5% sodium dodecyl sulfate (SDS), 10 mM nicotinamide, 2 μM Trichostatin A, 8 M urea, phosphatase inhibitor cocktail 2 and 3 (Sigma), and phosphatase and protease inhibitor cocktail tablets (Roche).

Total protein concentration was determined by bicinchoninic acid assay (Pierce BCA Protein Assay Kit, Thermo Fisher Scientific, Waltham, MA, USA). Approximately 50 μg of the total protein sample was digested using the filter-aided sample preparation (FASP) method as previously described,58 and outlined in detail below.

Total protein was denatured using 250 μL 8M urea (urea dissolved in 50 mM ammonium bicarbonate (ABC)) and disulfide bonds were reduced using 30 μL 10 mM dithiothreitol. The samples were incubated for 30 minutes, at 37 °C, at 600 rpm on a thermomixer. Samples were transferred to 10 kDa molecular weight cut-off spin column filters (Milipore) and centrifuged at 13400 rpm, at 23 °C for 30 minutes. Afterwards, 200 μL 8M urea was added to the sample and again centrifuged at 13400 rpm, 23 °C for 25 minutes. Cysteines were alkylated using 1% acrylamide and samples were incubated for 30 minutes at 23 °C, 600 rpm, in the dark on a thermomixer. Samples were centrifuged for 20 minutes, at 13400 rpm, at 23 °C. 100 μL 8M urea was added, followed by 15-minute centrifugation at 13400 rpm, at 23 °C. This step was done twice. Followed by four washing steps with 100 μL 50 mM ABC, followed by centrifugation for 15 minutes at 13400 rpm and 23 °C. Reduced and alkylated protein mixtures were digested with the endoproteinase trypsin (sequencing grade modified trypsin, Promega, Madison, WI, USA) overnight at 37 °C with a protease: protein ratio of 1:25 (w/w).

The following day, the Millipore filters were transferred to a new tube, and were centrifuged for 10 minutes, at 13400 rpm, at 23 °C. Peptides were eluted first with 50 μL 50 mM ABC, and finally with 0.5 M NaCl (pH 8.5), and centrifuged for 20 minutes at 13400 rpm for each step.

Peptides eluates were acidified with formic acid (FA) and desalted using reversed-phase C-18 microspin columns (SEMSS18V, Nest Group, Ipswich, MA, USA). Peptides were vacuum-dried and reconstituted in sample buffer (5% formic acid, 5% acetonitrile) compatible with LC-MS for its subsequent analysis.

Mass spectrometric analysis of the sarkosyl fractions

Samples were randomized and analyzed using a timsTOF Pro mass spectrometer coupled with an ultra-high-pressure nano-flow liquid chromatography nanoElute system (Bruker, Bremen, Germany). Peptides were loaded onto a reversed-phase Aurora Series C18 analytical column (25 cm x 75 μm ID, 1.6 μm C18) fitted with a captive spray insert (IonOpticks, Fitzroy, Australia). The column temperature was maintained at 50 °C, and mobile phases A (2% acetonitrile and 0.1% formic acid in water) and B (0.1% formic acid in acetonitrile) were used to separate the peptides at a constant flow of 400 nL/min using a linear gradient starting from 0% to 30% B in 90 min, followed by an increase to 80% B within 10 min, followed by washing and re-equilibration for 20 min. Mass spectra were acquired on a hybrid trapped ion mobility spectrometry (TIMS) – quadrupole time of flight (TOF) mass spectrometer (timsTOF Pro) with a modified nano-electrospray CaptiveSpray ion source (Bruker, Bremen, Germany). The mass spectrometer was operated in parallel accumulation-serial fragmentation (PASEF) mode. Full mass spectra were acquired in a mass range of 100–1700 m/z and an ion mobility (1/k0) range of 0.60–1.60. Ten PASEF MS/MS scans per topN acquisition cycle were acquired.

Mass spectrometric analysis for secondary cohort was carried out as previously described. 33 Briefly, mass spectrometric data was acquired on a Q Exactive mass spectrometer (Thermo) coupled to a micro-autosampler AS2 and a nanoflow HPLC pump (Eksigent). Peptides were separated using an in-house packed C18 analytical column (Magic C18 particles, 75 μm x 15 cm; AQUA C18/ 3 μm, Michrom Bioresource) by a linear 120 min gradient starting from 95% buffer A (0.1% (v/v) formic acid in HPLC-H2O) and 5% buffer B (0.2% (v/v) formic acid in acetonitrile) to 35% buffer B. A full mass spectrum with resolution of 70,000 (relative to an m/z of 200) was acquired in a mass range of 300–1500 m/z. The 10 most intense ions were selected for fragmentation via higher-energy c-trap dissociation.

FLEXITau standard preparation

FLEXITau analysis was performed as previously described,40,41 with one notable difference is that for the primary cohort the 0N and 1N isoform specific peptides were part of the FLEXITau construct.

In brief, the heavy labelled FLEXITau standard was expressed in a lysine and arginine dual auxotrophic strain of E. Coli using a MDAG-135 synthetic media59 (containing 1 M magnesium sulfate (MgSO4 ), 1000X metals, 20% glucose, 25% aspartate (containing aspartic acid and sodium hydroxide (NaOH)), 50xM (containing sodium phosphate dibasic (Na2HPO4), potassium phosphate monobasic (KH2PO4), ammonium chloride (NH4Cl), and sodium sulfate (Na2SO4)), 15 amino acids (10 mg/mL, no C, M, Y, K & R)), methionine (25 mg/mL), 10 mM thiamine and kanamycin (30 mg/ml)), supplemented with heavy isotope labelled lysine and arginine amino acids. The expression was induced by IPTG and the cells were harvested after 4 hours induction at 37 °C. The cells were centrifuged at 5000 rpm, at 4 °C for 10 minutes and the supernatant was discarded. The pellet was resuspended in PBS (1X, pH 7.4), aliquoted and stored in −20 °C until further processing.

For further processing, 40 μL of protease inhibitor cocktail solution (1 tablet dissolved in 1 mL water), and 5 μL of 10x benzonase was added to the 500 μL E.Coli lysate. The mixture was incubated at 37 °C for 30 minutes on a thermomixer.

Total protein concentration was determined by bicinchoninic acid assay (Pierce BCA Protein Assay Kit, Thermo Fisher Scientific, Waltham, MA, USA). Approximately 100 μg of the total protein sample was digested using the filter-aided sample preparation (FASP) method as described above. With one change in the protocol, a 30 kDa molecular weight cut-off spin column filters (Milipore) were used instead of the 10 kDa filter (Milipore).

Peptides were vacuum-dried and reconstituted in sample buffer (5% formic acid, 5% acetonitrile). The prepared heavy standard peptides were spiked into the sarkosyl fractionations together with a known amount of aqua 3R and aqua flex peptides.

FLEXITau analysis of sarkosyl-insoluble Tau

For primary and secondary cohort, a total of 15 and 11 Tau peptides were measured respectively by FLEXITau assay. The targeted FLEXITau analysis was conducted using a micro-autosampler AS2 and a nanoflow HPLC pump module (Eksigent / Sciex, Framingham, MA, USA) coupled to a triple quadrupole mass spectrometer (Sciex Qtrap 5500). The chromatographic separation was performed on a Protecol C18G (200Å, 250 mm x 300 μm ID; Trajan, Australia) at a flow rate of 5 μL/min in 25 min. A gradient of solvents A (water and 0.1% formic acid) and B (acetonitrile and 0.1% formic acid) was used for elution. Samples were loaded onto the column, and peptides were eluted using the following gradient: 2%–10% solvent B in 2 min, 10%–35% B in 13 min, and 35%–90% B in 5 min. Finally, the column was re-equilibrated at 2% B for 5 min. Three to five transitions were monitored for each precursor by selected reaction monitoring (SRM) with a retention time window of 45 s and a target scan time of 0.5 s to ensure optimal data point per peak. Data were analyzed with Skyline-daily (version 20.1.9.234)60 and each paired light and heavy precursor was manually checked for similar ratios among transitions, peak shape, and retention time. The peak areas of light and heavy peptides were then exported for further calculations.

QUANTIFICATION AND STATISTICAL ANALYSIS

Data analysis

The DDA TimsTOF data were converted into MGF format using Bruker DataAnalysis software and subsequently analyzed using ProteinPilot Software v5.0.1(Sciex) against a Homo sapiens UniProt protein database containing 42389 entries (downloaded on 2020–08–14), supplemented with common contaminants cRAP protein sequences (116 entries, downloaded on 2019–03-03). The following settings were applied: instrument type ‘Orbi-FT MS (1–3 ppm)’; ‘LTQ MSMS’; ‘Urea denaturation’; ‘Cysteine alkylationAcrylamide’; ‘Digestion-Trypsin’; ‘thorough’ search mode; ‘phosphorylation emphasis’; ‘ID focus on biological modifications’; ‘FDR Analysis-Yes’. As a second search, the above settings were kept while ‘phosphorylation emphasis’ was changed to ‘acetylation emphasis’.

Tau peptides were extracted from the ProteinPilot results tables at a local peptide FDR level of 5% for each sample using in-house java code using Java (v1.8.0_361) in the Eclipse IDE (v4.20.0). Additionally, peptides were filtered so they only contained phosphorylation on STY, methylation on K, acetylation on KR, ubiquitination on KS, and citrullination on R. Oxidation on M and deamidation of QN were only considered if another modification mentioned above was present. Sequences containing any other modification were removed. Peptides were then realigned to the Tau protein isoform sequences of 2N4R, 0N, 1N, and 3R, and modification and cleavage sites were calculated relative to the 2N4R coordinates. With proteolytic cleavage sites denoted by the amino acid following the cleavage site. Peptides and sites not present in these isoforms were removed. Results from both the ‘phosphorylation emphasis’ and ‘acetylation emphases’ were combined for each sample, and overall results were combined across the entire sample set.

Individual modifications and cleavage sites were then identified, and the presence and absence of modifications were encoded with 1 and 0, respectively, resulting in matrices of modifications and cleavage sites across all samples using in-house R scripts in R (v4.1.0) using RStudio (v1.4.1717). Additionally, individual consecutive cleavage sites were collapsed into cleavage hotspots, and frequencies of cleavages within each hotspot per sample were calculated. Subsequently, frequencies per disease group were also calculated.

To identify structures in the data, binary and frequency data for PTMs and cleavage sites, respectively, were clustered in R (v4.1.0) using hierarchical clustering. The data were clustered using the Euclidean distance metric and the ‘ward.D’ option as the clustering metric.

For isoform quantification using label free data the DDA data from the primary and secondary cohort were searched with MSFragger61 (v3.6), processed with Philosopher62 (v4.8.1), and quantified with IonQuant63 (v1.8.10) using FragPipe (v19.0) and the following parameters: default LFQ parameters from MSFragger were used. Analysis was performed against a Homo Sapiens Uniprot database containing 42502 entries (downloaded on 2024–02-05). The following variable modifications were added: 15.9949 on M, 42.0106 on [ ^, 79.96633 on STY, 114.0429 on K and 42.0106 on K, with 71.0371 on C (cysteine) as fixed modification. Protein-level and peptidelevel summaries were generated. For Quant (MS1) no match between runs or normalization was used.

FLEXITau and Label-free Quantification

For FLEXITau analysis, intensities for light- and heavy-labeled peptide forms were used because of different charge and oxidation states. Firstly, light-to-heavy ratios (L/H) were calculated for each peptide form before peptide forms, along with peptides containing missed tryptic cleavages, were collapsed into their mean forming peptide region ratios. Lastly, for each sample, the top three peptide regions with the highest L/H ratio were selected, and their mean values were used to normalize the L/H ratios across the Tau peptides. An exception of normalization was set for the separate spike-in peptide unique to the 3R isoform of Tau. Peptides were realigned to the 2N4R isoform for consistent site identification, and the overall results were stored as a data matrix enriched with the absolute abundance of total Tau and the 3R/4R isoform ratio. Total Tau abundance was tested for each group against the control group using a two-sided t-test with FDR based multiple testing correction and an alpha level of 0.05. The 0N, 1N and 2N isoforms were compared within each pathology for differential abundance using two-sided t-tests with FDR based multiple testing correction and an alpha level of 0.05. The dominance of 3R and 4R isoforms was assessed using the 3R/4R isoform ratio of the FLEXITau data using a two-sided t-test with FDR based multiple testing correction and an alpha level of 0.05 against the control group.

Additionally, the same FLEXITau peptides and all isoform specific peptides for 3R and 4R were extracted with label free intensities from the Fragpipe results. Peptide intensities were averaged for each isoform before calculating the 3R/4R ratio. The dominance of the 3R and 4R isoforms was assessed as described for the FLEXITau data.

To identify structures in the data, FLEXITau values for peptide regions were clustered in R using hierarchical clustering. The data were clustered using the Euclidean distance metric and the ‘ward.D’ option as the clustering metric.

Machine Learning

To separate the different disease groups, we employed supervised computational classification using the three molecular feature types of Tau as described above, namely binary presence/absence data for PTMs and proteolytic cleavages, as well as the quantitative modification extent from the FLEXITau measurements. To provide a balanced dataset for the training and testing of the classifiers per disease, each of the three datasets was split into the current ‘self’ tauopathy of interest and the remaining ‘non-self’ category. The ‘non-self’ category was then down sampled to achieve equal numbers in both the ‘self’ and ‘non-self’ categories. Asymptomatic (CTR) and symptomatic, Dementia with Lewy Bodies (DLB) were combined into one control (CTR) group. Due to the nature of PTMs and proteolytic cleavages, many redundant and non-informative features that could define a particular tauopathy were expected. Therefore, the elimination of features was performed using random forest elimination (RFE) with five times repeated cross-validation and a maximum feature set of 20, as the average balanced dataset sizes were 40 observations. Features were ranked by importance based on cross-validation model accuracy using the ‘CARET’ package in R. After feature shortlisting, several supervised learning classifiers for each tauopathy were constructed and evaluated by their performance, including neural networks (Nnet), k-nearest neighbor (KNN), learning vector quantization (LVQ), linear discriminant analysis (LDA), support vector machines (SVM) and random forest (RF). Classification models were assessed based on accuracy, proportion of correctly classified instances among all cases, sensitivity, and proportion of true positives among the total number of cases. Performance was analyzed for each classifier based on the true positive and false positive rates using the receiver operator curves (ROCs). The power of each disease classifier was further assessed by the area under the ROC curve (AUC). Achieving the highest sensitivity and specificity leads to the selection of RF for further model evaluation. Random feature sets were chosen as “slipper variables” at each node, and each tree was grown until terminal nodes were pure ‘self’ or ‘non-self’ samples. A maximum of 500 ensemble trees were grown for each classifier while raw vote counts were returned to allow combining results from different runs; sample replacement was not allowed. Trees were tested using ‘out of bag’ samples set aside (33.3% per group) for accuracy estimation of the classifiers. The forests were then applied to independent head-back data (20% per group). New disease classifiers were built using RF for a reduced set of 11 FLEXITau features consistent with the FLEXITau features in the secondary cohort for AD, CBD, PiD, PSP, and asymptomatic CTR. The new classifiers were assessed by area under the ROC curve (AUC) on the test-split from the reduced first cohort data and were additionally tested on the independent second cohort. Lastly, low-dimensional visualization of the classification capabilities of the RF models features dimensional reduction using t-distributed stochastic neighbor embedding (t-SNE). T-SNE was applied to the resulting top 5 most important features separating each tauopathy from the ‘non-self’ groups.

Analyses were performed and figures were created in R (v4.1.0) using RStudio (v1.4.1717) with the packages R.utils (v2.11.0), stringr (v1.4.0), GetoptLong (v1.0.5), reshape2 (v1.4.4), circlize (v0.4.13), ComplexHeatmap (v2.11.1), dendsort (v0.3.4), dendextend (v1.15.2), ggplot2 (v3.3.5), ggpubr (v0.4.0), ggdendro (v0.1.22), ggpmisc (v0.4.5), scales (v1.1.1), and gridExtra (v2.3). Machine learning classifiers were trained and analyzed in R (v4.1.1) with the additional packages caret (v6.0.89), glmnet (v4.1.2), e1071 (v1.7.9), randomForest (v4.6.14), Matrix (v1.3.4), lattice (v0.20.44), mlbench (v2.1.3), Boruta (v7.0.0), pROC (v1.18.0), ROCR (v1.0.11), and Rtsne (v0.15).

Supplementary Material

S1
S2
S3
S4
5

Supplemental information can be found online at https://doi.org/10.1016/j.cell.2025.12.036.

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Biological Samples

Alzheimer’s disease (AD) brain tissue for the first cohort Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank Patient Demographics details are provided in Table S1
Alzheimer’s disease (AD) brain tissue for the first cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Alzheimer’s disease (AD) brain tissue for the first cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Alzheimer’s disease (AD) brain tissue for the first cohort Boston University Chronic Traumatic Encephalopathy Center (VA-BU-CLF) through the UNITE brain bank Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the first cohort Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the first cohort University of Maryland Brain and Tissue Bank (UMD BTB) Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the first cohort Mayo Clinic Brain Bank Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the first cohort Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the first cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the first cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the first cohort Mayo Clinic Brain Bank Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the first cohort Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the first cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the first cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Chronic traumatic encephalopathy (CTE) brain tissue for the first cohort Boston University Chronic Traumatic Encephalopathy Center (VA-BU-CLF) through the UNITE brain bank Patient Demographics details are provided in Table S1
Dementia with Lewy bodies (DLB) brain tissue for the first cohort Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank Patient Demographics details are provided in Table S1
Healthy control brain tissue for the first cohort. Massachusetts Alzheimer’s Disease Research Center (MADRC) brain bank Patient Demographics details are provided in Table S1
Healthy control brain tissue for the first cohort. Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Healthy control brain tissue for the first cohort. Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Alzheimer’s disease (AD) brain tissue for the validation cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Alzheimer’s disease (AD) brain tissue for the validation cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Alzheimer’s disease (AD) brain tissue for the validation cohort Neurodegenerative Disease Brain Bank (NDBB) at University of California, San Francisco Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the validation cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the validation cohort Neurodegenerative Disease Brain Bank (NDBB) at University of California, San Francisco Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the validation cohort University of Maryland Brain and Tissue Bank (UMD BTB) Patient Demographics details are provided in Table S1
Corticobasal degeneration (CBD) brain tissue for the validation cohort University of Miami Brain Bank Cohort (MBB) Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the validation cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the validation cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Progressive supranuclear palsy (PSP) brain tissue for the validation cohort Neurodegenerative Disease Brain Bank (NDBB) at University of California, San Francisco Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the validation cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the validation cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the validation cohort Neurodegenerative Disease Brain Bank (NDBB) at University of California, San Francisco Patient Demographics details are provided in Table S1
Pick’s disease (PiD) brain tissue for the validation cohort University of Miami Brain Bank Cohort (MBB) Patient Demographics details are provided in Table S1
Healthy control brain tissue for the validation cohort Human Brain and Spinal Fluid Resource Center (HBSFRC) Patient Demographics details are provided in Table S1
Healthy control brain tissue for the validation cohort Harvard Brain Tissue Resource Center (HBTRC) at McLean Hospital Patient Demographics details are provided in Table S1
Healthy control brain tissue for the validation cohort Neurodegenerative Disease Brain Bank (NDBB) at University of California, San Francisco Patient Demographics details are provided in Table S1

Chemicals, peptides and recombinant proteins

Trizma® hydrochloride solution (Tris-HCl) Sigma-Aldrich Cat#T2663-1L
Sodium Chloride (NaCl) Sigma Cat#S7653-1KG
Magnesium Sulfate (MgSO4) Sigma Aldrich Cat#M7506-500G
Ethylenediamine tetraacetic Acid (EDTA) solution Sigma-Aldrich Cat#03690-100mL
Ethylene Glycol Tetraacetic Acid (EGTA) solution Research Products International Corp. Cat#E14100-50.0
DL-Dithiothreitol (DTT) Sigma Aldrich Cat#D0632-10G
Nicotinamide Sigma Cat# 72340-100G
Trichostatin A Sigma-Aldrich Cat# T8552-1MG
cOmplete ULTRA tablets, Mini, EDTA-free Roche Cat#05 892 791 001
PhosSTOP Roche Cat#04 906 837 001
Phosphatase Inhibitor Cocktail #2 Sigma-Aldrich Cat#P5726-5ML
Phosphatase Inhibitor Cocktail #3 Sigma-Aldrich Cat#P0044-5ML
Sucrose Sigma Cat#84097-1KG
20% Sarkosyl solution Teknova Cat#S3379
Sodium dodecyl sulfate (SDS) Sigma-Aldrich Cat#L4509-10G
Urea Sigma-Aldrich Cat#U5378-1KG
Acrylamide solution Sigma-Aldrich Cat# 01697-500mL
Ammonium bicarbonate (ABC) Sigma-Aldrich Cat# 09830-500G
Trypsin (sequencing grade modified trypsin) Promega, Madison, WI Cat#V5111
Water Thermo Fisher Scientific Cat#W6-4
Acetonitrile Thermo Fisher Scientific Cat#A955-4
Formic acid (FA) Optima LC/MS Fischer chemical Cat#A117-50
1000X metals Teknova Cat#T1001
Glucose Sigma Cat#G7021-1KG
Sodium Hydroxide (NaOH) Sigma Cat#S5881-1KG
Sodium phosphate dibasic (Na2HPO4) Sigma Cat#S5136-500G
Potassium phosphate monobasic (KH2PO4) Sigma Cat#60218-500G
Ammonium chloride (NH4Cl) Sigma Cat#09718-250G
Sodium sulfate (Na2SO4) Sigma Cat#6547-500G
L-Glutamic acid Sigma Cat#G8415-100G
L-Aspartic acid Sigma Cat#A7219-100G
L-Histidine Sigma Cat#H6034-25G
L-Alanine Sigma Cat#A7469-25G
L-Proline Sigma Cat#P5607-25G
Glycine Sigma Cat#G8790-100G
L-Threonine Sigma Cat#T8441-25G
L-Serine Sigma Cat#S4311-25G
L-Glutamine Sigma Cat#G8540-25G
L-Asparagine Sigma Cat#A4159-25G
L-Valine Sigma Cat#V0513-25G
L-Leucine Sigma Cat#L8912-25G
L-Isoleucine Sigma Cat#I7403-25G
L-Phenylalanine Sigma Cat#P5482-25G
L-Tryptophan Sigma Cat#T8941-25G
L-Methionine Sigma Cat#M5308-25G
Thiamine hydrochloride Sigma Cat#T1270-25G
Kanamycin Teknova Cat#K800
Heavy L-Lysine (K6) Cambridge Isotope Laboratories, Inc. Cat#CLM-2247-H-1
Heavy L-Arginine (R10) Cambridge Isotope Laboratories, Inc. Cat#CNLM-539-H-1
IPTG Teknova Cat#I3430
PBS pH 7.4 (1X) Gibco Cat#10010-023
Benzonase Nuclease EMD Millipore Corporation Cat#71206-3
Heavy 3R Aqua peptide Thermo Scientific VQJVYKPVDLS(K)
Light Flex Aqua peptide Thermo Scientific SENLYFQGDIS(R)

Critical commercial assays

Pierce BCA Protein Assay Kit Thermo Fischer Scientific Cat# 23225

Deposited data

Raw and analyzed data This paper ProteomeXchange identifiers: PRIDE: PXD069495; PRIDE: PXD069437; PRIDE: PXD069249; and PRIDE: PXD069217
Code This paper https://doi.org/10.5281/zenodo.17712260

Software and algorithms

Skyline-daily (version 20.1.9.234) MacCoss Lab Software, 2020 StartPage:/home/software/Skyline
Bruker Data Analysis software v5.3.236 Bruker Bruker software
ProteinPilot Software (v5.0) Sciex ProteinPilot - Next-Level Quantification & Identification
FragPipe (v19.0) Nesvizhskii Lab Software, 2017 https://fragpipe.nesvilab.org/
MSFragger (v3.6) Nesvizhskii Lab Software, 2017 https://msfragger.nesvilab.org/
Ionquant (v1.8.10) Nesvizhskii Lab Software, 2020 https://ionquant.nesvilab.org/
Philosopher (v4.8.1) Nesvizhskii Lab Software, 2020 https://philosopher.nesvilab.org/
R (v4.1.0) and R(v4.1.1) using RStudio (v1.4.1717) Cran R-project and Posit Software, PBC https://cran.r-project.org/ and https://posit.co/download/rstudio-desktop/
Java (v1.8.0_361) using Eclipse IDE (v4.20.0) Oracle and Eclipse Foundation https://www.oracle.com/java/technologies/downloads/ https://eclipseide.org/

Other

TimsTOF Pro Bruker, Germany N/A
NanoElute Bruker, Germany N/A
Q Exactive mass spectrometer Thermo Fisher Scientific N/A
Nanoflow HPLC pump Eksigent N/A
Qtrap 5500 Sciex N/A

Highlights.

  • Comprehensive mapping of tau identifies 145 PTMs and 195 cleavage sites in tauopathies

  • Provides tau molar abundance and peptide modification stoichiometry in disease

  • Machine learning classifies tauopathies using tau molecular features

  • Identified disease-specific features are potential drug targets and diagnostics

ACKNOWLEDGMENTS

We sincerely thank the patients and families who donated tissues to the NIH Brain Banks. Tissues were from the NIH NeuroBioBank; Massachusetts Alzheimer’s Disease Research Center (MADRC), funded by grant P30AG062421; the VA-BU Chronic Traumatic Brain Center, funded by NIA P30-AG072978 and NINDS U54-NS-115266; the Mayo Clinic Brain Bank, funded by Cure PSP and the Rainwater Charitable Foundation; the Human Brain and Spinal Fluid Resource Center at the VA West LA Healthcare Center, funded by NINDS/NIMH; the University of Maryland Brain and Tissue Bank, funded by the NIA and NINDS; the University of Miami Brain Endowment Bank, funded by HHS-NIH-NIDA(MH)-12-265, HHSN-271-2013-00028, and 75N95019C0005, as well as the Francis and Norris McGowan Endowment Fund; the Harvard Brain Tissue Resource Center, funded by HHSN-271-2013-00030C; and the Neurodegenerative Disease Brain Bank at UCSF, which is supported by the Rainwater Charitable Foundation, the Bluefield Project to Cure FTD, and NIH grants AG019724, AG062422, AG063911, and AG057195. This study was funded by the NIA R01 AG071858 (J.A.S.), UG3 NS104095 (J.A.S., D.W.D., and D.G.), the Tau Consortium (TC-Rainwater Foundation) (J.A.S., B.T.H., L.T.G., W.W.S., K.S.K., D.W.D., D.G., B.L.M., A.C.M., and A.L.B.), the ADDF (J.A.S. and H.S.), the Ellison Foundation, and the Cure Alzheimer’s Disease Fund (J.A.S. and B.T.H.). The instruments, software, and infrastructure used for this work were funded by National Institutes of Health grants R01 GM112007, RC4GM096319 (QTRAP 5500 to H.S.), and R01 NS066973 (J.A.S.). RF1AG059789, P30AG062421, and R56AG061196 support B.T.H. European Research Council Grant (eXplAInProt, 101124385) supports C.N.S. and B.Y.R. RF1AG059789, P30AG062421, and R56AG061196 fund A.C.M. and the BU-VA CTE Brain Bank. We are grateful to the TC and Rainwater Charitable Foundation; Todd Rainwater; Walter Rainwater; Matthew Rainwater; Amy Rommel, PhD; Jeremy Smith; Beth Taylor; Jordan Brainerd; Saralyn Carrillo; Glenn Harris, PhD; Bradley Boeve, MD; Howard Feldman, MD; Beth Hoffman, PhD; Patrick May, PhD; Eric Nestler, MD, PhD; Maria Grazia Spillantini, PhD; Hui Zheng, PhD; and the TC scientific community for insightful comments and support of the work in this manuscript.

Footnotes

DECLARATION OF INTERESTS

B.T.H. has a family member who works at Novartis and owns stock in Novartis; he serves on the SAB of Dewpoint and owns stock. He serves on a scientific advisory board or is a consultant for AbbVie, Aprinoia Therapeutics, Arvinas, Avrobio, Axial, Biogen, BMS, Cure Alz Fund, Cell Signaling, Eisai, Genentech, Ionis, Novartis, Sangamo, Sanofi, Takeda, the US Department of Justice, Vigil, and Voyager. His laboratory is supported by research grants from the National Institutes of Health, Cure Alzheimer’s Fund, Tau Consortium, and the JPB Foundation—and sponsored research agreements from AbbVie and BMS.

A.L.D. is a co-founder of Neurovanda Therapeutics and has equity in Alector and Arvinas. He serves as a consultant for Alector, Arvinas, Alexion, Arrow-head, BMS, Eli Lilly, Janssen, Merck, Neurocrine, Novartis, Oligomerix, Ono, Oscotec, Switch, and Transposon. He receives research support from Biogen, Eisai, and Regeneron. His laboratory is supported by grants from the NIH, the Alzheimer’s Association, Association for Frontotemporal Degeneration, Bluefield Project, GHR Foundation, Gates Ventures, and Rainwater Charitable Foundation.

REFERENCES

  • 1.Williams DR (2006). Tauopathies: classification and clinical update on neurodegenerative diseases associated with microtubule-associated protein tau. Intern. Med. J 36, 652–660. 10.1111/j.1445-5994.2006.01153.x. [DOI] [PubMed] [Google Scholar]
  • 2.Josephs KA (2017). Current Understanding of Neurodegenerative Diseases Associated With the Protein Tau. Mayo Clin. Proc 92, 1291–1303. 10.1016/j.mayocp.2017.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ballatore C, Lee VMY, and Trojanowski JQ (2007). Tau-mediated neurodegeneration in Alzheimer’s disease and related disorders. Nat. Rev. Neurosci 8, 663–672. 10.1038/nrn2194. [DOI] [PubMed] [Google Scholar]
  • 4.Spillantini MG, and Goedert M. (2013). Tau pathology and neurodegeneration. Lancet Neurol. 12, 609–622. 10.1016/S1474-4422(13)70090-5. [DOI] [PubMed] [Google Scholar]
  • 5.Kosik KS, Joachim CL, and Selkoe DJ (1986). Microtubule-associated protein tau (tau) is a major antigenic component of paired helical filaments in Alzheimer disease. Proc. Natl. Acad. Sci. USA 83, 4044–4048. 10.1073/pnas.83.11.4044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wood JG, Mirra SS, Pollock NJ, and Binder LI (1986). Neurofibrillary tangles of Alzheimer disease share antigenic determinants with the axonal microtubule-associated protein tau (tau). Proc. Natl. Acad. Sci.USA 83, 4040–4043. 10.1073/pnas.83.11.4040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Grundke-Iqbal I, Iqbal K, Quinlan M, Tung YC, Zaidi MS, and Wisniewski HM (1986). Microtubule-associated protein tau. A component of Alzheimer paired helical filaments. J. Biol. Chem 261, 6084–6089. 10.1016/S0021-9258(17)38495-8. [DOI] [PubMed] [Google Scholar]
  • 8.Bateman RJ, Aisen PS, De Strooper B, Fox NC, Lemere CA, Ringman JM, Salloway S, Sperling RA, Windisch M, and Xiong C. (2011). Autosomal-dominant Alzheimer’s disease: a review and proposal for the prevention of Alzheimer’s disease. Alzheimers Res. Ther 3, 1. 10.1186/alzrt59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Gibb WR, Luthert PJ, and Marsden CD (1989). Corticobasal degeneration. Brain 112, 1171–1192. 10.1093/brain/112.5.1171. [DOI] [PubMed] [Google Scholar]
  • 10.Dickson DW, Rademakers R, and Hutton ML (2007). Progressive supranuclear palsy: pathology and genetics. Brain Pathol. 17, 74–82. 10.1111/j.1750-3639.2007.00054.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dickson DW (2001). Neuropathology of Pick’s Disease. Neurology 56, S16–S20. 10.1212/wnl.56.suppl_4.s16. [DOI] [PubMed] [Google Scholar]
  • 12.McKee AC, Cairns NJ, Dickson DW, Folkerth RD, Keene CD, Litvan I, Perl DP, Stein TD, Vonsattel JP, Stewart W, et al. (2016). The first NINDS/NIBIB consensus meeting to define neuropathological criteria for the diagnosis of chronic traumatic encephalopathy. Acta Neuropathol.131, 75–86. 10.1007/s00401-015-1515-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bieniek KF, Cairns NJ, Crary JF, Dickson DW, Folkerth RD, Keene CD, Litvan I, Perl DP, Stein TD, Vonsattel JP, et al. (2021). The second NINDS/NIBIB consensus meeting to define neuropathological criteria for the diagnosis of chronic traumatic encephalopathy. J. Neuropathol. Exp. Neurol 80, 210–219. 10.1093/jnen/nlab001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Boeve BF, Lang AE, and Litvan I. (2003). Corticobasal degeneration and its relationship to progressive supranuclear palsy and frontotemporal dementia. Ann. Neurol 54, S15–S19. 10.1002/ana.10570. [DOI] [PubMed] [Google Scholar]
  • 15.Dickson DW, Ahmed Z, Algom AA, Tsuboi Y, and Josephs KA (2010). Neuropathology of variants of progressive supranuclear palsy. Curr. Opin.Neurol 23, 394–400. 10.1097/WCO.0b013e32833be924. [DOI] [PubMed] [Google Scholar]
  • 16.Dickson DW, Kouri N, Murray ME, and Josephs KA (2011). Neuropathology of Frontotemporal Lobar Degeneration-Tau (FTLD-Tau). J. Mol. Neurosci 45, 384–389. 10.1007/s12031-011-9589-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Murray ME, Graff-Radford NR, Ross OA, Petersen RC, Duara R, and Dickson DW (2011). Neuropathologically defined subtypes of Alzheimer’s disease with distinct clinical characteristics: a retrospective study. Lancet Neurol. 10, 785–796. 10.1016/S1474-4422(11)70156-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Tsuchiya K, Piao YS, Oda T, Mochizuki A, Arima K, Hasegawa K, Haga C, Kakita A, Hori K, Tominaga I, et al. (2006). Pathological heterogeneity of the precentral gyrus in Pick’s disease: a study of 16 autopsy cases. Acta Neuropathol. 112, 29–42. 10.1007/s00401-005-0028-6. [DOI] [PubMed] [Google Scholar]
  • 19.Wakabayashi K, and Takahashi H. (2004). Pathological heterogeneity in progressive supranuclear palsy and corticobasal degeneration. Neuropathology 24, 79–86. 10.1111/j.1440-1789.2003.00543.x. [DOI] [PubMed] [Google Scholar]
  • 20.Williams DR, and Lees AJ (2009). Progressive supranuclear palsy: clinicopathological concepts and diagnostic challenges. Lancet Neurol. 8, 270–279. 10.1016/S1474-4422(09)70042-0. [DOI] [PubMed] [Google Scholar]
  • 21.Respondek G, Stamelou M, Kurz C, Ferguson LW, Rajput A, Chiu WZ, van Swieten JC, Troakes C, Al Sarraj S, Gelpi E, et al. (2014). The phenotypic spectrum of progressive supranuclear palsy: a retrospective multicenter study of 100 definite cases. Mov. Disord 29, 1758–1766. 10.1002/mds.26054. [DOI] [PubMed] [Google Scholar]
  • 22.Murray ME, Kouri N, Lin W-L, Jack CR, Dickson DW, and Vemuri P. (2014). Clinicopathologic assessment and imaging of tauopathies in neurodegenerative dementias. Alzheimers Res. Ther 6, 1. 10.1186/alzrt231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ling H, de Silva R, Massey LA, Courtney R, Hondhamuni G, Bajaj N, Lowe J, Holton JL, Lees A, and Revesz T. (2014). Characteristics of progressive supranuclear palsy presenting with corticobasal syndrome: a cortical variant. Neuropathol. Appl. Neurobiol 40, 149–163. 10.1111/nan.12037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fitzpatrick AWP, Falcon B, He S, Murzin AG, Murshudov G, Garringer HJ, Crowther RA, Ghetti B, Goedert M, and Scheres SHW (2017). Cryo-EM structures of tau filaments from Alzheimer’s disease. Nature 547, 185–190. 10.1038/nature23002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Falcon B, Zivanov J, Zhang W, Murzin AG, Garringer HJ, Vidal R, Crowther RA, Newell KL, Ghetti B, Goedert M, et al. (2019). Novel tau filament fold in chronic traumatic encephalopathy encloses hydrophobic molecules. Nature 568, 420–423. 10.1038/s41586-019-1026-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Zhang W, Tarutani A, Newell KL, Murzin AG, Matsubara T, Falcon B, Vidal R, Garringer HJ, Shi Y, Ikeuchi T, et al. (2020). Novel tau filament fold in corticobasal degeneration. Nature 580, 283–287. 10.1038/s41586-020-2043-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Falcon B, Zhang W, Murzin AG, Murshudov G, Garringer HJ, Vidal R, Crowther RA, Ghetti B, Scheres SHW, and Goedert M. (2018). Structures of filaments from Pick’s disease reveal a novel tau protein fold. Nature 561, 137–140. 10.1038/s41586-018-0454-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Shi Y, Zhang W, Yang Y, Murzin AG, Falcon B, Kotecha A, van Beers M, Tarutani A, Kametani F, Garringer HJ, et al. (2021). Structure-based classification of tauopathies. Nature 598, 359–363. 10.1038/s41586-021-03911-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Smith LM, Agar JN, Chamot-Rooke J, Danis PO, Ge Y, Loo JA, Pasa-Tolic L, Tsybin YO, and Kelleher NL; Consortium; Top-Down; Proteomics (2021). The Human Proteoform Project: Defining the human proteome. Sci. Adv 7, eabk0734. 10.1126/sciadv.abk0734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Aebersold R, Agar JN, Amster IJ, Baker MS, Bertozzi CR, Boja ES, Costello CE, Cravatt BF, Fenselau C, Garcia BA, et al. (2018). How many human proteoforms are there? Nat. Chem. Biol 14, 206–214. 10.1038/nchembio.2576. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Iqbal K, and Grundke-Iqbal I. (1991). Ubiquitination and abnormal phosphorylation of paired helical filaments in Alzheimer’s disease. Mol. Neurobiol 5, 399–410. 10.1007/BF02935561. [DOI] [PubMed] [Google Scholar]
  • 32.Grundke-Iqbal I, Iqbal K, Tung YC, Quinlan M, Wisniewski HM, and Binder LI (1986). Abnormal phosphorylation of the microtubule-associated protein tau (tau) in Alzheimer cytoskeletal pathology. Proc. Natl. Acad. Sci. USA 83, 4913–4917. 10.1073/pnas.83.13.4913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Wesseling H, Mair W, Kumar M, Schlaffner CN, Tang S, Beerepoot P, Fatou B, Guise AJ, Cheng L, Takeda S, et al. (2020). Tau PTM Profiles Identify Patient Heterogeneity and Stages of Alzheimer’s Disease. Cell 183, 1699–1713.e13. 10.1016/j.cell.2020.10.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Irwin DJ, Cohen TJ, Grossman M, Arnold SE, Xie SX, Lee VMY, and Trojanowski JQ (2012). Acetylated tau, a novel pathological signature in Alzheimer’s disease and other tauopathies. Brain 135, 807–818. 10.1093/brain/aws013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Lövestam S, Wagstaff JL, Katsinelos T, Freund SMV, Goedert M, and Scheres SHW (2025). Twelve phosphomimetic mutations induce the assembly of recombinant full-length human tau into paired helical filaments. eLife 14, RP104778. 10.7554/eLife.104778.3. [DOI] [Google Scholar]
  • 36.Carlomagno Y, Manne S, DeTure M, Prudencio M, Zhang YJ, Hanna Al-Shaikh R, Dunmore JA, Daughrity LM, Song Y, Castanedes-Casey M, et al. (2021). The AD tau core spontaneously self-assembles and recruits full-length tau to filaments. Cell Rep. 34, 108843. 10.1016/j.celrep.2021.108843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Arakhamia T, Lee CE, Carlomagno Y, Kumar M, Duong DM, Wesseling H, Kundinger SR, Wang K, Williams D, DeTure M, et al. (2021). Posttranslational Modifications Mediate the Structural Diversity of Tauopathy Strains. Cell 184, 6207–6210. 10.1016/j.cell.2021.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Hu W, Zhang X, Tung YC, Xie S, Liu F, and Iqbal K. (2016). Hyperphosphorylation determines both the spread and the morphology of tau pathology. Alzheimers Dement. 12, 1066–1077. 10.1016/j.jalz.2016.01.014. [DOI] [PubMed] [Google Scholar]
  • 39.Dujardin S, Commins C, Lathuiliere A, Beerepoot P, Fernandes AR, Kamath TV, De Los Santos MB, Klickstein N, Corjuc DL, Corjuc BT, et al. (2020). Tau molecular diversity contributes to clinical heterogeneity in Alzheimer’s disease. Nat. Med 26, 1256–1263. 10.1038/s41591-020-0938-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Mair W, Muntel J, Tepper K, Tang S, Biernat J, Seeley WW, Kosik KS, Mandelkow E, Steen H, and Steen JA (2016). FLEXITau: Quantifying Post-translational Modifications of Tau Protein in Vitro and in Human Disease. Anal. Chem 88, 3704–3714. 10.1021/acs.analchem.5b04509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Kumar M, Quittot N, Dujardin S, Schlaffner CN, Viode A, Wiedmer A, Beerepoot P, Chun JE, Glynn C, Fernandes AR, et al. (2024). Alzheimer proteopathic tau seeds are biochemically a forme fruste of mature paired helical filaments. Brain 147, 637–648. 10.1093/brain/awad378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Greally S, Kumar M, Schlaffner C, van der Heijden H, Lawton ES, Biswas D, Berretta S, Steen H, and Steen JA (2024). Dementia with lewy bodies patients with high tau levels display unique proteome profiles. Mol. Neurodegener 19, 98. 10.1186/s13024-024-00782-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Goedert M, Spillantini MG, Jakes R, Rutherford D, and Crowther RA (1989). Multiple isoforms of human microtubule-associated protein tau: sequences and localization in neurofibrillary tangles of Alzheimer’s disease. Neuron 3, 519–526. 10.1016/0896-6273(89)90210-9. [DOI] [PubMed] [Google Scholar]
  • 44.Rösler TW, Tayaranian Marvian A, Brendel M, Nykänen NP, Höllerhage M, Schwarz SC, Hopfner F, Koeglsperger T, Respondek G, Schweyer K, et al. (2019). Four-repeat tauopathies. Prog. Neurobiol 180, 101644. 10.1016/j.pneurobio.2019.101644. [DOI] [PubMed] [Google Scholar]
  • 45.Irwin DJ, Brettschneider J, McMillan CT, Cooper F, Olm C, Arnold SE, Van Deerlin VM, Seeley WW, Miller BL, Lee EB, et al. (2016). Deep clinical and neuropathological phenotyping of Pick disease. Ann. Neurol 79, 272–287. 10.1002/ana.24559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Scheres SHW, Ryskeldi-Falcon B, and Goedert M. (2023). Molecular pathology of neurodegenerative diseases by cryo-EM of amyloids. Nature 621, 701–710. 10.1038/s41586-023-06437-2. [DOI] [PubMed] [Google Scholar]
  • 47.Soto C, and Pritzkow S. (2018). Protein misfolding, aggregation, and conformational strains in neurodegenerative diseases. Nat. Neurosci 21, 1332–1340. 10.1038/s41593-018-0235-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Parra Bravo C, Naguib SA, and Gan L. (2024). Cellular and pathological functions of tau. Nat. Rev. Mol. Cell Biol 25, 845–864. 10.1038/s41580-024-00753-9. [DOI] [PubMed] [Google Scholar]
  • 49.Limorenko G, and Lashuel HA (2022). Revisiting the grammar of Tau aggregation and pathology formation: how new insights from brain pathology are shaping how we study and target Tauopathies. Chem. Soc. Rev 51, 513–565. 10.1039/D1CS00127B. [DOI] [PubMed] [Google Scholar]
  • 50.Chang CW, Shao E, and Mucke L. (2021). Tau: Enabler of diverse brain disorders and target of rapidly evolving therapeutic strategies. Science 371, eabb8255. 10.1126/science.abb8255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Guo T, Steen JA, and Mann M. (2025). Mass-spectrometry-based proteomics: from single cells to clinical applications. Nature 638, 901–911. 10.1038/s41586-025-08584-0. [DOI] [PubMed] [Google Scholar]
  • 52.Rowe JB, Holland N, and Rittman T. (2021). Progressive supranuclear palsy: diagnosis and management. Pract. Neurol 21, 376–383. 10.1136/practneurol-2020-002794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Zwang TJ, Sastre ED, Wolf N, Ruiz-Uribe N, Woost B, Hoglund Z, Fan Z, Bailey J, Nfor L, Bue e, L., et al. (2024). Neurofibrillary tangle-bearing neurons have reduced risk of cell death in mice with Alzheimer’s pathology. Cell Rep. 43, 114574. 10.1016/j.celrep.2024.114574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Cummings JL, Teunissen CE, Fiske BK, Le Ber I, Wildsmith KR, Schöll M, Dunn B, and Scheltens P. (2025). Biomarker-guided decision making in clinical drug development for neurodegenerative disorders. Nat. Rev. Drug Discov 24, 589–609. 10.1038/s41573-025-01165-w. [DOI] [PubMed] [Google Scholar]
  • 55.Leuzy A, Janelidze S, Mattsson-Carlgren N, Palmqvist S, Jacobs D, Cicognola C, Stomrud E, Vanmechelen E, Dage JL, and Hansson O. (2021). Comparing the Clinical Utility and Diagnostic Performance of CSF P-Tau181, P-Tau217, and P-Tau231 Assays. Neurology 97, e1681–e1694. 10.1212/WNL.0000000000012727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Lai R, Li B, and Bishnoi R. (2024). P-tau217 as a Reliable Blood-Based Marker of Alzheimer’s Disease. Biomedicines 12, 1836. 10.3390/biomedicines12081836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Perez-Riverol Y, Bandla C, Kundu DJ, Kamatchinathan S, Bai J, Hewapathirana S, John NS, Prakash A, Walzer M, Wang S, et al. (2025). The PRIDE database at 20 years: 2025 update. Nucleic Acids Res. 53, D543–D553. 10.1093/nar/gkae1011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Wiśniewski JR (2018). Filter-Aided Sample Preparation for Proteome Analysis. Methods Mol. Biol 1841, 3–10. 10.1007/978-1-4939-8695-8_1. [DOI] [PubMed] [Google Scholar]
  • 59.Kumar M, Joseph SR, Augsburg M, Bogdanova A, Drechsel D, Vastenhouw NL, Buchholz F, Gentzel M, and Shevchenko A. (2018). MS Western, a Method of Multiplexed Absolute Protein Quantification is a Practical Alternative to Western Blotting. Mol. Cell. Proteomics 17, 384–396. 10.1074/mcp.O117.067082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.MacCoss MJ, and Tabb DL (2020). Skyline-Daily. Version 20.1.9.234 (MacCoss Lab Software; ). [Google Scholar]
  • 61.Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, and Nesvizhskii AI (2017). MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14, 513–520. 10.1038/nmeth.4256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.da Veiga Leprevost F, Haynes SE, Avtonomov DM, Chang HY, Shanmugam AK, Mellacheruvu D, Kong AT, and Nesvizhskii AI (2020). Philosopher: a versatile toolkit for shotgun proteomics data analysis. Nat. Methods 17, 869–870. 10.1038/s41592-020-0912-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Yu F, Haynes SE, and Nesvizhskii AI (2021). IonQuant Enables Accurate and Sensitive Label-Free Quantification With FDR-Controlled MatchBetween-Runs. Mol. Cell. Proteomics 20, 100077. 10.1016/j.mcpro.2021.100077. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1
S2
S3
S4
5

Data Availability Statement

  • Data: the MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE57 partner repository with the dataset identifiers PRIDE: PXD069495; PRIDE: PXD069437; PRIDE: PXD069249; and PRIDE: PXD069217 and are publicly available as of the date of publication.

  • Code: all code and intermediary data files for the processing and analysis of the MS data and for data visualization are available via Zenodo at https://doi.org/10.5281/zenodo.17712260 and are publicly available as of the date of publication.

  • Any additional information required to reanalyze data reported in this paper is available from the lead contact upon request.

RESOURCES