An aging population is a double-edged sword. On one hand, advancements in biotechnology and healthcare allow more people to enjoy longer lives. On the other hand, increases in the aging population accompany a surge in age-associated diseases, particularly neurodegenerative disorders. Since aging is the primary risk factor for many neurodegenerative disorders, living longer does not necessarily equate to maintaining a reasonable quality of life (Feigin et al., 2020).
The two most common neurodegenerative disorders are Alzheimer’s disease and Parkinson’s disease. Alzheimer’s disease is a multifactorial dementia disorder, primarily characterized by abnormal accumulation of extracellular amyloid β plaques and intracellular tau neurofibrillary tangles (Kumar et al., 2024). Similarly, Parkinson’s disease is also a multifactorial, movement disorder primarily characterized by abnormal accumulation of α-synuclein (α-syn) dominant aggregates in neuronal cells termed Lewy bodies and neurites (Poewe et al., 2017).
Despite the urgent need, there is currently no disease-modifying therapeutics that can effectively halt or reverse the progression of almost every neurodegenerative disorder. Many clinical trials aimed at finding effective treatments have failed. One of the main reasons for these failures is the heterogeneity of symptoms among these disorders. For example, although Alzheimer’s disease is believed to be the most prevalent dementia disorder, others such as vascular cognitive decline and impairment, frontotemporal dementia, dementia with Lewy body, and Parkinson’s disease dementia share the same underlying phenotypic presentation, albeit unique, but some overlapping, pathophysiology (Kumar et al., 2024). Similarly, Parkinson’s disease shares its cardinal clinical symptoms (bradykinesia, rigidity, and tremor) with other parkinsonian synucleinopathies such as dementia with Lewy bodies and multiple system atrophy, and tauopathies such as corticobasal degeneration and progressive supranuclear palsy, but again, differ in their underlying pathophysiology (Taha and Bogoniewski, 2024).
Definitive diagnosis of these neurodegenerative disorders can only be achieved through a neuropathological postmortem examination. This is largely due to the lack of objective, precise, reliable, and accurate antemortem biomarkers to achieve an appropriate differential diagnosis that represents the underlying pathophysiology (Taha and Bogoniewski, 2024). Therefore, persons are often misdiagnosed and mistakenly enrolled in the wrong clinical trials. This is also exacerbated by the heterogeneity within and among these disorders, where each specific case may present a unique spectrum of symptoms and pathophysiology (e.g., mixed disorders), making it challenging to develop targeted, effective disease-modifying therapeutics (Poewe et al., 2017; Kumar et al., 2024).
To accurately diagnose neurodegenerative disorders, it is important to develop objective, accurate, and reliable biomarkers. One popular approach is through immunoassays that can detect specific analytes in blood or cerebrospinal fluid (CSF). However, the development of these immunoassays is frequently hampered by the use of flawed and substantially inaccurate methodologies. This issue is compounded by the high costs and complexities involved, leading many researchers to either forgo development or employ erroneous techniques that compromise the validity of the assays in terms of accuracy and reliability (Dutta et al., 2023).
Therefore, this perspective (Figure 1) aims to illustrate simple principles for developing an immunoassay (Ortiz-Ross et al., 2024; Taha, 2024) while pinpointing common mistakes made, as seen in the failure of developing a phosphorylated α-syn at Serine 129 (pS129-α-syn) electrochemiluminescence ELISA (Dutta et al., 2023). Although the guidelines provided here are not exhaustive, they offer a starting point for those interested in or looking to refine their experimental methodologies for developing an immunoassay for detecting biomarkers.
Figure 1.

Schematic summary of the perspective’s main points.
(A) Advances in biotechnology and healthcare lead to increased aging, but also increase the incidence and prevalence of persons with neurodegenerative disorders. (B) Technical difficulties in accurately differentially diagnosing various neurodegenerative disorders illustrating the need for objective biomarkers using appropriately developed immunoassays. Created with BioRender.com. Aβ: Amyloid-β.
Metrics: The below metrics should be taken into consideration when attempting to develop an immunoassay.
Robustness: Measures the capacity of an immunoassay to remain unaffected by small, deliberate variations in method parameters and indicates its reliability under a variety of conditions (Lee et al., 2006). Typically, robustness is evaluated by first identifying a set of possible metrics to test such as temperature (e.g., room temperature vs. controlled incubator temperature), reagent quality or source (different batches or supplies), user (different individuals performing the assay), equipment (different instruments or variations in calibration), timing (variation in incubation time or processing speed), sample handling (differences in sample storage and preparation). A reliable immunoassay should not be affected by such aforementioned factors. Unfortunately, in the failed pS129-α-syn immunoassay (Dutta et al., 2023), the authors did not design any experiments to assess any robustness metrics, suggesting that the failed pS129-α-syn is unreliable.
Variability: Intra- and inter-assay coefficient of variations (CV) are important metrics that indicate the degree of variability obtained when the same sample is measured multiple times within the same assay run (intra-assay CV) or in different runs of the assay (inter-assay CV) using unchanged conditions, thereby reflecting the precision of the assay. In the unsuccessful pS129-α-syn immunoassay, the authors not only ran the samples on the plate using non-consistent metrics, but they also failed to demonstrate that these metrics do not have an effect (see above). This means that the authors failed to represent bona-fide estimation of the CV, as differences seen could easily be attributed to inconsistent robustness metrics, not the antibody comparisons.
Trueness: Indicates how close measurements are to the true value of the quantity being measured. Surprisingly, in the failed pS129-α-syn assay (Dutta et al., 2023), the authors indicate that the upper limit of quantification for the best antibody combination is 66,167 pg/mL, yet claim to have used a standard curve between 100 to 100,000 pg/mL in some of the figures. Sometimes, the authors often interchangeably used these values. Regardless, the trueness of their measurements varied greatly, indicating that quantified analyte levels are likely not representative of the bona-fide levels in the sample. Therefore, it is essential that researchers attempting to develop an immunoassay know and confirm exactly what standard curve is being generated and how much is added to it. In addition, in their discussion, the authors should have avoided discussing absolute concentrations of pS129-α-syn given the failure of their in-house immunoassay. This should have only been addressed when there is a consensus on the choice of calibrator material and a demonstration of the lack of matrix effects (more on this below).
Uncertainty: Quantifies the range of values within which the true value of the measurement lies. It can be quantified by identifying potential sources such as instrumental precision and operator variability, estimating their impact through statistical analysis of repeated measurements, and combining these estimates using the root sum square method to give a total uncertainty figure. This combined figure is then often multiplied by a factor (typically 2) to provide a confidence interval, which is reported alongside results to indicate their reliability (Farrance and Frenkel, 2012). This was unfortunately neglected by the authors in two separate studies utilizing the failed pS129-α-syn immunoassay (Dutta et al., 2023; Taha et al., 2023).
Matrix effects: Refers to the influence of other substances within a sample on the measurement of the analyte of interest, which can lead to higher or lower estimated values.
To be able to exclude a matrix effect, typically three experiments are necessary: dilution linearity, parallelism, and spike recovery. Acceptable recovery rates from the experiments should be between ±20.0% of the expected value. The below-described experimental guidelines are in agreement with the Clinical and Laboratory Standards Institute and the Food and Drug Administration guidelines, which the authors failed to take into consideration.
Dilution linearity is done by spiking the sample of interest above the upper limit of quantification and diluting the sample to a concentration within the working range of the immunoassay. Typically, dilutions of 1/2, 1/4, 1/8, or 1/16 are performed. Unfortunately, in the failed pS129-α-syn (Dutta et al., 2023), the authors claim to have performed dilution linearity, yet never spiked the sample above the upper limit of quantification. Instead, the authors opted to wrongfully dilute unspiked samples serially.
Parallelism is assessed by testing whether the dose-response curve of an analyte in a sample matrix parallels the curve obtained using a standard calibrator. To perform such an experiment, the user should start with a sample matrix of interest and dilute it 1/2, 1/4, 1/8, or 1/16. After diluting the samples, relative recovery compared to the standard calibrator of recombinant protein is calculated. An important note is to make sure that the diluted sample matrix will not fall below the lower limit of quantification of the immunoassay. This was not assessed in the failed pS129-α-syn immunoassay (Dutta et al., 2023). However, instead of assessing their relative accuracy against the standard calibrator, the authors performed a sham—a random experiment lacking scientific basis that did not meet the criteria for dilution linearity or parallelism—and proceeded to graph the recovery percentages, all of which exceeded the ±20% acceptable range. This suggests that their immunoassay is not compatible with any biofluid, including serum, plasma, CSF, and saliva.
Spike recovery is done by choosing a pure known amount of protein (e.g., a recombinant calibrator) based on the dynamic range of the immunoassay to be added to the sample of interest. For example, if the dynamic range is between 1 pg/mL and 1000 pg/mL, one could choose 100 pg/mL as low, 250 pg/mL as low-mid, 500 pg/mL as mid, 750 pg/mL as mid-high and 900 pg/mL as high spikes. In their failed pS129-α-syn immunoassay, the authors claim the dynamic range is between 14 and 66,167 pg/mL and sometimes up to 100,000 pg/mL, yet spiked their samples with 250, 500, and 1000 pg/mL. The authors went on to falsely claim that they are low, mid, and high spikes. Unfortunately, these values are all extremely low and fall within the last 3 points of an 11 points standard curve, and do not reflect the wrongfully claimed dynamic range of the immunoassay. It is unclear and counter-intuitive to simple scientific principles to choose such low values for developing an immunoassay.
In both the falsely claimed dilution linearity and spike experiments, the authors’ recovery percentages were way beyond the acceptable 20% recovery in all the matrix assessed (CSF, saliva, plasma, and serum), suggesting that their immunoassay suffers from a massive level of matrix effects. Ultimately, this suggests that their immunoassay is not compatible with quantifying bona-fide pS129-α-syn levels in biological fluids. A prime illustration is the fact the authors report high levels of pS129-α-syn in the CSF. However, data from Lashuel’s group suggest that pS129-α-syn is not quantifiable in the CSF (Cariulo et al., 2019; Lashuel et al., 2022). For instance, Cariulo et al. (2019) have used a more sensitive platform based on the Singulex Erenna and reported 3.4 times higher sensitivity (lower limit of quantification: 4.19 pg/mL) than the failed pS129-α-syn ECLIA (lower limit of quantification: 14.4). Interestingly, although Cariulo et al. (2019) used a much more sensitive immunoassay, they did not detect any pS129-α-syn in the CSF. Similarly, Lashuel et al. (2022) used a highly sensitive mass spectrometer and did not detect pS129-α-syn in the CSF. The fact that two more sensitive techniques did not detect any pS129-α-syn in CSF, while the authors of the failed pS129-α-syn did so, raises alarming questions about their immunoassay and methodology.
Stability: Refers to the consistency of the analyte measurements when samples are subjected to different conditions over time. Proper evaluation of stability is crucial to ensure that both the standard calibrator and the sample maintain their integrity and performance over time and under various handling scenarios. Typically, this involves side-by-side comparisons of freshly prepared samples or standard calibrators versus those subjected to controlled freeze/thaw cycles or stored at different time points (e.g., t0, t1, and t2). In their failed pS129-α-syn immunoassay, the authors claim that they have tested the standard calibrator’s stability over time, even though they had never tested the frozen protein at different times or when subjected to different freeze/thaw cycles in the same plate run. Therefore, the validity of the stability assessment remains questionable.
Cross-reactivity: Given that α-synuclein can exist in two other different isoforms (β-synuclein or γ-synuclein) as well as is subjected to numerous post-translational modifications aside from phosphorylation at Serine 129 and additional C-terminal truncations, the authors did not evaluate these important metrics. In fact, the authors have used the EP1536Y antibody to capture pS129-α-synuclein; however, Lashuel et al. (2022) demonstrated that this antibody is not reliable for detecting pathology-specific forms of pS129-α-syn. This is due to the antibody’s susceptibility to cross-react with other proteins, its failure to recognize C-terminal truncated forms of α-syn that co-occur with pS129-α-syn, reduced signals for pS129/nY125/nY133/nY136-α-syn fibrils and inability to detect pS129-α-syn in Western blotting. Therefore, their immunoassay does not represent bona-fide α-syn-associated changes in persons suffering from a synucleinopathy and has no clinical, translational, or basic utility.
Conclusion: The goal of this perspective is to briefly describe metrics that researchers should take into account when developing an immunoassay. The study by Dutta et al. (2023) on the pS129-α-syn immunoassay illustrates critical failings in adhering to these principles. The assay exhibited significant deficiencies in robustness, trueness, and matrix effects, leading to questionable reliability and validity of the results. This serves as a cautionary tale, emphasizing the necessity for rigorous validation and adherence to established immunoassay metrics to ensure the accuracy and utility of biomarker measurements in neurodegenerative diseases. It also cautions the readers from believing everything included in a paper, as in many occasions, the conclusions do not stem from the results being shown.
Additional file: Open peer review report 1 (89.7KB, pdf) .
Footnotes
Open peer reviewer: Sherif Bayoumy, Amsterdam UMC Locatie VUmc, Netherlands.
P-Reviewer: Bayoumy S; C-Editors: Zhao M, Sun Y, Qiu Y; T-Editor: Jia Y
References
- Cariulo C, Martufi P, Verani M, Azzollini L, Bruni G, Weiss A, Deguire SM, Lashuel HA, Scaricamazza E, Sancesario GM, Schirinzi T, Mercuri NB, Sancesario G, Caricasole A, Petricca L. Phospho-S129 alpha-synuclein is present in human plasma but not in cerebrospinal fluid as determined by an ultrasensitive immunoassay. Front Neurosci. 2019;13:889. doi: 10.3389/fnins.2019.00889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dutta S, Hornung S, Taha HB, Biggs K, Siddique I, Chamoun LM, Shahpasand-Kroner H, Lantz C, Herrera-Vaquero M, Stefanova N, Loo JA, Bitan G. Development of a novel electrochemiluminescence ELISA for quantification of α-synuclein phosphorylated at Ser(129) in biological samples. ACS Chem Neurosci. 2023;14:1238–1248. doi: 10.1021/acschemneuro.2c00676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farrance I, Frenkel R. Uncertainty of measurement: a review of the rules for calculating uncertainty components through functional relationships. Clin Biochem Rev. 2012;33:49–75. [PMC free article] [PubMed] [Google Scholar]
- Feigin VL, Vos T, Nichols E, Owolabi MO, Carroll WM, Dichgans M, Deuschl G, Parmar P, Brainin M, Murray C. The global burden of neurological disorders: translating evidence into policy. Lancet Neurol. 2020;19:255–265. doi: 10.1016/S1474-4422(19)30411-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar A, Sidhu J, Goyal A, Tsao JW. StatPearls. Treasure Island (FL): StatPearls Publishing; 2024. Alzheimer disease. [Google Scholar]
- Lashuel HA, Mahul-Mellier AL, Novello S, Hegde RN, Jasiqi Y, Altay MF, Donzelli S, DeGuire SM, Burai R, Magalhaes P, Chiki A, Ricci J, Boussouf M, Sadek A, Stoops E, Iseli C, Guex N. Revisiting the specificity and ability of phospho-S129 antibodies to capture alpha-synuclein biochemical and pathological diversity. NPJ Parkinsons Dis. 2022;8:136. doi: 10.1038/s41531-022-00388-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee JW, Devanarayan V, Barrett YC, Weiner R, Allinson J, Fountain S, Keller S, Weinryb I, Green M, Duan L, Rogers JA, Millham R, O’Brien PJ, Sailstad J, Khan M, Ray C, Wagner JA. Fit-for-purpose method development and validation for successful biomarker measurement. Pharm Res. 2006;23:312–328. doi: 10.1007/s11095-005-9045-3. [DOI] [PubMed] [Google Scholar]
- Ortiz-Ross X, Taha HB, Press E, Rhone S, Blumstein DT. Validating an immunoassay to measure fecal glucocorticoid metabolites in yellow-bellied marmots. bioRxiv [Preprint] 2024 doi: 10.1016/j.cbpa.2024.111738. doi: https://doi.org/10.1101/2024.05.20.595012. [DOI] [PubMed] [Google Scholar]
- Poewe W, Seppi K, Tanner CM, Halliday GM, Brundin P, Volkmann J, Schrag AE, Lang AE. Parkinson disease. Nat Rev Dis Primers. 2017;3:17013. doi: 10.1038/nrdp.2017.13. [DOI] [PubMed] [Google Scholar]
- Taha HB, Hornung S, Dutta S, Fenwick L, Lahgui O, Howe K, Elabed N, Del Rosario I, Wong DY, Duarte Folle A, Markovic D, Palma JA, Kang UJ, Alcalay RN, Sklerov M, Kaufmann H, Fogel BL, Bronstein JM, Ritz B, Bitan G. Toward a biomarker panel measured in CNS-originating extracellular vesicles for improved differential diagnosis of Parkinson’s disease and multiple system atrophy. Transl Neurodegener. 2023;12:14. doi: 10.1186/s40035-023-00346-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taha HB. Early detection of subjective cognitive decline and Alzheimer’s disease: analytical validation of a newly developed pT217-tau assay. Alzheimers Dement. 2024;20:3112–3113. doi: 10.1002/alz.13707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taha HB, Bogoniewski A. Analysis of biomarkers in speculative CNS-enriched extracellular vesicles for parkinsonian disorders: a comprehensive systematic review and diagnostic meta-analysis. J Neurol. 2024;271:1680–1706. doi: 10.1007/s00415-023-12093-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
