Abstract
Single-molecule detection of post-translational modifications (PTMs) such as phosphorylation plays a crucial role in early diagnosis of diseases and therapeutics development. Although single-molecule surface-enhanced Raman spectroscopy (SM-SERS) detection of PTMs has been demonstrated, the data analysis and detection accurracies were hindered by interference from citrate signals and lack of reference databases. Previous reports required complete coverage of the nanoparticle surface by analyte molecules to replace citrates, hampering the detection limit. Here, we developed a high-accuracy SM-SERS approach by combining a plasmonic particle-in-pore sensor to collect SM-SERS spectra of phosphorylation at Serine and Tyrosine, k-means-based clustering for citrate signal removal, and a one-dimensional convolutional neural network (1D-CNN) for phosphorylation identification. Significantly, we collected SM-SERS data with submonolayer analyte coverage of the particle surface and discriminated the phosphorylation in Serine and Tyrosine with over 95% and 97% accuracy, respectively. Finally, the 1D-CNN features were interpreted by a one-dimensional gradient feature weight and SM-SERS peak occurrence frequencies.


Post-translational modifications (PTMs) play essential roles in protein signaling, function, localization, and other important biological processes. Most PTMs attach small chemical groups, such as phosphoryl (153.32 Da), to amino acids of proteins, which can ultimately cause serious health consequences. , Phosphorylation is a ubiquitous PTM in which a phosphate group is covalently attached to a specific amino acid residue, which has a significant impact on vibrational modes. Phosphorylation events of serine (Ser) and tyrosine (Tyr) residues modulate critical signaling pathways in eukaryotic cells, which can provide insight into the mechanisms underlying cancer development and serve as potential therapeutic targets for developing new drugs or therapies for prostate cancers. , Phosphorylation at serine residues has been implicated in several physiological and pathological processes, including insulin signaling, tumor progression, inhibition of pathological calcification, and modulation of inflammatory and oncogenic pathways in metabolic disorders such as diabetes, chronic kidney disease, and cancer. On the other hand, phosphorylation at tyrosine plays a critical role in the pathogenesis of Parkinson’s disease (PD). ,
Mass spectrometry is currently the main technology for phosphorylation analysis, but its sensitivity is limited by the requirement of 106–108 copies of molecules. On the other hand, fluorescent sensors offer single-molecule sensitivity but often rely on antibodies or fluorescent tags, which can limit their specificity and applicability. Surface-Enhanced Raman Scattering (SERS), which is highly sensitive to molecular vibrations, has emerged as a powerful analytical tool for detecting protein PTMs at the single-molecule level. − However, conventional SERS substrates based on nanoparticles suffer from the strong signals of aromatic amino acids, which can overwhelm the signals of PTMs at non-aromatic amino acids. , In contrast, the plasmonic particle-in-pore sensor (Figure a) trapped a gold nanoparticle in a gold nanopore to generate a single, ultrasmall gap-mode plasmonic hot spot able to detect both aromatic and non-aromatic amino acids in single peptides. , The high spatial and spectral resolution of the particle-in-pore sensor would be promising to detect small PTMs such as proline hydroxylatioin in amino acids, peptides, and even at the protein level.
1.
Schematic of the deep learning-assisted SERS method for single-PTM detection. (a) Schematic of the plasmonic particle-in-pore sensor with a hot spot that excites the molecule or part of the molecule. (b) Molecular structures of Ser, pSer, Tyr, and pTyr. (c) Heatmap plot of the SM-SERS spectra time series. (d) 1D-CNN deep learning model for phosphorylation identification.
Nevertheless, single-molecule SERS (SM-SERS) analysis of phosphorylation is challenging by the traditional analysis method that identifies the phosphorylation by the shift and intensity change of the SERS peaks based on a SERS database. The method relying on the databased from multimolecule data becomes unreliable to analyze the SM-SERS spectra with continuous peak shifts and intensity fluctuations commonly referred to as ″blinking″, which arise from the Brownian motion and dynamic conformational changes of the molecule within the plasmonic hotspot. ,, In addition, the citrate, commonly used as a stabilizing surfactant in the nanoparticle system, seldom generates background signals in the multimolecule SERS system of nanoparticles, because they were completely replaced by the analyte molecules on the nanoparticle surface. In strong contrast, the citrate signals significantly interfered with SM-SERS signals from the SM-SERS sensor where analyte molecules were adsorbed on the nanoparticle surface in a submonolayer manner, because the citrates would compete with target analytes for adsorption sites in the hot spot. , Other research groups used oxygen plasma to remove the citrates from the enhancing surface of the SM-SERS sensor, which limits the applications. Finally, the lack of single-molecule SERS databases of pure pSer and pTyr places another barrier.
In this work, we developed a high-accuracy SM-SERS approach by combining a plasmonic particle-in-pore sensor (Figure a) to collect SM-SERS spectra of phosphorylation at Serine and Tyrosine (Figure b), k-mean clustering for citrate signal removal, and a one-dimensional convolutional neural network (1D-CNN) for phosphorylation identification. While the particle-in-pore SERS sensor provides single hot spot with high localization to detect SM-SERS signals with ultralow analyte coverage of 1.23% particle surface (Figure c), citrate-interfered spectra were effectively precluded by k-means clustering before training and post-evaluation of the 1D-CNN model. The customized 1D-CNN model consists of hierarchical convolutional layers and pooling operations (Figure d), which enables extraction of subtle spectra differences of phosphorylation from the blinking single-molecule SERS data (Figure c). , Taking the advantage of large amount of single-molecule SERS data acquired from the particle-in-pore sensor, the synergy of the particle-in-pore sensor with the 1D-CNN model would allow automated information extraction from complex spectra for single-molecule PTM analysis, − which overcame the above-mentioned drawbacks of traditional SERS analysis. Notably, we have achieved an overall accuracy of over 95% for the identification of Ser from pSer, and 97% for the identification of Tyr from pTyr. This work demonstrated the high-accuracy SERS detection of pSer and pTyr at the single molecule level. This is a significant step toward single-molecule PTM analysis, which has vast applications in personalized medicine, drug discovery, and therapeutic intervention monitoring.
The plasmonic particle-in-pore sensor with SM-SERS sensitivity was fabricated to collect the SM-SERS spectra of PTMs according to the protocol in our previous papers. , The gold nanopores of 200 nm diameter were fabricated on a silicon nitride (SiN) membrane by Focused ion Beam milling (see fabrication details in Supporting Information). After adsorbing 1/80 monolayer, i.e., 1.23% of the particle surface area, of analyte molecules on the gold nanoparticle (AuNP) of 50 nm diameter (details in the Supporting Information Table S1), the nanoparticle was trapped in the nanopore for minutes under 785 nm laser illumination by the ThermoFisher DXR2xi Raman microscope with 15 mW laser power, slit width of 50 μm, and 0.1 s exposure time. Consequently, a single plasmonic hot spot with a strong field enhancement was generated on the nanoparticle to excite the molecule and emit SM-SERS signals.
The SM-SERS spectra of Ser, Tyr, and their phosphorylation products, phosphorylated serine (pSer) and phosphorylated tyrosine (pTyr), were collected by the particle-in-pore sensor. The SM-SERS measurements in our particle-in-pore platform are quite reproducible, as we collected more than 20,000 SM-SERS spectra for each molecule (Citrate, Ser, Tyr, pSer, pTyr). We used 10 nanopores for each type of molecule. For an independent measurement, a new particle would be trapped in the nanopore to produce the SM-SERS spectra time series. We had considered 13 independent measurements for citrate, 11 for tyrosine, 23 for phosphorylated tyrosine, 13 for Serine, and 15 measurements for phosphorylated serine. Each independent measurement contains 2000 spectra, while some yield a better signal than others, depending on the molecular structures. Typical preprocessed SM-SERS spectra time series are shown in Figure c processed by the SERS signal processing pipeline in Supporting Information Figure S1.
Unlike multimolecule SERS spectra where spectral features are averaged over many molecules, SM-SERS spectra exhibit significant temporal fluctuations in peak position, intensity, and bandwidth. This variablility is due to the probabilistic nature of molecular adsorption within the hot spots. Additionally, peak shifts can occur due to charge transfer interactions between the target molecule and the gold surface in different orientations. Despite these fluctuations, SM-SERS retains high chemical specificity, capable of resolving subtle differences in molecular structure, isotopic composition, and even intermolecular interactions. Instead of using peak intensity, we apply a histogram of SM-SERS peak occurrence frequency, for example, in Figure a,b (pSer) and c,d (pTyr), to visualize and emphasize the narrow, continuously occurring SM-SERS peaks. They characterized the most probable molecular conformation of the analyte molecules and citrates within the plasmonic hot spot. , Similarly, the SM-SERS spectra of Tyr and Ser and their peak occurrence frequencies can be found in Supporting Information Figure S2.
2.

Examples of SM-SERS spectra and histograms of SM-SERS peak occurrence freqauency. (a) Fluctuating SM-SERS spectra of pSer, with the red line indicating the most frequently occurring Raman shift. (b) SM-SERS peak occurrence frequency of pSer. (c) Fluctuating SM-SERS spectra of pTyr. (d) SM-SERS peak occurrence frequency of pTyr.
In our previous paper, we identified hydroxylation at the single molecule level, whereas this work focuses on the phosphorylation of serine and tyrosine. The citrate interference is a bottleneck for single molecule data analysis. Previously, we experimentally mitigated this issue by substituting citrate with an analyte monolayer. However, this approach has two key limitations. First, it does not completely eliminate citrates, leading to residual interference, as citrate could still access the hot spot. Second, the requirement of a substantial number of analyte molecules to form a complete monolayer on the nanoparticle limits further improvements in the detection sensitivity.
To address these limitations, we engineered the platform such that the analyte molecule occupied only 1.23% of the particle surface. The remaining surface was covered by citrates, which generated SM-SERS noise when it was excited by the hot spot. To preclude citrate influence, we implemented a k-means-based clustering algorithm to exclude citrate-contaminated spectra from our SM-SERS data sets. The k-means-based clustering has three stages: (1) clustering with the k-means algorithm; (2) identification of the contaminated cluster (i.e., the cluster containing most of pure-citrate spectra); and (3) identification of citrate-affected spectra from the target molecule by iteratively searching within the contaminated cluster. First, SM-SERS spectra of pure citrates are collected using the bare nanoparticles (i.e., only citrate surfactants on the particle surface) in the particle-in-pore platform. Second, by using 2051 spectra of pure-citrate as a reference, we implemented k-means algorithms in MATLAB to cluster SERS spectra. k-Means clustering minimizes within-cluster variance by iteratively assigning spectra to their nearest cluster centroids based on spectral similarity metrics. We considered three clusters based on our prior knowledge and optimal numbers of cluster using the elbow method in Figure S3 in the Supporting Information. Following k-means clustering in Figure b and e, we identified the cluster predominantly composed of citrate spectra (contaminated cluster) using the “mode” function in MATLAB. This identified cluster was subsequently established as a reference database to effectively filter out any citrate-affected analyte spectra. The contaminated cluster primarily contains spectra of citrate, along with citrate-affected spectra from the analytes. We identified and removed any citrate-affected target molecule spectra that had been assigned to this contaminated cluster using “find” function in MATLAB.
3.
Workflow of k-means-based clustering. The black star (★) indicates the centroid in each cluseter. (a) t-SNE visualization of Ser, pSer, and citrate (Cit) in two-dimensional space; (b) t-SNE visualization of k-means clustered spectra of Ser, pSer, and citrate-contaminated (Cont) data; (c) t-SNE visualization of citrate-free, k-means clustered Ser and pSer spectra; (d) t-SNE visualization of Tyr, pTyr, and citrate (Cit) in two-dimensional space; (e) t-SNE visualization of k-means clustered Tyr, pTyr, and citrate-contaminated (Cont) spectra; and (f) t-SNE visualization of k-means clustered, citrate-free spectra of Tyr and pTyr.
To visualize the different clusters distinctly, we first implemented principal component analysis (PCA) to reduce the dimensionality of the original spectra. Each spectrum initially contained 1048 features (Raman shift intensity values), which we reduced to the 30 most significant principal components. We then applied t-distributed stochastic neighbor embedding (t-SNE), a robust non-linear dimensionality reduction technique, to visualize the spectral clustering outcomes as distinct groups in a two-dimensional space. (Detailed implementation available in the Supporting Information.)
As a result, some spectra from Ser, pSer, Tyr, and pTyr are reassigned as contaminated, meaning that these spectra are highly correlated to the citrate or strongly affected by citrate. We have presented the original, clustered, and citrate-free data sets of Ser and pSer in Figure a-c. Similarly, the t-SNE visualization of Tyr and pTyr before and after citrate removal using a k-means-based clustering algorithm is shown in Figure d-f. We quantified the relative number of spectra in each cluster in Figure S4 and the profile of spectra precluded by the k-means-based algorithm in Figure S5 in the Supporting Information. PCA, a linear technique, focuses on the global structure of the data. In contrast, t-SNE maintains local data relationships by constructing a probability distribution that reflects pairwise spectral similarities in the high-dimensional space. Subsequently, it optimizes a low-dimensional embedding that preserves these intricate spectral relationships. Thus, the integration of k-means clustering with an iterative search of individual spectra within the citrate database effectively mitigates citrate-induced interference, ensuring the selection of SERS spectra for precise molecular characterization.
Using the k-means-based citrate removal technique, we developed a citrate-free subset data set: subsequently we developed and validated a customized 1D-CNN model. Our 1D-CNN model consists of an input, three convolutional blocks, a flattening layer, two fully connected blocks (dense blocks), and an output layer. Each convolutional block consists of two convolutional layers, two batch normalization (BN) layers, the max-pooling layer, and dropout layers. We started a convolutional layer with 16 filters and proceeded to a convolutional layer having 64 filters by doubling each time. Batch normalization accelerates convergence, while max-pooling reduces spatial dimensions, and dropout prevents overfitting. The flattening layer is used to transform multiple feature maps produced by the convolutional layers into a 1D vector. The rectifier linear unit (ReLU) activation function introduces non-linearity to the model and mitigates the vanishing gradient and l2 kernel regularization is used. The fully connected (Dense layer) basically makes the final decision after the convolutional and pooling layers extract features, which should learn the global patterns in their input feature spaces to classify them. Finally, the Sigmoid activation function in the output layer is used to convert raw score outputs into a probability distribution over the binary classes (detailed in Figure S6 in Supporting Information). CNN has been applied for spectroscopic data analysis due to their excellent performance in feature extraction and identification problems. − We implemented a 1D-CNN in python version 3.11.5 with the TensorFlow framework (details in Table S3 of the Supporting Information) to distinguish Ser from pSer and Tyr from pTyr using 1D-CNN based on their SM-SERS spectra.
A total of 2572 spectra of Ser, 3803 spectra of pSer, 4231 spectra of Tyr, and 5108 spectra of pTyr were extracted and divided into training, validation, and post-evaluation sets. For each identification task, two data sets were prepared: the original data set (citrate-affected data set) and a citrate-free data set. The citrate-free data set was obtained by removing citrate-contaminated spectra using the k-means clustering algorithm, resulting in 1605 spectra of Ser, 2281 spectra of pSer, 2484 spectra of Tyr, and 3125 spectra of pTyr. Each data set was randomly split into training and validation sets in a 70:30 ratio. Model performance was subsequently evaluated using an unseen data set, referred to as the post-evaluation set, as summarized in Table S2 of the Supporting Information.
The classification accuracies of the 1D-CNN model on the validation and post-evaluation sets evaluated by the confusion matrix are shown in Figure a for Ser vs pSer identification based on citrate-affected data set, and in Figure b for the citrate-free data set. We achieved post-evaluation accuracies of over 81% for the citrate-affected data set and over 93% for the citrate-free data set, indicating that the use of the k-means clustering algorithm to remove citrate-contaminated spectra significantly improves classification performance. Figure c shows their Receiver Operating Characteristic (ROC) curves, demonstrating high sensitivity and specificity. We have presented the area under the curve (AUC), precision, and recall values from both training and post-evaluation in Supporting Information Table S4. The model’s training performance on the validation of citrate-free and citrate-interfered data sets during the training were in the Supporting Information Figure S7. However, the accuracies and ROC curves for Tyr versus pTyr in Figure d-f before and after citrate signal removal do not show such a significant increase in accuracy. It could be due to the aromatic molecular structure of Tyr and pTyr against the non-aromatic molecular structure of Ser and pSer, because the former can generate much stronger SM-SERS signals than the citrate ones.
4.
Performance metrics of the 1D-CNN on the identification of Ser from pSer and Tyr from pTyr. The confusion matrices show the classification accuracies at validation (Val) and post-evaluation (Post-Eval) stages of (a,d) citrate-affected spectra (Cit-A) and (b,e) citrate-free spectra(Cit-free), respectively. (c,f) Corresponding ROC curve on citrate-affected and citrate-free spectra on the validation and post-evaluation sets.
To interpret the 1D-CNN result and gain insight of the data analysis, we combined the histogram of SM-SERS peak occurrence frequencies of all citrate-free data sets with the Gradient-Weighted Class Activation Mapping (Grad-CAM). ,, The latter elucidates the decision-making process of neural networks by highlighting more discriminative spectral regions. In Figure , we present the normalized Grad-CAM feature weights of Ser/pSer, Tyr/pTyr (colored curves) alongside the SM-SERS peak occurrence frequencies (gray histogram) within the 500–1650 cm–1 spectral regions. In fact, the distinct differences in the high Grad-CAM features could correspond to either inherent molecular vibrations of the two molecules or just data differences without the corresponding vibrational modes. A good example of the latter is the high pTyr Grad-CAM feature at around 1027 cm–1 that does not correspond to a high SM-SERS peak occurrence frequency in Figure d. Therefore, only those Grad-CAM regions overlapping with a high SM-SERS peak occurrence frequency were confirmed to represent molecular structural differences and can be assigned to a certain vibrational mode. Due to the lack of a comprehensive spectral database for tyrosine and serine phosphorylation, some frequently occurring peaks may not be assigned.
5.
Normalized Grad-CAM feature weights extracted by the 1D-CNN model with a histogram of peak occurrence frequency of entire citrate-free data set. The blue curve in (a) represents the 1D Grad-CAM feature weights for Ser, while the red curve in (b) shows those for pSer. Similarly, the blue curve in (c) corresponds to Tyr, and the red curve in (d) corresponds to pTyr. Gray spikes in each panel indicate the peak occurrence frequencies of the corresponding molecules.
For Ser and pSer in Figure a,b, we observed the most significant spectral regions within the ranges 610–620 cm–1, 760–770 cm–1, 820–830 cm–1, 1000–1010 cm–1, 1090–1100 cm–1, 1240–1250 cm–1, 1300–1310 cm–1, 1320–1330 cm–1, 1420–1430 cm–1, and 1570–1580 cm–1. We assign peaks based on high occurrence frequencies, considering a Raman shift tolerance of 10 cm–1. The primary reliance on peak occurrence frequency for assignment stems from its direct correlation to structural differences. In contrast, the 1D-gradient feature map, derived from data differences, focuses on unique features for robust and accurate binary classification (e.g., Ser from pSer). This gradient map may also incorporate Raman peaks if they positively contribute to distinguishing the two classes; in that case, they can be assigned to a certain vibrational mode.
While pSer partly shares some spectral features with Ser, it also introduces a distinct vibrational band due to the presence of the phosphate group. For instance, both Ser and pSer exhibit CO stretching at 1097 cm–1 and CH2 bending modes at 1361 cm–1. Furthermore, relative shift in shared vibrational modes due to molecular structural differences, e.g., the out-of-plane bending mode of CH2 at 1004 cm–1, and 1021 cm–1 for pSer, and Ser, respectively. Additionally, phosphorylation induces unique peaks, such as the P–O asymmetric stretching at 829 cm–1, which is a direct consequence of the added phosphate group, consistent with literature findings summarized in Table .
1. SERS Bands with SM-SERS Peak Occurrence Frequencies as Shown in Figure .
| Ser and pSer Band (cm ‑1 ) | Molecule | Vibration Mode | Reference |
|---|---|---|---|
| 611 | pSer | δ(COO –) | , |
| 770 | Ser | γ(COO –) | |
| 829 | pSer | v_as(PO) | |
| 1004 | pSer | γ(CH 2) | − |
| 1021 | Ser | γ(CH 2) | , |
| 1097 | Ser, pSer | v(CO) | , |
| 1245 | Ser | δ(COH) | |
| 1301 | Ser | γ(CH) | , |
| 1324 | pSer | γ(CH) | , |
| 1423 | Ser | v(COO–) | |
| 1571 | Ser | v_as(COO–) |
| Tyr and pTyr Band (cm-1) | Molecule | Vibration Mode | Reference |
|---|---|---|---|
| 742 | Tyr | w(CH) | |
| 759 | pTyr | v_s(PO) | |
| 800 | Tyr, pTyr | ω(CC) | |
| 1027 | Tyr | v(CN) | , |
| 1077 | Tyr, pTyr | δ(CH) | |
| 1140 | Tyr, pTyr | v(CCH), ring | |
| 1259 | Tyr | v(C4OH) | |
| 1566 | Tyr, pTyr | v(ring) |
In the case of Tyr and pTyr in Figure c,d, prominent vibrational modes were identified in the ranges of 740–760 cm–1, 800–810 cm–1, 1020–1030 cm–1, 1070–1080 cm–1, 1140–1150 cm–1, 1250–1260 cm–1, and 1560–1570 cm–1. We observed a common feature for both molecules, including wagging of CC (800 cm–1), bending of CH (1077 cm–1), and ring stretching (1140 cm–1 and 1566 cm–1). Uniquely, asymmetric PO stretching at 759 cm–1 was observed, which is typically associated with pTyr, while the C4OH stretching at 1259 cm–1 characterizes Tyr. A comprehensive assignment of these vibrational modes is provided in Table .
In summary, we detected single-molecule SERS spectra of Ser, Tyr, and their phosphorylation with analyte molecules occupied only 1.23% of the particle surface using the particle-in-pore sensor, effectively excluding the citrate-affected spectra using a k-means clustering algorithm and distinguished single amino acids from their phosphorylation using a deep learning model. Our 1D-CNN model achieved accuracies of over 95% in distinguishing Ser from pSer and 97% in distinguishing Tyr from pTyr. This work demonstrates the SM-SERS identification of serine and tyrosine from their phosphorylated forms. Achieving a higher amino acid differentiation from their PTMs at a single molecule level lays the foundation for further study of peptide and protein phosphorylation. Consequently, AI-enhanced SERS will transform fields such as biochemistry, molecular biology, proteomics, genomics, and medicine by enabling precise molecular characterization at single-molecule sensitivity.
Our findings highlight deep learning analysis of SM-SERS spectra to detect PTMs. Manual analysis of thousands of single-molecule SERS spectra is impractical and susceptible to bias; conversely, deep learning makes this task automated and avoids human intervention. Notably, in deep learning, a large amount of data is required for training, validation, and post-evaluating the 1D-CNN model. The Brownian motion of molecules in the hot spot allows us to collect a large amount of original data without the need for augmentation in SERS spectra that may lead to false spectra generation. This is because the SM-SERS spectra have only one dimension, so augmenting by shifting or adding noise may introduce fake spectra. Furthermore, the high sensitivity of SM-SERS spectra renders traditional augmentation less effective in the single molecule regime. Future research may extend this approach to peptide and protein identification and sequencing at the single molecule level. The current model can be further customized for peptide and protein sequence analysis. Hence peptides have a long sequence, and their phosphorylations have many amino acids in common; the PTM site might not necessarily be located in the hot spot for SERS detection. This leads to heavy overlaps in the spectra of the peptide and its PTMs. Therefore, we may need deeper and more complex tuning of the model architecture to differentiate peptide spectra effectively.
Supplementary Material
Acknowledgments
This research receives support from Academy Research Fellow project: TwoPoreProSeq (project number 347652), Biocenter Oulu emerging project (DigiRaman) and DigiHealth project (project number 326291), a strategic profiling project at the University of Oulu that is supported by the Academy of Finland and the University of Oulu.
Codes are available in the github page: https://github.com/MulusewWondie/Single-molecule-Phosphorylation-Identification-.
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpclett.5c01753.
Mulusew W. Yaltaye developed the SERS spectra preprocessing pipeline, prepared the data sets, developed and validated 1D-CNN model, analyzed the results, drafted and revised the manuscripts. Yingqi Zhao designed the molecules, fabricated the Particle-in-Pore devices, determined the Raman measurement protocol, collected Raman spectra, and revised the manuscript. Kuo Zhan helped with data analysis and revised the manuscript. Eva Bozo helped with Raman measurements. Pei-Lin Xin helped in revising the manuscript. Francesco De Angelis contributed to the device fabrication and revised the manuscript. Vahid Farrahi helped with data analysis and revised the manuscript. Jianan Huang conceived the idea, acquired fundings, supervised the work, and revised the manuscript.
The authors declare no competing financial interest.
References
- Shortreed M. R., Wenger C. D., Frey B. L., Sheynkman G. M., Scalf M., Keller M. P., Attie A. D., Smith L. M.. Global Identification of Protein Post-Translational Modifications in a Single-Pass Database Search. J. Proteome Res. 2015;14(11):4714–4720. doi: 10.1021/acs.jproteome.5b00599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petushkova N. A., Zgoda V. G., Pyatnitskiy M. A., Larina O. V., Teryaeva N. B., Potapov A. A., Lisitsa A. V.. Post-Translational Modifications of FDA-Approved Plasma Biomarkers in Glioblastoma Samples. PLoS One. 2017;12(5):e0177427. doi: 10.1371/journal.pone.0177427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramazi S., Zahiri J.. Post-Translational Modifications in Proteins: Resources, Tools and Prediction Methods. Database. 2021:baab012. doi: 10.1093/database/baab012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson L. N., Lewis R. J.. Structural Basis for Control by Phosphorylation. Chem. Rev. 2001;101(8):2209–2242. doi: 10.1021/cr000225s. [DOI] [PubMed] [Google Scholar]
- Daniels G., Pei Z., Logan S. K., Lee P.. Mini-Review: Androgen Receptor Phosphorylation in Prostate Cancer. Am. J. Clin Exp Urol. 2013;1(1):25–29. [PMC free article] [PubMed] [Google Scholar]
- Ochieng J., Nangami G., Sakwe A., Moye C., Alvarez J., Whalen D., Thomas P., Lammers P.. Impact of Fetuin-A (AHSG) on Tumor Progression and Type 2 Diabetes. Int. J. Mol. Sci. 2018;19(8):2211. doi: 10.3390/ijms19082211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ricken F., Can A. D., Gräber S., Häusler M., Jahnen-Dechent W.. Post-Translational Modifications Glycosylation and Phosphorylation of the Major Hepatic Plasma Protein Fetuin-A Are Associated with CNS Inflammation in Children. PLoS One. 2022;17(10):e0268592. doi: 10.1371/journal.pone.0268592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cavallarin N., Vicario M., Negro A.. The Role of Phosphorylation in Synucleinopathies: Focus on Parkinsons Disease. CNS Neurol Disord Drug Targets. 2010;9(4):471–481. doi: 10.2174/187152710791556140. [DOI] [PubMed] [Google Scholar]
- Sano K., Iwasaki Y., Yamashita Y., Irie K., Hosokawa M., Satoh K., Mishima K.. Tyrosine 136 Phosphorylation of α-Synuclein Aggregates in the Lewy Body Dementia Brain: Involvement of Serine 129 Phosphorylation by Casein Kinase 2. Acta Neuropathol Commun. 2021;9(1):182. doi: 10.1186/s40478-021-01281-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aebersold R., Mann M.. Mass-Spectrometric Exploration of Proteome Structure and Function. Nature. 2016:347–355. doi: 10.1038/nature19949. [DOI] [PubMed] [Google Scholar]
- Ma H., Han X. X., Zhao B.. Enhanced Raman Spectroscopic Analysis of Protein Post-Translational Modifications. TrAC Trends in Analytical Chemistry. 2020;131:116019. doi: 10.1016/j.trac.2020.116019. [DOI] [Google Scholar]
- Bi X., Czajkowsky D. M., Shao Z., Ye J.. Digital Colloid-Enhanced Raman Spectroscopy by Single-Molecule Counting. Nature. 2024;628(8009):771–775. doi: 10.1038/s41586-024-07218-1. [DOI] [PubMed] [Google Scholar]
- Lin L. L., Alvarez-Puebla R., Liz-Marzán L. M., Trau M., Wang J., Fabris L., Wang X., Liu G., Xu S., Han X. X., Yang L., Shen A., Yang S., Xu Y., Li C., Huang J., Liu S.-C., Huang J.-A., Srivastava I., Li M., Tian L., Nguyen L. B. T., Bi X., Cialla-May D., Matousek P., Stone N., Carney R. P., Ji W., Song W., Chen Z., Phang I. Y., Henriksen-Lacey M., Chen H., Wu Z., Guo H., Ma H., Ustinov G., Luo S., Mosca S., Gardner B., Long Y.-T., Popp J., Ren B., Nie S., Zhao B., Ling X. Y., Ye J.. Surface-Enhanced Raman Spectroscopy for Biomedical Applications: Recent Advances and Future Challenges. ACS Appl. Mater. Interfaces. 2025;17(11):16287–16379. doi: 10.1021/acsami.4c17502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei F., Zhang D., Halas N. J., Hartgerink J. D.. Aromatic Amino Acids Providing Characteristic Motifs in the Raman and SERS Spectroscopy of Peptides. J. Phys. Chem. B. 2008;112(30):9158–9164. doi: 10.1021/jp8025732. [DOI] [PubMed] [Google Scholar]
- Kurouski D., Postiglione T., Deckert-Gaudig T., Deckert V., Lednev I. K.. Amide I Vibrational Mode Suppression in Surface (SERS) and Tip (TERS) Enhanced Raman Spectra of Protein Specimens. Analyst. 2013;138(6):1665. doi: 10.1039/c2an36478f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang J., Mousavi M. Z., Giovannini G., Zhao Y., Hubarevich A., Soler M. A., Rocchia W., Garoli D., De Angelis F.. Multiplexed Discrimination of Single Amino Acid Residues in Polypeptides in a Single SERS Hot Spot. Angew. Chem., Int. Ed. 2020;59(28):11423–11431. doi: 10.1002/anie.202000489. [DOI] [PubMed] [Google Scholar]
- Huang J.-A., Mousavi M. Z., Zhao Y., Hubarevich A., Omeis F., Giovannini G., Schütte M., Garoli D., De Angelis F.. SERS Discrimination of Single DNA Bases in Single Oligonucleotides by Electro-Plasmonic Trapping. Nat. Commun. 2019;10(1):5321. doi: 10.1038/s41467-019-13242-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao Y., Zhan K., Xin P.-L., Chen Z., Li S., De Angelis F., Huang J.-A.. Single-Molecule SERS Discrimination of Proline from Hydroxyproline Assisted by a Deep Learning Model. Nano Lett. 2025;25(18):7499–7506. doi: 10.1021/acs.nanolett.5c01177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park K.-D., Muller E. A., Kravtsov V., Sass P. M., Dreyer J., Atkin J. M., Raschke M. B.. Variable-Temperature Tip-Enhanced Raman Spectroscopy of Single-Molecule Fluctuations and Dynamics. Nano Lett. 2016;16(1):479–487. doi: 10.1021/acs.nanolett.5b04135. [DOI] [PubMed] [Google Scholar]
- Zhao X., Liu X., Chen D., Shi G., Li G., Tang X., Zhu X., Li M., Yao L., Wei Y., Song W., Sun Z., Fan X., Zhou Z., Qiu T., Hao Q.. Plasmonic Trimers Designed as SERS-Active Chemical Traps for Subtyping of Lung Tumors. Nat. Commun. 2024;15(1):5855. doi: 10.1038/s41467-024-50321-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deriu C., Morozov A. N., Mebel A. M.. Direct and Water-Mediated Adsorption of Stabilizers on SERS-Active Colloidal Bimetallic Plasmonic Nanomaterials: Insight into Citrate–AuAg Interactions from DFT Calculations. J. Phys. Chem. A. 2022;126(32):5236–5251. doi: 10.1021/acs.jpca.2c00455. [DOI] [PubMed] [Google Scholar]
- Ochieng J., Nangami G., Sakwe A., Moye C., Alvarez J., Whalen D., Thomas P., Lammers P.. Impact of Fetuin-A (AHSG) on Tumor Progression and Type 2 Diabetes. Int. J. Mol. Sci. 2018;19(8):2211. doi: 10.3390/ijms19082211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krizhevsky, A. ; Sutskever, I. ; Hinton, G. E. . ImageNet Classification with Deep Convolutional Neural Networks. http://code.google.com/p/cuda-convnet/.
- LeCun, Y. ; Bengio, Y. . Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks. 1998. [Google Scholar]
- Goodfellow, I. , Bengio, Y. ; Courville, A. . Deep Learning; MITPress: 2016. [Google Scholar]
- Albrecht T., Slabaugh G., Alonso E., Al-Arif S. M. R.. Deep Learning for Single-Molecule Science. Nanotechnology. 2017;28(42):423001. doi: 10.1088/1361-6528/aa8334. [DOI] [PubMed] [Google Scholar]
- Ghosh K., Stuke A., Todorović M., Jørgensen P. B., Schmidt M. N., Vehtari A., Rinke P.. Deep Learning Spectroscopy: Neural Networks for Molecular Excitation Spectra. Advanced Science. 2019;6(9):1801367. doi: 10.1002/advs.201801367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCarthy M., Lee K. L. K.. Molecule Identification with Rotational Spectroscopy and Probabilistic Deep Learning. J. Phys. Chem. A. 2020;124(15):3002–3017. doi: 10.1021/acs.jpca.0c01376. [DOI] [PubMed] [Google Scholar]
- Pieczonka N. P. W., Aroca R. F.. Single Molecule Analysis by Surfaced-Enhanced Raman Scattering. Chem. Soc. Rev. 2008;37(5):946. doi: 10.1039/b709739p. [DOI] [PubMed] [Google Scholar]
- Zou Y., Jin H., Ma Q., Zheng Z., Weng S., Kolataj K., Acuna G., Bald I., Garoli D.. Advances and Applications of Dynamic Surface-Enhanced Raman Spectroscopy (SERS) for Single Molecule Studies. Nanoscale. 2025;17(7):3656–3670. doi: 10.1039/D4NR04239E. [DOI] [PubMed] [Google Scholar]
- Zrimsek A. B., Chiang N., Mattei M., Zaleski S., McAnally M. O., Chapman C. T., Henry A.-I., Schatz G. C., Van Duyne R. P.. Single-Molecule Chemistry with Surface- and Tip-Enhanced Raman Spectroscopy. Chem. Rev. 2017;117(11):7583–7613. doi: 10.1021/acs.chemrev.6b00552. [DOI] [PubMed] [Google Scholar]
- Etchegoin P. G., Le Ru E. C.. A Perspective on Single Molecule SERS: Current Status and Future Challenges. Phys. Chem. Chem. Phys. 2008;10(40):6079. doi: 10.1039/b809196j. [DOI] [PubMed] [Google Scholar]
- Qiu Y., Kuang C., Liu X., Tang L.. Single-Molecule Surface-Enhanced Raman Spectroscopy. Sensors. 2022;22(13):4889. doi: 10.3390/s22134889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gautam R., Vanga S., Ariese F., Umapathy S.. Review of Multidimensional Data Processing Approaches for Raman and Infrared Spectroscopy. EPJ. Tech Instrum. 2015;2(1):8. doi: 10.1140/epjti/s40485-015-0018-6. [DOI] [Google Scholar]
- Pareek J., Jacob J.. Data Compression and Visualization Using PCA and T-SNE. Proceedings of AICTC 2019. 2021:327–337. doi: 10.1007/978-981-15-5421-6_34. [DOI] [Google Scholar]
- Van Der Maaten, L. ; Hinton, G. . Visualizing Data Using T-SNE; 2008; Vol. 9. [Google Scholar]
- Zhang Y., Lyu X., Xing Y., Ji Y., Zhang L., Wu G., Liu X., Qin L., Wu Y., Wang X., Wu J., Li Y.. Advancing DNA Structural Analysis: A SERS Approach Free from Citrate Interference Combined with Machine Learning. J. Phys. Chem. Lett. 2025;16(5):1199–1205. doi: 10.1021/acs.jpclett.4c03478. [DOI] [PubMed] [Google Scholar]
- Chin C.-L., Chang C.-E., Chao L.. Interpretable Multiscale Convolutional Neural Network for Classification and Feature Visualization of Weak Raman Spectra of Biomolecules at Cell Membranes. ACS Sens. 2025;10(4):2652–2666. doi: 10.1021/acssensors.4c03260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu P., Lin D., Chen N., Wang L., Zhang X., Chen H., Ma P.. CNN-Assisted SERS Enables Ultra-Sensitive and Simultaneous Detection of Scr and BUN for Rapid Kidney Function Assessment. Analytical Methods. 2023;15(3):322–332. doi: 10.1039/D2AY01573K. [DOI] [PubMed] [Google Scholar]
- Li J. Q., Dukes P. V., Lee W., Sarkis M., Vo-Dinh T.. Machine Learning Using Convolutional Neural Networks for SERS Analysis of Biomarkers in Medical Diagnostics. J. Raman Spectrosc. 2022;53(12):2044–2057. doi: 10.1002/jrs.6447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiong C.-C., Zhu S.-S., Yan D.-H., Yao Y.-D., Zhang Z., Zhang G.-J., Chen S.. Rapid and Precise Detection of Cancers via Label-Free SERS and Deep Learning. Anal Bioanal Chem. 2023;415(17):3449–3462. doi: 10.1007/s00216-023-04730-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirchberger-Tolstik T., Pradhan P., Vieth M., Grunert P., Popp J., Bocklitz T. W., Stallmach A.. Towards an Interpretable Classifier for Characterization of Endoscopic Mayo Scores in Ulcerative Colitis Using Raman Spectroscopy. Anal. Chem. 2020;92(20):13776–13784. doi: 10.1021/acs.analchem.0c02163. [DOI] [PubMed] [Google Scholar]
- Andrushchenko V., Benda L., Páv O., Dračínský M., Bouř P.. Vibrational Properties of the Phosphate Group Investigated by Molecular Dynamics and Density Functional Theory. J. Phys. Chem. B. 2015;119(33):10682–10692. doi: 10.1021/acs.jpcb.5b05124. [DOI] [PubMed] [Google Scholar]
- Jarmelo S., Reva I., Carey P. R., Fausto R.. Infrared and Raman Spectroscopic Characterization of the Hydrogen-Bonding Network in l-Serine Crystal. Vib Spectrosc. 2007;43(2):395–404. doi: 10.1016/j.vibspec.2006.04.025. [DOI] [Google Scholar]
- Kolesov B. A., Boldyreva E. V.. Difference in the Dynamic Properties of Chiral and Racemic Crystals of Serine Studied by Raman Spectroscopy at 3–295 K. J. Phys. Chem. B. 2007;111(51):14387–14397. doi: 10.1021/jp076083o. [DOI] [PubMed] [Google Scholar]
- Zhu G., Zhu X., Fan Q., Wan X.. Raman Spectra of Amino Acids and Their Aqueous Solutions. Spectrochim Acta A Mol. Biomol Spectrosc. 2011;78(3):1187–1195. doi: 10.1016/j.saa.2010.12.079. [DOI] [PubMed] [Google Scholar]
- dos Santos C. A. A. S. S., Carvalho J. O., da Silva Filho J. G., Rodrigues J. L., Lima R. J. C., Pinheiro G. S., Freire P. T. C., Façanha Filho P. F.. High-Pressure Raman Spectra and DFT Calculations of l -Tyrosine Hydrochloride Crystal. Physica B Condens Matter. 2018;531:35–44. doi: 10.1016/j.physb.2017.11.090. [DOI] [Google Scholar]
- Madzharova F., Heiner Z., Kneipp J.. Surface Enhanced Hyper-Raman Scattering of the Amino Acids Tryptophan, Histidine, Phenylalanine, and Tyrosine. J. Phys. Chem. C. 2017;121(2):1235–1242. doi: 10.1021/acs.jpcc.6b10905. [DOI] [Google Scholar]
- Pavlou E., Kourkoumelis N.. Deep Adversarial Data Augmentation for Biomedical Spectroscopy: Application to Modelling Raman Spectra of Bone. Chemometrics and Intelligent Laboratory Systems. 2022;228:104634. doi: 10.1016/j.chemolab.2022.104634. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Codes are available in the github page: https://github.com/MulusewWondie/Single-molecule-Phosphorylation-Identification-.




