Abstract
Background
Nasopharyngeal carcinoma (NPC) is a malignant tumor with high incidence in Southeast Asia and Southern China, characterized by difficulties in early diagnosis and high recurrence rates after treatment. Metabolic reprogramming plays a crucial role in the development and progression of tumors. In-depth studies on the metabolic characteristics and molecular mechanisms of NPC are essential to identify novel diagnostic and therapeutic targets.
Objectives
This study aimed to systematically reveal the metabolic characteristics and molecular mechanisms of NPC cell lines by integrating untargeted metabolomics, transcriptomics, and confocal micro-Raman spectroscopy (CMRS), and to explore potential biomarkers for prognostic evaluation and precision treatment of NPC. Methods: We performed an integrated analysis of transcriptomic, metabolomic, and Raman spectral data on five NPC cell lines (CNE1, CNE2, 5–8 F, 6-10B, and SUNE1) and the immortalized nasopharyngeal epithelial cell line NPEC1-BMI1. The analysis included association analysis of differentially expressed metabolites (DEMs) and differentially expressed genes (DEGs), pathway enrichment analysis, and network analysis to elucidate the interplay between gene expression and metabolic alterations. Furthermore, we employed machine learning models to achieve efficient discrimination between NPC cell lines and NPEC1-BMI1 using Raman spectroscopy. Finally, we validated the expression levels of selected DEGs using quantitative polymerase chain reaction (qPCR), Western blotting (WB), and immunohistochemistry (IHC).
Results
Significant differences in metabolic and gene expression profiles were observed between NPC cells and normal cells. CMRS analysis, combined with a multilayer perceptron (MLP) model, achieved high-precision discrimination between NPC cells and normal cells (accuracy 99.3%, AUC = 1.00). Further integrated analysis revealed significant correlations between KYNU and other DEGs, multiple DEMs, and specific Raman spectral features, suggesting their potential as diagnostic and prognostic biomarkers. Validation using the The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases showed high KYNU expression in head and neck squamous cell carcinoma (HNSCC) and NPC tissues. Consistent high expression of KYNU was confirmed in NPC cell lines and tissues by qPCR, WB, and IHC.
Conclusions
This study elucidated the unique metabolic characteristics and molecular signatures of NPC, clarified how molecular changes regulate gene expression, and provided new potential targets for prognostic evaluation and precision treatment of NPC.
Graphical abstract
Supplementary Information
The online version contains supplementary material available at 10.1007/s12672-025-03349-7.
Keywords: Nasopharyngeal carcinoma, Multi-omics, Metabolomics, Transcriptomics, Confocal micro-Raman spectroscopy, Machine learning
Introduction
NPC primarily originates from the pharyngeal recess of the nasopharynx, with over 95% of cases classified as non-keratinizing squamous cell carcinoma [1]. It exhibits a significant geographical distribution, with over 75% of cases occurring in Southeast Asia and Southern China [2]. Epstein-Barr virus (EBV) infection and genetic susceptibility are the main pathogenic factors of NPC. Specifically, EBV infection interacts with genetic susceptibility through multiple mechanisms to promote NPC development. Firstly, genetic factors, such as specific allelic variations in human leukocyte antigen (HLA) genes, influence individual susceptibility and immune response to EBV infection. For instance, certain HLA alleles are associated with an increased risk of NPC [3]. Secondly, upon EBV infection, the virus can alter host cell gene expression and signaling pathways, such as activating the PI3K/Akt/mTOR and NF-κB pathways through the viral protein LMP1, thereby promoting tumor cell growth and proliferation [4]. Furthermore, EBV infection may induce epigenetic changes in host cells, such as DNA methylation and histone modification, further affecting gene expression [5]. Other factors, such as environmental exposures (e.g., diet, smoking), also contribute to NPC development [6]. Despite significant advances in NPC treatment, including chemotherapy, radiotherapy, targeted therapy, and immunotherapy, challenges remain, such as drug resistance, tumor heterogeneity, frequent recurrence and metastasis, and a narrow therapeutic window, which often limit treatment efficacy [7]. Tumor cell metabolism is closely related to NPC cell growth, proliferation, invasion, and metastasis [8–10]. Multiple studies have shown that tumor cells frequently undergo metabolic reprogramming to adapt to the challenging tumor microenvironment [11].
Metabolomics, as a rapidly evolving analytical tool, has demonstrated significant potential in elucidating tumor metabolic phenotypes, aiding in diagnosis, and predicting prognosis [12]. By systematically analyzing metabolites within biological organisms, metabolomics reveals notable metabolic differences between tumor cells and normal cells. For instance, tumor cells frequently exhibit the Warburg effect, a metabolic reprogramming that provides energy and biosynthetic building blocks for rapid proliferation [13]. Studies have unveiled significant metabolic alterations in NPC cells compared to normal cells, offering valuable insights into the underlying mechanisms of NPC development and progression [14]. In-depth transcriptomic analyses of NPC patients have identified key genes associated with tumorigenesis, progression, invasion, and metastasis. Aberrant gene expression drives tumor malignancy and significantly influences tumor cell metabolism by regulating the expression of metabolic enzymes and transporters [15]. Integrating metabolomics and transcriptomics can further elucidate the specific roles of these genes in modulating tumor metabolic networks.
CMRS is an emerging non-invasive spectroscopic technique that obtains the “molecular fingerprint” of cells by detecting Raman scattered light generated from molecular vibrations [16]. Compared to traditional radiological and histological methods, CMRS enable more direct detection of intracellular biomolecular changes, revealing cellular metabolic states. For example, CMRS can sensitively detect changes in biomolecules such as DNA, proteins, and lipids, as well as metabolites like glucose and lactate [17]. Thus, CMRS provide a novel perspective for studying cellular functions and diseases. However, CMRS also present limitations and potential challenges. Firstly, the Raman scattering signal is weak and susceptible to fluorescence interference, leading to a reduced signal-to-noise ratio. Secondly, sample preparation processes may introduce artifacts, affecting the accuracy of the spectra. Additionally, CMRS data analysis is complex, requiring specialized knowledge and experience. Lastly, the high cost of equipment and limited penetration depth also restrict its application scope [18].
Although reports on the CMRS spectral characteristics of NPC at the single-cell level are limited, studies have demonstrated that CMRS, by analyzing cellular biomolecular vibrational information and integrating machine learning models, can effectively distinguish between normal and tumor cells. A novel blood test using Raman spectroscopy has been proposed for the detection of NPC. By examining blood samples from NPC patients and healthy individuals, the results revealed high sensitivity and specificity, effectively differentiating NPC from normal samples, with similar detection efficacy observed at two wavelengths. This suggests the potential application of Raman spectroscopy-based blood tests in NPC detection [19]. This approach offers a promising avenue for the screening and validation of NPC-related biomarkers [20].
Advances in technologies such as metabolomics, transcriptomics, and Raman spectroscopy have provided powerful tools for studying tumor metabolism. By systematically analyzing the metabolism of NPC cells, we can gain deeper insights into their molecular mechanisms, laying the foundation for improving the prognosis of NPC patients. In recent years, advances in omics technologies have opened new avenues for cancer biomarker research. Multi-omics technologies have been widely applied in common cancers such as lung cancer [21] and breast cancer [22], achieving significant progress. Currently, metabolomics and transcriptomics studies of NPC cells still have significant limitations and data scarcity, severely hindering our comprehensive understanding of NPC metabolic characteristics [23]. Specifically, in transcriptomics, there is insufficient functional validation and clinical application of miRNA, and lncRNA and RNA sequencing studies lack widely recognized biomarkers [24]. In metabolomics, the application of high-throughput technologies is lagging, and clinical validation and long-term follow-up data are insufficient [25]. Therefore, there is an urgent need to conduct systematic multi-omics studies, using methods such as technological innovation, clinical validation, optimized design, and multi-omics integration, to deeply reveal the metabolic mechanisms of NPC and provide new scientific evidence for the early diagnosis and precise treatment of NPC. Integrating metabolomic and transcriptomic analyses, combined with real-time metabolite information provided by Raman spectroscopy, can more comprehensively reveal the molecular changes in NPC. Transcriptomic data provide information at the gene expression level, metabolomic data reflect the actual levels of metabolites, and Raman spectroscopy directly captures the dynamic changes of intracellular metabolic activities. The combination of these three can reveal how changes in gene expression affect metabolic pathways and how metabolites regulate gene expression, thereby providing more effective strategies for the screening and validation of NPC-related biomarkers.
This study aimed to elucidate the changes in metabolomics, transcriptomics, and CMRS in NPC cell lines and to validate the impact of these changes on gene expression. We hypothesized that integrating untargeted metabolomics, transcriptomics, and CMRS analyses could identify DEGs specifically associated with NPC and elucidate how molecular changes drive alterations in gene expression. Specifically, we performed untargeted metabolomics, transcriptomics, and CMRS analyses on five NPC cell lines (CNE1, CNE2, 5–8 F, 6-10B, and SUNE1) and the immortalized nasopharyngeal epithelial cell line NPEC1-BMI1 to identify DEMsand DEGs. Subsequently, we analyzed the relative expression levels of DEGs significantly correlated with DEMs in HNSCC using The TCGA database and validated these findings in the GEO database. To validate the roles of the screened DEGs in NPC, we employed qPCR and WB techniques, which revealed high expression of KYNU in NPC cell lines. Furthermore, we successfully validated the high expression of KYNU on NPC tissue microarrays. By systematically analyzing metabolomic, transcriptomic, and CMRS data, combined with validation, we not only revealed the unique molecular characteristics and heterogeneity of NPC but also elucidated how molecular changes regulate gene expression, providing new insights into the pathogenesis of NPC. Looking forward, we will further investigate the functional roles of these key DEGs in the development and progression of NPC and explore their potential as therapeutic targets. We believe that this study lays the foundation for precision treatment strategies for NPC and provides an important reference for future research directions.
Materials and methods
Cell culture
The development and progression of NPC are multifactorial processes, with EBV infection being one of the most significant etiological factors. This study utilized a variety of NPC cell lines, including EBV-negative cell lines (SUNE1, 5–8 F, 6-10B, CNE1, CNE2, HK1, HONE1) and EBV-positive cell lines (CNE2-EBV, HK1-EBV). The inclusion of both EBV-positive and EBV-negative cell lines aimed to investigate the impact of EBV infection status on the biological characteristics and potential differences of NPC cells, given the critical role of EBV infection in NPC pathogenesis.
The immortalized nasopharyngeal epithelial cell lines NP69 and NPEC1-BMI1 were cultured as non-cancer controls under the following conditions. NP69 cells were maintained in Keratinocyte Serum-Free Medium (KSFM) supplemented with epidermal growth factor and bovine pituitary extract. NPEC1-BMI1 cells, immortalized nasopharyngeal epithelial cells induced by Bmi-1, were cultured in KSFM (Invitrogen). All NPC cell lines were cultured in RPMI 1640 medium (GIBCO) supplemented with 5% fetal bovine serum (FBS, GIBCO). All cells were cultured at 37 °C in a humidified incubator with 5% CO2. Among these NPC cell lines, 5–8 F exhibited high metastatic potential, 6-10B exhibited low metastatic potential, CNE1 was a well-differentiated squamous cell line, and CNE2 was a poorly differentiated squamous cell line. All cell lines mentioned above were kindly provided by Professor Musheng Zeng from Sun Yat-sen University Cancer Center.
Ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) metabolomics assay and data analysis
Untargeted metabolomics analysis was performed as previously described [36] at the Calibra Metabolomics laboratory (DlAN Diagnostics, Hangzhou, Zhejiang, China). Small-molecule metabolites were extracted using 500µL of 80% methanol extraction buffer, followed by centrifugation. The analysis was conducted using four different UPLC-MS/MS methods on a Waters ACQUITY 2D UPLC system coupled to a Thermo Fisher Q Exactive mass spectrometer. The chromatographic separations employed C18 reversed-phase (UPLC BEH C18, 2.1 × 100 mm, 1.7 μm) and HILIC (UPLC BEH Amide, 2.1 × 150 mm, 1.7 μm) columns. For each method, samples were dried under nitrogen and reconstituted in a suitable solvent prior to injection. The QE mass spectrometer was operated in full scan mode over a mass range of 70–1000 m/z with a resolution of 35,000.
Raw mass spectrometry data were pre-processed using proprietary in-house metabolomics software, including ion feature extraction and quality control. Metabolite identification was performed by comparing the data to an in-house reference library, requiring matches in chromatographic retention index, ion mass accuracy within 10 ppm, and high spectral similarity scores to achieve MSI Tier 1 confidence levels. Identified metabolites were median-normalized, log2-transformed, and statistically analyzed using Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA) to classify NPEC1-BMI1 cells and NPC cells. OPLS-DA was chosen for its efficacy in maximizing inter-group separation and providing VIP scores for key metabolite identification. DEMs were identified based on VIP values > 1.0 and P < 0.05, determined through t-tests or non-parametric statistical tests. Random forest (RF) analysis was also employed to identify potential biomarkers in NPC cells compared to NPEC1-BMI1 cells. RF was selected for its ability to handle high-dimensional data, robustness against overfitting, and feature importance assessment for biomarker identification. The original mass spectrometry data are available in Supplementary Raw Data 1.
RNA extraction, transcriptome sequencing, and data analysis
Total RNA was extracted using TRIzol Reagent (Thermo Fisher Scientific, 15596018), and samples with an RNA Integrity Number (RIN) greater than 7.0 were assessed using a Bioanalyzer 2100. 5 µg of RNA was purified using Dynabeads Oligo(dT) beads (Thermo Fisher Scientific) and fragmented at 94 °C using the Magnesium RNA Fragmentation Module. cDNA was synthesized using SuperScript II Reverse Transcriptase (Invitrogen, cat. 1896649) and ligated to a bi-exponential adapter. The cDNA library was amplified by PCR (95 °C for 3 min, followed by 8 cycles of 98 °C for 15 s, 60 °C for 15 s, and 72 °C for 30 s) to generate an average insert size of 300 ± 50 bp. The library was then sequenced on an Illumina Novaseq 6000 platform (LC-Bio Technology CO., Ltd, Hangzhou, China) using 2 × 150 bp paired-end sequencing. Differential gene expression analysis was performed between NPC cell lines and NPEC1-BMI1 cells using both DESeq2 and edgeR software. Genes with an adjusted p-value (FDR) less than 0.05 and an absolute fold change greater than or equal to 2 were considered differentially expressed. Gene Ontology (GO) term and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed on these DEGs. In this study, GO reference gene set was obtained from the official website of the Gene Ontology Consortium (https://geneontology.org/), and the KEGG reference gene set was obtained from the official website of KEGG (https://www.kegg.jp/).The original transcriptome data are available in Supplementary Raw Data 2.
CMRS detection and single-cell Raman spectroscopy pretreatment
NPEC1-BMI1 and NPC cells were dissociated into single cells, washed once with PBS, fixed with 4% paraformaldehyde for 15 min, and washed three times with PBS. Cell suspensions were air-dried on aluminum Raman slides, preserving cellular morphology. Single-cell Raman spectra were acquired using a Witec A300 confocal Raman spectrometer equipped with a 532 nm laser and a 100×/0.9 NA objective lens. The laser power was set to 12 mW. The spectrometer was calibrated in silico, with the Raman peak located at 520.73 cm⁻¹. Spectra were collected over a spectral range of 279–2187 cm⁻¹ using a 1200 lines/mm diffraction grating with a 9-second integration time. Five spectra were acquired from different positions on each cell, resulting in a total of 40 single-cell spectra per sample. Three technical replicates and three biological replicates were performed for each sample. After quality control, a total of 5685 single-cell Raman spectra were obtained, including 846 from NPEC1-BMI1 cells, 964 from 5 to 8 F cells, 1006 from 6-10B cells, 1023 from SUNE1 cells, 820 from CNE1 cells, and 1026 from CNE2 cells. All Raman spectroscopy data are available in Supplementary Raw Data 3.
Raman spectra were pre-processed using Labspec6 software (Horiba). The spectral range was limited to 279–2187 cm⁻¹ to remove cosmic ray spikes and reduce background noise. Baseline correction was performed using a 10th-order polynomial fitting algorithm. Spectral normalization was achieved by dividing each spectrum by its total area. Biomolecule quantification was performed by integrating the area under the corresponding Raman bands. Unpaired two-sided t-tests were used to compare the mean values of different samples, and p-values were adjusted for multiple comparisons using the Bonferroni correction. Significantly different wavenumbers and their corresponding biomolecular structures are listed in Table 1.
Table 1.
Biological attribution of Raman wave numbers
| Raman wavenumber (cm− 1) | Biomolecule assignment | References |
|---|---|---|
| 405 | Glucose | [26, 27] |
| 525 | DNA/RNA backbone vibrations, phosphodiester bond stretching or bending | [28] |
| 666 | Guanine, Thymine (ring breathing modes in the DNA bases); tyrosine-G backbone in RNA | [20] |
| 780 | Cytosine/uracil ring breathing (nucleotide) | [29] |
| 860 | Tryptophan | [30] |
| 876 | Hydroxyproline | [21] |
| 900 | Saccharides | [27] |
| 934 | Proline, valine | [30] |
| 1125 | Amide III vibrations of proteins | [31] |
| 1247 | Guanine, cytosine (NH2) | [32,33] |
| 1265 | Unsaturated lipids | [33] |
| 1305 | Amide III vibrations of proteins | [31] |
| 1440 | CH2 bending mode of proteins and lipids | [33] |
| 1580 | C-C stretching | [34] |
| 1603 | C=C in-plane bending mode of phenylalanine and tyrosine | [35] |
| 1616 | C=C stretching mode of tyrosine and tryptophan | [35] |
| 1655 | Amide I vibrations of proteins | [31] |
Considering the potential for high dimensionality, non-linearity, and complex inter-class distribution characteristics inherent in Raman spectroscopy data, this study selected and comparatively analyzed a variety of machine learning models to identify the most suitable classification method for this type of data. The models investigated include MLP, Gated Recurrent Unit (GRU), K-Nearest Neighbors (KNN), Gradient Boosting (GB), Linear Support Vector Machine (LSVM), Quadratic Discriminant Analysis (QDA), RF, Linear Discriminant Analysis (LDA), Logistic Regression (LR), and Naive Bayes (NB). All model analyses were implemented using the Python 3.9 programming language. The libraries utilized encompassed, but were not limited to: SciPy for statistical analysis, Scikit-learn for machine learning models, TensorFlow and PyTorch for deep learning models, and Matplotlib and Seaborn for data visualization.
For non-image or sequence models, feature data were initially subjected to Principal Component Analysis (PCA) for dimensionality reduction. This step aimed to remove redundant information and ensure the independence of variables, thereby enhancing model training efficiency and generalization capability. Raman spectroscopy data typically encompass a substantial number of correlated wavelength features; PCA can effectively extract the principal components of variance, reduce feature dimensionality, mitigate the risk of overfitting, and provide more stable inputs for subsequent linear models.
Subsequently, we employed the following traditional machine learning models: LSVM, suitable for high-dimensional data and typically exhibiting good generalization capability on small sample datasets, was adopted as one of the baseline models in this study to examine the classification performance of spectral data under linearly separable conditions. The ability of LSVM to identify the optimal hyperplane in high-dimensional space theoretically enables it to effectively distinguish linearly separable or approximately linearly separable spectral data. LDA, a linear model that performs classification concurrently with dimensionality reduction, is appropriate for data where class discrimination primarily relies on linear decision boundaries. It was used to evaluate the classification efficacy of spectral data under the assumption that the data follow a normal distribution and each class shares a common covariance matrix. The selection of LDA aimed to investigate whether significant linear separability exists between the classes of Raman spectroscopy data under conditions where its assumptions are met. QDA, an extension of LDA that permits different classes to possess distinct covariance matrices, can handle more complex non-linear data distributions, particularly suitable for scenarios where the distribution shapes of spectral features from different cell types are expected to exhibit significant variations. QDA was employed to explore the model’s classification performance when the distribution of Raman spectroscopy data from different classes demonstrates evident non-linearity and distinct covariance structures. RF, a robust ensemble learning method that constructs multiple decision trees and performs voting, can effectively process high-dimensional data and exhibits good robustness to noise and missing values, making it suitable for investigating classification capabilities under complex spectral features. RF is capable of handling high-dimensional and non-linear data, and possesses certain advantages in feature selection and feature importance assessment, aiding in the understanding of which wavelength ranges are crucial for classification. GB is another ensemble learning method that enhances model performance by sequentially constructing residuals of multiple weak learners (typically decision trees). It excels in handling non-linear problems and complex data structures, making it suitable for extracting subtle discriminatory features within spectral data. GB often achieves superior performance in complex, non-linear classification tasks and can capture finer patterns in the data. LR is a simple yet effective linear classification model, particularly applicable to binary classification problems. It was used to evaluate the classification performance of PCA-reduced spectral data under approximately linearly separable conditions and to provide insights into the relationships between features and classes. Serving as a complement to linear models, LR investigates whether the reduced-dimensional data can be effectively distinguished by a linear model and analyzes the strength and direction of the relationships between features and classes. NB is a probabilistic classifier based on Bayes’ theorem. Despite its assumption of independence between features, it is computationally efficient and can achieve reasonable classification results in scenarios with a large number of features but a relatively small sample size, serving as a simple and fast probabilistic model for comparison. The high computational efficiency of NB and its unexpectedly good performance on certain high-dimensional, small sample datasets make it a useful rapid baseline model. KNN is a non-parametric, instance-based learning algorithm whose classification decisions are based on neighborhood voting and does not make any assumptions about the underlying data distribution, making it suitable for identifying local structures within the data. Given that Raman spectroscopy data may exhibit complex local patterns, the selection of KNN aimed to explore whether classification patterns based on local similarity exist in the data without relying on specific distribution assumptions.
In contrast, for spectral data exhibiting potential sequential characteristics, we employed deep learning models. GRU, a type of recurrent neural network particularly well-suited for processing sequential data, can capture potential long-range dependencies within the spectral data, which may be crucial for distinguishing subtly different spectral features. The selection of GRU was motivated by the consideration that Raman spectroscopy data might, in certain contexts, possess “sequential” information. For instance, the specific arrangement patterns of spectral peaks or the data acquisition process itself could contain class-discriminative information, and the recurrent structure of GRU enables it to capture such sequential dependencies. MLP, a fundamental feedforward neural network with strong non-linear fitting capabilities, can learn complex combinations of PCA-reduced spectral features, serving as a baseline for comparing the performance of deep learning models. The choice of MLP was based on its status as a classic and powerful non-linear model capable of learning intricate feature representations from high-dimensional data, thus providing a comparative benchmark for evaluating the performance of deep learning models.
Traditional machine learning models underwent hyperparameter optimization via GridSearchCV (cv = 5), and the selected optimal parameters were used for prediction on the test set. Deep learning models, including MLP and GRU, were fine-tuned based on common architectures, with a validation split ratio of validation_split = 0.2. All models were trained with a batch size of 128 and a learning rate of 0.001, employing the ReduceLROnPlateau strategy for automatic learning rate adjustment to enhance training efficiency. The single-cell Raman spectroscopy data were randomly partitioned into training and test sets at an 80:20 ratio to evaluate model performance and generalization capability. Receiver Operating Characteristic (ROC) curves were generated using the test set, and the Area Under the Curve (AUC) was calculated. Sensitivity, specificity, and overall accuracy were computed using the binary classification confusion matrix, according to the following formulas: Sensitivity = TP / (TP + FN), Specificity = TN / (TN + FP), Overall Accuracy = (TN + TP) / (TN + FP + FN + TP), where TP represents true positives, TN represents true negatives, FP represents false positives, and FN represents false negatives.
Differential metabolite correlation analysis with differential genes and Raman spectroscopy
This study used transcriptomics and untargeted metabolomics analyses to investigate gene expression and metabolic differences between NPEC1-BMI1 and NPC cells. DEGs and DEMs were identified and mapped to the KEGG pathway database to identify shared pathways. The top 32 DEGs and DEMs from these shared pathways were selected for further analysis. Spearman correlation coefficients were calculated between these top 32 DEGs and DEMs using R (4.1.3), and the results were visualized as cluster heatmaps. Additionally, Spearman correlation coefficients were calculated between these top 32 DEGs and biologically significant Raman spectral features, and the results were also visualized as cluster heat maps.
Differential gene expression in NPC and HNSCC
Spearman correlation analysis was employed to identify DEGs that regulate multiple metabolites, potentially serving as biomarkers. TCGA database (http://portal.gbc.cancer.gov/) is a comprehensive cancer genomics data resource encompassing genomic, transcriptomic, epigenetic, and proteomic information for 33 cancer types. We utilized the UALCAN portal (http://ualcan.path.uab/index.hrml) to analyze the online expression of these target genes in HNSCC within the TCGA database. Additionally, the GSE53819 microarray dataset (GPL6480 platform) from the GEO database (https://www.ncbi.nlm.nih.gov/geo/) was used to validate the expression of these target genes in NPC. This dataset includes 18 tumor tissues and 18 normal tissues.
Immunohistochemistry analysis
The NPC tissue microarray used in this study was provided by the Sun Yat-sen University Cancer Center and contained tumor tissue samples from 157 NPC patients. IHC staining was performed using a KYNU polyclonal antibody (Wuhan Fine Biotech Co., Ltd.). Detailed experimental procedures and reagent information are provided in Supplementary documentation 2.
Western blot analysis
Following cell lysis, total protein was extracted and subjected to Western blot analysis. The protein expression level of KYNU was detected using a KYNU polyclonal antibody (Wuhan Fine Biotech Co., Ltd.), with GAPDH serving as an internal control. Detailed experimental procedures and reagent information are provided in Supplementary documentation 2.
Quantitative polymerase chain reaction
QPCR was performed using the 2×RealStar Fast SYBR qPCR Mix (Low ROX) from GenStar Biosolutions (Beijing, China) to validate the expression levels of relevant genes. Primer sequences are provided in Supplementary documentation 2. Relative gene expression levels were calculated using the 2^(-ΔΔCt) method, with GAPDH serving as an internal control. Detailed reaction conditions and cycling parameters are provided in Supplementary documentation 2.
Statistical analysis
This study presented the differential Raman spectral intensities and the expression levels of DEGs as mean ± standard deviation (SD). Parametric t-tests, ANOVA, non-parametric Wilcoxon rank-sum tests, or Kruskal-Wallis tests were employed to assess the statistical significance of differences in spectral intensities and DEG expression levels among different cell lines. Data visualization was performed using R (4.1.3) and GraphPad Prism (9.5.0). Most of the data visualization conducted in this study was performed using the OmicStudio tools at https://www.omicstudio.cn/tool and was plotted by https://www.bioinformatics.com.cn (last accessed on 10 Dec 2024), an online platform for data analysis and visualization. Unless otherwise specified, statistical significance was determined at a p-value threshold 0.05.
Results
Differential metabolite screening and metabolic pathway analysis
To characterize the metabolic profiles of NPC cell lines (SUNE1, CNE1, CNE2, 5–8 F, and 6-10B) and immortalized nasopharyngeal epithelial cells (NPEC1-BMI1), an untargeted metabolomics analysis was performed using UPLC-MS/MS. A total of 600 metabolites were identified. PCA revealed a clear clustering pattern among the six cell lines, indicating distinct metabolic profiles after dimensionality reduction (Fig. 1A).
Fig. 1.
Differential metabolite profiling and pathway analysis. A PCA clearly separated the six cell lines into distinct clusters in a two-dimensional space. B OPLS-DA effectively distinguished NPC samples from non-NPC samples. VIP values indicates the contribution of each variable to the model, with higher values indicating a more significant contribution. C Volcano plot illustrating DEMs between NPC and non-NPC groups. Metabolites with VIP values > 1.0 and p-values < 0.05 were considered statistically significant. Red dots represent up-regulated metabolites, while blue dots represent down-regulated metabolites. D Heatmap showing the expression patterns of the top 100 DEMs between NPC and non-NPC groups. The color intensity represents the relative abundance of each metabolite, with red indicating up-regulation and blue indicating down-regulation. E RF analysis identified the top 50 most important metabolites for differentiating between the two groups. F Enrichment analysis of differential metabolites in KEGG pathways
The metabolomics data of NPC cell lines (SUNE1, CNE1, CNE2, 5–8 F, and 6-10B) were combined into an NPC group, while NPEC1-BMI1 cells were designated as the non-NPC group. Inter-cell line variability was mitigated by standardizing the data, applying log2 transformation, and averaging metabolite values, generating a comprehensive NPC group metabolic profile. All cell line data underwent identical preprocessing, and quality control samples ensured data consistency.An orthogonal partial least squares-discriminant analysis (OPLS-DA) model was constructed, yielding a coefficient of determination (R²) of 0.999 and a predictive power (Q²) of 0.994 (Fig. 1B). Volcano plots were generated based on variable importance in projection (VIP) scores > 1.0 and p-values < 0.05 (Fig. 1C). Both the OPLS-DA model and volcano plot analysis indicated significant metabolic differences between the NPC and non-NPC groups. The top 100 DEMs, ranked by q-value, were selected for hierarchical cluster analysis (Fig. 1D). Subsequently, a supervised RF analysis was performed to identify the top 50 most important metabolites for distinguishing between the NPC and non-NPC groups (Fig. 1E). Fourteen metabolites were identified as potential biomarkers based on their ability to discriminate between the two groups in both the hierarchical clustering and RF analyses, which were contained taurine, sphingomyelin (d18:1/18:1, d18:2/18:0), myristoyl dihydrosphingomyelin (d18:0/14:0)*, imidazole propionate, N-acetyl-aspartyl-glutamate (NAAG), 3-ureidopropionate, asparagine, beta-alanine, cysteine, glutamate gamma-methyl ester, mannitol/sorbitol, 10-nonadecenoate (19:1n9), aspartate, and carnitine.
We performed a KEGG pathway enrichment analysis to elucidate the biological implications of the top 100 DEMs between NPC and non-NPC groups. Fifty metabolic pathways with q-values significantly less than 0.05 were identified (Fig. 1F). Notable differences were observed in the NPC and non-NPC groups for pathways including carnitine metabolism (deoxycarnitine, carnitine), fatty acid metabolism (particularly dihydroxybutyrate), and short-chain acylcarnitine metabolism (acetylcarnitine (C2)). These significant alterations in metabolites and metabolic pathways may contribute to the development and progression of NPC.
Transcriptome data analysis and pathway analysis by RNA-seq
Transcriptome sequencing identified 35,279 genes expressed across the six cell lines. PCA revealed distinct clustering patterns among the cell lines (Fig. 2A). To identify DEGs associated with NPC, we compared the five NPC cell lines (SUNE1, CNE1, CNE2, 5–8 F, and 6-10B) with the NPEC1-BMI1 cell line. To mitigate inter-cell line variability, data from the five NPC cell lines were combined, generating a composite NPC group gene expression profile through standardization and averaging. All cell line data underwent identical batch processing to ensure consistency.Genes with an adjusted p-value (q-value) < 0.05 and an absolute log2 fold change (|log2FC|) > 1 were considered significantly differentially expressed. A total of 5020 DEGs were identified, comprising 2436 upregulated genes and 2584 downregulated genes (Fig. 2B). Hierarchical clustering analysis of the top 500 DEGs, ranked by q-value, further highlighted the distinct transcriptional profiles of NPC cells (Fig. 2C).
Fig. 2.
DEGs between NPC and non-NPC groups and functional enrichment analysis of these DEGs. A PCA of the six cell lines based on transcriptomic data demonstrated that the distinct cell lines are separated in the two-dimensional feature space. B Differential gene volcano plot. Genes with a q-value < 0.05 and |log2FC| >1 were considered significantly differentially expressed. Red and blue dots represent upregulated and downregulated genes, respectively. C Hierarchical clustering analysis of DEGs between NPC and non-NPC groups. Each column corresponds to one sample, and the color gradient indicates the gene expression levels. D Enrichment analysis of DEGs in KEGG Level 1 and Level 2 pathways. E GO functional enrichment analysis of DEGs. BP, CC, and MF denote biological processes, cellular components, and molecular functions
To further explore the biological functions of the DEGs, we performed a KEGG pathway enrichment analysis. A total of 6 and 44 pathways were significantly enriched at KEGG levels 1 and 2, respectively (Fig. 2D). Signal transduction was the most significantly enriched pathway among the DEGs.GO enrichment analysis identified 362 significantly enriched pathways. The top 10 significantly enriched pathways in biological process (BP), cellular component (CC), and molecular function (MF) categories are shown in Fig. 2E.
Integrated analysis of transcriptomics and metabolomics data
We integrated transcriptomic and metabolomic data to gain deeper insights into the molecular mechanisms underlying NPC. A joint pathway enrichment analysis identified 35 shared pathways significantly enriched by both DEGs and DEMs (Fig. 3A). Notably, pathways such as oxidative phosphorylation, retrograde endocannabinoid signaling, sphingolipid signaling, purine metabolism, and phosphatidylinositol signaling were prominently enriched (Fig. 3B). Further analysis revealed that 18 genes and 28 metabolites were significantly upregulated, while 14 genes and 4 metabolites were significantly downregulated within these shared pathways (Fig. 3C-D). Spearman correlation analysis was performed to identify significant correlations between DEGs and DEMs within these pathways, resulting in a robust transcript-metabolite interaction network (P < 0.05) (Fig. 3E-F). Among them, differentially up-regulated genes such as LPCAT1, KYNU, PDE10A, ALPP, GNAO1, TYMS, and NME1 were significantly positively correlated with the following metabolites: (R)-3-hydroxybutyrylcarnitine (VIP = 1.2449, q value = 1.6442E-05, Fold Change = 13.204), oleoylcarnitine (C18:1) (VIP = 1.2344, q value = 2.3710E-05, Fold Change = 184.93), 1-ribosyl-imidazoleacetate (VIP = 1.1910, q value = 3.9062E-05, Fold Change = 0.092), N-stearoyltaurine (VIP = 1.2449, q value = 6.5875E-06, Fold Change = 29.466), docosapentaenoate (n3 DPA;22:5n3) ( VIP = 1.2286, q value = 7.9329E-07, Fold Change = 27.544), eicosenoylcarnitine (C20:1) (VIP = 1.2366, q value = 4.1058E-05, Fold Change = 92.492), allantoin (VIP = 1.2457, q value = 2.3581E-07, Fold Change = 121.402), homoarginine (VIP = 1.2438, q value = 2.5925E-06, Fold Change = 52.256), trigonelline (N’-methylnicotinate) (VIP = 1.2442, q value = 2.5217E-06, Fold Change = 6.156), pipecolate (VIP = 1.2324, q value = 7.8871E-07, Fold Change = 5.406), 3-ureidopropionate (VIP = 1.2394, q value = 4.6160E-06, Fold Change = 75.092), glycochenodeoxycholate (VIP = 1.2398, q value = 1.2258E- 05, Fold Change = 10.092),tyrosine (VIP = 1.2043, q value = 4.2138E-05, Fold Change = 3.018), imidazole propionate (VIP = 1.2356, q value = 5.5467E-08, Fold Change = 34.628) and pyridoxine (Vitamin B6) (VIP = 1.2361, q value = 4.1367E-07, Fold Change = 57.93). However, the significantly downregulated differential gene CBS exhibited a strong negative correlation with various metabolites, including homoarginine, trigonelline (N’-methylnicotinate), and pipecolate. Table 2 shows that 21 DEGs from common pathways in the KEGG pathway analysis were assigned to six metabolic pathways. To further explore the potential molecular mechanisms underlying the different subtypes of NPC, we performed a comparative analysis of DEGs and DEMs between the highly differentiated CNE1 and lowly differentiated CNE2 cell lines, as well as between the highly metastatic 5–8 F and lowly metastatic 6-10B cell lines. Hierarchical clustering analysis was used to visualize the distinct expression patterns of these genes and metabolites (Supplementary Fig. 1A-D). Notably, genes such as NME1, TYMS, and KYNU, along with metabolites like 3-ureidopropionate, 1-ribosyl-imidazoleacetate, and pyridoxine (vitamin B6), exhibited distinct clustering patterns between highly and lowly differentiated cell lines. Conversely, genes like TYMS, PDE10A, and GNAO1, as well as metabolites like pipecolate and imidazole propionate, showed significant clustering differences between high and low metastatic cell lines. These findings suggest that specific gene-metabolite correlations may underlie the molecular characteristics of different NPC subtypes.
Fig. 3.
Integration of Transcriptomic and Metabolomic Analyses. A Venn diagram showing the number of KEGG pathways co-enriched in the analysis of DEGs and differentially expressed DEMs.B Histogram showing the distribution of KEGG pathways co-enriched in DEGs and DEMs. C, D Lists of DEMs and DEGs involved in the common pathways, respectively. E Spearman correlation coefficient matrix heatmap showing the correlation between DEMs and DEGs involved in the common pathway. Red and blue colors indicate positive and negative correlations, respectively. *q < 0.05; **q < 0.01. F Correlation networks of DEMs and DEGs involved in common pathways constructed by Cytoscape. Nodes represent DEMs or DEGs, edges represent Spearman correlations between them, and the edges’ color and thickness indicate the correlation’s strength and direction. *q < 0.05 correlations were retained
Table 2.
DEGs enriched in metabolic pathways
| Classification level 3 | Classification level 1 | Gene number | Rich. Factor | Genes |
|---|---|---|---|---|
| Metabolic pathways | Metabolism | 19 | 0.30 | NME1 ↑;HKDC1 ↑; CHPT1 ↑; NAT8L ↑; TYMS ↑; ENPP1 ↑; KYNU ↑; PLAAT3 ↑; LPCAT1 ↑;ALPP ↑; PDE4A ↑; GLUL ↓; ITPKC ↓; NAGS ↓; ABAT ↓; ARG2 ↓; PIK3CD ↓; CBS ↓; PCBD1 ↓ |
| Propanoate metabolism | Metabolism | 2 | 0.47 | ABAT ↓; SUCLG2 ↓ |
| Pyrimidine metabolism | Metabolism | 3 | 0.41 | NME1 ↑; TYMS ↑; ENPP1 ↑ |
| Carbon metabolism | Metabolism | 4 | 0.35 | HKDC1 ↑; PIK3R1 ↓; SUCLG2 ↓; PIK3CD ↓ |
| Arginine and proline metabolism | Metabolism | 1 | 0.42 | ARG2 ↓ |
| Folate biosynthesis | Metabolism | 2 | 0.47 | PCBD1 ↓; ALPP ↑ |
Machine learning analysis of Raman spectroscopy data structures of NPC cell lines and NPEC1-BMI1 cell lines
In our previous study, we successfully identified cell-specific lipid metabolic signatures in NPC cells using CMRS coupled with machine learning, and established a high-precision diagnostic model [36]. Building upon this foundation, we integrated single-cell Raman spectroscopy data from five NPC cell lines (SUNE1, CNE1, CNE2, 5–8 F, and 6-10B) and the NPEC1-BMI1 cell line to comprehensively characterize the metabolic profiles of NPC cells using CMRS. Initially, following background removal and normalization, we obtained single-cell Raman spectra within the spectral range of 500–2000 cm⁻¹ for the five NPC cell lines and the NPEC1-BMI1 cell line (Fig. 4A). Analysis using t-distributed stochastic neighbor embedding (t-SNE), a non-linear dimensionality reduction technique (Fig. 4B), revealed a clear distinction between the Raman spectra of the NPEC1-BMI1 cell line and the NPC cell lines. Subsequently, we consolidated the Raman spectroscopy data from the five NPC cell lines into a single NPC group. To minimize inter-cell line variability, we merged the Raman spectroscopy data of these five NPC cell lines and obtained single-cell Raman spectra for the NPC group and the NPEC1-BMI1 cell line within the spectral range of 300–1800 cm⁻¹ through standardization and averaging (Fig. 4C). To ensure data consistency, all cell line data underwent identical batch processing.Specifically, 17 wavenumbers exhibited substantial variations in the NPC group compared to the non-NPC group (Table 1), providing insights into the underlying biomolecular changes associated with NPC. Semi-quantitative analysis revealed that the NPC group exhibited a relative increase in spectral intensity at 666 and1247 cm⁻¹ (Nucleic Acid), 780, 1580, 1603, and 1616 cm⁻¹ (Protein), while showing a relative decrease at 405 cm⁻¹ (Hydrocarbon), 525 cm⁻¹ (Nucleic Acid), 1265 cm⁻¹ (Lipid), 1440 cm⁻¹ (Protein, Lipid), 900 cm⁻¹ (Carbohydrate),860, 876, 934, 1125, 1305and 1655 cm⁻¹(Protein). Supplementary Table 2 provides a comprehensive list of the biomolecular structures and their biological significance corresponding to these differentially expressed Raman spectral peaks.
Fig. 4.
Discrimination of NPC and Non-NPC Cell Lines by Single-Cell Raman Spectroscopy (SCRS). A Normalized average Raman spectra of five NPC cell lines (SUNE1, CNE1, CNE2, 5–8 F, and 6-10B) and the NPEC1-BMI1 cell line. Spectral data were baseline-corrected and normalized to the total area. B Visualization of high-dimensional data using the non-linear dimensionality reduction technique t-SNE (t-Distributed Stochastic Neighbor Embedding). C Normalized average Raman spectra of the NPC group (red) and the non-NPC group (blue, NPEC1-BMI1). Spectral data were baseline-corrected and normalized to the total area. D Analysis of SCRS data from the NPC and non-NPC cell lines using the first two principal components of the PCA model. PCA analysis revealed significant differences in Raman spectral features between the two groups. E Confusion matrix of the PC-MLP model for classifying NPC and non-NPC groups. F ROC curve demonstrating the excellent performance of the PC-MLP model in classifying NPC and non-NPC cells, with an AUC value of 1.00, indicating extremely high classification accuracy. These results suggest that single-cell Raman spectroscopy combined with machine learning models can effectively distinguish between NPC and non-NPC cells
To validate the classification performance of Raman spectroscopy data, single-cell Raman spectroscopy data were randomly divided into training and test sets at an 80:20 ratio, and dimensionality reduction was performed using PCA to 50 principal components. On the training set, hyperparameters of various machine learning models were optimized using 5-fold cross-validation (cv = 5). Subsequently, ten machine learning models, including MLP, GRU, KNN, GB, LSVM, QDA, RF, LDA, LR, and NB, were trained on the training set, and their prediction results were compared on the test set. The results demonstrated that MLP exhibited the best performance in Raman spectroscopy prediction (Supplementary Table 1), with its training results (Supplementary Fig. 2) indicating good convergence and generalization capabilities, effectively learning and generalizing data features, thus providing a reliable foundation for subsequent classification tasks.
PCA revealed a clear separation between the NPC and non-NPC groups in the two-dimensional PCA space, indicating significant spectral differences (Fig. 4D). Based on the biomolecular attributions summarized in Table 1, we focused on several key Raman spectral regions, including 860, 1125, 1305, 1440, 1655, 780, 1247, 1580, 1603, 1616, and 1265 cm⁻¹. These spectral regions contributed significantly to PC1 and PC2 in the PCA analysis. Notably, the Raman bands at 860, 1125, 1305, 1440, and 1655 cm⁻¹ exhibited positive loadings on both PC1 and PC2, while the bands at 780, 1247, 1580, 1603, 1616, and 1265 cm⁻¹ showed positive loadings on PC1 and negative loadings on PC2. To validate the PCA results, a PC-MLP model was developed using the PCA scores as input features and an MLP as the classifier. The binary confusion matrix for the NPC and non-NPC groups is presented in Fig. 4E-F, demonstrating a prediction accuracy of 99.3% (Table 3). Additionally, the ROC curve for the test set yielded an AUC of 1.00. These results indicate that the PC-MLP model, based on Raman spectroscopy, effectively discriminates between NPC and non-NPC groups with high accuracy.
Table 3.
PC-MLP model distinguishes NPC group cell lines from non-NPC group cell lines
| PC-MLP | Reference | |
|---|---|---|
| NPC | NPEC-BMI1 | |
| NPC | 158 | 2 |
| NPEC-BMI1 | 1 | 159 |
| Sensitivity(%) | 99.4 | – |
| Specificity(%) | – | 98.8 |
| Accuracy(%) | 99.3 | |
Correlation analysis between DEMs and DEGs with differential Raman spectra and differential analysis of potential Raman spectral biomarkers
We identified significant correlations between 32 DEMs and 32 DEGs within the integrated metabolic and transcriptomic pathways (Fig. 3E-F). Furthermore, we performed Spearman correlation analysis between the 17 biomolecularly significant Raman spectral features (Table 1) and the 32 DEGs and DEMs (Fig. 5A-B). Our study revealed that elevated Raman spectral intensities at 666 and 1616 cm⁻¹ in the NPC group were significantly positively correlated with the expression of genes such as NAT8L and PDE10A. Conversely, decreased Raman spectral intensities at 900 and 934 cm⁻¹ were significantly positively correlated with the expression of PCBD1 and PIK3R1, respectively. Additionally, Raman shifts at 876, 1305, and 1655 cm⁻¹ were significantly negatively correlated with NDUFS6, CHKA, CHPT1, and GDA expression. In the correlation analysis between Raman spectra and DEMs, elevated Raman spectral intensities at 1603, 1616, 780, 1580, and 666 cm⁻¹ in the NPC group were significantly positively correlated with metabolites such as eicosenoylcarnitine (C20:1), trans-4-hydroxyproline, docosapentaenoate (n-3 DPA; 22:5n3), (R)-3-hydroxybutyrylcarnitine, allantoin, propionylcarnitine (C3), homoarginine, trigonelline (N’-methylnicotinate), butyrylcarnitine (C4), pipecolate, 3-ureidopropionate, and glycochenodeoxycholate. Interestingly, a negative correlation was observed between the 780 cm⁻¹ Raman band and adenine levels. Decreased Raman spectral intensities at 1440, 1305, 1655, 860, and 1125 cm⁻¹ exhibited significant negative correlations with metabolites such as mannitol/sorbitol, 10-nonadecenoate (19:1n9), betaine, homoarginine, trigonelline, N-acetylaspartate (NAA), N, N, N-trimethyl-5-aminovalerate, allantoin, and propionylcarnitine (C3). Specifically, 860 and 1125 cm⁻¹ bands showed significant positive correlations with adenine and pantothenate, respectively.
Fig. 5.
Correlation Analysis of DEMs and DEGs with Raman Spectra. A, B Heatmaps demonstrating Spearman correlations between DEMs and DEGs in common pathways and differential Raman peak intensities. Red and blue colors indicate positive and negative correlations, respectively. Color labels indicate the absolute values of correlation coefficients. *q < 0.05, **q < 0.01. C–J Differences in the intensity of Raman peaks of potential biomarkers in different groups were compared. *q < 0.05, **q < 0.01, ***q < 0.001, ****q < 0.0001
Our analysis revealed that nine Raman spectral features (666, 1616, 876, 1305, 1655, 1603, 1440, 780, and 860 cm⁻¹) exhibited significant correlations with multiple DEGs and DEMs involved in integrated metabolic and transcriptomic pathways (Fig. 5C-J). Among these, the Raman spectral features at 860, 1603, 1616, and 1655 cm⁻¹ showed significant differences between the NPC and non-NPC groups. Furthermore, these spectral features also discriminated between high and low metastatic and high and low differentiated NPC cell lines, suggesting their potential as biomarkers for subtype-specific molecular features. These findings suggest a complex interplay between Raman spectral information, metabolic profiles, and gene expression patterns in NPC.
Expression analysis of potential biomarkers in HNSCC and NPC
Based on the correlation analysis between DEGs and DEMs within the integrated metabolic and transcriptomic pathways, eight key genes were identified: LPCAT1, KYNU, PDE10A, ALPP, GNAO1, TYMS, CBS, and NME1. Notably, the results demonstrated significantly higher expression of these genes in HNSC tumor samples (n = 520) compared to adjacent normal tissues (n = 44) (P < 0.0001) (Fig. 6A-H).
Fig. 6.
Differential Expression of Potential Transcriptomic Biomarkers in TCGA HNSCC Samples and GEO NPC Tissue Samples.(A)-(H) Expression of LPCAT1, KYNU, PDE10A, ALPP, GNAO1, TYMS, CBS, and NME1 was significantly elevated in HNSCC tissues compared to normal tissues. ****P < 0.0001. (I)-(T) KYNU, NME1, and TYMS expression were significantly elevated in NPC tissues compared with normal tissues as well as in the TCGA database, there were significant differences between KYNU, NME1, and TYMS with cancer stage, tumor grade, and nodal metastasis in HNSC patients. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001
To validate the TCGA database findings, we selected the GSE53819 dataset from the GEO database for analysis. This dataset included whole-genome expression profiling of 18 primary NPC tumors and 18 non-cancerous nasopharyngeal tissues. The median age of NPC patients was 46 years (range: 19–77 years), and the median age of the non-cancerous cohort was 45 years (range: 18–78 years), with nearly one-third of the patients being female. All samples were collected prior to any anti-cancer treatment. This detailed clinical information, along with the paired tumor and normal tissue samples, provided an important clinical context and a reliable basis for validating the differential expression of genes in NPC tumor tissues. Our analysis results confirmed the overexpression of KYNU, TYMS, and NME1 in NPC tissues (Fig. 6I-T). Notably, these genes exhibited significantly higher expression levels in HNSC tumors compared to normal controls, regardless of tumor stage (I, II, III, and IV) or lymph node metastasis (N0 and N1) (P < 0.05). These findings suggest that KYNU, TYMS, and NME1 may serve as potential biomarkers for the diagnosis and prognosis of NPC.
KYNU is highly expressed in NPC cell lines and tissues
The results of this study demonstrated a significant trend of KYNU overexpression in NPC. Firstly, qPCR validation revealed that KYNU mRNA levels were significantly elevated in NPC cell lines SUNE1, 5–8 F, 6-10B, and HONE1 compared to the normal nasopharyngeal epithelial cell line NP69 (Fig. 7A). Subsequently, Western blot experiments further confirmed the high expression of KYNU protein in these NPC cell lines (Fig. 7C). Notably, KYNU mRNA and protein levels were also significantly higher in the EBV-positive NPC cell lines CNE2-EBV and HK1-EBV compared to their corresponding EBV-negative cell lines CNE2 and HK1 (Fig. 7D-F).Supplementary documentation 1 provides the full uncropped Gels and Blots images. Furthermore, immunohistochemical (IHC) staining of NPC tissue microarrays revealed three distinct expression levels of KYNU (low, medium, and high) in clinical samples (Fig. 7G), suggesting a potential correlation with the clinicopathological features of NPC. In summary, the high expression of KYNU in NPC cell lines, EBV-positive NPC cell lines, and its differential expression in clinical samples indicate its potential crucial role in the development and progression of NPC.
Fig. 7.
Expression of KYNU in NPC Cell Lines and Tissues. A Quantitative PCR (qPCR) results revealed a significant increase in KYNU mRNA levels in NPC cell lines SUNE1, 5–8 F, 6-10B, and HONE1 compared to the normal nasopharyngeal epithelial cell line NP69 (**P < 0.01, ***P < 0.001, ****P < 0.0001). B, C Western Blot analysis demonstrated elevated KYNU protein levels in SUNE1, 5–8 F, 6-10B, and HONE1 cell lines compared to NP69 (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001). D qPCR results showed a significant increase in KYNU mRNA levels in EBV-positive NPC cell lines CNE2-EBV and HK1-EBV compared to EBV-negative cell lines CNE2 and HK1 (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001). E, F Western Blot analysis indicated higher KYNU protein levels in CNE2-EBV and HK1-EBV compared to CNE2 and HK1 (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001). G Immunohistochemical (IHC) staining of a NPC tissue microarray showed varying levels of KYNU protein expression (low, medium, and high) in clinical samples (100× and 200× magnification). HE staining illustrates the morphological characteristics of NPC tissues
Discussion
The development and progression of NPC involve a complex interplay of geographical location, environmental factors, genetic susceptibility, and EBV infection [37]. Metabolic reprogramming is considered a key characteristic of NPC cells adapting to high energy demands and promoting rapid tumor growth [38]. This study integrated Raman spectroscopy, metabolomics, and transcriptomics analyses to elucidate metabolic-related gene alterations in NPC cells and to explore the potential of KYNU as a candidate metabolic-related biomarker.
This study integrated metabolomics data to more comprehensively elucidate the molecular mechanisms of NPC, overcoming the limitations of transcriptomics in reflecting dynamic metabolic states and signaling pathway regulation [39]. Through multi-omics integrated analysis, we identified 35 significantly altered common pathways and found significant correlations between DEGs (including KYNU, NME1, and TYMS) and multiple DEMs within these pathways, suggesting that these genes may play a central role in metabolic dysregulation in NPC. We observed that the upregulation of KYNU in NPC cells was positively correlated with the levels of (R)-3-hydroxybutyryl carnitine, oleoyl carnitine (C18:1), 1-ribosylimidazoleacetic acid*, and N-stearoyl taurine, which supports the role of KYNU as a key enzyme in tryptophan catabolism, potentially driving fatty acid oxidation and ketone body accumulation to meet the energy demands of tumor cells by promoting the production of metabolites such as kynurenine [40]. Metabolomic analysis revealed significant alterations in the tryptophan metabolism pathway in NPC cells (Fig. 1F), further corroborating the potential crucial role of KYNU in NPC metabolic reprogramming. The upregulation of NME1 and TYMS and their positive correlations with specific metabolites suggest that they may synergistically promote NPC cell proliferation by participating in multiple pathways such as nucleotide synthesis, amino acid metabolism, and energy metabolism [41, 42].
CMRS combined with the PC-MLP model enabled high-precision discrimination between NPC cells and NPEC1-BMI1 cells, outperforming the traditional SVM model [36], which highlights the potential of Raman spectroscopy in label-free detection and revealing the metabolic heterogeneity of cancer cells [17]. The differences in Raman spectral intensity at specific wavenumbers in NPC cells, correlated with DEGs and DEMs, suggest that alterations in these genes and metabolites may lead to changes in cellular molecular composition and structure [43–46].
Notably, we found a close association between the upregulation of the KYNU gene in NPC cells and significant alterations in Raman spectra, suggesting that KYNU expression levels may influence the overall molecular composition of the cells. Specifically, the increase in intensity at 1603 cm⁻¹ associated with KYNU upregulation may reflect enhanced aromatic amino acid metabolism and changes in protein structure, while the decrease in intensity at 1655 cm⁻¹ suggests alterations in protein secondary structure. These observations are consistent with the elevated levels of KYNU-positively correlated metabolites, including (R)-3-hydroxybutyryl carnitine, oleoyl carnitine (C18:1), 1-ribosylimidazoleacetic acid*, and N-stearoyl taurine, further supporting the critical role of aromatic amino acid metabolism in providing energy and influencing cell signaling in NPC tumor cells. KYNU, as a key enzyme in the tryptophan metabolic pathway, directly affects the metabolic flux in NPC cells by accelerating the conversion of tryptophan to kynurenine, which is not only a precursor for NAD + synthesis but also activates AhR, regulating downstream gene expression. The metabolic changes we observed may be related to the increased levels of tryptophan catabolites mediated by KYNU, subsequently promoting fatty acid oxidation and ketone body production to provide energy and biosynthetic precursors for the rapid proliferation of NPC cells. The upregulation of TYMS and its positive correlation with decreased intensity at 1655 cm⁻¹ and elevated levels of metabolites such as L-arginine, L-tyrosine, 3-ureidopropionic acid, and glycochenodeoxycholic acid highlight the crucial role of nucleic acid and protein synthesis in NPC progression. The accumulation of these metabolites may provide necessary precursors for DNA and protein synthesis in tumor cells, thereby promoting cell proliferation and differentiation. In summary, the integration of Raman spectroscopy, metabolomics, and genomics provides a novel strategy for comprehensively analyzing metabolic alterations in NPC and demonstrates its potential value in distinguishing different NPC subtypes.
Through integrated omics analyses, we identified eight DEGs closely associated with metabolic alterations in NPC cells, among which KYNU, TYMS, and NME1 were significantly upregulated in NPC tissues and were closely correlated with tumor stage, grade, and lymph node metastasis, suggesting their potential as biomarkers for NPC progression. KYNU, as a key enzyme in tryptophan metabolism, is involved in immunosuppression [47]. Our validation confirmed the elevated expression of KYNU in NPC tissues and cell lines (especially in EBV-positive cell lines), further supporting its potential as an NPC biomarker. Based on the significant correlations between KYNU and various metabolites and Raman spectral features, we speculate that KYNU may influence NPC development and progression by regulating specific metabolic pathways. Our findings are consistent with observations of KYNU upregulation in other malignancies. For instance, Zhao et al. [48] found that high KYNU expression was associated with tumor clinical stage in colorectal cancer (CRC), and inhibiting KYNU significantly reduced the proliferation, migration, and invasion abilities of CRC cells, which aligns with our speculation that KYNU may promote tumor progression in NPC. The study by Lu et al. [49] showed that KYNU expression was elevated in cisplatin-resistant esophageal cancer cells and tissues, suggesting that KYNU may be involved in tumor drug resistance mechanisms. Analysis of multiple cancer types by León-Letelier et al. [50] revealed that high KYNU expression was significantly associated with an immunosuppressive tumor microenvironment and with high expression of immune checkpoint genes, including PD-L1, PD-1, and CTLA4, implying that KYNU may affect the immunotherapeutic response in NPC patients. The significant alterations in the tryptophan metabolic pathway observed in our study (Fig. 1F) also support the notion that KYNU may regulate the immune microenvironment by influencing the production of metabolites such as kynurenine. Similarly, Shen et al. [51] reported that high KYNU expression was associated with a poor immune microenvironment and poor patient prognosis in gastric cancer, and that KYNU knockdown inhibited the proliferation and invasion of gastric cancer cells. In this study, we validated the high expression of KYNU in NPC tissues and cell lines and observed its close correlation with tumor stage, grade, and lymph node metastasis, echoing the findings of Shen et al. in gastric cancer regarding the association of high KYNU expression with poor prognosis, further suggesting KYNU as a potential indicator of poor prognosis in NPC patients. In conclusion, KYNU plays a significant role in various cancers and may promote tumor progression and influence treatment response by affecting the malignant biological behavior of tumor cells and regulating the tumor microenvironment. Therefore, in-depth investigation of the functional mechanisms of KYNU in NPC and its potential clinical value, such as its use as a diagnostic and prognostic biomarker and as a target for therapeutic strategies, is of significant importance.
This study integrated Raman spectroscopy, metabolomics, and transcriptomics analyses, preliminarily revealing the potential roles of KYNU, TYMS, and NME1 in NPC metabolic reprogramming and highlighting the value of KYNU as a potential biomarker. However, this study also has several limitations, including a limited sample size, the complexity of Raman spectroscopy and metabolomics analyses, potential discrepancies between transcriptomic data and protein expression, and insufficient functional validation. Considering potential sex- or demographic-based biases in interpreting our metabolomics and transcriptomics data, we noted that the GEO dataset GSE53819 used for validation had a sample size of 18 pairs of NPC tumors and normal tissues, with a median patient age of 46 years and approximately one-third being female. While this dataset provided an important resource for validation, its sample size and specific demographic distribution may limit our comprehensive assessment of sex- or age-related differences. In our cell line studies, we primarily used cell lines derived from Asian populations, which may limit the generalizability of our findings to other ethnic groups. Therefore, future studies should expand the sample size, investigate the specific mechanisms of these genes in more depth, incorporate proteomics data for validation, and further optimize the analytical methods for Raman spectroscopy and metabolomics. Simultaneously, future research should aim to include larger and more diverse patient cohorts and perform more detailed stratified analyses to explore potential sex- or demographic-specific differences in the metabolic and transcriptomic profiles of NPC, thereby leading to a more comprehensive understanding of the disease’s molecular mechanisms.
Despite the significant scientific value of this study, we acknowledge the challenges in translating complex multi-omics data into practical clinical applications such as early diagnostic tools or personalized treatment strategies, necessitating careful evaluation. Nevertheless, the potential biomarker KYNU identified in this study shows promise for clinical translation. As a key enzyme significantly overexpressed in NPC tissues and cell lines and correlated with tumor stage, grade, and lymph node metastasis, KYNU could be further developed as a diagnostic or prognostic marker for NPC. Future research should focus on validating the clinical diagnostic and prognostic value of KYNU in larger, independent patient cohorts and exploring the development of convenient detection methods based on KYNU expression levels, such as IHC or serological assays. Furthermore, a deeper understanding of the specific functional mechanisms of KYNU in NPC progression may provide new directions for developing therapeutic strategies targeting KYNU. The limitations of sample size or technology may restrict the applicability of our findings to specific patient subgroups, underscoring the importance of clearly articulating the study’s limitations and their impact on data interpretation. Additionally, multi-omics research involves substantial patient data, necessitating discussion of potential ethical and social implications, including patient data privacy and security and strategies to prevent misuse of research findings. Therefore, by balancing the potential value of multi-omics with its inherent risks and challenges, we can better facilitate the translation of this research towards clinical applications for NPC.
Conclusion
This study, through the integration of Raman spectroscopy, metabolomics, and transcriptomics analyses, elucidated key genes and pathways involved in the metabolic reprogramming of NPC, clarified the significant roles of KYNU, TYMS, and NME1 in NPC tumorigenesis and progression, and demonstrated that Raman spectroscopy coupled with a machine learning model can effectively discriminate NPC cells, particularly highlighting the value of KYNU as a potential metabolism-related biomarker.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary documentation 1.The full uncropped Gels and Blots images
Supplementary documentation 2.Supplementary Materials and Methods
Figure S1. Clustering Analysis of DEGs and DEMs Expression in Different Subtypes of NPC Cell Lines(A) (C) Cluster analysis of DEGs and DEMs between CNE1 and CNE2 cell lines. Each column represents a sample and each row represents a gene or metabolite. (B) (D) Cluster analysis of DEGs and DEMs between 5-8F and 6-10B cell lines. Each column represents a sample and each row represents a gene or metabolite
Figure S2. Training Results of the MLP Model. Demonstrates the performance of the MLP model during 100 epochs of training, including the accuracy and loss changes in the training and validation sets. The results show that with the increase in training epochs, the accuracy of the model in both the training and validation sets rapidly improved and stabilized, with a high consistency between the validation and training set accuracies. Simultaneously, the loss values rapidly decreased and converged, indicating that the model possesses good convergence and generalization capabilities, effectively learning and generalizing data features, thus providing a reliable foundation for subsequent classification tasks
Supplementary raw data 1.Original metabolomics data
Supplementary raw data 2.Original transcriptomics data
Supplementary raw data 3.Original single-cell Raman spectroscopy data
Acknowledgements
We thank Professor Mu-Sheng Zeng (Cancer Center, Sun Yat-sen University) for providing the NPEC1-BMI1, SUNE1, 5–8 F, 6-10B, CNE1, and CNE2 cell lines.
Author contributions
Ziman Wu and Haiyan Yang contributed equally to the design and writing of this manuscript and should be considered co-first authors; Yafei Xu, Xiang Ji, and Dayang Chen collected and analyzed the data; Chuang Zhang, Xinying Li and Mingjie Liang performed literature searches and were responsible for data visualization; Xiuming Zhang and Dan Xiong carefully and rigorously revised the manuscript; all authors have read and approved the final manuscript.
Funding
This study was supported by the National Natural Science Foundation of China (Grant No. 81772921), the Science and Technology Planning Project of Shenzhen Municipality, China (JCYJ20230807142806014), and the Shenzhen Key Medical Discipline (No. SZXK054).
Data availability
Data availability Statement: The authors declare that the main data supporting the findings of this study are available within the article and its Supplementary files.Extra dataare available from the corresponding author upon request.
Declarations
Ethics approval and consent to participate
For this study, the NPC tissue microarray was supplied by Sun Yat-sen University Cancer Center. Written informed consent was obtained from all patients regarding the use of their biopsies, with approval from the Ethics Committee of Sun Yat-sen University Cancer Center (Ethics Approval Number: GZR2018939). This study strictly adhered to the Declaration of Helsinki (as revised) and all relevant ethical guidelines and regulations. No animal experiments were involved in this study; therefore, the ARRIVE guidelines are not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ziman Wu and Haiyan Yang should be considered joint first authors.
Contributor Information
Xiuming Zhang, Email: zhangxiuming0760@163.com.
Dan Xiong, Email: sunny543@126.com.
References
- 1.Chen YP, Chan ATC, Le QT, Blanchard P, Sun Y, Ma J. Nasopharyngeal carcinoma. Lancet. 2019;394(10192):64–80. [DOI] [PubMed] [Google Scholar]
- 2.Huang H, Yao Y, Deng X, Huang Z, Chen Y, Wang Z, Hong H, Huang H, Lin T. Immunotherapy for nasopharyngeal carcinoma: current status and prospects (Review). Int J Oncol. 2023;63(2):97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Tsao SW, Tsang CM, Lo KW. Epstein-Barr virus infection and nasopharyngeal carcinoma. Philos Trans R Soc Lond B Biol Sci. 2017;372(1732):20160270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Trinh CTH, Tran DN, Nguyen LTT, Tran NT, Nguyen MTG, Nguyen VTP, Vu NTH, Dang KD, Van Vo K, Chau HC, Phan PTP. Truc phuong MH. LMP1-EBV gene deletion mutations and HLA genotypes of nasopharyngeal cancer patients in Vietnam. Pathophysiology. 2023;30(1):1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Guo R, Liang JH, Zhang Y, Lutchenkov M, Li Z, Wang Y, Trujillo-Alonso V, Puri R, Giulino-Roth L, Gewurz BE. Methionine metabolism controls the B cell EBV epigenome and viral latency. Cell Metab. 2022;34(9):1280–e12979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Renaud S, Lefebvre A, Mordon S, Moralès O, Delhem N. Novel therapies boosting T cell immunity in epstein barr Virus-Associated nasopharyngeal carcinoma. Int J Mol Sci. 2020;21(12):4292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Juarez-Vignon Whaley JJ, Afkhami M, Onyshchenko M, Massarelli E, Sampath S, Amini A, Bell D, Villaflor VM. Recurrent/Metastatic nasopharyngeal carcinoma treatment from present to future: where are we and where are we heading?? Curr Treat Options Oncol. 2023;24(9):1138–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tang M, Dong X, Xiao L, Tan Z, Luo X, Yang L, Li W, Shi F, Li Y, Zhao L, Liu N, Du Q, Xie L, Hu J, Weng X, Fan J, Zhou J, Gao Q, Wu W, Zhang X, Liao W, Bode AM, Cao Y. CPT1A-mediated fatty acid oxidation promotes cell proliferation via nucleoside metabolism in nasopharyngeal carcinoma. Cell Death Dis. 2022;13(4):331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Namba T, Nardelli J, Gressens P, Huttner WB. Metabolic regulation of neocortical expansion in development and evolution. Neuron. 2021;109(3):408–19. [DOI] [PubMed] [Google Scholar]
- 10.Li L, Tang Q, Ge J, Wang D, Mo Y, Zhang Y, Wang Y, Xiong F, Yan Q, Liao Q, Guo C, Wang F, Zhou M, Xiang B, Zeng Z, Shi L, Chen P, Xiong W. METTL14 promotes lipid metabolism reprogramming and sustains nasopharyngeal carcinoma progression via enhancing mA modification of ANKRD22 mRNA. Clin Transl Med. 2024;14(7):e1766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zanotelli MR, Zhang J, Reinhart-King CA. Mechanoresponsive metabolism in cancer cell migration and metastasis. Cell Metab. 2021;33(7):1307–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fuller H, Zhu Y, Nicholas J, Chatelaine HA, Drzymalla EM, Sarvestani AK, Julián-Serrano S, Tahir UA, Sinnott-Armstrong N, Raffield LM, Rahnavard A, Hua X, Shutta KH, Darst BF. Metabolomic epidemiology offers insights into disease aetiology. Nat Metab. 2023;5(10):1656–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nascentes Melo LM, Lesner NP, Sabatier M, Ubellacker JM, Tasdogan A. Emerging metabolomic tools to study cancer metastasis. Trends Cancer. 2022;8(12):988–1001. [DOI] [PubMed] [Google Scholar]
- 14.Zhou J, Deng Y, Huang Y, Wang Z, Zhan Z, Cao X, Cai Z, Deng Y, Zhang L, Huang H, Li C, Lv X. An individualized prognostic model in patients with locoregionally advanced nasopharyngeal carcinoma based on serum metabolomic profiling. Life (Basel). 2023;13(5):1167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Xue K, Cao J, Wang Y, Zhao X, Yu D, Jin C, Xu C. Identification of potential therapeutic gene markers in nasopharyngeal carcinoma based on bioinformatics analysis. Clin Transl Sci. 2020;13(2):265–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhuang Z, Li N, Guo Z, Zhu M, Xiong K, Chen S. Study of molecule variations in renal tumor based on confocal micro-Raman spectroscopy. J Biomed Opt. 2013;18(3):31103. [DOI] [PubMed] [Google Scholar]
- 17.Yue S, Cheng JX. Deciphering single cell metabolism by coherent Raman scattering microscopy. Curr Opin Chem Biol. 2016;33:46–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Elumalai S, Managó S, De Luca AC. Raman microscopy: progress in research on cancer cell sensing. Sens (Basel). 2020;20(19):5525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lin H, Zhou J, Wu Q, Hung TM, Chen W, Yu Y, Chang JT, Pan J, Qiu S, Chen R. Human blood test based on surface-enhanced Raman spectroscopy technology using different excitation light for nasopharyngeal cancer detection. IET Nanobiotechnol. 2019;13(9):942–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chan JW, Taylor DS, Zwerdling T, Lane SM, Ihara K, Huser T. Micro-Raman spectroscopy detects individual neoplastic and normal hematopoietic cells. Biophys J. 2006;90(2):648–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Song Y, Zhou J, Zhao X, Zhang Y, Xu X, Zhang D, Pang J, Bao H, Ji Y, Zhan M, Wang Y, Ou Q, Hu J. Lineage tracing for multiple lung cancer by Spatiotemporal heterogeneity using a multi-omics analysis method integrating genomic, transcriptomic, and immune-related features. Front Oncol. 2023;13:1237308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tao M, Song T, Du W, Han S, Zuo C, Li Y, Wang Y, Yang Z. Classifying breast cancer subtypes using multiple kernel learning based on omics data. Genes (Basel). 2019;10(3):200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zhang SQ, Pan SM, Liang SX, Han YS, Chen HB, Li JC. Research status and prospects of biomarkers for nasopharyngeal carcinoma in the era of highthroughput omics (Review). Int J Oncol. 2021;58(4):9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Spence T, Bruce J, Yip KW, Liu FF. MicroRNAs in nasopharyngeal carcinoma. Chin Clin Oncol. 2016;5(2):17. [DOI] [PubMed] [Google Scholar]
- 25.Tang T, Zhou Z, Chen M, Li N, Sun J, Chen Z, Xiao T, Wang X, Zhang L, Wang Y, Zhang H, Zheng X, Chen B, Ye F, Guan J. Plasma metabolic Profiles-Based prediction of induction chemotherapy efficacy in nasopharyngeal carcinoma: results of a bidirectional clinical trial. Clin Cancer Res. 2024;30(14):2925–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Góral J, Zichy V. Fourier transform Raman studies of materials and compounds of biological importance. Spectrochimica Acta Part A: Mol Spectrosc. 1990;46(2):253–75. [Google Scholar]
- 27.Wiercigroch E, Szafraniec E, Czamara K, Pacia MZ, Majzner K, Kochan K, et al. Raman and infrared spectroscopy of carbohydrates: A review. Spectrochim Acta Part A Mol Biomol Spectrosc. 2017;185:317–35. [DOI] [PubMed] [Google Scholar]
- 28.Lin C, Li Y, Peng Y, Zhao S, Xu M, Zhang L, Huang Z, Shi J, Yang Y. Recent development of surface-enhanced Raman scattering for biosensing. J Nanobiotechnol. 2023;21(1):149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Pezzotti G. Raman spectroscopy in cell biology and microbiology. J Raman Spectrosc. 2021;52(12):2348–443. [Google Scholar]
- 30.Zhu G, Zhu X, Fan Q, Wan X. Raman spectra of amino acids and their aqueous solutions. Spectrochim Acta Part A Mol Biomol Spectrosc. 2011;78(3):1187–95. [DOI] [PubMed] [Google Scholar]
- 31.Barth A, Zscherp C. What vibrations tell Us about proteins. Q Rev Biophys. 2002;35(4):369–430. [DOI] [PubMed] [Google Scholar]
- 32.Ruiz-Chica AJ, Medina MA, Sánchez‐Jiménez F, Ramírez FJ. Characterization by Raman spectroscopy of conformational changes on guanine–cytosine and adenine–thymine oligonucleotides induced by aminooxy analogues of spermidine. J Raman Spectrosc. 2004;35(2):93–100. [Google Scholar]
- 33.Czamara K, Majzner K, Pacia MZ, Kochan K, Kaczor A, Baranska M. Raman spectroscopy of lipids: a review. J Raman Spectrosc. 2015;46(1):4–20. [Google Scholar]
- 34.Movasaghi Z, Rehman S, Rehman DIU. Raman spectroscopy of biological tissues. Appl Spectrosc Rev. 2007;42(5):493–541. [Google Scholar]
- 35.Stone N, Kendall C, Shepherd N, Crow P, Barr H. Near-infrared Raman spectroscopy for the classification of epithelial pre-cancers and cancers. J Raman Spectrosc. 2002;33(7):564–73. [Google Scholar]
- 36.Xu J, Chen D, Wu W, Ji X, Dou X, Gao X, Li J, Zhang X, Huang WE, Xiong D. A metabolic map and artificial intelligence-aided identification of nasopharyngeal carcinoma via a single-cell Raman platform. Br J Cancer. 2024;130(10):1635–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Chang ET, Ye W, Zeng YX, Adami HO. The evolving epidemiology of nasopharyngeal carcinoma. Cancer Epidemiol Biomarkers Prev. 2021;30(6):1035–47. [DOI] [PubMed] [Google Scholar]
- 38.Huang H, Li S, Tang Q, Zhu G. Metabolic reprogramming and immune evasion in nasopharyngeal carcinoma. Front Immunol. 2021;12:680955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Maan K, Baghel R, Dhariwal S, Sharma A, Bakhshi R, Rana P. Metabolomics and transcriptomics based multi-omics integration reveals radiation-induced altered pathway networking and underlying mechanism. NPJ Syst Biol Appl. 2023;9(1):42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Al-Mansoob M, Gupta I, Stefan Rusyniak R, Ouhtit A. KYNU, a novel potential target that underpins CD44-promoted breast tumour cell invasion. J Cell Mol Med. 2021;25(5):2309–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Attwood PV, Muimo R. The actions of NME1/NDPK-A and NME2/NDPK-B as protein kinases. Lab Invest. 2018;98(3):283–90. [DOI] [PubMed] [Google Scholar]
- 42.Furuta E, Okuda H, Kobayashi A, Watabe K. Metabolic genes in cancer: their roles in tumor progression and clinical implications. Biochim Biophys Acta. 2010;1805(2):141–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Overman SA, Thomas GJ Jr. Raman spectroscopy of the filamentous virus Ff (fd, fl, M13): structural interpretation for coat protein aromatics. Biochemistry. 1995;34(16):5440–51. [DOI] [PubMed] [Google Scholar]
- 44.Guleken Z, Ceylan Z, Aday A, Bayrak AG, Hindilerden İY, Nalçacı M, Jakubczyk P, Jakubczyk D, Kula-Maximenko M, Depciuch J. Detection of primary myelofibrosis in blood serum via Raman spectroscopy assisted by machine learning approaches; correlation with clinical diagnosis. Nanomedicine. 2023;53:102706. [DOI] [PubMed] [Google Scholar]
- 45.Miura T, Thomas GJ Jr. Structure and dynamics of interstrand guanine association in quadruplex telomeric DNA. Biochemistry. 1995;34(29):9645–54. [DOI] [PubMed] [Google Scholar]
- 46.Bai Y, Yu Z, Yi S, Yan Y, Huang Z, Qiu L. Raman spectroscopy-based biomarker screening by studying the fingerprint characteristics of chronic lymphocytic leukemia and diffuse large B-cell lymphoma. J Pharm Biomed Anal. 2020;190:113514. [DOI] [PubMed] [Google Scholar]
- 47.hen K, Chen B, Yang L, Gao W. KYNU as a biomarker of Tumor-Associated macrophages and correlates with immunosuppressive microenvironment and poor prognosis in gastric cancer. Int J Genomics. 2023;2023:4662480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Zhao L, Wang B, Yang C, Lin Y, Zhang Z, Wang S, Ye Y, Shen Z. TDO2 knockdown inhibits colorectal cancer progression via TDO2-KYNU-AhR pathway. Gene. 2021;792:145736. [DOI] [PubMed] [Google Scholar]
- 49.Lu Y, Zhao X, Yuan M, Zhao M, Liu K, Zhang M, Qiu X, Yu X, Liu X, Wei D, Xie J, Cheng Z. KYNU expression promotes cisplatin resistance in esophageal cancer. J Cancer. 2024;15(9):2475–85. 10.7150/jca.93229. PMID: 38577600; PMCID: PMC10988315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.León-Letelier RA, Abdel Sater AH, Chen Y, Park S, Wu R, Irajizad E, Dennison JB, Katayama H, Vykoukal JV, Hanash S, Ostrin EJ, Fahrmann JF. Kynureninase upregulation is a prominent feature of NFR2-Activated cancers and is associated with tumor immunosuppression and poor prognosis. Cancers (Basel). 2023;15(3):834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Shen K, Chen B, Yang L, Gao W. KYNU as a biomarker of Tumor-Associated macrophages and correlates with immunosuppressive microenvironment and poor prognosis in gastric cancer. Int J Genomics. 2023;2023:4662480. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary documentation 1.The full uncropped Gels and Blots images
Supplementary documentation 2.Supplementary Materials and Methods
Figure S1. Clustering Analysis of DEGs and DEMs Expression in Different Subtypes of NPC Cell Lines(A) (C) Cluster analysis of DEGs and DEMs between CNE1 and CNE2 cell lines. Each column represents a sample and each row represents a gene or metabolite. (B) (D) Cluster analysis of DEGs and DEMs between 5-8F and 6-10B cell lines. Each column represents a sample and each row represents a gene or metabolite
Figure S2. Training Results of the MLP Model. Demonstrates the performance of the MLP model during 100 epochs of training, including the accuracy and loss changes in the training and validation sets. The results show that with the increase in training epochs, the accuracy of the model in both the training and validation sets rapidly improved and stabilized, with a high consistency between the validation and training set accuracies. Simultaneously, the loss values rapidly decreased and converged, indicating that the model possesses good convergence and generalization capabilities, effectively learning and generalizing data features, thus providing a reliable foundation for subsequent classification tasks
Supplementary raw data 1.Original metabolomics data
Supplementary raw data 2.Original transcriptomics data
Supplementary raw data 3.Original single-cell Raman spectroscopy data
Data Availability Statement
Data availability Statement: The authors declare that the main data supporting the findings of this study are available within the article and its Supplementary files.Extra dataare available from the corresponding author upon request.








