Highlights
-
•
Early diagnosis could improve lung cancer survival rate.
-
•
The availability of blood-based screening could increase lung cancer patient uptake.
-
•
An interdisciplinary mechanism combines metabolomics and machine learning methods.
-
•
Metabolic biomarkers could be potential screening biomarkers for early detection of lung cancer.
-
•
Naïve Bayes is recommended as an exploitable tool for early lung tumor prediction.
Keywords: Lung cancer, Metabolites, Biomarker, Early diagnosis, Machine learning
Abstract
Early diagnosis has been proved to improve survival rate of lung cancer patients. The availability of blood-based screening could increase early lung cancer patient uptake. Our present study attempted to discover Chinese patients’ plasma metabolites as diagnostic biomarkers for lung cancer. In this work, we use a pioneering interdisciplinary mechanism, which is firstly applied to lung cancer, to detect early lung cancer diagnostic biomarkers by combining metabolomics and machine learning methods. We collected total 110 lung cancer patients and 43 healthy individuals in our study. Levels of 61 plasma metabolites were from targeted metabolomic study using LC-MS/MS. A specific combination of six metabolic biomarkers note-worthily enabling the discrimination between stage I lung cancer patients and healthy individuals (AUC = 0.989, Sensitivity = 98.1%, Specificity = 100.0%). And the top 5 relative importance metabolic biomarkers developed by FCBF algorithm also could be potential screening biomarkers for early detection of lung cancer. Naïve Bayes is recommended as an exploitable tool for early lung tumor prediction. This research will provide strong support for the feasibility of blood-based screening, and bring a more accurate, quick and integrated application tool for early lung cancer diagnostic. The proposed interdisciplinary method could be adapted to other cancer beyond lung cancer.
Introduction
In worldwide, lung carcinoma is the leading cause of cancer death in the past few decades. In January 2019, National Central Cancer Registry of China (NCCRC) released its latest nationwide tumor statistics of population-based tumor registry data gathered from 368 tumor registries in 2015. According to this report, lung cancer ranks top for its incidence among malignant tumors in China. What's more, the 5-year survival rate for patients with lung tumors was low which is at 18%. However, the survival rate can increase to approximately 55% if early diagnosis of lung cancer was achieved. Moreover, it has been reported the early stage patients who received proper treatment could have a 5 year survival rate around 40% [1]. Unfortunately, over 70% patients are diagnosed when their tumor are developed to the advanced stages, and most of them are not suitable for receiving operation [2]. This is partly related with the early stage diagnostic methods which are still not sensitive and specific enough. Therefore, it appears to be an important step to find out the cogent and powerful diagnostic biomarkers of lung cancer, particularly for the diagnosis of early lung tumor progression.
Metabolomics study have been used to recognize the metabolic pathways and metabolites that regulate tumor progression and physiological function [3], [4], [5]. Metabolomics could provide the information of cellular metabolic processes that drive tumorigenesis and tumor progression. These metabolites also could be helpful for distinguishing the tumor stage, histological types, and even the response to drug treatment [6]. These changes in metabolite pattern had be used to evaluate the clinical characteristics of colorectal tumor [7], ovarian tumor [8], renal tumor [9], oral tumor [10], and pancreatic tumor [11]. Even so, for lung cancer, more specific and sensitive biomarkers were needed to be revealed with metabolic analysis.
Artificial intelligence (AI) is the competence for machines to imitate human behavior, which is extremely adept at handling extensive amounts of data. Machine learning is the application of AI, which allows computer systems could be trained automatically from experience without explicitly programmed [12]. Fundamentally, machine learning means learning from the practice of using algorithms to parse data, and then making a prediction or decision about the future situation of any new data sets [13]. In cancer, machine learning has already been used to explore survival and prognostic prediction models in pancreatic cancer, bladder cancer, advanced nasopharyngeal carcinoma and breast cancer [14], [15], [16], [17]. In some cases, their performance had attained comparable to that of human experts [18]. Machine learning models could be seemed as an approach of designing the model by learning from experience and improving its performance [19]. These models aim at finding out effective variables and the relationship between them. Over the past few years, the field of AI has moved from largely theoretical studies to real-world applications [20,21]. The application of AI in several domains is now associated with great expectations and at the same time exists a great vacancy in cancer research especially lung cancer.
In this study, the major aim of metabolomics research on lung cancer was to discover clinical metabolic biomarkers that had representative alterations between lung tumor patients and healthy individuals. Moreover, we also focused on biomarkers for distinguishing each histological subtypes and disease stages, especially for early stages. For the first time, based on the plasma metabolite features, we applied machine learning to develop the diagnostic model for early stages of lung cancer.
Materials and methods
Patients and groups
A total of 110 patients and 43 healthy individuals of the Hubei Taihe Hospital were included in this study. The Institutional Review Board of Taihe Hospital, Hubei University of Medicine approved the study involving patients and healthy individuals. All individuals have written informed consent prior to participation in the investigation, with permission of sample collection, usage and data analysis. Final diagnosis was ascertained by clinical symptoms and histopathological examination of operative specimens. According to the TNM staging system, patients were classified as stage I (n = 54) stage II (n = 31), stage III (n = 25). Based on the WHO classification of tumors [22], the tumors have been classified as adenocarcinomas (n = 63), squamous carcinomas (n = 41) and other histological types (n = 6).
Targeted metabolomic study using LC-MS/MS
Targeted metabolomic study was performed with previous reported LC-MS methods [23,24]. In brief, plasma samples (200 μL) were thawed on ice, mixed with N-ethylmaleimide PBS buffer (10 mM, 200 μL) and 1000 μl of methanol containing 10 ng/ml internal standards (IS) Phe-d5.The mixed solution then incubated 20 mins at −20 °C and centrifuged at 13,000 rpm for 10 min at 4 °C. The supernatants were dried under nitrogen flow at 4 °C and reconstituted with 30% methanol for LC-MS analysis.
The chromatographic separation was carried out with a Waters X Bridge™ BEH C18 analytical column (2.5 μm, 3.0 × 100 mm; Waters, Torrance, CA) using a Waters ACQUITY UPLC coupled with a 4000 Q-TRAP mass spectrometer. The mobile phase was composed of 0.1% formic acid water (solvent A) and methanol (solvent B) which was running in a gradient program: 0–3.0 min (0%–1% B); 3.0–10.0 min (1–3% B); 10.0–14.0 min (3–50% B); 14.0–18.0 min (50–95% B); 18.0–22.0 min (95–0% B); followed by a 3-min re-equilibration step. The flow rate was 0.6 ml/min, and 10 μl were injected in LC-MS. The mass spectra were acquired in both the negative and positive ion voltage modes for electrospray (ESI) with the following parameters: gas temperature, 450 °C; the ion spray voltage, ±4500 V; ion source gas 1 (nebulizer gas) 40 psi (N2); ion source gas 2 (auxiliary gas), 40 psi (N2); curtain gas: 20 psi. Targeted MS/MS (MRM) mode were used with the collision energy ranging from 10 V to 40 V. All LC-MS data were obtained by AB Analyst Software (Version 1.6.2). The intensity of each ion was normalized to the peak area of IS prior to multivariate statistical analysis. For the metabolomic assay, principal component analysis (PCA) and orthogonal projection to latent structures discriminant analysis (OPLS-DA) analysis were analyzed with SIMCA-p14 software (Umetrics AB, UMEÅ, Sweden) for tested groups. Variables were screened by VIP values first, and values exceeding 1 were considered eligible for group discrimination. The selected metabolites were further confirmed by a between normal control and disease control with P-value less than 0.05. MetaboAnalyst 3.0 was used for integrated analysis, pathway impact and enrichment.
Statistical analysis
All statistical data were analyzed by using SPSS Statistics 22.0 (SPSS, Chicago, IL, United States), GraphPad Prism 8.0 (GraphPad Software, La Jolla, CA, United States), TBtools v0.6735 [25] and Orange software 3.23 [26]. Receiver operating characteristic (ROC) curve analysis was established to evaluate the diagnostic performance of metabolites. P values < 0.05 was considered statistically significant.
Machine learning methods
The six machine learning techniques of K-nearest neighbor (KNN), Naïve Bayes, AdaBoost, Support Vector Machine (SVM), Random Forest, and Neural Network with 10-cross fold technique were used for the early lung tumor prediction based on the metabolomic biomarkers features. SVM, which is the classification algorithm, intends to invent a decision boundary between two categories that enables the prediction of labels from feature vectors [27]. K-nearest-neighbor (KNN) is the preferred selection when there is tiny prior knowledge of data, which is elementary and plain nonparametric method for classification [28]. Random Forest is an ensemble tree method that trees are grown by binary recursive splitting of right-censored data [29]. Naïve Bayes is a statistical classifier, which was used to predict class membership probability [30]. It hypothesizes all variables participate in classification independently and provides the result for prediction [31]. Neural Network purposes to simulate the neuron and human brain. The artificial neuron of Neural Network uses particular input features to assign suitable mathematical weights that are eventually able to predict some output object [32].
80% samples including stage I lung tumor patients (n = 43) and healthy individuals (n = 35) were selected by stratified sampling from each group as the training set to uniformly train the models. The rest 20% samples including stage I lung cancer patients (n = 11) and healthy individuals (n = 8) making up the test set were used for evaluation. The training set was used to generate the prediction model that could predict the diagnosis of the test set. To test and compare the models, sensitivity, specificity, precision, classification accuracy, and AUC (area under the curve) value from each model were used to measure the performance. The sensitivity, specificity, and classification accuracy values were obtained from true negative (TN), false negative (FN), true positive (TP), and false positive (FP). The following terms are essential information of them:
TP = Lung tumor patients correctly diagnosed as patients.
FP = Healthy individuals incorrectly identified as patients.
TN = Healthy individuals correctly identified as healthy.
FN = Lung tumor patients incorrectly identified as healthy.
Results
Metabolic biomarkers for detection of early lung tumor
Our studies aim to identify metabolites that could act as promising biomarkers for distinguishing lung tumor patients with healthy individuals, disease stages as well as pathological patterns with high sensitivity and specificity. Firstly, stage I lung cancer patients (n = 54) and healthy individuals (n = 43) were separated by using unsupervised hierarchical clustering with heat map shown in Fig. 1. Through Mann–Whitney U test, 46 influential metabolic biomarkers (Fig. 2A and Table S1) showed statistically significant difference (p-value<0.05) among 61 metabolites. Increased levels of l-Leucine, l-Valine, serine and other 25 metabolites were observed in stage I lung cancer patients compared to healthy individuals. Moreover, fumaric acid, citric acid, PC (36:4), and other 15 metabolites were downregulated in stage I lung cancer patients compared to healthy controls. These 46 influential metabolites are potential markers for pre-clinical screening of lung tumor.
Next, these 46 influential metabolic biomarkers were applied to construct ROC curves. Based on the AUC (area under the ROC curve) value, sensitivity and specificity, top 10 metabolic biomarkers with higher diagnostic value (AUC>0.800) were showed in Table 1 and Fig. S1. In addition, PCA (principal component analysis) with these 10 metabolites revealed a clear separation between stage I lung tumor patients and healthy individuals (Fig. 2B and Table S2). In the comparison between stage I lung tumor patients and healthy individuals, proline showed the best AUC value of 0.923 (95% CI: 0.871–0.975), with a sensitivity of 79.6% and specificity of 93.0% at the cut off value of 24.350.
Table 1.
AUC | Std. error | Asymptotic 95% confidence interval |
Optimal cut off | Sensitivity | Specificity | Youden index | ||
---|---|---|---|---|---|---|---|---|
Lower bound | Upper bound | |||||||
ROC analysis of metabolomic biomarkers and combined variates in early lung tumor detection. | ||||||||
L-Kynurenine | 0.825 | 0.043 | 0.740 | 0.909 | 0.975 | 85.2% | 72.1% | 0.573 |
Proline | 0.923 | 0.026 | 0.871 | 0.975 | 24.350 | 79.6% | 93.0% | 0.727 |
Spermidine | 0.890 | 0.035 | 0.821 | 0.958 | 7.195 | 81.5% | 90.7% | 0.722 |
Amino-hippuric acid | 0.811 | 0.045 | 0.722 | 0.900 | 4.035 | 68.5% | 93.0% | 0.615 |
Palmitoyl-l-carnitine | 0.906 | 0.032 | 0.843 | 0.969 | 3.655 | 74.1% | 100.0% | 0.741 |
Taurine | 0.920 | 0.032 | 0.856 | 0.983 | 71.300 | 88.9% | 95.3% | 0.842 |
Phenylalanine | 0.848 | 0.038 | 0.774 | 0.922 | 125.500 | 79.6% | 76.7% | 0.564 |
L-Valine | 0.876 | 0.036 | 0.806 | 0.946 | 167.000 | 68.5% | 95.3% | 0.639 |
o-Tyr | 0.822 | 0.043 | 0.738 | 0.906 | 24.650 | 83.3% | 72.1% | 0.554 |
Carnitine | 0.848 | 0.040 | 0.769 | 0.926 | 4.680 | 72.2% | 93.0% | 0.652 |
Combination of two | 0.933 | 0.028 | 0.878 | 0.978 | 0.337 | 85.2% | 93.0% | 0.782 |
Combination of three | 0.968 | 0.019 | 0.931 | 1.000 | −0.147 | 94.4% | 97.7% | 0.921 |
Combination of six | 0.989 | 0.011 | 0.967 | 1.000 | −0.102 | 98.1% | 100.0% | 0.981 |
ROC analysis of metabolomic biomarkers and combined variates of adenocarcinoma and squamous carcinoma patients. | ||||||||
L-Kynurenine | 0.423 | 0.060 | 0.306 | 0.540 | 1.050 | 77.8% | 24.4% | 0.022 |
Proline | 0.580 | 0.057 | 0.469 | 0.692 | 35.150 | 54.0% | 65.9% | 0.198 |
Carnitine | 0.536 | 0.058 | 0.422 | 0.650 | 6.835 | 38.1% | 75.6% | 0.137 |
Hypoxanthine | 0.639 | 0.055 | 0.531 | 0.746 | 0.092 | 69.8% | 56.1% | 0.259 |
Hippuric acid | 0.628 | 0.056 | 0.519 | 0.737 | 2.620 | 49.2% | 77.5% | 0.267 |
Combination of four | 0.740 | 0.049 | 0.644 | 0.837 | 0.556 | 58.7% | 78.0% | 0.368 |
Abbreviations: ROC,receiver operating characteristic; AUC, area under the curve.
Moreover, the potential combination schemes of metabolic biomarkers based on logistic regression analysis were carried out to enhance the sensitivity and accuracy of diagnostic of early stages of lung cancer. As shown in Table 1, Table S3 and Fig. 2C, the combination of six variables (metabolites) remarkably enhanced the AUC to 0.989 (95% CI: 0.967–1.000, Sensitivity = 98.1%, Specificity = 100.0%). The metabolites used included proline, l-kynurenine (AUC = 0.825, Sensitivity = 85.2%, Specificity = 72.1%), spermidine (AUC = 0.890, Sensitivity = 81.5%, Specificity = 90.7%), amino-hippuric acid (AUC = 0.811, Sensitivity = 68.5%, Specificity = 93.0%), palmitoyl-l-carnitine (AUC = 0.906, Sensitivity = 74.1%, Specificity = 100.0%) and taurine (AUC = 0.920, Sensitivity = 88.9%, Specificity = 95.3%). These results indicated that 6 metabolic biomarkers could act as a promising combination for early detection of lung tumor.
Metabolic biomarkers for disease progression
In addition, we were also interested in the alteration of metabolites in different stages. To identify the metabolites level changes with tumor stage progress, Kruskal–Wallis test was applied. Fig. 3A showed the 10 metabolites which showed significant difference in stage I (n = 54), stage II (n = 31), stage III (n = 25) lung tumor patients and healthy individuals (n = 43). Although there was statistically significant difference between lung tumor patients and healthy individuals (Fig. S2 and Table S4), the metabolic biomarkers showed poor performance for discrimination of stage I, II, and III patients with lung cancer (Fig. 3B–D, Table S5 and Table S6). Both of non-parametric test and receiver operating characteristic curves suggested the continuous abnormal expression in lung cancer patients compared with healthy individuals.
Metabolic biomarkers for evaluation of different histological types
For lung cancer histological type prediction, particularly the distinction between squamous carcinoma and adenocarcinoma, it is a significant diagnostic requirement in clinical practice. Several clinical studies have demonstrated that tumor histological type differing toxicity and efficacy of treatment [33]. Tumor histological types identification will be helpful for improving the treatment efficiency. We applied the metabolites from the adenocarcinoma (n = 63) and squamous carcinoma (n = 41) patients to construct ROC curves and performed Mann–Whitney U test. There were only 2 influential metabolites identified between adenocarcinoma and squamous carcinoma, including hippuric acid (p-value=0.029) and hypoxanthine (p-value=0.017). In the ROC analysis (Fig. S3), hypoxanthine showed the AUC value of 0.639 (95% CI: 0.531–0.746), with a sensitivity of 69.8% and specificity of 56.1% at the cut off value of 0.092. And the hippuric acid showed the AUC value of 0.628 (95% CI: 0.519–0.737), with a sensitivity of 49.2% and specificity of 77.5% at the cut off value of 2.620. As shown in Fig. 2D and Table 1, the combination of four variates enhanced the AUC to 0.740 (95% CI: 0.644–0.837, sensitivity = 58.7%, specificity = 78.0%), including hypoxanthine, l-Kynurenine (AUC = 0.423, sensitivity = 77.8%, specificity = 24.4%), proline (AUC =0.580, sensitivity = 54.0%, specificity = 65.9%) and Carnitine (AUC = 0.536, sensitivity = 38.1%, specificity = 75.6%). These results indicated metabolic biomarkers showed poor performance on distinguishing different lung tumor histological types in our study, since all AUC values < 0.800 with poor sensitivity and specificity.
Utilization of machine learning methods
To develop the early lung tumor prediction model, we considered six machine learning techniques: K-nearest-neighbor (KNN), Naïve Bayes, AdaBoost, Support Vector Machine (SVM), Random Forest, and Neural Network (Fig. 4A). The training set of stage I lung tumor patients (n = 43) and healthy individuals (n = 35) was used to develop machine learning models based on the metabolic biomarker features. Using Orange 3.23 platform, there were 61 kinds of metabolites used as features to develop the machine learning models. To determine which model would provide the most precise predictions on the lung tumor metabolomics data, the sensitivity, specificity, precision, classification accuracy, and AUC value of six machine learning models were assessed (Supplementary methods). With the best value in each evaluation is highlighted in Table 2. In training set, Naïve Bayes and Neural Network indicated better results in comparison with other techniques (KNN, AdaBoost, SVM, and Random Forest). The precision, classification accuracy, specificity, sensitivity and the AUC value of Naïve Bayes and Neural Network are 100.0%. AdaBoost machine learning technique showed poor performance (precision =0.885, classification accuracy=0.885, specificity=0.886, sensitivity=0.884, and AUC=0.885).
Table 2.
TP | FP | TN | FN | Classification accuracy | Sensitivity | Specificity | AUC | Precision | ||
---|---|---|---|---|---|---|---|---|---|---|
Training set | KNN | 38 | 0 | 35 | 5 | 0.936 | 0.884 | 1.000 | 1.000 | 0.944 |
SVM | 43 | 2 | 33 | 0 | 0.974 | 1.000 | 0.943 | 1.000 | 0.975 | |
Random Forest | 41 | 0 | 35 | 2 | 0.974 | 0.953 | 1.000 | 1.000 | 0.976 | |
Neural Network | 43 | 0 | 35 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
Naïve Bayes | 43 | 0 | 35 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
AdaBoost | 38 | 4 | 31 | 5 | 0.885 | 0.884 | 0.886 | 0.885 | 0.885 | |
Test set | KNN | 9 | 0 | 8 | 2 | 0.895 | 0.818 | 1.000 | 1.000 | 0.916 |
SVM | 10 | 0 | 8 | 1 | 0.947 | 0.909 | 1.000 | 1.000 | 0.953 | |
Random Forest | 11 | 2 | 6 | 0 | 0.895 | 1.000 | 0.750 | 1.000 | 0.911 | |
Neural Network | 10 | 0 | 8 | 1 | 0.947 | 0.909 | 1.000 | 1.000 | 0.953 | |
Naïve Bayes | 11 | 0 | 8 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
AdaBoost | 4 | 0 | 8 | 7 | 0.632 | 0.364 | 1.000 | 0.682 | 0.804 |
Abbreviations: AdaBoost, Adaptive Boosting; SVM, support vector machines; KNN, k-nearest neighbor; TN, true negative; FN, false negative; TP, true positive; FP, false positive; AUC, area under the curve.
Subsequently, we used our trained diagnostic machine learning models to classify the test set that consisted of stage I lung cancer patients (n = 11) and healthy individuals (n = 8) to evaluate its performance. As shown in Table 2, the Naïve Bayes model showed the best performance on all evaluation parameters. The specificity of Naïve Bayes, Neural Network, KNN, AdaBoost, and SVM was 1.000, which means good prediction power against healthy individuals. For overall quality of prediction, AUC of Naïve Bayes, Neural Network, KNN, Random Forest, and SVM were 1.000. The sensitivity of Naïve Bayes and Random Forest was 1.000, SVM and Neural Network was 0.909, which means little FN (false negative) scale. In medical situation, FN scale is more important than FP (false positive) [34]. Consequently, these four models (Naïve Bayes, Random Forest, SVM and Neural Network) were appropriate for diagnosis of early lung tumor. In classification accuracy, Naïve Bayes model has a 1.000 rate with SVM and Neural Network models of 0.947. All of three models have enough accuracy that could be used for medical application. Naïve Bayes is a simple probabilistic classifier based on applying the Bayes' theorem with strong independence and normality assumptions between the variables [35]. It is one of the most valid machine learning algorithms with strong independence and normality assumptions between features, which has been widely employed for the prediction [36]. Given the above, our study testified that Naïve Bayes was the best model with the highest level of sensitivity, specificity, and accuracy. Therefore, Naïve Bayes is recommended as an exploitable tool for early lung tumor prediction.
The Fast Correlation-Based Filter (FCBF) algorithm is a supervised method, which is based on information theory [37]. It takes both identifying correlated features for classification and eliminating redundant features into account [38]. Based on the symmetrical uncertainty (SU), FCBF ranks features in descending order of correlation and casts off those redundant features that are less correlated [39]. Consequently, the optimal correlated and non-redundant features subset is acquired. It could maximize the diagnostic potential of the extractable information. The relative importance of a metabolic biomarker feature is developed using the Fast Correlation-Based Filter (FCBF) algorithm. All metabolites features were ranked and scored according to their ability to discern the classification label of an object (Table S7). According to the ranking, top 8 metabolic biomarkers were used to develop the machine learning models, respectively. Table S8 and Fig. S4 showed that AUC values changed with the number of variates. When the top 5 variates including taurine, Palmitoyl-l-carnitine, proline, 2-DG, and PE (36:4) were used as metabolites features, prediction model showed the excellent performance which was similar with previous models. Therefore, these 5 metabolic biomarkers could be potential candidates for pre-clinical screening of lung cancer. Then based on the results of logistic regression analysis, the combination of top three variates including taurine, Palmitoyl-l-carnitine, and proline, showed the AUC value of 0.968 (95% CI: 0.931–1.000), with a sensitivity of 94.4% and specificity of 97.7% in classical analysis (Table 1).
To validate diagnosis performance of machine learning models and demonstrate the specificity of metabolic biomarker features of early lung cancer patients found in our study, we performed a control experiment [40]. We created a scrambled training set that correctly labeled training set was replaced with a training set where the labels were randomly assigned. As expected, the accuracy of our machine learning models markedly dropped, which is equivalent to randomly choosing, showing no predictive value and validating the specific predictive signature of metabolic biomarker features (Fig. 4B).
Discussion
Lung cancer is the worldwide leading cause of cancer-related mortality, which early diagnosis could improve survival rate. However, high-risk people are generally recommended annual radiologic screening by low-dose computed tomography (LDCT) [41], which is also the one and only way of the clinically lung cancer detection at present. Due to the significant cost and high false-discovery rate [42], fulfillment of CT screening is unsatisfactory. Therefore, the availability of blood-based screening could increase lung cancer patient uptake, including plasma metabolic biomarkers detection. Our present study attempted to discover Chinese patients’ plasma metabolites as predictive biomarkers for lung cancer diagnosis. In this work, we use a pioneering interdisciplinary mechanism, which is firstly applied to lung cancer, to detect early lung cancer diagnostic biomarkers by combining metabolomics and machine learning methods. The highly sensitive and accurate metabolomics technology and machine learning methods bring novelty to our work compared with previous study that used miRNAs as biomarkers for early lung cancer diagnostic.
Main previous studies of breast cancer, prostate cancer and other cancers have screened miRNAs as biomarkers for distinguishing cancer patients and different tumor subtypes [43]. Nevertheless, miRNA screening has its limitation, such as high cost and technical monopolization [44]. Then, clinical usual tests of blood antigen CEA, CA125, SCC, etc. still meets the problem of low sensitivity and low accuracy [45]. At the same time, small biopsy specimen means trauma and false negative as the limited section site [46]. In our present study, we focused on seeking out metabolic biomarkers as early stage lung cancer diagnostic biomarkers. Herein, we identified a new range of metabolites by determining their plasma profiles in patients prior to lung tumor clinical diagnosis. From 61 kinds of metabolites, we found 10 metabolic biomarkers could act as the promising biomarker for early detection of lung tumor. Particularly, the combination of six variates remarkably enhanced the AUC to 0.989 with sensitivity of 98.1% and specificity of 100.0%, which has not been reported so far. According to these results, we recommended the specific combination of these six metabolites as screening biomarker for early detection of lung tumor. We believe that this finding could allow us to develop a specific, sensitive, and minimally invasive implement for early lung tumor prevention and prediction.
To date, machine learning approaches, which provide a promising alternative to classical data analysis methods, have been utilized in varied biomedical applications, including drug discovery [47], and biomarker development [48]. Rather than conventional types of data analysis requiring prior awareness of biological dependencies, machine learning could improve diagnostic capability through abundant and high-quality data. One team collected 23 items of demographic data and tumor-related parameters of 102 cervical cancer patients who had undergone radical hysterectomy for treatment, and investigated diverse machine learning models to predict 5-year survival rate in patients [49]. Another group made use of 2267 women colposcopy findings and human papillomavirus (HPV) biomarkers to develop a clinical decision support scoring system using artificial neural networks for cervical intraepithelial neoplasia patients, in which showing the ANN predicted with higher accuracy compared to cytology with or without HPV test [50]. Moreover, it has been proved the possibility to excavate serum microRNA panel as a potential biomarker for the detection of gastric cancer by machine learning [51]. Six types of machine learning techniques were used to select three biomarkers (miR-21-5p, miR-29c-3p, and miR-22-3p) from the published miRNA profiling study (GSE23739). Herein, we firstly used metabolic biomarkers as machine-learning features for lung cancer diagnosis, and the obtained results were analyzed and discussed. The top 5 relative importance metabolic biomarkers (taurine, Palmitoyl-l-carnitine, proline, PE (36:4) and 2-DG), which were developed by FCBF algorithm, could be potential candidates for pre-clinical screening of lung cancer. In this study, several machine learning models were applied and compared, which were evaluated by test set and control experiment (Fig. 4B and Table 2). It leads to greatest assessment values on Naïve Bayes, but it also leads to decent assessment values on Neural Network, and SVM. As standalone screening models, high sensitivity would be desirable to minimize false positives. As shown in Table 2, Naïve Bayes, Random Forest, Neural Network, and SVM models with high sensitivity have dependable and stable potential for early lung tumor prediction. Furthermore, specificity thresholds set in the training set performed similarly when applied to the test set, indicating that models are well calibrated. The specificity threshold of Naïve Bayes, Neural Network, and SVM models defined in the training set achieved a similar specificity in the test set. Given the above of our study, Naïve Bayes, Neural Network, and SVM, based on the metabolic biomarker features, may be conducive for the diagnosis of early lung tumor. These results provided strong support for the feasibility of blood-based screening basing on metabolomics technology and machine learning for early lung cancer diagnostic.
Cancer progression is strongly related to cellular metabolism, and the dysregulated metabolism is conducive to tumor progression and initiation [52]. Cancer cells change their metabolic pathways, which request specific enzymes to catalyze biochemical reactions, to meet the growing need of the rapid cell reproduction and division. Metabolites, including metabolic biomarkers for lung cancer diagnosis found in our study, have functions related to tumorigenesis and tumor progression. Proline dehydrogenase (PRODH), which catalyzes the first step of proline degradation, is activated by lymphoid-specific helicase (LSH) to decrease proline levels [53]. Proline catabolism relating PRODH has been shown could either promote tumor survival through ROS-induced autophagy or ATP production, or as tumor suppressor to initiate ROS-mediated apoptosis depending on the tumor microenvironment [54,55]. One team recent study found that PRODH promotes lung cancer tumorigenesis by eliciting the expression of IKKα-dependent inflammatory genes and epithelial to mesenchymal transition (EMT) [56]. High levels of l- kynurenine could provide a microenvironment for lung tumor growing through initiating T-cell apoptosis, inhibiting T-cell proliferation and leading to immune tolerance [57,58]. Spermine N1-acetyltransferase (SSAT) is the pivotal protein involved in the homeostasis and synthesis of the spermidine [59]. Spermidine/SSAT is the rate-limiting step in the catabolism of polyamines, which play particular role in maintaining the membrane potential and regulating cell volume [60, 61]. Recent studies have reported that SSAT is upregulated in lung cancer [62]. 2-Deoxy-d-glucose (2DG), a glucose analogue, is converted to 2-DG-P by hexokinase. 2-DG-P cannot be metabolized but it could allosterically suppress hexokinase, which is the rate-limiting enzyme of glycolysis [63]. On account of blocking glycolysis, 2-DG influences in various biological processes. It could inhibit N-linked glycosylation, increases oxidative stress, and efficiently suppresses cell growth and invasion [64].The amino acid 2-aminoethanesulfonic acid, commonly known as taurine, has widespread physiological effects and was confirmed as the endogenous anti-injury material [65]. It could up-regulate the expression of N-acetyl galactosaminyl transferase 2, down-regulate the expression of matrix metalloproteinase-2, and inhibit the potential invasion and metastasis [66]. Previous studies have proposed that changes of taurine levels could be used to predict the malignant transformation and formation of breast, bladder and colorectal tumors [67], [68]–69].
Recently, on March 2020, Chabon et al. used integrating genomic features for non-invasive early lung cancer detection [70], which initially demonstrated machine learning method could be used for lung cancer detection. Based on cell-free DNA (cfDNA) features, researchers developed and prospectively validated a machine-learning method termed ‘lung cancer likelihood in plasma’ (Lung-CLiP), which could discriminate early lung cancer patients from controls. During screening test, they observed sensitivities of 63% stage I lung cancer patients with 80% specificity. Compared with Naïve Bayes, Neural Network, and SVM models basing on the metabolic biomarkers as features developed in our study, the sensitivity and specificity are > 85%. Although our machine learning models is less accurate than LDCT, this strategy could potentially increase the total number of patients screened. As this strategy progresses into clinical trial, abundant sample data will allow for improvement of performance by using more progressive machine-learning algorithms.
On the other hand, our current study still has several limitations. First, our analysis was built on data from single healthcare institution of finite geographic region. And non-small cell lung cancer excepted, the number of patients with other types of lung cancer was inadequate. Therefore, further confirmatory studies at other institutions are necessary prior to implementation. Second, our data only included the metabolites level. More information of lung tumor patients and healthy individuals, such as age, history of smoking, the concurrent tumor diagnosis, and past medical history, would be helpful for further study. We propose that integration of plasma metabolic biomarkers with CT screening or other lung tumor features could further improve performance.
In any case, we need to figure out a proper method to apply this strategy in clinical practice, such as combined with electronic chips system in order to make the plasma tests and model application in an assembly line. One potential application of our study could be served as a premier screening for some of the lung cancer patients. In despite of being candidates for LDCT as high-risk people, these patients are not being screened due to the concerns with false positives, limited access and other limited reasons. Then patients who show positive tests would then be referred to LDCT screening. Additionally, by modifying the machine learning methods and incorporating features appropriate for other cancer types, we expect that it could be feasible to develop strategy combining metabolomics and machine learning for a diverse range of malignancies diagnosis.
Conclusions
A pioneering interdisciplinary method was proposed in this study to detect early lung cancer diagnostic biomarkers by combining metabolomics and machine learning methods. Metabolic biomarkers demonstrate significant diagnostic strength for early detection of lung tumor.
CRediT authorship contribution statement
Ying Xie: Methodology, Investigation, Writing - review & editing. Wei-Yu Meng: Writing - original draft, Methodology, Formal analysis. Run-Ze Li: Writing - original draft, Conceptualization, Investigation. Yu-Wei Wang: Software, Data curation. Xin Qian: Investigation, Resources. Chang Chan: Investigation, Resources. Zhi-Fang Yu: Investigation, Resources. Xing-Xing Fan: Resources. Hu-Dan Pan: Resources. Chun Xie: Resources, Project administration. Qi-Biao Wu: Resources. Pei-Yu Yan: Resources. Liang Liu: Resources, Supervision. Yi-Jun Tang: Resources, Supervision. Xiao-Jun Yao: Conceptualization, Writing - review & editing, Funding acquisition. Mei-Fang Wang: Conceptualization, Resources, Writing - review & editing. Elaine Lai-Han Leung: Conceptualization, Writing - review & editing, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Funding
This work was funded by The Science and Technology Development Fund, Macau SAR (File no: 0096/2018/A3, 0003/2019/AKP, 0055/2020/A) and the NSFC overseas and Hong Kong and Macao Scholars Cooperative Research Fund Project (Project no: 81828013).
Footnotes
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.tranon.2020.100907.
Contributor Information
Xiao-Jun Yao, Email: xjyao@must.edu.mo.
Mei-Fang Wang, Email: wmfpps02@hotmail.com.
Elaine Lai-Han Leung, Email: lhleung@must.edu.mo.
Appendix. Supplementary materials
References
- 1.Ettinger D.S., Wood D.E., Aisner D.L., Akerley W., Bauman J., Chirieac L.R., D'Amico T.A., DeCamp M.M., Dilling T.J., Dobelbower M., Doebele R.C., Govindan R., Gubens M.A., Hennon M., Horn L., Komaki R., Lackner R.P., Lanuti M., Leal T.A., Leisch L.J., Lilenbaum R., Lin J., Loo B.W., Jr., Martins R., Otterson G.A., Reckamp K., Riely G.J., Schild S.E., Shapiro T.A., Stevenson J., Swanson S.J., Tauer K., Yang S.C., Gregory K., Hughes M. Non-small cell lung cancer, version 5.2017, NCCN clinical practice guidelines in oncology. J. Natl. Compr. Cancer Netw. 2017;15:504–535. doi: 10.6004/jnccn.2017.0050. [DOI] [PubMed] [Google Scholar]
- 2.Chen Z., Fillmore C.M., Hammerman P.S., Kim C.F., Wong K.K. Non-small-cell lung cancers: a heterogeneous set of diseases. Nat. Rev. Cancer. 2014;14:535–546. doi: 10.1038/nrc3775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jin X., Yun S.J., Jeong P., Kim I.Y., Kim W.J., Park S. Diagnosis of bladder cancer and prediction of survival by urinary metabolomics. Oncotarget. 2014;5:1635–1645. doi: 10.18632/oncotarget.1744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wang H., Chen J., Feng Y., Zhou W., Zhang J., Yu Y.U., Wang X., Zhang P. (1)H nuclear magnetic resonance-based extracellular metabolomic analysis of multidrug resistant Tca8113 oral squamous carcinoma cells. Oncol. Lett. 2015;9:2551–2559. doi: 10.3892/ol.2015.3128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lam C.W., Law C.Y. Untargeted mass spectrometry-based metabolomic profiling of pleural effusions: fatty acids as novel cancer biomarkers for malignant pleural effusions. J. Proteome Res. 2014;13:4040–4046. doi: 10.1021/pr5003774. [DOI] [PubMed] [Google Scholar]
- 6.Bamji-Stocke S., van Berkel V., Miller D.M., Frieboes H.B. A review of metabolism-associated biomarkers in lung cancer diagnosis and treatment. Metabolomics. 2018;14:81. doi: 10.1007/s11306-018-1376-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhu J., Djukovic D., Deng L., Gu H., Himmati F., Chiorean E.G., Raftery D. Colorectal cancer detection using targeted serum metabolic profiling. J. Proteome Res. 2014;13:4120–4130. doi: 10.1021/pr500494u. [DOI] [PubMed] [Google Scholar]
- 8.Guan W., Zhou M., Hampton C.Y., Benigno B.B., Walker L.D., Gray A., McDonald J.F., Fernández F.M. Ovarian cancer detection from metabolomic liquid chromatography/mass spectrometry data by support vector machines. BMC Bioinform. 2009;10:259. doi: 10.1186/1471-2105-10-259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kim K., Aronov P., Zakharkin S.O., Anderson D., Perroud B., Thompson I.M., Weiss R.H. Urine metabolomics analysis for kidney cancer detection and biomarker discovery. Mol. Cell Proteom. 2009;8:558–570. doi: 10.1074/mcp.M800165-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tiziani S., Lopes V., Günther U.L. Early stage diagnosis of oral cancer using 1H NMR-based metabolomics. Neoplasia. 2009;11:269–276. doi: 10.1593/neo.81396. 264p following 269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Urayama S., Zou W., Brooks K., Tolstikov V. Comprehensive mass spectrometry based metabolic profiling of blood plasma reveals potent discriminatory classifiers of pancreatic cancer. Rapid Commun. Mass Spectrom. 2010;24:613–620. doi: 10.1002/rcm.4420. [DOI] [PubMed] [Google Scholar]
- 12.Nindrea R.D., Aryandono T., Lazuardi L., Dwiprahasto I. Diagnostic accuracy of different machine learning algorithms for breast cancer risk calculation: a meta-analysis. Asian Pac. J. Cancer Prev. 2018;19:1747–1752. doi: 10.22034/APJCP.2018.19.7.1747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Vamathevan J., Clark D., Czodrowski P., Dunham I., Ferran E., Lee G., Li B., Madabhushi A., Shah P., Spitzer M., Zhao S. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 2019;18:463–477. doi: 10.1038/s41573-019-0024-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Dalal V., Carmicheal J., Dhaliwal A., Jain M., Kaur S., Batra S.K. Radiomics in stratification of pancreatic cystic lesions: machine learning in action. Cancer Lett. 2020;469:228–237. doi: 10.1016/j.canlet.2019.10.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhang B., He X., Ouyang F., Gu D., Dong Y., Zhang L., Mo X., Huang W., Tian J., Zhang S. Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma. Cancer Lett. 2017;403:21–27. doi: 10.1016/j.canlet.2017.06.004. [DOI] [PubMed] [Google Scholar]
- 16.Mucaki E.J., Zhao J.Z.L., Lizotte D.J., Rogan P.K. Predicting responses to platin chemotherapy agents with biochemically-inspired machine learning. Signal Transduct. Target Ther. 2019;4:1. doi: 10.1038/s41392-018-0034-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Xu W., Xu M., Wang L., Zhou W., Xiang R., Shi Y., Zhang Y., Piao Y. Integrative analysis of DNA methylation and gene expression identified cervical cancer-specific diagnostic biomarkers. Signal Transduct. Target. Ther. 2019;4:55. doi: 10.1038/s41392-019-0081-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Huang S., Yang J., Fong S., Zhao Q. Artificial intelligence in cancer diagnosis and prognosis: opportunities and challenges. Cancer Lett. 2020;471:61–71. doi: 10.1016/j.canlet.2019.12.007. [DOI] [PubMed] [Google Scholar]
- 19.Lynch C.M., Abdollahi B., Fuqua J.D., de Carlo A.R., Bartholomai J.A., Balgemann R.N., van Berkel V.H., Frieboes H.B. Prediction of lung cancer patient survival via supervised machine learning classification techniques. Int. J. Med. Inform. 2017;108:1–8. doi: 10.1016/j.ijmedinf.2017.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Luo S., Xu J., Jiang Z., Liu L., Wu Q., Leung E.L., Leung A.P. Artificial intelligence-based collaborative filtering method with ensemble learning for personalized lung cancer medicine without genetic sequencing. Pharmacol. Res. 2020;160 doi: 10.1016/j.phrs.2020.105037. [DOI] [PubMed] [Google Scholar]
- 21.Emin E.I., Emin E., Papalois A., Willmott F., Clarke S., Sideris M. Artificial intelligence in obstetrics and gynaecology: is this the way forward? In Vivo. 2019;33:1547–1551. doi: 10.21873/invivo.11635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Travis W.D., Brambilla E., Burke A.P., Marx A., Nicholson A.G. Introduction to the 2015 World Health Organization classification of tumors of the lung, pleura, thymus, and heart. J. Thorac. Oncol. 2015;10:1240–1242. doi: 10.1097/JTO.0000000000000663. [DOI] [PubMed] [Google Scholar]
- 23.Suguro R., Pang X.C., Yuan Z.W., Chen S.Y., Zhu Y.Z., Xie Y. Combinational applicaton of silybin and tangeretin attenuates the progression of non-alcoholic steatohepatitis (NASH) in mice via modulating lipid metabolism. Pharmacol. Res. 2020;151 doi: 10.1016/j.phrs.2019.104519. [DOI] [PubMed] [Google Scholar]
- 24.Pan H., Zheng Y., Liu Z., Yuan Z., Ren R., Zhou H., Xie Y., Liu L. Deciphering the pharmacological mechanism of Guan-Jie-Kang in treating rat adjuvant-induced arthritis using omics analysis. Front. Med. 2019;13:564–574. doi: 10.1007/s11684-018-0676-2. [DOI] [PubMed] [Google Scholar]
- 25.C. Chen, R. Xia, H. Chen, Y. He, TBtools: A Toolkit for Biologists Integrating Various HTS-data Handling Tools with a User-Friendly Interface, bioRxiv, (2018) 289660.
- 26.Demsar J., Curk T., Erjavec A., Gorup C., Hocevar T., Milutinovic M. Orange: data mining toolbox in Python. J. Mach. Learn. Res. 2013;14:2349–2353. [Google Scholar]
- 27.Huang S., Cai N., Pacheco P.P., Narrandes S., Wang Y., Xu W. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Proteom. 2018;15:41–51. doi: 10.21873/cgp.20063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Shi P., Ray S., Zhu Q., Kon M.A. Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction. BMC Bioinform. 2011;12:375. doi: 10.1186/1471-2105-12-375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ambale-Venkatesh B., Yang X., Wu C.O., Liu K., Hundley W.G., McClelland R., Gomes A.S., Folsom A.R., Shea S., Guallar E., Bluemke D.A., Lima J.A.C. Cardiovascular event prediction by machine learning: the multi-ethnic study of atherosclerosis. Circ. Res. 2017;121:1092–1101. doi: 10.1161/CIRCRESAHA.117.311312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zaidi N.A., Cerquides J., Carman M.J., Webb G.I. Alleviating Naive Bayes attribute independence assumption by attribute weighting. J. Mach. Learn. Res. 2013;14:1947–1988. [Google Scholar]
- 31.Banu A.B., Thirumalaikolundusubramanian P. Comparison of Bayes classifiers for breast cancer classification. Asian Pac. J. Cancer Prev. 2018;19:2917–2920. doi: 10.22034/APJCP.2018.19.10.2917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Rashidi H.H., Tran N.K., Betts E.V., Howell L.P., Green R. Artificial intelligence and machine learning in pathology: the present landscape of supervised methods. Acad. Pathol. 2019;6 doi: 10.1177/2374289519873088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Cooper W.A., O'Toole S., Boyer M., Horvath L., Mahar A. What's new in non-small cell lung cancer for pathologists: the importance of accurate subtyping, EGFR mutations and ALK rearrangements. Pathology. 2011;43:103–115. doi: 10.1097/PAT.0b013e328342629d. [DOI] [PubMed] [Google Scholar]
- 34.Liu B., Kim S.J., Cho K.J., Oh S. Development of machine learning models for diagnosis of glaucoma. PLoS One. 2017;12 doi: 10.1371/journal.pone.0177726. e0177726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ahmed M.S., Shahjaman M., Rana M.M., Mollah M.N.H. Robustification of Naïve Bayes classifier and its application for microarray gene expression data analysis. Biomed. Res. Int. 2017;2017 doi: 10.1155/2017/3020627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zhang H., Ren J.X., Ma J.X., Ding L. Development of an in silico prediction model for chemical-induced urinary tract toxicity by using naïve Bayes classifier. Mol. Divers. 2019;23:381–392. doi: 10.1007/s11030-018-9882-8. [DOI] [PubMed] [Google Scholar]
- 37.Hubbard T., Barker D., Birney E., Cameron G., Chen Y., Clark L., Cox T., Cuff J., Curwen V., Down T., Durbin R., Eyras E., Gilbert J., Hammond M., Huminiecki L., Kasprzyk A., Lehvaslaiho H., Lijnzaad P., Melsopp C., Mongin E., Pettett R., Pocock M., Potter S., Rust A., Schmidt E., Searle S., Slater G., Smith J., Spooner W., Stabenau A., Stalker J., Stupka E., Ureta-Vidal A., Vastrik I., Clamp M. The Ensembl genome database project. Nucleic Acids Res. 2002;30:38–41. doi: 10.1093/nar/30.1.38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Li F., Yang M., Li Y., Zhang M., Wang W., Yuan D., Tang D. An improved clear cell renal cell carcinoma stage prediction model based on gene sets. BMC Bioinform. 2020;21:232. doi: 10.1186/s12859-020-03543-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Barroso-García V., Gutiérrez-Tobal G.C., Kheirandish-Gozal L., Álvarez D., Vaquerizo-Villar F., Núñez P., Del Campo F., Gozal D., Hornero R. Usefulness of recurrence plots from airflow recordings to aid in paediatric sleep apnoea diagnosis. Comput. Methods Programs Biomed. 2020;183 doi: 10.1016/j.cmpb.2019.105083. [DOI] [PubMed] [Google Scholar]
- 40.Ko J., Bhagwat N., Black T., Yee S.S., Na Y.J., Fisher S., Kim J., Carpenter E.L., Stanger B.Z., Issadore D. miRNA profiling of magnetic nanopore-isolated extracellular vesicles for the diagnosis of pancreatic cancer. Cancer Res. 2018;78:3688–3697. doi: 10.1158/0008-5472.CAN-17-3703. [DOI] [PubMed] [Google Scholar]
- 41.Zhao W., Yang J., Sun Y., Li C., Wu W., Jin L., Yang Z., Ni B., Gao P., Wang P., Hua Y., Li M. 3D deep learning from CT scans predicts tumor invasiveness of subcentimeter pulmonary adenocarcinomas. Cancer Res. 2018;78:6881–6889. doi: 10.1158/0008-5472.CAN-18-0696. [DOI] [PubMed] [Google Scholar]
- 42.Pinsky P.F., Gierada D.S., Black W., Munden R., Nath H., Aberle D., Kazerooni E. Performance of lung-RADS in the National Lung Screening Trial: a retrospective assessment. Ann. Intern. Med. 2015;162:485–491. doi: 10.7326/M14-2086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Li L., Chen J., Chen X., Tang J., Guo H., Wang X., Qian J., Luo G., He F., Lu X., Ding Y., Yang Y., Huang W., Hou G., Lin X., Ouyang Q., Li H., Wang R., Jiang F., Pu R., Lu J., Jin M., Tan Y., Gonzalez F.J., Cao G., Wu M., Wen H., Wu T., Jin L., Chen L., Wang H. Serum miRNAs as predictive and preventive biomarker for pre-clinical hepatocellular carcinoma. Cancer Lett. 2016;373:234–240. doi: 10.1016/j.canlet.2016.01.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Poell J.B., van Haastert R.J., Cerisoli F., Bolijn A.S., Timmer L.M., Diosdado-Calvo B., Meijer G.A., van Puijenbroek A.A., Berezikov E., Schaapveld R.Q., Cuppen E. Functional microRNA screening using a comprehensive lentiviral human microRNA expression library. BMC Genom. 2011;12:546. doi: 10.1186/1471-2164-12-546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Yang Q., Zhang P., Wu R., Lu K., Zhou H. Identifying the best marker combination in CEA, CA125, CY211, NSE, and SCC for lung cancer screening by combining ROC curve and logistic regression analyses: is it feasible? Dis. Markers. 2018;2018:1–12. doi: 10.1155/2018/2082840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kitazono S., Fujiwara Y., Tsuta K., Utsumi H., Kanda S., Horinouchi H., Nokihara H., Yamamoto N., Sasada S., Watanabe S., Asamura H., Tamura T., Ohe Y. Reliability of small biopsy samples compared with resected specimens for the determination of programmed death-ligand 1 expression in non–small-cell lung cancer. Clin. Lung Cancer. 2015;16:385–390. doi: 10.1016/j.cllc.2015.03.008. [DOI] [PubMed] [Google Scholar]
- 47.Riniker S., Wang Y., Jenkins J.L., Landrum G.A. Using information from historical high-throughput screens to predict active compounds. J. Chem. Inf. Model. 2014;54:1880–1891. doi: 10.1021/ci500190p. [DOI] [PubMed] [Google Scholar]
- 48.Jeon J., Nim S., Teyra J., Datti A., Wrana J.L., Sidhu S.S., Moffat J., Kim P.M. A systematic approach to identify novel cancer drug targets using machine learning, inhibitor design and high-throughput screening. Genome Med. 2014;6:57. doi: 10.1186/s13073-014-0057-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Obrzut B., Kusy M., Semczuk A., Obrzut M., Kluska J. Prediction of 5-year overall survival in cervical cancer patients treated with radical hysterectomy using computational intelligence methods. BMC Cancer. 2017;17:840. doi: 10.1186/s12885-017-3806-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kyrgiou M., Pouliakis A., Panayiotides J.G., Margari N., Bountris P., Valasoulis G., Paraskevaidi M., Bilirakis E., Nasioutziki M., Loufopoulos A., Haritou M., Koutsouris D.D., Karakitsos P., Paraskevaidis E. Personalised management of women with cervical abnormalities using a clinical decision support scoring system. Gynecol. Oncol. 2016;141:29–35. doi: 10.1016/j.ygyno.2015.12.032. [DOI] [PubMed] [Google Scholar]
- 51.Huang Y., Zhu J., Li W., Zhang Z., Xiong P., Wang H., Zhang J. Serum microRNA panel excavated by machine learning as a potential biomarker for the detection of gastric cancer. Oncol. Rep. 2018;39:1338–1346. doi: 10.3892/or.2017.6163. [DOI] [PubMed] [Google Scholar]
- 52.He Y., Gao M., Tang H., Cao Y., Liu S., Tao Y. Metabolic intermediates in tumorigenesis and progression. Int. J. Biol. Sci. 2019;15:1187–1199. doi: 10.7150/ijbs.33496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Phang J.M. Proline metabolism in cell regulation and cancer biology: recent advances and hypotheses. Antioxid. Redox Signal. 2019;30:635–649. doi: 10.1089/ars.2017.7350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Liu W., Glunde K., Bhujwalla Z.M., Raman V., Sharma A., Phang J.M. Proline oxidase promotes tumor cell survival in hypoxic tumor microenvironments. Cancer Res. 2012;72:3677–3686. doi: 10.1158/0008-5472.CAN-12-0080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Liu W., Hancock C.N., Fischer J.W., Harman M., Phang J.M. Proline biosynthesis augments tumor cell growth and aerobic glycolysis: involvement of pyridine nucleotides. Sci. Rep. 2015;5:17206. doi: 10.1038/srep17206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Liu Y., Mao C., Wang M., Liu N., Ouyang L., Liu S., Tang H., Cao Y., Liu S., Wang X., Xiao D., Chen C., Shi Y., Yan Q., Tao Y. Cancer progression is mediated by proline catabolism in non-small cell lung cancer. Oncogene. 2020;39:2358–2376. doi: 10.1038/s41388-019-1151-5. [DOI] [PubMed] [Google Scholar]
- 57.Kolodziej L.R., Paleolog E.M., Williams R.O. Kynurenine metabolism in health and disease. Amino Acids. 2011;41:1173–1183. doi: 10.1007/s00726-010-0787-9. [DOI] [PubMed] [Google Scholar]
- 58.Chuang S.C., Fanidi A., Ueland P.M., Relton C., Midttun O., Vollset S.E., Gunter M.J., Seckl M.J., Travis R.C., Wareham N., Trichopoulou A., Lagiou P., Trichopoulos D., Peeters P.H., Bueno-de-Mesquita H.B., Boeing H., Wientzek A., Kuehn T., Kaaks R., Tumino R., Agnoli C., Palli D., Naccarati A., Aicua E.A., Sánchez M.J., Quirós J.R., Chirlaque M.D., Agudo A., Johansson M., Grankvist K., Boutron-Ruault M.C., Clavel-Chapelon F., Fagherazzi G., Weiderpass E., Riboli E., Brennan P.J., Vineis P., Johansson M. Circulating biomarkers of tryptophan and the kynurenine pathway and lung cancer risk. Cancer Epidemiol. Biomark. Prev. 2014;23:461–468. doi: 10.1158/1055-9965.EPI-13-0770. [DOI] [PubMed] [Google Scholar]
- 59.Pegg A.E. Spermidine/spermine-N(1)-acetyltransferase: a key metabolic regulator. Am. J. Physiol. Endocrinol. Metab. 2008;294:E995–1010. doi: 10.1152/ajpendo.90217.2008. [DOI] [PubMed] [Google Scholar]
- 60.Babbar N., Hacker A., Huang Y., Casero R.A., Jr. Tumor necrosis factor alpha induces spermidine/spermine N1-acetyltransferase through nuclear factor kappaB in non-small cell lung cancer cells. J. Biol. Chem. 2006;281:24182–24192. doi: 10.1074/jbc.M601871200. [DOI] [PubMed] [Google Scholar]
- 61.Kingsnorth A.N., Wallace H.M. Elevation of monoacetylated polyamines in human breast cancers. Eur. J. Cancer Clin. Oncol. 1985;21:1057–1062. doi: 10.1016/0277-5379(85)90291-3. [DOI] [PubMed] [Google Scholar]
- 62.Singhal S., Rolfo C., Maksymiuk A.W., Tappia P.S., Sitar D.S., Russo A., Akhtar P.S., Khatun N., Rahnuma P., Rashiduzzaman A., Ahmed Bux R., Huang G., Ramjiawan B. Liquid biopsy in lung cancer screening: the contribution of metabolomics. Results of a pilot study. Cancers. 2019;11:1069. doi: 10.3390/cancers11081069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Parniak M., Kalant N. Incorporation of glucose into glycogen in primary cultures of rat hepatocytes. Can. J. Biochem. Cell Biol. 1985;63:333–340. doi: 10.1139/o85-049. [DOI] [PubMed] [Google Scholar]
- 64.Zhang D., Li J., Wang F., Hu J., Wang S., Sun Y. 2-Deoxy-d-glucose targeting of glucose metabolism in cancer cells as a potential therapy. Cancer Lett. 2014;355:176–183. doi: 10.1016/j.canlet.2014.09.003. [DOI] [PubMed] [Google Scholar]
- 65.Zhang X., Tu S., Wang Y., Xu B., Wan F. Mechanism of taurine-induced apoptosis in human colon cancer cells. Acta Biochim. Biophys. Sin. 2014;46:261–272. doi: 10.1093/abbs/gmu004. [DOI] [PubMed] [Google Scholar]
- 66.Neary P.M., Hallihan P., Wang J.H., Pfirrmann R.W., Bouchier-Hayes D.J., Redmond H.P. The evolving role of taurolidine in cancer therapy. Ann. Surg. Oncol. 2010;17:1135–1143. doi: 10.1245/s10434-009-0867-9. [DOI] [PubMed] [Google Scholar]
- 67.El Agouza I.M., Eissa S.S., El Houseini M.M., El-Nashar D.E., Abd El Hameed O.M. Taurine: a novel tumor marker for enhanced detection of breast cancer among female patients. Angiogenesis. 2011;14:321–330. doi: 10.1007/s10456-011-9215-3. [DOI] [PubMed] [Google Scholar]
- 68.Srivastava S., Roy R., Singh S., Kumar P., Dalela D., Sankhwar S.N., Goel A., Sonkar A.A. Taurine – a possible fingerprint biomarker in non-muscle invasive bladder cancer: a pilot study by 1H NMR spectroscopy. Cancer Biomark. 2010;6:11–20. doi: 10.3233/CBM-2009-0115. [DOI] [PubMed] [Google Scholar]
- 69.Tu S., Zhang X.L., Wan H.F., Xia Y.Q., Liu Z.Q., Yang X.H., Wan F.S. Effect of taurine on cell proliferation and apoptosis human lung cancer A549 cells. Oncol. Lett. 2018;15:5473–5480. doi: 10.3892/ol.2018.8036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Chabon J.J., Hamilton E.G., Kurtz D.M., Esfahani M.S., Diehn M. Integrating genomic features for non-invasive early lung cancer detection. Nature. 2020;580:245–251. doi: 10.1038/s41586-020-2140-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.