Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Nov 1.
Published in final edited form as: Cancer Epidemiol Biomarkers Prev. 2015 Aug 17;24(11):1716–1723. doi: 10.1158/1055-9965.EPI-15-0427

Investigation of Metabolomic Blood Biomarkers for Detection of Adenocarcinoma Lung Cancer

Johannes F Fahrmann 1, Kyoungmi Kim 2, Brian C DeFelice 1, Sandra L Taylor 2, David R Gandara 3, Ken Y Yoneda 4, David T Cooke 5, Oliver Fiehn 1,6, Karen Kelly 3, Suzanne Miyamoto 3,#
PMCID: PMC4633344  NIHMSID: NIHMS716198  PMID: 26282632

Abstract

Background

Untargeted metabolomics was utilized in case control studies of adenocarcinoma (ADC) lung cancer in order to develop and test metabolite classifiers in serum and plasma as potential biomarkers for diagnosing lung cancer.

Methods

Serum and plasma were collected and used in two independent case-control studies (ADC1 and ADC2). Controls were frequency matched for gender, age and smoking history. There were 52 ADC cases and 31 controls in ADC1 and 43 ADC cases and 43 controls in ADC2. Metabolomics was conducted using gas chromatography time-of-flight mass spectrometry. Differential analysis was performed on ADC1 and the top candidates (FDR < 0.05) for serum and plasma used to develop individual and multiplex-classifiers that were then tested on an independent set of serum and plasma samples (ADC2).

Results

Aspartate provided the best accuracy (81.4%) for an individual metabolite classifier in serum whereas pyrophosphate had the best accuracy (77.9%) in plasma when independently tested. Multiplex classifiers of either 2 or 4 serum metabolites had an accuracy of 72.7% when independently tested. For plasma, a multi-metabolite classifier consisting of 8 metabolites gave an accuracy of 77.3% when independently tested. Comparison of overall diagnostic performance between the two blood matrices yielded similar performances. However, serum is most ideal given higher sensitivity for low abundant metabolites.

Conclusion

This study shows the potential of metabolite-based diagnostic tests for detection of lung adenocarcinoma. Further validation in a larger pool of samples is warranted.

Impact

These biomarkers could improve early detection and diagnosis of lung cancer.

Keywords: Metabolomics, Biomarker, Adenocarcinoma Lung Cancer

Introduction

Lung cancer continues to be a leading cause of cancer mortality in both men and women in the United States (1, 2). Among the different lung cancer types, non-small cell lung cancer (NSCLC) accounts for approximately 85% of all lung cancer cases with adenocarcinoma being the most common histological type (3).

Recently, the National Lung Cancer Screen Trial (NLST) demonstrated that low dose computed tomography (LDCT) screening could reduce mortality due to lung cancer by 20%. However, LDCT screening is largely hindered by high false positive rates (96%), particularly in high risk populations (heavy smokers), due to the low prevalence rates (less than 2%) of malignant tumors and high incidence of benign lung nodules. Consequently, complementary biomarkers which can be used in conjunction with LDCT screening to improve diagnostic capacities and reduce false-positive rates are highly desirable (4) (5). Preferably such complementary tools should be non-invasive and exhibit high sensitivity and specificity. The application of “–omic” sciences (genomics, transcriptomics, proteomics and metabolomics) represent valuable tools for discovery and validation of potential biomarkers which can be used for detection of NSCLC. Of these omic-sciences, metabolomics has received considerable attention for its application in cancer (6). Metabolomics is the assessment of small molecules and biochemical intermediates (metabolites) using analytical instrumentation. Metabolites in blood are the product of all cellular processes, which are highly responsive to conditions of disease and environment, and represent the final output products of all organs forming a detailed systemic representation of an individual’s current physiological state (7).

Metabolomics has been applied to gain new insights into the pathology of cancer, develop methods predictive of disease onset and reveal new biomarkers associated with diagnosis and prognosis (6, 8, 9). As such, the application of metabolomics in NSCLC adenocarcinoma represents a promising avenue of new research for the identification and validation of potential biomarkers associated with diagnosis and prognosis.

In the current study, we utilized an untargeted metabolomics approach using gas chromatography time-of-flight mass spectrometry (GCTOF MS) to analyze the metabolome of serum and plasma samples both collected from the same patients that were organized into two independent case-control studies (ADC1 and ADC2). In both studies, only NSCLC adenocarcinoma was investigated. The overall objectives were to: 1) determine if individual or combinations of metabolites could be used a diagnostic test to distinguish NSCLC adenocarcinoma from controls; and 2) to determine which, plasma or serum, provides more accurate classifiers for the detection of lung cancer. We developed individual and multi- metabolite classifiers using a training test from the ADC1 study and evaluated the performance of the constructed classifiers, individually or in combination, in an independent test/validation study (ADC2).

Materials and Methods

Patient population and collection of patient samples

Subjects were recruited over a 4 year period (2010–2014) from the UC Davis Medical Center and Cancer Center Clinics. All subjects were diagnosed with NSCLC adenocarcinoma prior to specimen collection. Blood samples (serum and plasma) were collected from NSCLC adenocarcinoma and control subjects with patient consent using approved IRB protocols (LC001 for cancer cases and LC002 for control cases). The control population was heavily recruited from spouses and family members accompanying a lung cancer patient to their clinic to maintain as much of a similar environment and life styles, especially diet and smoking history, as possible. Cases were frequency matched with controls for gender, age and smoking history. Only cases diagnosed with NSCLC adenocarcinoma were used in these studies. Fasting status was not controlled for as individuals were recruited upon their arrival to the clinic.

Patient characteristics are described in Table 1. Detailed information on blood sample collection protocols is provided in Supplemental Methods.

Table 1.

Patient Characteristics

ADC1 (Training set)* for development ADC2 (Test set)* for validation
Variable Plasma Serum Plasma/Serum
Total Sample Size, N 83 80 86
Healthy controls, N (%) 31 (37.3%) 31 (38.8%) 43 (50%)
Cancer cases, N (%) 52 (62.7%) 49 (61.2%) 43 (50%)
By stage, N (%)
  I 21 (40.38) 19 (36.54) 18 (41.86)
  II 7 (13.46) 7 (13.46) 3 (6.98)
  III 14 (26.92) 14 (26.92) 7 (16.28)
  IV 10 (19.23) 9 (17.31) 15 (34.88)
Gender, N (Males/Females)
Controls 11/20 21/22
Cancer Cases 17/35 17/32 21/22
Age (yr), mean ± SD
Controls 64.1 ± 8.97 65.9 ± 8.05
Cancer Cases 65.9 ± 9.66 65.9 ± 9.87 67.3 ± 10.10
Packs per year, mean ± SD
Controls 29.8 ± 19.54 38.6 ± 26.46
Cancer Cases 34.6 ± 19.33 33.9 ± 20.06 39.5 ± 27.23
*

No statistical differences in variables between cases and controls within each set and between the training set and test set.

Metabolomic Profiling

The MiniX database (10) was used as a Laboratory Information Management System (LIMS) and for sample randomization prior to all analytical procedures. Sample identifications were kept blinded during the entire metabolomics analysis to minimize potential bias. Serum and plasma samples were equally distributed for analysis so they could be compared directly.

Plasma and Serum Sample Preparation

Detailed information on sample preparation, instrument parameters and data acquisition are provided in Supplemental Methods.

Samples (30μL) (serum or plasma) were thawed, extracted and derivatized as previously described (11). Mass spectrometry analysis and data acquisition was performed using an Agilent 7890A gas chromatograph coupled to a Leco Pegasus IV time-of-flight (TOF) spectrometer. Acquired spectra were further processed using the BinBase database (10, 12).

Data Analysis

Prior to statistical analysis, metabolite intensity values were total quantity normalized and log2 transformed. Missing intensity values were imputed with one-half the minimum observed matrix and metabolite specific value. Differential analysis was implemented to identify significant metabolomic differences between cancer and control samples in serum and plasma separately for both the training (ADC1) and test (ADC2) set. For each matrix, intensity values were regressed on the covariates (age, gender and smoking history [packs per year]) and the residuals used to calculate t statistics for the difference between cancer and control groups adjusting for the covariates. Significance between cancer and control groups was determined based on a permutation null distribution consisting of 100,000 permutations. False discovery rates (FDR) were calculated to account for multiple testing and FDR <0.05 was considered as significant.

Development of classifiers was carried out on the training set (ADC1). Classifiers consisted of individual metabolites and as a multiplex panel, for classifying samples as cancer or control (13). Only metabolites with a significant FDR (< 0.05) were used in constructing classifiers. Further, in developing classifiers we used residuals from adjusting for age, gender and smoking history and scaled the residuals to a variance of 1 for comparability between data sets.

Development of Classifiers: Classifiers were developed using a strategy based on the use of “voting classifiers” as previously described (13). Detailed information regarding classifier development is provided in the Supplemental Material.

Results

Subject Characteristics

Patient characteristics for the two independent studies are provided in Table 1. The first set (ADC1) used for biomarker development consisted of serum and plasma samples obtained from 52 Stage I–IV NSCLC adenocarcinoma patients (52 plasma and 49 serum), and 31 healthy controls (31 pairs of serum and plasma) for a total of 163 samples. Thirty-one control patients were enrolled as individual control subjects matched multiple cancer samples. This set was regarded as the training set for biomarker discovery and classifier development. A second, independent case control study (ADC2) consisting of serum and plasma samples collected from 43 Stage I–IV NSCLC adenocarcinoma patients and 43 healthy controls (total 172 samples) was used as an independent test set for biomarker evaluation. Samples for ADC2 were collected and analyzed at a different time from the ADC1. There were no significant differences between the matching variables of age, gender, and smoking history (packs per year) for the cases and control cohorts in the two separate case control studies (ADC1 and ADC2).

Identification of Metabolites Discriminatory of NSCLC Adenocarcinoma in Serum

Untargeted GC-TOF based metabolomics was conducted on each sample in the ADC1 and then ADC2 set. A total of 511 metabolites were detected in ADC1, of which 181 had known annotated structures, whereas in ADC2 413 metabolites, of which 152 were known (Supplemental Table S1 and S2). Of all metabolites detected between the two studies, 296 were repeatedly measured in serum and plasma (Supplemental Table S1 and S2). Notably, many of the metabolites that were unique to either ADC1 or ADC2 were unknowns. These unknowns may represent artifacts, low abundant compounds or xenobiotic compounds which were removed during the data filtering process (see Supplemental Methods). Pearson correlation coefficients illustrating the association between measured metabolites are provided in Supplemental Table S3. Differential analysis (cancer versus control) on the ADC1 training set identified 80 differential metabolites in the serum with a raw p-value < 0.05 (Supplemental Table S1). Only four metabolites (xylose, glutamate, aspartate and Bin_225393) remained significant after false-discovery rate adjustment (FDR <0.05) in ADC1 (Supplemental Table S4). Three of the four significant metabolites (glutamate, aspartate and Bin_225393) were found to be elevated in cancer relative to control in ADC1; whereas xylose was decreased (Box-plots for each metabolite are provided in Supplemental Figure S1). Using the independent test set (ADC2), we conducted a separate differential analysis to confirm whether the metabolites identified in the first study (ADC1) were significantly and consistently differential in a different cohort of samples. Out of the 80 metabolites with a raw p-value <0.05 in ADC1, 15 (18.8%) were also found to be significant (raw p-value <0.05) in ADC2 (Supplemental Table S4). More importantly, all 15 metabolites indicated a similar fold change (increased or decreased) in both studies (Supplemental Table S4). When comparing only those metabolites which were significantly different following FDR-adjustment in ADC1, 3 of the 4 metabolites (aspartate, glutamate and Bin_225393) were also found to be significantly (FDR-adjusted p-value <0.05) elevated in adenocarcinoma in ADC2 (Supplemental Figure S1).

Developing Serum Metabolite Classifiers Using the ADC1 Training Set

Serum classifiers were developed using the metabolites whose peak intensity were significantly differential in relation to the cancer presence in ADC1 after FDR-adjustment. Determination of classification thresholds and rules, construction of classifiers and cross-validation of classifier performance for both individual metabolites and panel of metabolites was performed as described in the Methods Section. Performances of the developed classifiers for each individual metabolite in ADC1 are provided in Table 2. Individually, Bin_225393 displayed the highest individual accuracy of 72.5% (AUC=0.766) (Table 2 and Figure 1A). A ROC curve plus the confidence interval for Bin_225393 is provided in Figure 1A and Supplemental Table S6, respectively. We next evaluated whether combining multiple metabolites in ADC1 could yield improved classifications. The order that each metabolite entered the multiplex classifier is provided in Table 2. Overall, the highest accuracies achieved were 72.5% and 68.8% consisting of Bin_225393 alone and the first three metabolites (Bin_225393, aspartate and xylose), respectively (Table 2). A ROC curve plus the confidence interval for the 3 metabolite classifiers (Bin_225393, aspartate and xylose) is provided in Figure 1B and Supplemental Table S6, respectively.

Table 2.

Performances of developed individual- and multi-metabolite serum classifiers in ADC1 (training) and ADC2 (test) sets.

ADC1 (Training) ADC2 (Test)

Individual Metabolite Classifiers
Metabolite Accuracya Accuracyb Sensitivity Specificity
Xylose 53.8 50.0 76.7 23.3
Glutamate 61.3 74.4 65.1 83.7
Aspartate 63.8 81.4 67.4 95.4
Bin_225393 72.5 64.0 74.4 53.5

Multi-Metabolite Classifiers
Metabolite Accumulated Accuracya Accumulated Accuracyb Sensitivity Specificity
Bin_2253931 72.5 64.0 74.4 53.5
Asparate2 68.1 72.7 70.9 74.4
Xylose3 68.8 67.4 74.4 60.5
Glutamate4 66.9 72.7 70.9 74.4
1–4

denotes entry into classifier

a

ADC1 (Training Set)

b

ADC2 (Test Set)

Figure 1. ROC curves for individual- and multi-metabolite classifiers in serum.

Figure 1

A) ROC curves for aspartate and Bin_225393 in serum. B) ROC curves for two multi-metabolite classifiers in serum. The 4 metabolite classifier contains all the metabolites included in the classifier (Table 2). The 3 metabolite classifier includes Bin_225393, aspartate and xylose. Confidence intervals for AUCs are provided in Supplemental Table S6.

Testing/validation of Serum Classifiers Developed with ADC1 Training Set in an Independent ADC2 Test Set

We next evaluated the performance of the serum metabolite classifiers developed using the ADC1 training set on the independent ADC2 test set. Individually, aspartate yielded the best performance with a classification accuracy of 81.4% (AUC=0.855) when tested in ADC2 (Table 2). A ROC curve plus the confidence interval for aspartate is provided in Figure 1A and Supplemental Table S6, respectively. Then we tested the multi-metabolite classifiers consisting of up to 4 metabolites developed with the ADC training set in the test set for assessment of performance accuracy. The highest performance was achieved with all four metabolites in the classifier yielding an accuracy of 72.7% (Table 2). A ROC curve plus the confidence interval for the 4 metabolite classifiers (Bin_225393, aspartate and xylose) is shown in Figure 1B and Supplemental Table S6, respectively.

Identification of Metabolites Discriminatory of NSCLC Adenocarcinoma in Plasma

In addition to serum, we examined the performance of plasma-derived metabolite classifiers as potential biomarkers for NSCLC adenocarcinoma. Differential analysis identified 68 differential metabolites in plasma samples with a raw p-value <0.05 (Supplemental Table S2) in the ADC1 set. Only 14 (21%) of the 68 metabolites remained significant following FDR-adjustment (Supplemental Table S2). Of these 68 metabolites, 18 (26.5%) were also found to be significantly different in the ADC2 test set (Supplemental Table S5). When comparing only those 14 metabolites that remained significant after FDR-adjustment in the ADC1 set, 6 (pyrophosphate, maltotriose, citrulline, adenosine-5-phosphate, Bin_226841 and Bin_36799) were also found to remain significant following FDR-adjustment (FDR p-value <0.05) in the ADC2 set (Supplemental Table S2). All 6 of these metabolites displayed the same direction of change in ADC2 (increased or decreased) as observed in ADC1 (Supplemental Figure S1).

Developing Plasma Metabolite Classifiers Using the ADC1 Training Set

Plasma classifiers were developed from the thirteen (Supplemental Table S2) discriminating metabolites which remained significant following FDR-adjustment. Performances of the developed classifiers for each individual metabolite are provided in Table 3. Four metabolites (maltotriose, maltose, cellobiotol, and Bin_715929) had individual accuracy scores above 70% (Table 3). However, 3 metabolites (cellobiotol, Bin_715929, and Bin_299216) were not detected in the ADC2 test set and consequently excluded when developing classifiers to apply to the test set for performance evaluation. The non-detection of cellobiotol and unknown compounds (Bin_715929 and Bin_299216) is suspected to be the consequence of low spectral abundance or only being detected in few patients, thus resulting in removal of these compounds during the data filtering processes (see Supplemental Methods). Overall, maltose performed the best with an accuracy of 72.3% (Table 3 and Figure 2A). A ROC curve plus the confidence interval for maltose is provided in Figure 2A and Supplemental Table S6, respectively. We subsequently evaluated whether combining multiple plasma metabolites could serve a better classification test. The order that each metabolite entered the classifier is provided in Table 3. Overall, the highest accuracy achieved for the plasma classifiers was 79.5% using a panel of 5 metabolites (Table 3 and Figure 2B) suggesting that several metabolites in the classifier can improve classification relative to individual metabolite classifiers. A ROC curve plus confidence intervals for the 5-metabolite classifier is provided in Figure 2B and Supplemental Table S6, respectively.

Table 3.

Performances of developed individual- and multi-metabolite plasma classifiers in ADC1 (training) and ADC2 (test) sets.

ADC1 (Training) ADC2 (Test)

Individual Metabolite Classifier
Metabolite Accuracya Accuracyb Sensitivity Specificity
Tryptophan 50.6 52.3 39.5 65.1
Pyrophophosphate 66.3 77.9 60.5 95.4
Maltotriose 71.1 64.0 79.1 48.8
Maltose 72.3 62.8 55.8 69.8
Cystine 66.3 68.6 58.1 79.1
Citrulline 67.5 66.3 55.8 76.7
Cellobiotol£ 71.1 ND¥ ND ND
Adenosine-5-phosphate 68.7 72.1 67.4 76.7
3-Phosphoglycerate 69.9 51.2 30.2 72.1
Bin_226841 60.2 64.0 37.2 90.7
Bin_715929£ 72.3 ND ND ND
Bin_367991 59.0 65.1 39.5 90.7
Bin_299216£ 60.2 ND ND ND

Multi-Metabolite Classifier
Metabolite Accumulated Accuracya Accumulated Accuracyb Sensitivity Specificity
Maltose1 72.3 62.8 55.8 69.8
Maltotriose2 71.7 63.4 67.4 59.3
Cystine3 75.9 69.8 69.8 69.8
3-Phosphoglycerate4 76.5 69.2 61.6 76.7
Citrulline5 79.5 73.3 65.1 81.4
Pyrophosphate6 77.1 75.6 64.0 87.2
Tryptophan7 77.1 76.7 60.5 93.0
Adenosine-5-Phosphate8 75.3 77.3 64.0 90.7
Bin_2268419 77.1 76.7 60.5 93.0
Bin_36799110 73.5 75.0 55.8 94.2
1–10

denotes entry into classifier

a

ADC1 (Training Set)

b

ADC2 (Test Set)

¥

not detected

£

metabolite not included in multi-metabolite classifier

Figure 2. ROC curves for individual- and multi-metabolite classifiers in plasma.

Figure 2

A) ROC curves for maltose and pyrophosphate in plasma. B) ROC curves for the best multi-metabolite ADC1 classifier for plasma consisting of 5 metabolites (blue line) and the classifier consisting of 8 (black line) when applied to ADC2 are shown (Table 3). Confidence intervals for AUCs are provided in Supplemental Table S6.

Testing/validation of Plasma Classifiers Developed with ADC1 Training Set in an Independent ADC2 Test Set

We next evaluated the performance of the plasma metabolite classifiers developed using the training set (ADC1) in the discovery phase on the independent ADC2 test set. Individually, plasma metabolites indicated modest performances in classifying ADC2 samples with pyrophosphate achieving the highest accuracy (77.9%) and specificity (95.4%) but low sensitivity (60.5%) (Table 3). A ROC curve plus confidence intervals for pyrophosphate is shown in Figure 2A and Supplemental Table S6, respectively. We afterwards evaluated the developed multiplex classifiers in the independent test set using the refined 10 metabolites classifiers developed in the training set. Collectively, the best performance was achieved when using a combination of 8 metabolites in the plasma classifier resulting in an accuracy of 77.3% (Table 3). A ROC curve plus confidence intervals for the 8-metabolite classifier is shown in Figure 2A and Supplemental Table S6, respectively.

Comparison of Classifier Performances between Plasma and Serum Samples

It is of particular interest for clinical utility to determine which type of blood specimen (serum or plasma) is best suited for obtaining optimal classifiers for the detection of lung cancer. For this reason, we collected both serum and plasma from the same individuals and developed the classifiers from these two biofluids independently. Collectively, 3 metabolites (maltotriose, glutamate and Bin_223618) were found to be consistently differential between ADC and control (Supplemental Table S1 and S2) in both serum and plasma.

Comparison of individual metabolite classifier performances in either serum or plasma yielded comparable results although classifier accuracies were slightly better in serum (ranging from 50–81% accuracy) than plasma (ranging from 51–78% accuracy) (Tables 2 and 3). However, plasma provided slightly better performance metrics using a multi-metabolite classifier compared to serum (77.3% versus 72.7% accuracy in ADC2, respectively) (Tables 2 and 3), indicating that serum may be better suited for individual-metabolite classifiers; whereas plasma may be more suited for multi-metabolite classifiers.

Discussion

In the present study, we identified multiple circulating metabolites (annotated and unknown) that are significantly elevated or reduced in patients with NSCLC adenocarcinoma compared to healthy controls. In the discovery phase of our experimental design, we identified and developed classifiers in a training set which were then applied to an independent test set for testing/validation. This approach is consistent with the guidelines set forth by the US National Cancer Institute for evaluating potential diagnostic cancer biomarkers (14). Within each study, sample sets were matched by age, gender and smoking history and rigorous statistical evaluations were implemented to analyze the test performance of the metabolite compositions.

Overall, the individual-metabolite classifier, aspartate showed the best accuracy (81.4%) in serum, whereas pyrophosphate provided the best accuracy (77.9%) in plasma when tested in the independent ADC2 test/validation study. A combination of either 2 or 4 metabolites (Table 2) in the serum-classifier gave the best performance with an accuracy of 72.7% when applied to the ADC2 test set. For plasma, a multi-panel classifier consisting of 8 metabolites (Table 3) provided the best performance with an accuracy of 77.3% in the ADC2 test set. Although the ideal (but unrealistic) situation is for a 100% sensitivity and specificity, in this study we focused on building a multiplex test with low sensitivity/high specificity. In this way, nearly all of the true negative and false positives could be correctly identified as cancer free.

The performance of the developed classifiers in our study for identification of NSCLC adenocarcinoma is comparable with those of others (1517). Patz et al illustrated that a panel of four serum proteins (CEA, retinol binding protein, a1-antitrypsin and squamous cell carcinoma antigen) were found to have a sensitivity of 89.3% and specificity of 84.7% in a case control training set for lung cancer that in a validation study yielded a sensitivity of 77.8% and specificity of 75.4% (15). Li et al developed a 13-protein classifier from a panel of 371 protein candidates, previously identified in 143 plasma samples obtained from patients with benign and malignant lung nodules. This 13-protein classifier was validated on an independent set of plasma samples (n = 104) yielding a negative predictive value (NPV) of 90% and specificity of 44 ± 13% (16). Results from analysis of a third independent study showed an NPV of 94% and specificity of 56% (16). AUCs of 0.82, 0.60, and 0.74 were obtained for discovery, validation study 1 and validation study 2, respectively (16). As in our study, they also determined that their classifier score was independent of smoking history and age (16).

Recently, Sozzi et al demonstrated the effectiveness of a plasma miRNA signature classifier using samples collected from smokers within the randomized Multicenter Italian Lung Detection (MILD) trial both individual and in conjunction with LDCT (17). Using this miRNA classifier a sensitivity of 87% and specificity of 81% was achieved for all case control samples while 88% sensitivity and 80% specificity was achieved for cases from the low-dose CT arm (17).

Taken together, both methods (miRNA and metabolomics) or all methods (miRNAs, proteomics and metabolomics) might eventually be combined to produce a test with even better performance than each individual test for early detection of all types of lung cancer. Despite the potential prospects of combined methodologies (such as miRNA and metabolomics), our results highlight the application of metabolomics in the discovery phase of potential biomarkers and yield candidate classifiers which will be expanded upon in future studies. Moreover, we have tested both plasma and serum from the same individuals to determine which type of blood specimen would be more suitable for metabolomics-derived classifiers. Specifically, we aimed to determine which biofluid would provide the most reliable classifiers with the potential for general utility as this is a clinically relevant question. Overall, there were no major advantages by using either serum or plasma as both blood-specimen types yielded comparable results in overall performances; although serum-based classifiers performed slightly better for individual-metabolite classifiers whereas plasma performed slightly better for multi-metabolite classifiers. These findings are in agreement with those by Wedge et al and Yu et al who similarly stated that the two biofluids were comparable; although in both studies plasma provided slightly better results (18, 19). However, despite their findings, Yu et al also found that metabolite concentrations were higher in serum allowing for more sensitive results in biomarker detection (19). We suggest that serum is most ideal given higher sensitivity for low abundant metabolites as the low abundant metabolites/compounds could potentially be of most biologically relevant.

While the application of metabolomics in identification of potential biomarkers is of immense value, it also provides useful information about the pathophysiology of the disease. Recently, we distinguished metabolic differences between matched malignant and non-malignant lung tissue from subjects with early stage (Stage IA–IB) NSCLC adenocarcinoma (20). Particularly, we identified that glutamate, malate, adenosine-5-phosphate and xanthine were significantly increased whereas glutamine was reduced in malignant tissue compared to non-malignant tissue (20). In the current investigation, we also found that glutamate, malate, adenosine-5-phosphate and xanthine were consistently elevated in NSCLC whereas glutamine was consistently reduced. However, only glutamate was found to be reliably significant. The elevation in glutamate and reduction in glutamine is particularly noteworthy given the recently recognized importance of glutamate and glutamine in energy metabolism and macromolecule biosynthesis in lung cancer (21). The abundances of these metabolites and others mentioned above in serum or plasma may therefore be a reflection of tumorigenesis and cumulatively point towards alterations in nucleotide and energy metabolism.

Lastly, it is important to recognize both the strengths and limitations of the current study. A strength of the current study is that our analysis only examined adenocarcinoma NSCLC. By evaluating only adenocarcinoma we exclude any potential biases from mixed pathologies during the classifier construction allowing for the identification of adenocarcinoma specific biomarkers. This is particularly important for diagnosis and treatment options. Unlike many biomarkers studies, candidate biomarkers were selected based on their false discovery rates (FDRs) and not on raw p-values, leading to less false positive rates of developing biomarkers. A limitation of this study is the relatively small sample size for each cohort (52 cases, 31 controls for ADC1 and 43 cases and 43 controls for ADC2) (22) since patient variability can be a big factor in smaller studies. However, the current study is still part of the discovery phase and, as such, will require further validation in a larger population. Additionally, values of detected metabolites are based on qualitative peak heights rather than absolute concentrations. The intent of this study was to obtain maximal coverage of the metabolome for the identification, generation and evaluation of classifiers which could be used as potential diagnostic markers for NSCLC adenocarcinoma. However, it will be important to verify and quantitate absolute concentrations in future studies. The inclusion of both early and late stage adenocarcinoma may also lead to masking of potential biomarkers given that heterogeneity exists amongst tumor stage. Lastly, while collection of samples from a single institution can be a strength due to consistency in protocols, it also poses as a limitation as there is potential bias of site-specific confounding effects attributed to differences in sample collection and sample handling. Therefore, larger studies are warranted on only early stage NSCLC adenocarcinoma and also include samples from multiple institutions to evaluate reproducibility and performance independent of the collection site.

In conclusion, our results highlight the application and validity of metabolomics in the discovery and validation of candidate biomarkers for diagnosing NSCLC adenocarcinoma. More specifically, we have identified individual metabolites and multi-metabolite panels as part of the discovery phase which will serve as the basis for classifiers used in future studies.

Supplementary Material

1
2

Acknowledgments

Financial Support: We acknowledge the following grants: Tobacco Related Disease Research Program 20PT0034 (Kelly), LUNGevity Foundation 201118739 (Miyamoto), and NIH West Coast Metabolomics Resource Core U24 DK097154 (Fiehn).

We acknowledge the assistance of Valerie Kuderer, Olesya Litovka and Yahanna Sandoval Torres with collection of samples and managing the clinical specimen database. We also acknowledge the services and help with the storage of samples in the UC Davis Cancer Biorepository (Regina Gandour-Edwards and Irmgard Feldman).

Footnotes

Conflicts-of-interest: The authors declare no conflict-of-interest.

References

  • 1.Prevention CfDCa. National Center for Health Statistics. CDC WONDER On-line Database, compiled from Compressed Mortality File 1999–2012. 2014 Series 20 No. 2R. [Google Scholar]
  • 2. (Society AC).Cancer Facts and Figures. 2014:1–60. ( http://www.cancer.org/research/cancerfactsstatistics/cancerfactsfigures2014/index)
  • 3.Breathnach OS, Freidlin B, Conley B, Green MR, Johnson DH, Gandara DR, et al. Twenty-two years of phase III trials for patients with advanced non-small-cell lung cancer: sobering results. Journal of clinical oncology: official journal of the American Society of Clinical Oncology. 2001;19:1734–42. doi: 10.1200/JCO.2001.19.6.1734. [DOI] [PubMed] [Google Scholar]
  • 4.Pass HI, Beer DG, Joseph S, Massion P. Biomarkers and molecular testing for early detection, diagnosis, and therapeutic prediction of lung cancer. Thoracic surgery clinics. 2013;23:211–24. doi: 10.1016/j.thorsurg.2013.01.002. [DOI] [PubMed] [Google Scholar]
  • 5.Hassanein M, Callison JC, Callaway-Lane C, Aldrich MC, Grogan EL, Massion PP. The state of molecular biomarkers for the early detection of lung cancer. Cancer prevention research (Philadelphia, Pa) 2012;5:992–1006. doi: 10.1158/1940-6207.CAPR-11-0441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Serkova NJ, Glunde K. Metabolomics of cancer. Methods in molecular biology (Clifton, NJ) 2009;520:273–95. doi: 10.1007/978-1-60327-811-9_20. [DOI] [PubMed] [Google Scholar]
  • 7.Kwon H, Oh S, Jin X, An YJ, Park S. Cancer metabolomics in basic science perspective. Archives of pharmacal research. 2015 doi: 10.1007/s12272-015-0552-4. [DOI] [PubMed] [Google Scholar]
  • 8.Spratlin JL, Serkova NJ, Eckhardt SG. Clinical applications of metabolomics in oncology: a review. Clinical cancer research: an official journal of the American Association for Cancer Research. 2009;15:431–40. doi: 10.1158/1078-0432.CCR-08-1059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Claudino WM, Goncalves PH, di Leo A, Philip PA, Sarkar FH. Metabolomics in cancer: a bench-to-bedside intersection. Critical reviews in oncology/hematology. 2012;84:1–7. doi: 10.1016/j.critrevonc.2012.02.009. [DOI] [PubMed] [Google Scholar]
  • 10.Scholz M, Fiehn O. SetupX--a public study design database for metabolomic projects. Pac Symp Biocomput. 2007:169–80. [PubMed] [Google Scholar]
  • 11.Fiehn O, Wohlgemuth G, Scholz M, Kind T, Lee do Y, Lu Y, et al. Quality control for plant metabolomics: reporting MSI-compliant studies. Plant J. 2008;53:691–704. doi: 10.1111/j.1365-313X.2007.03387.x. [DOI] [PubMed] [Google Scholar]
  • 12.Fiehn O, Wohlgemuth G, Scholz M. Setup and annotation of metabolomic experiments by integrating biological and mass spectrometric metadata. Data Integration in the Life Sciences, Proceedings. 2005;3615:224–39. [Google Scholar]
  • 13.Taylor SL, Kim K. A jackknife and voting classifier approach to feature selection and classification. Cancer informatics. 2011;10:133–47. doi: 10.4137/CIN.S7111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.McShane LM, Cavenagh MM, Lively TG, Eberhard DA, Bigbee WL, Williams PM, et al. Criteria for the use of omics-based predictors in clinical trials: explanation and elaboration. BMC medicine. 2013;11:220. doi: 10.1186/1741-7015-11-220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Patz EF, Jr, Campa MJ, Gottlin EB, Kusmartseva I, Guan XR, Herndon JE., 2nd Panel of serum biomarkers for the diagnosis of lung cancer. Journal of clinical oncology: official journal of the American Society of Clinical Oncology. 2007;25:5578–83. doi: 10.1200/JCO.2007.13.5392. [DOI] [PubMed] [Google Scholar]
  • 16.Li XJ, Hayward C, Fong PY, Dominguez M, Hunsucker SW, Lee LW, et al. A blood-based proteomic classifier for the molecular characterization of pulmonary nodules. Science translational medicine. 2013;5:207ra142. doi: 10.1126/scitranslmed.3007013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Sozzi G, Boeri M, Rossi M, Verri C, Suatoni P, Bravi F, et al. Clinical utility of a plasma-based miRNA signature classifier within computed tomography lung cancer screening: a correlative MILD trial study. Journal of clinical oncology: official journal of the American Society of Clinical Oncology. 2014;32:768–73. doi: 10.1200/JCO.2013.50.4357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wedge DC, Allwood JW, Dunn W, Vaughan AA, Simpson K, Brown M, et al. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An assessment in patients with small-cell lung cancer. Analytical chemistry. 2011;83:6689–97. doi: 10.1021/ac2012224. [DOI] [PubMed] [Google Scholar]
  • 19.Yu Z, Kastenmuller G, He Y, Belcredi P, Moller G, Prehn C, et al. Differences between human plasma and serum metabolite profiles. PloS one. 2011;6:e21230. doi: 10.1371/journal.pone.0021230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wikoff W, Grapov D, Fahrmann J, DeFelice B, Rom W, Pass H, et al. Metabolomic Markers of Altered Nucleotide Metabolism in Early Stage Adenocarcinoma. Cancer prevention research; Philadelphia, Pa: 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Mohamed A, Deng X, Khuri FR, Owonikoko TK. Altered glutamine metabolism and therapeutic opportunities for lung cancer. Clinical lung cancer. 2014;15:7–15. doi: 10.1016/j.cllc.2013.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Pepe MS, Etzioni R, Feng Z, Potter JD, Thompson ML, Thornquist M, et al. Phases of biomarker development for early detection of cancer. Journal of the National Cancer Institute. 2001;93:1054–61. doi: 10.1093/jnci/93.14.1054. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES