Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 May 1.
Published in final edited form as: Cancer Epidemiol Biomarkers Prev. 2016 Dec 2;26(5):675–683. doi: 10.1158/1055-9965.EPI-16-0366

Metabolomic Characterization of Hepatocellular Carcinoma in Patients with Liver Cirrhosis for Biomarker Discovery

Cristina Di Poto 1,, Alessia Ferrarini 1,, Yi Zhao 2, Rency S Varghese 1, Chao Tu 1, Yiming Zuo 1, Minkun Wang 1, Mohammad R Nezami Ranjbar 1, Yue Luo 1, Chi Zhang 1, Chirag S Desai 3, Kirti Shetty 4, Mahlet G Tadesse 5, Habtom W Ressom 1,*
PMCID: PMC5413442  NIHMSID: NIHMS834809  PMID: 27913395

Abstract

Background

Metabolomics plays an important role in providing insight into the etiology and mechanisms of hepatocellular carcinoma (HCC). This is accomplished by a comprehensive analysis of patterns involved in metabolic alterations in human specimens. This study compares the levels of plasma metabolites in HCC cases versus cirrhotic patients and evaluates the ability of candidate metabolites in distinguishing the two groups. Also, it investigates the combined use of metabolites and clinical covariates for detection of HCC in patients with liver cirrhosis.

Methods

Untargeted analysis of metabolites in plasma from 128 subjects (63 HCC cases and 65 cirrhotic controls) was conducted using gas chromatography coupled to mass spectrometry (GC-MS). This was followed by targeted evaluation of selected metabolites. LASSO regression was used to select a set of metabolites and clinical covariates that are associated with HCC. The performance of candidate biomarkers in distinguishing HCC from cirrhosis was evaluated through a leave-one-out cross-validation based on area under the receiver operating characteristics (ROC) curve.

Results

We identified 11 metabolites and three clinical covariates that differentiated HCC cases from cirrhotic controls. Combining these features in a panel for disease classification using support vector machines (SVM) yielded better area under the ROC curve compared to alpha-fetoprotein (AFP).

Conclusions

This study demonstrates the combination of metabolites and clinical covariates as an effective approach for early detection of HCC in patients with liver cirrhosis.

Impact

Further investigation of these findings may improve understanding of HCC pathophysiology and possible implication of the metabolites in HCC prevention and diagnosis.

Keywords: Hepatocellular carcinoma, Biomarkers, Metabolomics, Multivariate analysis, Risk factors

Introduction

Hepatocellular carcinoma (HCC) is the fifth most common cancer in the world and the third leading cause of cancer mortality worldwide (1). The estimate of new cases of liver cancer (including intrahepatic bile duct cancers) expected to occur in the US during the 2015 year was 35,660 with approximately three-fourths to be HCC (2). The survival rate of patients with HCC is still 5% (3), and it can only be significantly improved if the diagnoses are made at earlier stages, when treatment is more effective. Ultrasonography, performed every 6 months, is the currently recommended screening and surveillance for patients with established liver cirrhosis (4). The diagnosis of HCC by imaging techniques requires availability of equipment and a correct interpretation of the results, which are limited in regions with high HCC burden (5). Other than liver imaging and histology, current diagnosis of HCC relies on measurement of level of the serum biomarker, α-fetoprotein (AFP). However, the sensitivity and specificity of AFP are not sufficient for diagnosis of HCC as elevated AFP levels may be seen in patients with cirrhosis or chronic hepatitis too (5). Different variants of AFP such as AFP-L1, AFP-L2 and AFP-L3 have been studied in order to improve its diagnostic performance (6). Des-gamma-carboxy-protrombin (DCP) has also been investigated as a potential biomarker for HCC (7). However, reliable serological biomarkers for early detection of HCC in high-risk population of cirrhotic patients are yet to be found and validated.

Metabolomics has been broadly used for biomarker discovery for many human diseases, including cancer (8). It provides simultaneous assessment of a broad range of metabolites. In this paper, we evaluate the levels of plasma metabolites measured by gas chromatography coupled with selected ion monitoring mass spectrometry (GC-SIM-MS), combined with clinical covariates in detecting early stage HCC cases in patients with liver cirrhosis recruited at MedStar Georgetown University Hospital (MGUH), Washington, DC. Metabolites and clinical covariates relevant for detecting HCC in cirrhotic patients were selected through least absolute shrinkage and selection operator (LASSO) logistic regression (9). We observed that the combination of LASSO-selected metabolites and AFP, Child-Pugh score and etiologic factors leads to improved area under the ROC curve compared to AFP. We used correlation and network analyses to evaluate any associations among the selected metabolites and clinical covariates. Finally, we performed pathway enrichment analysis to examine the biological meaning of the results.

Materials and Methods

Study cohort and sample collection

Adult patients were recruited from the hepatology clinic at MGUH. The characteristics of 128 patients investigated in this study are summarized in Table 1. All participants provided informed consent to the study approved by the Institutional Review Board at Georgetown University. The patients were diagnosed to have liver cirrhosis on the basis of established clinical, laboratory and/or imaging criteria. Cases were diagnosed to have HCC based on well-established diagnostic imaging criteria and/or histology. Clinical stages for HCC cases were determined based on the tumor-node-metastasis (TNM) staging system. Controls were required to be HCC free for at least 6 months from the time of study entry.

Table 1.

Characteristic of the study population

Case Control
N=63 % N=65 % p-value*
63.0 65.0
Age Mean (SD) 60.0 (6.4) 58.6 (7.2) 0.2561
Race African American 17.0 27.0 15.0 23.1 0.3938
White 33.0 52.4 43.0 66.2
Asian 6.0 9.5 2.0 3.1
Hispanic/Latino 4.0 6.3 2.0 3.1
Other 3.0 4.8 3.0 4.6
Gender Female 18.0 28.6 19.0 29.2 1.0000
Male 45.0 71.4 46.0 70.8
Smoker Current 14.0 22.2 15.0 23.1 1.0000
Former 31.0 49.2 32.0 49.2
None 18.0 28.6 17.0 26.2
Alcohol Current 8.0 12.7 11.0 16.9 0.7809
Former 33.0 52.4 34.0 52.3
None 21.0 33.3 19.0 29.2
BMI Mean (SD) 30.2 (6.6) 29.2 (6.3) 0.3838
Diabetes No 39.0 61.9 40.0 61.5 1.0000
Yes 24.0 38.1 23.0 35.4
Family history of cancer No 25.0 39.7 24.0 36.9 0.6129
Unknown 2.0 3.2 5.0 7.7
Yes 36.0 57.1 35.0 53.8
Etiology Alcohol 17.0 27.0 25.0 38.5 0.0210
Autoimmune 2.0 3.2 1.0 1.5
Cryptogenic 1.0 1.6 0.0 0.0
HBV 9.0 14.3 1.0 1.5
HCV 39.0 61.9 29.0 44.6
NAFLD 4.0 6.3 3.0 4.6
Other 2.0 3.2 6.0 9.2
PBC 0.0 0.0 3.0 4.6
PSC 2.0 3.2 4.0 6.2
HCV Ab Negative 24.0 38.1 34.0 52.3 0.0860
Positive 37.0 58.7 27.0 41.5
Anti HBc Negative 30.0 47.6 40.0 61.5 0.1822
Positive 27.0 42.9 19.0 29.2
Unknown 1.0 1.6 1.0 1.5
HBsAg Negative 49.0 77.8 55.0 84.6 0.1247
Positive 8.0 12.7 3.0 4.6
Ascites No 37.0 58.7 21.0 33.3 0.0038
Yes 24.0 38.1 42.0 66.7
AST Median (IQR) 83.0 (74.0) 67.0 (52.5) 0.1216
ALT Median (IQR) 70.0 (67.0) 49.5 (36.8) 0.0009
AFP Median (IQR) 28.8 (102.1) 4.5 (11.0) 0.0000
MELD Median (IQR) 10.0 (5.0) 14.0 (7.0) 0.0000
≤10 30.0 47.6 10.0 15.4 0.0000
>10 28.0 44.4 54.0 83.1
Mean (SD) 11.4 (4.1) 16.2 (13.8) 0.0087
Stage I 37.0 58.7
II 20.0 31.7
III 6.0 9.5
HCV RNA Median (IQR) 350800.0 (856449.0) 293900.0 (878781.4) 0.7622
>281 27.0 42.9 16.0 24.6 0.3415
≤281 5.0 7.9 7.0 10.8
Child Pugh Score Median (IQR) 7.0 (3.0) 9.0 (3.0) 0.0001
Mean (SD) 7.1 (2.1) 8.8 (2.4) 0.0001
Child Pugh Grade A 24.0 38.1 9.0 13.8 0.0011
B 23.0 36.5 34.0 52.3
C 7.0 11.1 18.0 27.7
*

Fisher exact test was used for categorical variables. Wilcoxon rank sum test was used for continuous variables not symmetrical distributed.

Through peripheral venipuncture, single blood sample was drawn into 10 mL BD Vacutainer sterile vacuum tube in the presence of EDTA anticoagulant. The blood was immediately centrifuged at 1000g for 10 minutes at room temperature. The plasma supernatant was carefully collected and centrifuged at 2500g for 10 minutes at room temperature. After aliquoting, plasma was kept frozen at −80°C until use.

Chemical and reagents

Deuterium labeled internal standards were purchased from CDN isotopes (Pointe-Claire, QC, Canada). These include Tyrosine-d2 (D-1611), L-glutamic-2,3,3,4,4-d5 acid (D-899), L-alanine-2,3,3,3-d4 (D-1488) and L-phenyl-d5-alanine-2,3,3,-d3 (D-1241). Glycine-d5 (175838), Myristic acid–d27 (366889), Alkane standard mixture (68281), fatty acid methyl ester standards (FAMEs), C8 (260673), C9 (245895), C10 (299030), C12 (234591), C14 (P5177), C16 (P5177), C18 (S5376), C20 (10941), C22 (11940), C24 (87115), C26 (H6389), C28 (74701), except for the C30 purchased from TCI chemicals (Portland, OR USA - T0812), methoxyamine hydrochloride (226904), and pyridine (360570) were purchased from Sigma Aldrich (St. Louis, MO, USA). MSTFA (TS-48910) was purchased from Thermo Scientific (Waltham, MA, USA). HPLC grade 2-propanol, acetonitrile and water were used for metabolites extraction. Helium was purchased from Robert Oxygen (Rockville, MD, USA).

Experimental design and quality assessment

Among the 128 participants, plasma collected from 120 subjects (60 HCC cases and 60 patients with liver cirrhosis) were used for untargeted analysis and plasma from 84 subjects (40 HCC cases and 44 patients with liver cirrhosis) were used for targeted analysis, with an overlap of 74 participants between the two analyses. Plasma samples were divided into batches, with balanced proportions of cases and controls by clinical covariates age, sex and ethnicity, to allow adequate time intervals between sample derivatization and GC-MS instruments calibration prior to each analysis. For the untargeted experiment, samples were split into three batches of 40 samples each, while for targeted analysis, samples were divided into two batches of 41 and 43 samples respectively. In order to monitor the system’s stability and performance, quality assurance procedures were applied as follows. First, a retention index (RI) standard mixture was run at the beginning and the end of each batch for retention index calibration. The standard mixture was prepared by mixing a series of fatty acid methyl esters (FAMEs, C8-C30) and Alkanes (C10–C40), as previously described (10). Then, blank samples were prepared together with the patient samples by adding the derivatization reagents to an empty tube and following the same steps, in order to monitor possible contaminations and background ions introduced by the derivatization process. Finally, quality control (QC) samples were prepared by taking 10 μL of each derivatized sample within the batch and run at the beginning of each batch for system equilibrium, in between runs and at the end for quality assessment.

Metabolite extraction

Plasma metabolites were extracted by adding 1mL of a working solution composed of acetonitrile, isopropanol and water (3:3:2) containing isotope-labeled internal standards at a concentration of 1.25 μg/mL (Tyrosine-d2, L-glutamic-2,3,3,4,4-d5 acid, L-alanine-2,3,3,3-d4, L-phenyl-d5-alanine-2,3,3,-d3, Glycine-d5, Myristic acid d27) to 30μL of plasma in order to evaluate the quality of metabolites extraction. After vortexing, samples were centrifuged at 14,500g for 15 minutes at room temperature. The supernatant was then split into aliquotes of 460μL for subsequent untargeted and targeted analyses by GC-MS. Each supernatant was then concentrated to dryness in speedvac. The dried samples were kept at −20°C until derivatization prior to analysis by GC-MS. 20μL of a 20mg/mL methoxyamine hydrochloride in pyridine was added to the dried extracts, vortexed and incubated at 30°C for 90 minutes. After returning the samples at room temperature, 80μL of MSTFA was added, vortexed and incubated at 30°C for 30 minutes. Samples were then centrifuged at 14,500rpm for 15 minutes, and 60μL of the supernatant was transferred into 250μL clear glass autosampler vials.

Data acquisition and pre-processing

Untargeted metabolomic data were acquired by analyzing metabolites extracted from the plasma samples. The analysis was carried out using two GC-MS systems operated at full scan: a GC-qMS (Agilent Technologies 5975C MSD coupled to an Agilent Technologies 7890A GC) and a GC-TOFMS (LECO Pegasus TOF coupled to an Agilent Technologies 7890A GC). The GC-MS data acquisition and pre-processing were performed following the methods we reported previously (10). For targeted analysis, 46 metabolites from the following three sources were considered: 1) metabolites with statistically significant changes in the untargeted analysis of the samples derived from 120 participants of the same cohort [see (10) for details on the statistical method], 2) metabolites selected from our previous GC-MS study conducted on an Egyptian cohort (10) and 3) metabolites retrieved from the literature by text mining. Targeted quantification was performed in selected ion monitoring (SIM) mode by using the GC-qMS platform, as described previously (10). For each analyte, four ions were selected based on their specificity, where one ion was used as a quantifier for intensity calculation and the other three as qualifiers for confirmation. The fragments were manually selected based on the uniqueness across co-eluting analytes and their relative intensity compared to the base peak in the spectrum. Time segments were set up to allow at least 10msec dwell time for each ion monitored. The complete list of the targeted metabolites (together with the IS) is shown in Supplementary Table S1. The GC-SIM-MS data were pre-processed by the Automated Mass Spectral Deconvolution and Identification System (AMDIS) for peak detection, deconvolution and identification (11). The resulting peaks were aligned using Mass Profiler Professional (MPP) from Agilent Technologies.

Selection of metabolites and clinical covariates by LASSO

Two LASSO regression models were applied to select a set of metabolites and clinical covariates, respectively, based on their association with HCC or cirrhotic disease status. For the metabolites, the data matrix was obtained by pre-processing the GC-SIM-MS runs of the 84 plasma samples on the basis of the quantifier ion’s intensity selected for each of the 46 metabolite targets. The metabolite intensities were log-transformed and the batch effect was removed by using R ComBat package. For the clinical covariates, AFP measurements (reported in ng/mL) were also log-transformed to satisfy the linearity assumption with the log-odds of HCC status. For both LASSO models, the tuning parameter was chosen by (leave-one-out) cross-validation with deviance as the loss function. A univariate logistic regression model was also fit on each individual metabolite to examine its association with the risk of HCC. Adjusted p-values were calculated following the Benjamini-Hochberg procedure (12). In order to further investigate the performance of the metabolites for early detection of HCC, a multinomial logistic regression model was fit considering the HCC stage I & II combined as a group and using the cirrhotic controls as a reference group.

Performance evaluation of predictors

Logistic regression models and support vector machines (SVM) were built to evaluate the performance of the predictors selected by LASSO. We evaluated the performances of four sets of predictors: 1) AFP measurements only; 2) clinical covariates selected by LASSO; 3) metabolites selected by LASSO; and 4) the combination of 2) and 3). Receiver operating characteristics (ROC) curves and 95% confidence interval (CI) of area under the ROC curve (AUC) calculated based on leave-one-out cross-validation were used for performance evaluation.

Correlation analysis

Pearson correlation coefficients were calculated to investigate associations between the LASSO-selected metabolites. Separate correlation graphs were obtained for the HCC and cirrhotic groups by using the R corrplot package. The p-values were adjusted for multiple comparison by the Benjamini & Hochberg procedure (12). The significance cutoff was set to be 0.05. Associations between LASSO-selected metabolites and a subset of clinical covariates were also investigated excluding 17 patients who had missing values for the clinical covariates.

Metabolites ID verification by standards

Identities of the majority of the metabolites selected by LASSO were confirmed by analysis of authentic compounds purchased from Sigma Aldrich: L-valine (PHR1172), glycine (G7403), DL-isoleucine (298689), creatinine (C4255), L-pyroglutamic acid (83160) / [L-glutamic acid (95436)], alpha-D-glucosamine 1-phospate (G9753), tagatose (T2751) [sorbose (S4887)], linoleic acid (L1376), lauric acid (61609). Individual 0.25 mg/mL stock standards solutions were prepared in the appropriate solvent and stored at −20 °C until the analysis. Working standards solutions, at the concentration of 1.25 μg/mL, were prepared by appropriate dilution of the stock standard solutions in acetonitrile, isopropanol, and water (3:3:2). Standards were then concentrated to dryness and derivatized following the same procedure applied for plasma metabolites as described in the Metabolite extraction’s section. Each standard was analyzed by both GC-qMS and GC-TOF-MS platform, following the same GC and MS methods as described in Ranjbar et al. (10). Acquired spectra of the individual standards were cross matched with the corresponding ones extracted from the analysis of the plasma samples.

Network and pathway analysis

In order to investigate the association among LASSO-selected metabolites, group-specific networks were built for HCC cases and cirrhotic controls through a Gaussian graphical model (GGM) and graphical LASSO algorithm implemented in the R Glasso package (13). In a GGM network, the connection between two nodes indicates a conditional independence between them given all the others. The sparsity parameter was tuned based on the result of a 10-fold cross-validation applying the one standard error rule (14). The shared and group-specific connections between the HCC and cirrhotic GGM networks were also investigated. Further evaluation on the shared connections between two GGM networks was conducted by recovering the metabolites that are not detected in our experiment but are reported in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database to have interaction with a pair of nodes in the GGM networks. This was accomplished by using the Matlab MetaboNetworks toolbox to discover the shortest path between each metabolite pairs from the KEGG database (15). To get a deeper understanding of the biological relation between the metabolites we detected in our study and those derived from KEGG in HCC and cirrhotic patients, we performed pathway enrichment analysis using MetaboAnalyst 3.0 (16).

Results

Metabolites selected by LASSO regression

LASSO regression selected eleven metabolites whose expression levels jointly differentiate HCC cases from cirrhotic controls. Table 2 presents statistical results for these variables based on multivariable analysis and univariate logistic regression (p-values and FDR adjusted p-values) along with fold changes (average and median values calculated on the raw intensities after correcting the batch effect). Their fold change ranged between +2.48 (alpha-D-glucosamine 1-phosphate) and -2.18 (tagatose). Figure 1 depicts their dot-plots. The selected metabolites include amino acids and their derivatives (valine, serine, glycine, isoleucine, creatinine and pyroglutamic acid/ glutamic acid), sugars and their derivatives (furanose sugare and alpha-D-glucosamine 1-phospate), fatty acids (linoleic acid and lauric acid), and one inorganic acid (phosphoric acid). Among the metabolites selected by LASSO, those found to be significant in the multinomial logistic regression model as discriminating Stage I & II HCC cases from cirrhotic controls are indicated in Table 2. While we were able to confirm with high confidence (match value >850) the identity of nine of the selected metabolites by comparing their fragmentation patterns with the ones from commercial and/or in-house libraries, we were unable to determine with certainty the identity of the metabolite belonging to the class of furanose sugars. However, based on the similarity with the RI of the standard, we determined tagatose as the likely identification. Also we could not distinguish between pyroglutamic acid and glutamic acid, based on our data. Although in the following sections of the paper both names (pyroglutamic and glutamic acid) have been kept as the identification of the selected metabolite, only glutamic acid was chosen for investigation by literature search and pathway analysis. This is because pyroglutamic acid is most likely a product of glutamate cyclisation during the chemical derivatization process (17). Higher levels of valine, serine, isoleucine, alpha D-glucosamine 1-phosphate and linoleic acid were found in HCC cases, while glycine, creatinine, glutamic acid, tagatose, lauric acid and phosphoric acid were found elevated in cirrhotic controls.

Table 2.

Metabolites and clinical covariates selected by LASSO

Metabolites Fold Change raw Multivariable analysis Univariate Logistic Regression
mean median Coefficient (p-values) Coefficient (p-values, adj. p-values*)
Amino acids and derivatives valinea, b ↑ +1.42 ↑ +1.46 1.249 (0.036) 1.016 (0.007, 0.091)
serine ↑ +1.23 ↑ +1.13 0.958 (0.042) 0.428 (0.152, 0.381)
glycinea ↓ −1.31 ↓ −1.21 −1.821 (0.073) 2.117 (0.002, 0.091)
isoleucinea, b ↑ +1.29 ↑ +1.28 1.251 (0.138) 1.301 (0.021, 0.168)
creatinine ↓ −1.58 ↓ −1.53 −0.782 (0.138) −0.508 (0.086, 0.330)
pyroglutamic acid /glutamic acida, b ↓ −1.24 ↓ −1.19 −2.101 (0.079) 2.101 (0.006, 0.0914)
Sugars and alcohols alpha-D-glucosamine 1-phosphate ↑+1.49 ↑ +2.48 0.521 (0.012) 0.146 (0.160, 0.381)
tagatose (furanose sugar)a, b ↓ −2.18 ↓ −1.97 0.092 (0.633) 0.359 (0.011, 0.102)
Fatty acids linoleic acida ↑ +1.42 ↑ +1.77 1.808 (0.007) 0.863 (0.005, 0.091)
lauric acid ↓ −1.31 ↓ −1.35 1.344 (0.024) −0.543 (0.123, 0.355)
Inorganic acid phosphoric acid ↓ −1.03 ↓ −1.01 −1.530 (0.140) −0.331 (0.514, 0.685)
Clinical covariates
AFP ↑ +1.99 ↑ +2.53 0.405 (0.032) 0.547 (0.003, 0.065)
Child Pugh Score - - −0.287 (0.142) 0.469 (0.005, 0.065)
Etiology (Alcohol vs. HCV) - - −1.725 (0.168) 2.287 (0.042 0.283)
Etiology (HBV vs. HCV) - - 16.929 (0.993) 17.358 (0.994, 0.994)
Etiology (NAFLD vs. HCV) - - −0.378 (0.827) −0.901 (0.482, 0.765)
Etiology (Other vs. HCV) - - −0.851 (0.370) −1.055 (0.179, 0.479)
*

Multiple testing adjusted p-values;

a

Metabolites found to be significant in the multinomial logistic regression model as discriminating HCC – Stage I and II versus cirrhotic controls;

b

Metabolites previously reported in our GC-MS study on an Egyptian cohort (10).

Figure 1.

Figure 1

Individual dot plots of LASSO-selected metabolites and AFP. Horizontal lines represent median.

Clinical covariates selected by LASSO regression

LASSO regression analysis, applied on the clinical variables selected AFP (dot-plot in Figure 1), Child-Pugh score and etiologic factors comprised of alcohol, viral infection (HBV, HCV), non-alcoholic fatty liver disease (NAFLD), and other less frequent etiologies including autoimmune, primary biliary cirrhosis (PBC), primary sclerosing cholangitis (PSC) and cryptogenic. The results of multivariable analysis and univariate logistic regression of these clinical covariates are shown in Table 2. Although we did not anticipate the clinical covariates age, sex and ethnicity to be selected by the LASSO model, since in our study they were matched between HCC cases and cirrhotic controls, we included them anyway into the analysis and, as expected, they did not show up to be significant.

Performance evaluation of predictors

Through a leave-one-out cross-validation, we evaluated the performances of the predictors selected by LASSO in terms of their ability to distinguish cirrhotic controls from early stage HCC by excluding the group of patients with HCC stage III (n=3). Figures 2A, 2B, and 2C present box plots for AUC values and the corresponding 95% CI calculated based on the training set during leave-one-out cross-validation of logistic regression and SVM models. Figures 2D and 2E depict ROC curves obtained while testing the logistic regression and SVM models during the leave-one-out cross-validation. As shown these figures, the logistic regression model built by the LASSO selected metabolites (AUC = 0.808) and clinical covariates (AUC = 0.788) led to improved performance compared to AFP (AUC = 0.723). Although the logistic regression model built by combining the LASSO selected metabolites with clinical covariates in a panel (AUC = 0.733) did not perform well, the SVM built using these predictors outperformed (AUC = 0.857) the remaining three sets of predictors: LASSO selected metabolites (AUC = 0.805), LASSO selected clinical covariates (AUC = 0.786), and AFP (AUC = 0.712).

Figure 2.

Figure 2

Box plots of AUC values obtained based on the training set during leave-one-out cross-validation of logistic regression (A) and SVM (B) models, and the corresponding 95% CI (C). ROC curves obtained while testing the logistic regression (D) and SVM (E) models during the leave-one-out cross-validation using four sets of predictors: AFP (dot dashed line), clinical covariates (dotted line), metabolites only (dashed line), combined metabolites and clinical covariates (solid line).

Correlation analysis

Pearson correlation among the panel of metabolites selected from LASSO, showed a strong relation between amino acids, fatty acids, alpha-D-glucosamine-1-phosphate and pyroglutamic acid/glutamic acid in HCC patients (Supplementary Figure S1). In particular creatinine is strongly correlated with pyroglutamic acid/glutamic acid, alpha-D-glucosamine 1-phosphate and lauric acid. Lauric acid is also positively correlated with pyroglutamic acid/glutamic acid, alpha-D-glucosamine 1-phosphate, and linoleic acid. On the other hand, in cirrhotic controls (Supplementary Figure S2) phosphoric acid shows positive correlation with linoleic acid which presents also a moderate and strong negative correlation with a furanose sugar. Valine shows a common moderate and strong positive correlation with isoleucine in both HCC cases and cirrhotic controls. In addition to the correlation among the LASSO selected metabolites, we investigated their relationship with two of the three clinical covariates (AFP, Child-Pugh score) selected by LASSO. In this panel, AFP was negatively correlated with glycine, creatinine and Child-Pugh score in cirrhotic controls. The Child-Pugh score presented a negative correlation with isoleucine in both groups of patients and with creatinine in HCC cases only.

Network and pathway analysis

Figure 3 shows GGM networks built for HCC and cirrhotic controls separately, along with the merged one. As shown in the merged graph (Figure 3, merged panel), there are four connected pairs, composed of five metabolites with statistically significant differences between HCC and cirrhotic groups (darker blue nodes). The five metabolites are alpha D-glucosamine 1-phosphate, valine, serine, lauric acid, and linoleic acid. Among them, alpha D-glucosamine 1-phosphate serves as a hub metabolite connected to all the other four. Further evaluation of these four metabolites by searching for the shortest path between each metabolite pair against the KEGG database (15) revealed 22 metabolites connecting the four metabolites as shown in Supplementary Figure S3. Metabolite names and KEGG IDs for the original and recovered analytes are listed in Supplementary Table S2. Pathway enrichment analysis, using MetaboAnalyst 3.0, based on the original and recovered metabolites, derived from the GGM network analysis (Figure S3), showed the involvement of the selected metabolites in nine specific pathways common to both HCC and cirrhotic groups (Supplementary Table S3).

Figure 3.

Figure 3

Network analysis. Each node is shaded in proportion to its significance level – the darker the node, the smaller the adjusted p-value; the node shape represents the fold change between HCC and CIRR (diamond nodes for fold change > 1, and circular nodes for fold change < 1); the edges represent whether the association between the metabolites was based on the data from the HCC group (dotted line), cirrhotic group (solid line), or shared by both (double line).

Discussion

In this study, we conducted targeted analysis of metabolites in plasma samples from HCC cases and patients with liver cirrhosis. LASSO regression analysis of the metabolomic data selected eleven metabolites and three clinical covariates including AFP. Combined by SVM in a panel, these predictors showed improved performance in disease classification, compared to AFP only (Figure 2B). If successfully validated, the panel can potentially improve the ability to detect and monitor HCC in high risk population of cirrhotic patients.

Among the LASSO-selected 11 metabolites, four (valine, isoleucine, glutamic acid, and the furanose sugar) were also found statistically significant in a GC-MS–based metabolomic analysis we conducted previously using plasma samples from HCC cases and patients with liver cirrhosis recruited in Tanta, Egypt (10). Of the three branched-chain amino acids (BCAAs), valine and isoleucine were elevated in HCC versus cirrhosis in both the U.S. and Egyptian cohorts. Although it was not statistically significant, leucine too showed increased level in HCC versus cirrhosis, consistent with our previous findings (10). BCAAs have been reported to have a crucial role in cancer by regulating the anabolic process involving protein synthesis and degradation, needs that are shared by both tumor and normal cells (18). The severe muscle wasting syndrome experienced by many cancer patients has motivated the use of BCAA supplements, as already extensively used in the athletic field for performance improvement and muscle mass. Therefore, the use of BCAAs as biomarkers is challenging due to the competing energetic and proliferative demands in both healthy and disease states (1820). However, high levels of BCAAs in HCC samples could be due to their potential tumorigenic effect in liver and may be a significant component of diagnostic testing panels for monitoring the risk of cancer (19). While we found in this study reduced level of glutamic acid in HCC versus cirrhosis, we observed increased level in our previous study involving the Egyptian cohort (10). In another plasma metabolomics study conducted on patients with liver diseases (21), the level of glutamic acid was found decreased in all three types of liver disease (hepatitis, cirrhosis, HCC) when compared to healthy controls. According to the authors this remarkable reduction can be explained by the altered activities and ratio across the three groups of patients of two monitored transaminases. Among the LASSO selected sugars, tagatose appears to be down regulated in HCC similarly to sorbose, another furanose sugar, identified in our previous study using the Egyptian cohort (10). In order to investigate their nature and contribution in promoting hypoxia inducible factors prevalent in low oxygen environments as in solid tumors like HCC (17), the use of complementary techniques aimed at discriminating sugar isoforms will be necessary.

The selection by LASSO, of multiple clinical variables, in addition to AFP, seems to be in agreement with several epidemiological and clinical studies that have shown the increased sensitivity of early detection of HCC in clinical practice when incorporating longitudinal data or adjusting for patient characteristics in addition to the conventional AFP assay (22).

The result of our correlation analysis shows that valine has a moderate and strong positive correlation with isoleucine in both HCC cases and cirrhotic controls, respectively, suggesting its connection not only to cancer but other liver diseases such as cirrhosis as previously reported (18). Also, the Child-Pugh score, a prognostic indicator of liver diseases and necessity of liver transplantation, presented a negative correlation with isoleucine in both groups of patients. This correlation has been previously reported in patients with liver diseases, where the muscle and blood amino acids metabolism were investigated (23).

The result of the pathways enrichment analysis (Table S3, Figure S3) reveals the hepatic metabolome interchange between lipids and water-soluble metabolites crucial for liver energy production and consumption (24), therefore essential for aberrant metabolic reprogramming happening in cancer cells (25).

In summary, the combination of metabolites with clinical covariates, including AFP, has led to better area under the ROC curve in distinguishing early HCC cases in patients with liver cirrhosis when compared to the results obtained by using AFP. Previous HCC related metabolomics studies, conducted using complementary metabolomics platforms and multivariate analysis, including a GC-MS study conducted on an Egyptian cohort, revealed similar metabolomics findings to the ones reported in this paper. Due to the small sample size used in this study, replication of these findings through a larger cohort including samples that represent diverse populations is desired. Following appropriate validation, the metabolites discovered in this study could contribute to better understanding of the development of HCC and allow early detection of HCC in patients with liver cirrhosis. Most of the clinical covariates selected by LASSO are commonly reported as HCC risk factors. Thus, following validation of the metabolites discovered in this study, their combination with the selected clinical covariates in a panel could contribute to better understanding of the development of HCC and to improving our ability to detect early stage HCC in patients with liver cirrhosis.

Supplementary Material

1

Acknowledgments

Financial support: This work was supported by U01CA185188 awarded to Habtom W. Ressom.

The authors acknowledge Dr. Tsung-Heng Tsai for providing constructive comments.

Footnotes

Disclosure of Potential Conflicts of Interest: The authors declare no competing financial interests.

References

  • 1.Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA: a cancer journal for clinicians. 2015;65:87–108. doi: 10.3322/caac.21262. [DOI] [PubMed] [Google Scholar]
  • 2.Situ BCI. Cancer Facts. 2015. [Google Scholar]
  • 3.Forner A, Llovet JM, Bruix J. Hepatocellular carcinoma. Lancet. 2012 Mar 31;379:1245–55. doi: 10.1016/S0140-6736(11)61347-0. [DOI] [PubMed] [Google Scholar]
  • 4.Bruix J, Sherman M. Management of hepatocellular carcinoma: an update. Hepatology. 2011;53:1020–2. doi: 10.1002/hep.24199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kim JU, Shariff MI, Crossey MM, Gomez-Romero M, Holmes E, Cox IJ, et al. Hepatocellular carcinoma: Review of disease and tumor biomarkers. World journal of hepatology. 2016;8:471. doi: 10.4254/wjh.v8.i10.471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Li D, Mallory T, Satomura S. AFP-L3: a new generation of tumor marker for hepatocellular carcinoma. Clinica chimica acta. 2001;313:15–9. doi: 10.1016/s0009-8981(01)00644-1. [DOI] [PubMed] [Google Scholar]
  • 7.Lok AS, Sterling RK, Everhart JE, Wright EC, Hoefs JC, Di Bisceglie AM, et al. Des-γ-carboxy prothrombin and α-fetoprotein as biomarkers for the early detection of hepatocellular carcinoma. Gastroenterology. 2010;138:493–502. doi: 10.1053/j.gastro.2009.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Liesenfeld DB, Habermann N, Owen RW, Scalbert A, Ulrich CM. Review of mass spectrometry-based metabolomics in cancer research. Cancer Epidemiol Biomarkers Prev. 2013 Dec;22:2182–201. doi: 10.1158/1055-9965.EPI-13-0584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) 1996:267–88. [Google Scholar]
  • 10.Ranjbar MRN, Luo Y, Di Poto C, Varghese RS, Ferrarini A, Zhang C, et al. GC-MS based plasma metabolomics for identification of candidate biomarkers for hepatocellular carcinoma in Egyptian cohort. PloS one. 2015;10:e0127299. doi: 10.1371/journal.pone.0127299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Stein SE. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. Journal of the American Society for Mass Spectrometry. 1999;10:770–81. [Google Scholar]
  • 12.Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) 1995;57:289–300. [Google Scholar]
  • 13.Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008 Jul;9:432–41. doi: 10.1093/biostatistics/kxm045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. CRC press; 1984. [Google Scholar]
  • 15.Posma JM, Robinette SL, Holmes E, Nicholson JK. MetaboNetworks, an interactive Matlab-based toolbox for creating, customizing and exploring sub-networks from KEGG. Bioinformatics. 2014 Mar 15;30:893–5. doi: 10.1093/bioinformatics/btt612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Xia J, Sinelnikov IV, Han B, Wishart DS. MetaboAnalyst 3.0--making metabolomics more meaningful. Nucleic Acids Res. 2015 Jul 1;43:W251–7. doi: 10.1093/nar/gkv380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Armitage EG, Kotze HL, Allwood JW, Dunn WB, Goodacre R, Williams KJ. Metabolic profiling reveals potential metabolic markers associated with Hypoxia Inducible Factor-mediated signalling in hypoxic cancer cells. Sci Rep. 2015 Oct 28;5:15649. doi: 10.1038/srep15649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.O’Connell TM. The complex role of branched chain amino acids in diabetes and cancer. Metabolites. 2013;3:931–45. doi: 10.3390/metabo3040931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Tom A, Nair KS. Assessment of branched-chain amino Acid status and potential for biomarkers. J Nutr. 2006 Jan;136:324S–30S. doi: 10.1093/jn/136.1.324S. [DOI] [PubMed] [Google Scholar]
  • 20.Liu KA, Lashinger LM, Rasmussen AJ, Hursting SD. Leucine supplementation differentially enhances pancreatic cancer growth in lean and overweight mice. Cancer Metab. 2014 Mar 31;2 doi: 10.1186/2049-3002-2-6. 6,3002-2-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Lin X, Zhang Y, Ye G, Li X, Yin P, Ruan Q, et al. Classification and differential metabolite discovery of liver diseases based on plasma metabolic profiling and support vector machines. Journal of separation science. 2011;34:3029–36. doi: 10.1002/jssc.201100408. [DOI] [PubMed] [Google Scholar]
  • 22.Singal AG, El-Serag HB. Hepatocellular Carcinoma from Epidemiology to Prevention: Translating Knowledge into Practice. Clinical Gastroenterology and Hepatology. 2015 doi: 10.1016/j.cgh.2015.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Dam G, Sørensen M, Buhl M, Sandahl TD, Møller N, Ott P, et al. Muscle metabolism and whole blood amino acid profile in patients with liver disease. Scand J Clin Lab Invest. 2015;75:674–80. [PubMed] [Google Scholar]
  • 24.Beyoğlu D, Idle JR. The metabolomic window into hepatobiliary disease. J Hepatol. 2013;59:842–58. doi: 10.1016/j.jhep.2013.05.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhou S, Huang C, Wei Y. The metabolic switch and its regulation in cancer cells. Science China Life Sciences. 2010;53:942–58. doi: 10.1007/s11427-010-4041-1. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES