Skip to main content
World Journal of Hepatology logoLink to World Journal of Hepatology
. 2021 Jan 27;13(1):94–108. doi: 10.4254/wjh.v13.i1.94

Integrative analysis of layers of data in hepatocellular carcinoma reveals pathway dependencies

Mamatha Bhat 1, Elisa Pasini 2, Chiara Pastrello 3, Sara Rahmati 4, Marc Angeli 5, Max Kotlyar 6, Anand Ghanekar 7, Igor Jurisica 8,9
PMCID: PMC7856865  PMID: 33584989

Abstract

BACKGROUND

The broader use of high-throughput technologies has led to improved molecular characterization of hepatocellular carcinoma (HCC). 

AIM

To comprehensively analyze and characterize all publicly available genomic, gene expression, methylation, miRNA and proteomic data in HCC, covering 85 studies and 3355 patient sample profiles, to identify the key dysregulated genes and pathways they affect. 

METHODS

We collected and curated all well-annotated and publicly available high-throughput datasets from PubMed and Gene Expression Omnibus derived from human HCC tissue. Comprehensive pathway enrichment analysis was performed using pathDIP for each data type (genomic, gene expression, methylation, miRNA and proteomic), and the overlap of pathways was assessed to elucidate pathway dependencies in HCC.

RESULTS

We identified a total of 8733 abstracts retrieved by the search on PubMed on HCC for the different layers of data on human HCC samples, published until December 2016. The common key dysregulated pathways in HCC tissue across different layers of data included epidermal growth factor (EGFR) and β1-integrin pathways. Genes along these pathways were significantly and consistently dysregulated across the different types of high-throughput data and had prognostic value with respect to overall survival. Using CTD database, estradiol would best modulate and revert these genes appropriately.

CONCLUSION

By analyzing and integrating all available high-throughput genomic, transcriptomic, miRNA, methylation and proteomic data from human HCC tissue, we identified EGFR, β1-integrin and axon guidance as pathway dependencies in HCC. These are master regulators of key pathways in HCC, such as the mTOR, Ras/Raf/MAPK and p53 pathways. The genes implicated in these pathways had prognostic value in HCC, with Netrin and Slit3 being novel proteins of prognostic importance to HCC. Based on this integrative analysis, EGFR, and β1-integrin are master regulators that could serve as potential therapeutic targets in HCC.

Keywords: Hepatocellular carcinoma, Gene expression, miRNA, Methylation, Proteomics, High throughput data


Core Tip: Analyzing all available high-throughput genomic, transcriptomic, miRNA, methylation and proteomic data from human hepatocellular carcinoma tissue, we identified master regulators of key pathways in hepatocellular carcinoma, such as the mTOR, Ras/Raf/MAPK and p53 pathways.

INTRODUCTION

The molecular basis of hepatocellular carcinoma (HCC) has been elusive, given the significant heterogeneity of this tumor that arises in the context of various chronic liver diseases[1]. HCC remains a high-fatality cancer, despite large-scale efforts to better characterize and therapeutically target this malignancy. Since prevalence of cirrhosis due to hepatitis C and fatty liver disease is increasing in North America, HCC continues to rise[2]. Five-year survival remains poor at 18% due to late diagnosis and inability to tolerate chemotherapy in patients with cirrhosis[2]. Consequently, there is an urgent need to better understand the molecular basis of this highly fatal cancer. 

Clinical management of HCC is optimized based on disease stage[3]. Curative treatment with resection, radiofrequency ablation or transplantation is possible in early stage disease[4]. When HCC is diagnosed at a later stage, sorafenib is the first-line chemotherapy, which is directed against the Ras/Raf/MAPK pathway[4]. This is associated with a very modest improvement in overall survival of 3 additional months as compared to placebo (10.7 mo vs 7.9 mo)[5].

The cancer genome atlas (TCGA) is a large-scale project that has enabled improved characterization of cancers with several layers of data. The TCGA multi-platform analysis of 196 HCC tumors described this cancer as highly heterogeneous and difficult to characterize, although certain key pathways did emerge including the Ras/Raf/MAPK, mTOR, Wnt/B-catenin, and Sonic Hedgehog pathways[1,6]. Integration of various types of data has previously been performed to map interaction networks. By integrating genomic, transcriptomic and proteomic data, one can understand potential interactions that contribute to a disease condition or process[7,8]. These interactions may otherwise not be uncovered, on the basis of a single type of data. This systems biology approach has been especially important in cancer, given that alterations in one gene can have a ripple effect on proteins in the rest of a protein-protein interaction network. Therefore, elucidating the layers of data in a disease can provide additional insights into the pathways that drive cancer[9].

In the current study, we aim to characterize the landscape of high-throughput data profiling in HCC and determine the patterns in key dysregulated genes and pathways across these different layers of data. The patterns that emerge could help in better understanding the pathways that drive HCC and could be considered as therapeutic targets.

MATERIALS AND METHODS

Data collection, analysis and database compiling

We downloaded all available high-throughput genomic, transcriptomic, microRNA, methylation, and proteomic datasets related to human HCC samples from published datasets (PubMed, http://www.ncbi.nlm.nih.gov/PubMed and Gene Expression Omnibus (GEO), https://www.ncbi.nlm.nih.gov/geo).

Using PubMed, the following search was performed for whole exome sequencing data on HCC: ("carcinoma, hepatocellular" [MeSH Terms] OR ("carcinoma" [All Fields] AND "hepatocellular" [All Fields]) OR "hepatocellular carcinoma" [All Fields] OR ("hepatocellular" [All Fields] AND "carcinoma" [All Fields])) AND (whole [All Fields] AND ("exome" [MeSH Terms] OR "exome" [All Fields]) AND sequencing [All Fields]). The following MeSH terms were used to identify gene expression papers: ("carcinoma, hepatocellular" [MeSH Terms] OR ("carcinoma" [All Fields] AND "hepatocellular" [All Fields]) OR "hepatocellular carcinoma" [All Fields] OR ("hepatocellular" [All Fields] AND "carcinoma" [All Fields])) AND ("gene expression" [MeSH Terms] OR ("gene" [All Fields] AND "expression" [All Fields]) OR "gene expression" [All Fields]) AND ("humans" [MeSH Terms] OR "humans" [All Fields]) AND English [All Fields] NOT ("review" [Publication Type] OR "review literature as topic" [MeSH Terms] OR "reviews" [All Fields]). To identify suitable papers regarding methylation in HCC, we used the following terms: ("methylation" [MeSH Terms] OR "methylation"[All Fields]) AND ("carcinoma, hepatocellular" [MeSH Terms] OR ("carcinoma" [All Fields] AND "hepatocellular" [All Fields]) OR "hepatocellular carcinoma" [All Fields] OR ("hepatocellular" [All Fields] AND "carcinoma" [All Fields]) AND ("humans" [MeSH Terms] AND English [lang]). Proteomics papers were retrieved using the following search: [("proteomics" [MeSH Terms] OR "proteomics" [All Fields]) AND high [All Fields] AND throughput [All Fields]] AND ("carcinoma, hepatocellular" [MeSH Terms]) OR ("carcinoma" [All Fields] AND "hepatocellular" [All Fields]) OR "hepatocellular carcinoma" [All Fields] OR ("hepatocellular"[All Fields] AND "carcinoma"[All Fields]). MicroRNAs reported in HCC were identified using these MeSH terms: ("micrornas" [MeSH Terms] OR "micrornas"[All Fields] OR "mirna" [All Fields]) AND profile [All Fields] AND ("carcinoma, hepatocellular" [MeSH Terms] OR ("carcinoma" [All Fields] AND "hepatocellular" [All Fields]) OR "hepatocellular carcinoma" [All Fields] OR ("hepatocellular" [All Fields] AND "carcinoma" [All Fields]).

We considered for inclusion all datasets available in PubMed. 

The datasets publicly available on the GEO, a public functional genomics data repository of high-throughput array data (https://www.ncbi.nlm.nih.gov/geo) were retrieved and analyzed using GEO2R (https://www.ncbi.nlm.nih.gov/geo/info/geo2r.html), a web tool available on the portal, identifying genes differentially expressed between samples of HCC and the non-tumoral liver portion. GEO2R compares original submitter-supplied processed data tables using the GEOquery and limma R packages from the Bioconductor project. Following instructions available online at (https://www.ncbi.nlm.nih.gov/geo/info/geo2r.html), we retrieved all dysregulated genes. Only those with an adjusted P value < 0.05, and expression fold change value below ≤ 0.5 or above ≥ 1.5 were considered for further analysis (Table 1, Supplementary Table 1). The genes included in our list from WES papers were reported as affected by nonsynonymous mutations, and synonymous mutations were not considered. Putative microRNA gene targets were identified using an online database, mirDIP 4.1[10], (http://ophid.utoronto.ca/mirDIP). The most stringent predictive search option (top 1%) was used to obtain the list of putative targets of all differentially expressed miRNAs.

Table 1.

List of the final 85 selected publications for each layer of data. For each publication the number of hepatocellular carcinoma samples and controls and the platform used for the analysis are reported

Gene expression
No. year PMID HCC (n) Controls (n) GEO dataset
1 2004 17393520 35 13 GSE6764
2 2008 18504433 11 2 GSE6222
3 2008 18923165 80 82 GSE10143
4 2009 19098997 47 58 GSE14323
5 2009 19861515 16 47 GSE17967
6 2011 21320499 34 34 GSE20140 (GSE10141, GSE10140)
7 2011 21712445 40 40 GSE28248
8 2013 23691139 15 15 GSE17548
9 2013 23800896 GSE36376_276; GSE25097_211 GSE36376_247; GSE25097_283 GSE36376, GSE25097
10 2014 24498002 46 46 GSE47595
11 2014 24564407 45 45 GSE45114
12 2014 25093504 39 40 GSE57958
13 2014 25141867 11 11 GSE55092
14 2014 25376302 18 18 GSE60502
15 2014 25536056 72 72 GSE39791
16 2015 25666192 132 132 GSE54236
17 2015 25645722 228 168 GSE63898
18 2016 27499918 60 60 GSE64041
19 2016 25964079 26 20 GSE54238
Proteomics
No. year PMID HCC (n) Controls (n)
1 2004 14726492 8 8
2 2008 19003864 12 12
3 2005 15759316 10 10
4 2005 16097030 14 14
5 2007 17627933 12 12
6 2014 23621634 3 3
7 2009 19562805 3 3
8 2016 26709725 24 12
9 2013 23589362 20 20
10 2012 22813877 10 10
11 2012 22082227 11 11
12 2011 21631109 69 123
13 2010 20230046 5 5
14 2010 19956837 20 20
15 2009 19715608 18 18
16 2009 19535095 3 3
17 2009 19161326 80 80
18 2004 15221772 20 20
19 2003 14673798 21 21
20 2003 14654528 21 21
21 2002 12481271 11 11
22 2013 23462207 7 7
23 2005 16335951 8 8
24 2006 16342242 10 10
25 2011 22034872 3 3
26 2005 15852300 7 7
27 2011 21913717 3 3
28 2007 17203974 25 28
29 2007 17586277 10 10
Whole exome sequencing
No. year PMID HCC (n) Controls (n) GEO dataset
1 2013 23912677 3 3 N/A
2 2014 24055508 4 7 N/A
3 2017 28323123 5 5 N/A
4 2014 24798001 231 231 GSE54504
5 2012 22561517 24 24 N/A
Epigenetic_miRNAs
No. year PMID HCC (n) Controls (n) GEO dataset
1 2015 26190160 9 7 N/A
2 2014 24789420 10 9 GSE31383
3 2014 24564407 45 45 GSE10694
4 2011 21298008 73 73 GSE21362
5 2008 18649363 78 10 N/A
6 2012 22135159 20 20 N/A
7 2011 21319996 94 94 N/A
8 2009 19473441 20 20 N/A
9 2009 19173277 35 N/A
10 2007 18171346 10 10 N/A
11 2006 16331254 25 25 N/A
12 2015 26062888 30 30 N/A
13 2015 26046780 327 43 N/A
14 2015 25861255 66 66 GSE54751
15 2015 25500075 6 6 GSE54537
16 2014 24875649 24 24
17 2013 23812667 166 166 GSE31384
18 2013 23390000 9 17 GSE40744
19 2012 23082062 18 18 N/A
20 2014 24586785 29 29 N/A
21 2013 24417970 78 78 N/A
Epigenetic methylation
No. year PMID HCC (n) Controls (n) GEO dataset
1 2011 21500188 13 12 N/A
2 2014 24306662 45 45 N/A
3 2014 25376292 22 22 N/A
4 2015 25945129 8 8 GSE59260
5 2011 21747116 12 12 GSE29720
6 2010 20165882 20 20 GSE18081
7 2012 22234943 62 62 GSE37988
8 2013 24012984 20 8 GSE44970
9 2013 23208076 66 66 GSE54503
10 2014 25093504 59 59 GSE57956
11 2014 25294808 27 27 GSE60753

HCC: Hepatocellular carcinoma; GEO: Gene Expression Omnibus; N/A: Not applicable.

From the selected 11 methylation datasets, raw data from eight studies were available on the GEO website (https://www.ncbi.nlm.nih.gov/geo/). We selected the CpG sites or genes reported to be hyper-or hypo- methylated in these publications. The genomic region was considered differentially methylated between HCC tissue and the adjacent non-tumoral sample, if the FDR corrected P value < 0.01. Furthermore, we filtered out everything that did not satisfy the criteria: ∆β ≥ 0.20 or ∆β ≤ -0.20, where ∆β = βHCC - βadjacent was the difference in methylation between above specified groups. When the CpG sites were considered, the Illumina HumanMethylation450K and 27K platforms were used for mapping to the genes. When multiple sites or genes were found to have the same sense of differential methylation, the mean value of ∆β was calculated. Only the CpGs in the 5’UTR, 1st Exon, TSS200, TSS1500 or in CpG islands were considered in our analysis. Proteomic results were retrieved and included only if protein abundance was reported as different in HCC liver samples compared to control samples.

Figure 1 outlines our study workflow. Papers were excluded from each specific search for the following reasons: Data from cell lines, or animal models, studying efficacy or drugs, or the presence of long non-coding RNA, mechanistic studies not performing high-throughput or evaluating the role of one molecule, papers focused on liver diseases but not HCC or liver tissue, not original data such as review articles, or those studies using already selected datasets, not reporting the modulation of the molecules, and papers without data available. 

Figure 1.

Figure 1

Flow chart showing the paper selection process and exclusion criteria for each data type: Gene expression, proteomics, whole exome sequencing, microRNAs and methylation.

Available patient data, including etiology of liver disease (hepatitis C, hepatitis B, alcohol, fatty liver disease) on the basis of which the HCC tumors developed, presence of cirrhosis, the Model for End-stage Liver Disease score (MELD score, an assessment of the severity of liver dysfunction), tumor histology, stage of cancer, alpha-fetoprotein level, overall and recurrence-free survival following treatment were also documented (Supplementary Table 2).

Pathway enrichment analysis

The key dysregulated genes from each type of data (genomic, miRNA, methylation, transcriptomic, and proteomic) were fed into the Integrated Interactions Database[11] (IID, http://ophid.utoronto.ca/iid), to obtain a list of the protein-protein interactions. For the miRNA dataset, we determined the target genes of the differentially expressed miRNAs in tumors using the miRNA Data Integration Portal mirDIP v4.1[10]. The individual lists derived from each type of data were then fed into the pathway Data Integration Portal, pathDIP v3.0 (http://ophid.utoronto.ca/pathDIP)[12], in order to determine the significantly dysregulated pathways in HCC. pathDIP integrates data from 20 major pathway databases, and computationally predicts gene association to curated pathways using protein-protein interactions from IID significance of their connectivity[12]. We used this comprehensive pathway enrichment analysis portal to obtain a list of significantly enriched pathways using literature curated (core) pathway memberships P value (FDR: BH-method) less than 0.05.

The lists of pathways from each type of data were then assessed for overlap using Venny 2.1, an online tool for Venn diagram design (http://bioinfogp.cnb. csic.es/tools/venny/index.html).

Retrospective validation on independent dataset

In order to determine whether key differentially expressed genes along the overlapping pathways had prognostic value, we used KMplotter, a web-based tool that enables survival analysis across multiple cancers and datasets[13]. Patient samples were split into two groups per autoselection of the best cutoff for each gene, in order to assess its prognostic value. We ran multivariate overall survival analysis based on the high vs low expression of each gene in HCC tumors. The two groups were compared by a Kaplan-Meier survival plot, and the hazard ratio with 95% confidence intervals and log-rank P value were calculated. 

Drug identification by CTD

The identification of putative therapeutic agents able to revert the modulation of genes of interest based on their modulation associated with a worse prognosis was obtained using the online Comparative Toxicogenomics Database http://ctdbase.org[14]. This database provides manually curated information about chemical–gene/protein interactions, chemical–disease and gene–disease relationships.

RESULTS

We identified a total of 8733 abstracts retrieved by the search on PubMed on HCC for the different layers of data on human HCC samples, published until December 2016. The flow chart outlining the selection process is detailed in Figure 1.

The number of samples included in our analysis are as follows: (1) Whole exome sequencing: 267 HCC and 270 control samples; (2) Gene expression: 870 HCC and 814 control samples; (3) miRNA: 1172 HCC and 771 control samples; (4) Methylation: 354 HCC and 341 control samples; and (5) Proteomics: 421 HCC and 473 control samples. The methodologies and platforms used to obtain these high-throughput data are reported by type of data (genomic, transcriptomic, miRNA, methylation and proteomic) in Table 1. Clinical data, regarding etiology of liver disease (hepatitis C, hepatitis B, alcohol, fatty liver disease) were frequently reported, on the other side serum levels of liver enzymes, AST and ALT, frequently used to assess liver functions were not available. Pathological details relative to differentiation or stage were frequently absent as well as other crucial variables in the clinic setting, such as Child Pugh/MELD score (Supplementary Table 2).

Integrative analysis reveals most important pathways in HCC

There were 188 overlapping dysregulated genes/proteins across the different types of data. Independently for each type of data, we obtained a list of pathways using pathDIP. We merged the list of dysregulated pathways in miRNA and methylation, given that these epigenetically regulate gene expression, in order to assess for overlapping pathways across the datasets. 

This resulted in a list of 3 common, overlapping pathways among the different types of data: EGFR, β1-integrin, and axon guidance pathways, as depicted in Figure 2. From the previous list of 188 common dysregulated elements in all different layers of data (Figure 3), we were able to identify 35/188 genes that were involved in these 3 shared pathways across the layers of data (Supplementary Table 1).

Figure 2.

Figure 2

Venn diagram shows the three common pathways (EGFR, epidermal growth factor, β1-integrin, and axon guidance pathways) across the four different types of data.

Figure 3.

Figure 3

From the previous list of 188 common dysregulated elements in all different layers of data. A: Number of genes/proteins identified in each data type; B: Venn diagram showing the 188 genes identified as commonly deregulated across the 4 different type of data.

Prognostic value of pathways in HCC

We then examined the prognostic value of the deregulated genes associated to pathways of interest in HCC using TCGA RNA seq dataset, as listed in Table 2. Median survival of 364 patients in the TCGA, which was used for validation purposes regarding the prognostic value is reported. KMplotter HR results from TCGA RNA seq data reflected the altered modulation identified for these 9 genes in the 19 HCC papers relative to the gene expression data (Table 2). Among the five upregulated genes associated with positive HR values, CDK5, was reported with the highest HR value (1.85, P = 0.0035) and involved in cell cycle (Table 3). The other 4/9 genes reported as upregulated, COL2A1, LAMC1, RPS6KA3 and ITGB1 were identified with positive HR value by KM plotter analysis and involved in cellular migration (Table 2 and Table 3).

Table 2.

Prognostic value of the 9 dysregulated genes associated with the 3 common dysregulated pathways (EGFR, epidermal growth factor, β1-integrin and axon guidance) among the 4 types of data in obtained with KMplotter

Gene
Modulation in the 19 HCC papers
Probe-ID
HR
CI
Log-Rank P value
Median survival low (mo)
Median survival high (mo)
Estradiol gene modulation predicted by CTD
COL2A1 Up 1280 1.49 1.05-2.11 0.0229 61.7 54.1 N/A
FGA Down 2243 0.52 0.35-0.77 0.0009 49.7 70.5 +
FGG Down 2266 0.56 0.39-0.79 0.0009 38.3 70.5 +
LAMC1 Up 3915 1.43 0.98-2.09 0.06 56.5 38.3 N/A
CDK5 Up 1020 1.85 1.22-2.81 0.0035 81.9 6.2 N/A
EPHB1 Down 2047 0.72 0.048-1.08 0.1135 54.1 70.5 N/A
RPS6KA3 Up 6197 1.2 0.8-1.78 0.3743 54.1 56.5 -
EGFR Down 1956 0.61 0.43-0.89 0.0085 31 70.5 +
ITGB1 Up 3688 1.37 0.95-1.97 0.0924 82.9 49.7 N/A

CTD based prediction identified Estradiol to efficiently affect the expression of the 4/9 genes based on their hazard ratios values. HR: Hazard ratios; HCC: Hepatocellular carcinoma; CI: Confidence interval; N/A: Not applicable.

Table 3.

Modulation of the 9 dysregulated genes associated with the 3 common dysregulated pathways (EGFR, epidermal growth factor, β1-integrin and axon guidance) identified in the 19 hepatocellular carcinoma gene expression papers. Their genetic alteration in hepatocellular carcinoma and their mechanism in cancer are reported

Gene
Modulation in the 19 HCC papers
PMID
Mutation in HCC (PMID)
Role in cancer (PMID)
COL2A1 Up (2/19) 23800896/25666192 (rs3917) polymorphism is associated with higher risk of HCC (21665180) COL2A1 promotes migration in HCC (29858962)
FGA Down (9/19) 21320499/23800896/25093504/25536056/25141867/25376302/25666192/25645722/25666192 Deleted in HCC patients (27511114) FGA is a positive predictor of survival in gastric cancer patients (15756001)
FGG Down 8/19 21320499/23800896/25093504/25536056/25141867/25376302/25645722/24498002 Allelic loss (16980951) FGG is involved in amino acid and redox metabolism pathway in HCC (28089356)
LAMC1 Up (4/19) 23800896/25536056/25141867/25645722 Not identified LAMC1 promotes tumor cell invasion and migration in HCC (28928891)
CDK5 Up (2/19) 25141867/25376302 Not identified CDK5 promotes proliferation in HCC (29312535)
EPHB1 Down (2/19) 23800896/25141867 Missense mutation (19469653) EPHB1 inhibits cell migration(22242939)
RPS6KA3 Up 1/19 25141867 Somatic mutation and copy number variations (22561517) RPS6KA3 increases cell proliferation (15833840)
EGFR Down (2/19) 19098997/25141867 Missense mutation (26436086) EGFR promotes cell adhesion (31465839)
ITGB1 Up (1/19) 25141867 Somatic number variations (24512821) ITGB1 promotes migration (30664185)

HCC: Hepatocellular carcinoma.

Four out of 9 genes were reported as downmodulated in the 19 HCC gene expression papers. Among these four, two genes, FGA and FGG, were identified as the top statistically significantly (P = 0.0009) associated with a protective role in HCC (HR values 0.52 and 0.59, respectively). FGA and FGG were consistently reported as downmodulated in about 45% of our 19 selected gene expression papers (Table 3). The other two downmodulated genes, EPHB1 and EFGR with negative HR values (Table 2) are reported to be affected by missense mutation leading to a loss of their protective role against cell migration.

Estradiol is a therapeutic agent that appropriately targets HCC genes

Using CTD, we found that estradiol was able to appropriately down- or upmodulate 4 out of 9 cancer-related genes (Table 2). Particularly, CTD reported estradiol capabilities to upregulated FGA, FGG and EGFR reported downmodulated in HCC (Table 2) and counteracting the upregulation of RPS6KA3 in HCC, suggesting a possible role for this hormone in HCC treatment.

DISCUSSION

In this study, we evaluate the molecular pathogenesis of HCC using a unique approach, that of combining all publicly available high-throughput data from patient HCC tumors. This encompasses all miRNA, methylation, genomic, transcriptomic and proteomic profiling data present in the literature, and represents the first effort to derive a consensus molecular model of HCC through analysis of these different types of data. Although these datasets originated from different patient cohorts, presented integrative analysis offers the opportunity to explore common key pathway dependencies of HCC. Starting with the initial generation of genomics and whole exome sequencing data, previous high-throughput studies have brought forth different lists of dysregulated genes, depending on the type of data evaluated. Dysregulated genes may affect different parts of a pathway. Therefore, a pathway-based approach when evaluating different types of high-throughput data offers the ability to assess the pathways most commonly affected in a given cancer. Additionally, the integrative analysis in our study encompasses a large number of patient samples.

Using this integrative approach, we confirm the importance of EGFR, β1-integrin and axon guidance as pathways critical in hepatocarcinogenesis. EGFR activates the signaling cascades of the Ras/Raf/MAPK and mTOR pathways, two pathways that were identified as key to HCC pathogenesis in the TCGA study[6]. The identification of β1-integrin as being commonly dysregulated in HCC is novel, and its significance is confirmed through its consistent dysregulation across types of data. β1-integrin is a cell surface receptor that senses the extracellular matrix, thereby modulating the hallmarks of cancer such as proliferative signaling with continuous activated cell replication, evasion of growth suppressors, resistance to angiogenesis as well as cancer cell invasion and metastasis[14]. Ras/Raf/MAPK and mTOR are established pathways in hepatocarcinogenesis, and are integrin-dependent signaling pathways[15]. Additionally, β1-integrin is known to crosstalk with EGFR. In fact, the downregulation of β1-integrin was found to decrease phosphorylation of EGFR and c-Met in hepatocytes during liver regeneration[16]. A synergistic relationship between integrins and EGFR has also been demonstrated in tumor progression[17]. The finding of axon guidance pathway-related proteins as being dysregulated across types of data, thereby establishing consistent dysregulation of this pathway in HCC, is also novel. Netrin-1 is the best studied protein in the axon guidance pathway, and is known to be overexpressed in various cancers[13]. It is responsible for regulation of apoptosis, with increased presence of netrin-1 leading to inhibition of apoptosis. The tumor suppressor p53, frequently mutated in the TCGA HCC study, regulates the cell cycle through netrin-1. The axon guidance pathway has previously been identified as a pathway that is significantly mutated in HCC based on integration of all genomic data in HCC[18]. This analysis revealed mutations along the axon guidance pathway as being prognostic of a higher rate of HCC metastasis. We were able to additionally validate the prognostic importance of dysregulated proteins in these pathways proteins using TCGA data.

HCC is a cancer that develops in the context of various chronic liver diseases, which may influence the molecular characteristics of HCC. Additionally, the underlying cirrhosis and liver dysfunction that are often concurrent may influence HCC development and behavior[2]. Patients are often diagnosed at an advanced stage of disease, when it is too late for curative treatment. A unique consideration in HCC is the inability to tolerate hepatotoxic chemotherapy in patients with liver dysfunction, as it is often patients with cirrhosis who develop HCC[19,20]. Therefore, liver function must be considered prior to, during, and after any form of treatment for HCC.

Thus, especially for HCC, it has been suggested that a multi-pronged approach to HCC therapy jointly targeting different pathways be adopted.

Omics technologies are essential in the progress towards elucidating the molecular basis of HCC. The current study represents the largest integration of all publicly available genomic, gene expression, methylation, miRNA and proteomic data in HCC, covering 85 studies and 3355 patient sample profiles. We identified consistently deregulated pathways associated with hepatocarcinogenesis across different types of data using integrative analysis tools, thereby confirming the importance of these genes in HCC pathogenesis. EGFR (activator of Ras/Raf/MAPK and mTOR) and β1-integrin (also modulator of the aforementioned pathways) were clearly identified as pivotal to HCC[5,21-23]. This is in keeping with the efficacy of the Ras/Raf/MAPK inhibitors sorafenib and regorafenib in HCC[24].

Even beyond this, we found these consistently deregulated genes across pathways to be appropriately modulated by estradiol. HCC is less common in women, and there have been clinical studies demonstrating that hormone therapy and female sex are protective against HCC as described earlier in this thesis.

Other integrative multi-omics studies have been recently performed for other tumors with high mortality such as breast and ovarian cancer[6,25]. Several breast cancer studies emphasizing how data integration of genomic/transcriptomic and proteomic has improved the molecular characterization of subtypes of breast cancer and elucidate its heterogeneity and its interaction with the microenvironment and aggressiveness[26,27]. A single source of data was used in the ovarian cancer multi-omics mathematical integration performed by Bhardwaj et al[25]. Copy number variation gene expression and methylation data from TCGA data portal were integrated using mathematical algorithm and identified 32 co-expressed genes and 6 pathways associated with survival.

The main limitation of our study is the different patient samples represented by the various types of data. Nonetheless, there is a large amount of high-throughput data, which allowed us to detect pathway dependency patterns that are compatible with the current HCC literature. Additionally, HCC tumors arise in the setting of various chronic liver diseases. We could not assess for etiology-specific genes and pathways in this study, given that the clinical and genetic data to evaluate these differences were not fully available for all the studies. Therefore, we could only evaluate gene differences over whole datasets, rather than individual patients, due not complete individual annotation of the samples available on GEO for each specific dataset. The HCC samples in this integrative analysis all came from patients who had undergone hepatectomy. There were no specimens from patients who were candidates for ablation therapy (early stage), those who were undergoing liver transplantation, or those with advanced HCC. One might anticipate that the molecular features of such tumors differ, given the different stages of HCC captured, but there is unfortunately scarcity of data in this regard.

CONCLUSION

In conclusion, our study represents the largest integrative analysis of all publicly available data in HCC, spanning different types of high-throughput data. Pathway enrichment analysis elucidated EGFR, β1-integrin and axon guidance as pathway dependencies in HCC. These are proteins known to serve as master regulators of key pathways in HCC such as Ras/Raf/MAPK, Wnt/β-catenin and mTOR[28], and may serve as potential overarching therapeutic targets in HCC. The axon guidance pathway was identified as being of potential importance to HCC for the first time, with prognostic value suggested in patient sample validation with TCGA. Estradiol affects a large number of deregulated genes across data with appropriate modulation and may be a therapeutic agent that helps in HCC. A combined therapeutic approach conjointly targeting different pathways may be more optimal in the treatment of HCC, especially when underlying hepatic dysfunction compromises the ability to tolerate optimal chemotherapeutic doses.

ARTICLE HIGHLIGHTS

Research background

Hepatocellular carcinoma (HCC) is highly heterogeneous, difficult to characterize and the molecular basis of HCC has been elusive.

Research motivation

The Cancer Genome Atlas is a large-scale project that has enabled improved characterization of cancers with several layers of data. Elucidating the layers of data in a disease can provide additional insights into the pathways that drive cancer.

Research objectives

A novel integrative approach of all publicly available high-throughput data from patient HCC tumors was used to delineate critical pathway dependencies in HCC.

Research methods

A comprehensive analysis and characterization of all publicly available genomic, gene expression, methylation, miRNA and proteomic data in HCC covered 85 studies and 3355 patient sample profiles and identified the key overlapping dysregulated genes and pathways affected.

Research results

We identified the prognostic value of these genes in HCC genes, specifically with Netrin and Slit3 being novel proteins of prognostic importance to HCC.

Research conclusions

Our large integrative analysis of all publicly available data in HCC and our pathway enrichment analysis has elucidated epidermal growth factor, β1-integrin, and axon guidance as pathway dependencies in HCC.

Research perspectives

Based on our integrative analysis, epidermal growth factor, and β1-integrin are master regulators that could be considered as potential therapeutic targets in HCC.

ACKNOWLEDGEMENTS

The authors thank undergraduate students Sujitha Srinathan, Emily Chen, Bishoy Lawendy, Nangi Suo and Amira Abdallah for their help in data curation.

Footnotes

Institutional review board statement: All data was from publicly available sources, no animal or human studies where done by the authors. No approval was needed. 

Conflict-of-interest statement: The authors do not have any conflict of interest to declare.

Manuscript source: Unsolicited manuscript

Peer-review started: August 6, 2020

First decision: September 21, 2020

Article in press: December 4, 2020

Specialty type: Gastroenterology and hepatology

Country/Territory of origin: Canada

Peer-review report’s scientific quality classification

Grade A (Excellent): 0

Grade B (Very good): B

Grade C (Good): 0

Grade D (Fair): 0

Grade E (Poor): 0

P-Reviewer: Troncoso MF S-Editor: Zhang L L-Editor: A P-Editor: Wang LL

Contributor Information

Mamatha Bhat, Multi Organ transplant Program, University Health Network, Toronto M5G2N2, Canada. mamatha.bhat@uhn.ca.

Elisa Pasini, Multi Organ transplant Program, University Health Network, Toronto M5G2N2, Canada.

Chiara Pastrello, Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health NetworkandKrembil Research Institute, University Health Network, Toronto M5T 0S8, Canada.

Sara Rahmati, Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health NetworkandKrembil Research Institute, University Health Network, Toronto M5T 0S8, Canada.

Marc Angeli, Multi Organ transplant Program, University Health Network, Toronto M5G2N2, Canada.

Max Kotlyar, Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health NetworkandKrembil Research Institute, University Health Network, Toronto M5T 0S8, Canada.

Anand Ghanekar, Surgery, University Health Network, Toronto M5G 2C4, Canada.

Igor Jurisica, Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health NetworkandKrembil Research Institute, University Health Network, Toronto M5T 0S8, Canada; Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto M5T 0S8, Canada.

Data sharing statement

Technical appendix, statistical code available from the corresponding author at mamatha.bhat@uhn.ca all data sets are publicly available.

References

  • 1.Whittaker S, Marais R, Zhu AX. The role of signaling pathways in the development and treatment of hepatocellular carcinoma. Oncogene. 2010;29:4989–5005. doi: 10.1038/onc.2010.236. [DOI] [PubMed] [Google Scholar]
  • 2.El-Serag HB, Rudolph KL. Hepatocellular carcinoma: epidemiology and molecular carcinogenesis. Gastroenterology. 2007;132:2557–2576. doi: 10.1053/j.gastro.2007.04.061. [DOI] [PubMed] [Google Scholar]
  • 3.Heimbach JK, Kulik LM, Finn RS, Sirlin CB, Abecassis MM, Roberts LR, Zhu AX, Murad MH, Marrero JA. AASLD guidelines for the treatment of hepatocellular carcinoma. Hepatology. 2018;67:358–380. doi: 10.1002/hep.29086. [DOI] [PubMed] [Google Scholar]
  • 4.Bruix J, Sherman M American Association for the Study of Liver Diseases. Management of hepatocellular carcinoma: an update. Hepatology. 2011;53:1020–1022. doi: 10.1002/hep.24199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Llovet JM, Ricci S, Mazzaferro V, Hilgard P, Gane E, Blanc JF, de Oliveira AC, Santoro A, Raoul JL, Forner A, Schwartz M, Porta C, Zeuzem S, Bolondi L, Greten TF, Galle PR, Seitz JF, Borbath I, Häussinger D, Giannaris T, Shan M, Moscovici M, Voliotis D, Bruix J SHARP Investigators Study Group. Sorafenib in advanced hepatocellular carcinoma. N Engl J Med. 2008;359:378–390. doi: 10.1056/NEJMoa0708857. [DOI] [PubMed] [Google Scholar]
  • 6.Cancer Genome Atlas Research Network. Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma. Cell 2017; 169: 1327-1341. :e23. doi: 10.1016/j.cell.2017.05.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wilk G, Braun R. Integrative analysis reveals disrupted pathways regulated by microRNAs in cancer. Nucleic Acids Res. 2018;46:1089–1101. doi: 10.1093/nar/gkx1250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Srivastava A, Kumar S, Ramaswamy R. Two-layer modular analysis of gene and protein networks in breast cancer. BMC Syst Biol. 2014;8:81. doi: 10.1186/1752-0509-8-81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhang H, Liu T, Zhang Z, Payne SH, Zhang B, McDermott JE, Zhou JY, Petyuk VA, Chen L, Ray D, Sun S, Yang F, Chen L, Wang J, Shah P, Cha SW, Aiyetan P, Woo S, Tian Y, Gritsenko MA, Clauss TR, Choi C, Monroe ME, Thomas S, Nie S, Wu C, Moore RJ, Yu KH, Tabb DL, Fenyö D, Bafna V, Wang Y, Rodriguez H, Boja ES, Hiltke T, Rivers RC, Sokoll L, Zhu H, Shih IM, Cope L, Pandey A, Zhang B, Snyder MP, Levine DA, Smith RD, Chan DW, Rodland KD CPTAC Investigators. Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer. Cell. 2016;166:755–765. doi: 10.1016/j.cell.2016.05.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tokar T, Pastrello C, Rossos AEM, Abovsky M, Hauschild AC, Tsay M, Lu R, Jurisica I. mirDIP 4.1-integrative database of human microRNA target predictions. Nucleic Acids Res. 2018;46:D360–D370. doi: 10.1093/nar/gkx1144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kotlyar M, Pastrello C, Sheahan N, Jurisica I. Integrated interactions database: tissue-specific view of the human and model organism interactomes. Nucleic Acids Res. 2016;44:D536–D541. doi: 10.1093/nar/gkv1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Rahmati S, Abovsky M, Pastrello C, Jurisica I. pathDIP: an annotated resource for known and predicted human gene-pathway associations and pathway enrichment analysis. Nucleic Acids Res. 2017;45:D419–D426. doi: 10.1093/nar/gkw1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Arakawa H. Netrin-1 and its receptors in tumorigenesis. Nat Rev Cancer. 2004;4:978–987. doi: 10.1038/nrc1504. [DOI] [PubMed] [Google Scholar]
  • 14.Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2017. Nucleic Acids Res. 2017;45:D972–D978. doi: 10.1093/nar/gkw838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Griffiths GS, Grundl M, Leychenko A, Reiter S, Young-Robbins SS, Sulzmaier FJ, Caliva MJ, Ramos JW, Matter ML. Bit-1 mediates integrin-dependent cell survival through activation of the NFkappaB pathway. J Biol Chem. 2011;286:14713–14723. doi: 10.1074/jbc.M111.228387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Speicher T, Siegenthaler B, Bogorad RL, Ruppert R, Petzold T, Padrissa-Altes S, Bachofner M, Anderson DG, Koteliansky V, Fässler R, Werner S. Knockdown and knockout of β1-integrin in hepatocytes impairs liver regeneration through inhibition of growth factor signalling. Nat Commun. 2014;5:3862. doi: 10.1038/ncomms4862. [DOI] [PubMed] [Google Scholar]
  • 17.Ivaska J, Heino J. Cooperation between integrins and growth factor receptors in signaling and endocytosis. Annu Rev Cell Dev Biol. 2011;27:291–320. doi: 10.1146/annurev-cellbio-092910-154017. [DOI] [PubMed] [Google Scholar]
  • 18.Zhang Y, Qiu Z, Wei L, Tang R, Lian B, Zhao Y, He X, Xie L. Integrated analysis of mutation data from various sources identifies key genes and signaling pathways in hepatocellular carcinoma. PLoS One. 2014;9:e100854. doi: 10.1371/journal.pone.0100854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mittal S, El-Serag HB. Epidemiology of hepatocellular carcinoma: consider the population. J Clin Gastroenterol. 2013;47 Suppl:S2–S6. doi: 10.1097/MCG.0b013e3182872f29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Fitzmorris P, Shoreibah M, Anand BS, Singal AK. Management of hepatocellular carcinoma. J Cancer Res Clin Oncol. 2015;141:861–876. doi: 10.1007/s00432-014-1806-0. [DOI] [PubMed] [Google Scholar]
  • 21.Zhu AX, Abrams TA, Miksad R, Blaszkowsky LS, Meyerhardt JA, Zheng H, Muzikansky A, Clark JW, Kwak EL, Schrag D, Jors KR, Fuchs CS, Iafrate AJ, Borger DR, Ryan DP. Phase 1/2 study of everolimus in advanced hepatocellular carcinoma. Cancer. 2011;117:5094–5102. doi: 10.1002/cncr.26165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zhou Q, Lui VW, Yeo W. Targeting the PI3K/Akt/mTOR pathway in hepatocellular carcinoma. Future Oncol. 2011;7:1149–1167. doi: 10.2217/fon.11.95. [DOI] [PubMed] [Google Scholar]
  • 23.Llovet JM, Villanueva A, Lachenmayer A, Finn RS. Advances in targeted therapies for hepatocellular carcinoma in the genomic era. Nat Rev Clin Oncol. 2015;12:436. doi: 10.1038/nrclinonc.2015.121. [DOI] [PubMed] [Google Scholar]
  • 24.Bruix J, Qin S, Merle P, Granito A, Huang YH, Bodoky G, Pracht M, Yokosuka O, Rosmorduc O, Breder V, Gerolami R, Masi G, Ross PJ, Song T, Bronowicki JP, Ollivier-Hourmand I, Kudo M, Cheng AL, Llovet JM, Finn RS, LeBerre MA, Baumhauer A, Meinhardt G, Han G RESORCE Investigators. Regorafenib for patients with hepatocellular carcinoma who progressed on sorafenib treatment (RESORCE): a randomised, double-blind, placebo-controlled, phase 3 trial. Lancet. 2017;389:56–66. doi: 10.1016/S0140-6736(16)32453-9. [DOI] [PubMed] [Google Scholar]
  • 25.Bhardwaj A, Van Steen K. Multi-omics Data and Analytics Integration in Ovarian Cancer. In: Maglogiannis I, Iliadis L, Pimenidis E, editors. Artificial Intelligence Applications and Innovations . 2020:347–57. [Google Scholar]
  • 26.Wagner J, Rapsomaniki MA, Chevrier S, Anzeneder T, Langwieder C, Dykgers A, Rees M, Ramaswamy A, Muenst S, Soysal SD, Jacobs A, Windhager J, Silina K, van den Broek M, Dedes KJ, Rodríguez Martínez M, Weber WP, Bodenmiller B. A Single-Cell Atlas of the Tumor and Immune Ecosystem of Human Breast Cancer. Cell 2019; 177: 1330-1345. :e18. doi: 10.1016/j.cell.2019.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bhatia S, Monkman J, Blick T, Duijf PH, Nagaraj SH, Thompson EW. Multi-Omics Characterization of the Spontaneous Mesenchymal-Epithelial Transition in the PMC42 Breast Cancer Cell Lines. J Clin Med. 2019;8 doi: 10.3390/jcm8081253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bhat M, Sonenberg N, Gores GJ. The mTOR pathway in hepatic malignancies. Hepatology. 2013;58:810–818. doi: 10.1002/hep.26323. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Technical appendix, statistical code available from the corresponding author at mamatha.bhat@uhn.ca all data sets are publicly available.


Articles from World Journal of Hepatology are provided here courtesy of Baishideng Publishing Group Inc

RESOURCES