Skip to main content
Journal of Oncology logoLink to Journal of Oncology
. 2022 Sep 9;2022:2965166. doi: 10.1155/2022/2965166

A Deep Neural Network for Gastric Cancer Prognosis Prediction Based on Biological Information Pathways

Jili Hu 1,2,#, Weiqiang Yu 3,4,5,#, Yuting Dai 3,4, Can Liu 1, Yongkang Wang 1,2, Qingfa Wu 3,4,5,
PMCID: PMC9481367  PMID: 36117847

Abstract

Background

Gastric cancer (GC) is one of the deadliest cancers in the world, with a 5-year overall survival rate of lower than 20% for patients with advanced GC. Genomic information is now frequently employed for precision cancer treatment due to the rapid advancements of high-throughput sequencing technologies. As a result, integrating multiomics data to construct predictive models for the GC patient prognosis is critical for tailored medical care.

Results

In this study, we integrated multiomics data to design a biological pathway-based gastric cancer sparse deep neural network (GCS-Net) by modifying the P-NET model for long-term survival prediction of GC. The GCS-Net showed higher accuracy (accuracy = 0.844), area under the curve (AUC = 0.807), and F1 score (F1 = 0.913) than traditional machine learning models. Furthermore, the GCS-Net not only enables accurate patient survival prognosis but also provides model interpretability capabilities lacking in most traditional deep neural networks to describe the complex biological process of prognosis. The GCS-Net suggested the importance of genes (UBE2C, JAK2, RAD21, CEP250, NUP210, PTPN1, CDC27, NINL, NUP188, and PLK4) and biological pathways (Mitotic Anaphase, Resolution of Sister Chromatid Cohesion, and SUMO E3 ligases) to GC, which is consistent with the results revealed in biological- and medical-related studies of GC.

Conclusion

The GCS-Net is an interpretable deep neural network built using biological pathway information whose structure represents a nonlinear hierarchical representation of genes and biological pathways. It can not only accurately predict the prognosis of GC patients but also suggest the importance of genes and biological pathways. The GCS-Net opens up new avenues for biological research and could be adapted for other cancer prediction and discovery activities as well.

1. Introduction

Gastric cancer (GC) is one of the deadliest tumors in the world and gastric adenocarcinoma (GAC) is the most common type of gastric cancer [1], with 95% of gastric malignancies being GAC [2]. Although early gastric cancer can be cured by surgical resection, the 5-year overall survival (OS) rate of advanced gastric cancer is less than 20% due to its easy recurrence and metastasis [4]. Therefore, it is imperative to improve the prognosis of gastric cancer patients, in order to guide personalized medical services and carry out tailored treatment plans.

Many types of genomic data have been acquired as a result of the advancements of next-generationhigh-throughput sequencing technology, including DNA methylation [5], mRNA [6], miRNA [6], and copy number variation (CNV) [7]. Because these datasets provide distinct viewpoints on cancer samples, combining multiomics datasets for cancer type prediction is advantageous. The Cancer Genome Atlas (TCGA) organization has released multiomics sequencing data for 33 cancer types [8], which is useful for comprehensive cancer analysis using multiomics data.

Deep learning (DL) algorithms have recently demonstrated remarkable performance in handling multiomics nonlinear data and numerous DL-based cancer multiomics analysis methods have been developed. Based on the combination of clinical and multiomics data, Tong suggested an integrative predictive model for colon cancer [9]. Using an autoencoder architecture, Chaudhary integrates multiomics data to predict hepatocellular carcinoma (HCC) survival. Hu developed a random forest deep feature selection (RDFS) and approach to increase gastric cancer prediction accuracy by combining the gene expression and copy number variation data [11]. Based on multiomics ensemble data, Xu employed a bidirectional deep neural network (BiDNN) model to predict the prognosis of gastric cancer [12]. Tufail summarizes DL models for cancer diagnosis and prognosis prediction tasks [13].

Although these models have revolutionized the diagnosis and predictions of cancers, they tend to be black boxes with poorly interpretable models. Conversely, machine learning models based on interpretable biomedical information may contribute to cancer genomic discovery and clinical prediction [1416]. Hao et al. designed a pathway-associated sparse deep neural network (PASNet) to predict long-term survival in glioblastoma multiforme (GBM) accurately by incorporating biological pathways [17], but the hidden layers of the PASNet model are not entirely based on biological pathway information. Elmarakeby developed P-NET, a biologically informative deep learning model, to classify primary and castration-resistant prostate cancer (CRPC) [18], but the authors did not state why only 5 layers were chosen in the biological information pathway. These studies bring interpretability research to deep learning for cancer clinical prediction.

Using multiomics data to analyze the complex biological mechanisms of cancer patient survival is crucial; however, high-dimensional, nonlinear data pose computational challenges for survival analysis. In this study, we integrated multiomics data and designed a gastric cancer sparse deep neural network (GCS-Net) by modifying the P-NET model for gastric cancer prognosis, which can not only perform patient survival prognosis but also describe the complex biological process of prognosis. The GCS-Net is biologically interpretable with nodes in the neural network corresponding to biological genes and pathways, which can capture the nonlinear and hierarchical effects of biological genes and pathways on gastric cancer patient survival. Applying the GCS-Net to long-term survival prediction of GC, GCS-Net's accuracy, area under the curve (AUC), and F1 score are all higher than those of traditional machine learning models. Furthermore, genes and biological pathways discovered to be significant in the GCS-Net were validated as important genes and pathways for GC in previous biological and medical studies.

The remainder of the paper is organized as follows: Section 2 explains the datasets and data preprocessing procedure used in our study, the structure and operating principle of the GCS-Net, and the traditional machine learning models we compare the GCS-Net against in GC prognosis. Section 3 compares the results of the GCS-Net with those of traditional machine learning models in GC prognosis and inspects the GCS-Net to uncover significant genes and biological pathways. Section 4 presents a discussion of the results in Section 3. Finally, Section 5 provides the concluding remarks.

2. Materials and Methods

2.1. Datasets

We used the R tool “TCGA-assembler 2” [19] to download the GC dataset from TCGA (https://tcga-data.nci.nih.gov/tcga/). The dataset contains two types of multiomics data: copy number variation (CNV), somatic mutation, and clinical data. Integrating copy number alteration and somatic mutation data helps to reveal and predict survival time due to genomic variation in gastric cancer. The dataset has 295 samples, including 295 mutation data and 293 CNV data.

The GCS-Net network architecture is constructed based on the biological pathway database Reactome [20]. We download the Reactome pathway database from https://reactome.org/download-data, which contains three files: the gene matrix file ReactomePathways.gmt, the pathway name file ReactomePathways.txt, and the pathway parent-child relationship file ReactomePathwaysRelation.txt. From the parent-child relationship file, we create a hierarchical network with four levels of pathways, one layer of genes, and one layer of characteristics.

2.2. Data Preprocessing

Long-term survival (LTS) samples were those who lived for more than 60 months (independent of survival status), while short-term survival samples were those who died in less than 60 months (non-LTS). We obtained 183 non-LTS samples and 42 LTS samples, of which approximately 20% were LTS patients.

The CNV data were standardized to −2, −1, 0, 1, 2. CNV deletion was defined as −2 and CNV amplification as 2. Somatic mutation data were normalized to 1 and 0, with 1 denoting a gene with at least one site mutation and 0 denoting a gene with no mutation.

2.3. Construction of the Pathway Layers in the GCS-Net

We read the Reactome pathway file ReactomePathwaysRelation.txt, which contains the parent-child relationships in the pathway, and chose the human relationships by the keyword “HSA.” Then, we used the Python package NetworkX [21] to build a directed acyclic graph based on the chosen human relationships (Figure 1(b)). The distribution of the number of nodes in each layer is shown in Figure 1(a), in which the fourth layer has the largest number of nodes and the fifth layer ranks second.

Figure 1.

Figure 1

(a) The number of nodes in each layer of the network is constructed based on the Reactome pathway. The first layer has 26 nodes and the last layer has 2 nodes. (b). The parent-child relationship network layer constructed based on the Reactome pathway has a total of 9 layers, each node represents a pathway, and the node corresponds to the corresponding pathway gene.

To capture the relationship between gastric cancer information pathways and reduce network operations, we selected the first four layers to construct the pathway layers in the GCS-Net. In the directed acyclic graph, the directed edges point from parent pathways to child pathways they depend on, while in the GCS-Net, this is reversed, with the outputs of child pathway nodes serving as inputs of parent pathway nodes. Thus, the fourth layer of the directed acyclic graph is the first pathway layer in the GCS-Net, while the first layer of the directed acyclic graph is the last pathway layer in the GCS-Net.

2.4. The Architecture of the GCS-Net

As shown in Figure 2, one layer of feature data serving as an input layer, one layer of genes, and four layers of pathways make up the GCS-Net model. In this study, we use mutations and copy number variations as feature data, and we used the GCS-Net model with such multiomics data as the input to predict patient survival.

Figure 2.

Figure 2

The architecture of the GCS-Net proposed to integrate multiomics data for the GC prognosis prediction. The structure of the GCS-Net consists of a feature layer (multiomics data), a layer of genes, and four layers of biological pathways based on Reactome, and the layers are directly sparsely connected.

2.5. Operating Principle of the Gastric Cancer Sparse Deep Neural Network (GCS-Net)

Based on the Reactome-based network relationship built by NetworkX, we use TensorFlow's high-level API Keras to build multiple linear layers, with each layer followed by dropout and then an activation function.

The input layer represents feature data that need to be fed into the network for training, which is mutations and copy number variation data (encompassing copy number amplifications and copy number deletions) in this study. Each input node represents a feature and they are combined to form an m-column vector, denoted by x={x1, x2,…, xm}.

The gene layer consists of genes involved in the pathways of the first pathway layer. The connection between the input layer and the gene layer is established based on the fourth layer of the pathway database. Each node in the fourth layer of the pathway database is made up of a set of genes; thus, the connection between the input layer and the gene layer is a sparse connection, but not a full connection. We construct a binary adjacency matrix, APn×m, where n is the number of pathways in the first pathway layer and m is the number of genes in the gene layer, to encode the connections between the gene layer and the first pathway layer. We set the value of the element aij of A to one if gene j belongs to some pathway i, and zero otherwise. This is a sparse coding model established based on the relationship between genes and pathways.

In the subsequent pathway layers, the connections between two adjacent pathway layers are determined by the pathway parent-child relationship in the Reactome pathway dataset and are stored in a binay mask matrix M, where M is a binary matrix created from parent-child relationships between the Reactome pathways. During the forward propagation calculation of the network, the output vector y of each layer is jointly determined by the input vector x, the weight matrix W, and the mask matrix M, forming a sparse network model. The calculation formula is as follows:

y=fWMx+b, (1)

where f is the activation function. For each node, we use the following tanh activation function:

f=tanh=e2x1e2x+1, (2)

and as a result, the value of each node remains in the range [−1, 1]. The activation function of the final output layer is the sigmoid function:

f=11+ex, (3)

which outputs a number in the range of (0, 1), with 0 representing good prognosis and 1 representing poor prognosis.

To measure the importance of each node in the network model, we use the DeepLIFT [22] gradient-based attribution method to rank the features in all layers. DeepLIFT utilizes a back-propagation method to propagate important signals from output neurons back through layers to the input [22]. The DeepLIFT scheme implemented in this study uses the GitHub library (https://github.com/kundajelab/deeplift).

In this work, to calculate the importance of nodes in each layer, each node needs to be assigned a score. Let t represents the target output and let x1, x2,…, xn represent some intermediate layer neurons that are necessary to compute the target output. Let t0 denotes the reference activation of t.We define Δt as the difference-from-reference:

Δt=tt0. (4)

DeepLIFT assigns contribution scores CΔxiΔt to Δxi s.t.:

i=1nCΔxiΔt=Δt, (5)

where CΔxiΔt can be thought of as the amount of difference-from-reference in t that is attributable to the difference-from-reference of xi.

2.6. Parameters Optimization and Model Training

We split TCGA gastric cancer data set (containing somatic mutation and copy number data) into 80% training set, 10% validation set, and 10% test set for predicting survival. To make the model training converge smoothly, we initialize the learning rate to 0.001 and reduce it actively after every 100 epochs. The model is trained using the Adam optimizer [23]. We performed 1000 epochs of training and optimized parameters according to the cross entropy loss function:

L=1Niyi·  logpi+1yi·  log1pi, (6)

where N represents the total number of samples, yi is the label corresponding to sample i, and pi is the LTS probability of sample i calculated according to the sigmoid function.

2.7. Methods for Comparison

In this work, we investigated the effectiveness of four traditional machine learning approaches in predicting the prognosis of stomach cancer (decision trees, support vector machines, logistic regression, and random forests). We utilized the scikit-learn package to implement these algorithms and used the default settings [24].

3. Results

3.1. Comparison of Weights between the GCS-Net Model and the Dense Network Model

There are much fewer weights in the GCS-NET sparse model than in a fully connected dense network with the same number of nodes. The number of weights of the sparse model is slightly higher than 83,347 (Table 1), while the fully connected dense network has more than 300 million weights. The formula for calculating the number of weights in a layer in the fully connected dense network is as follows:

wl=nlnl1+1, (7)

where wl is the number of weights in a layer l,  nl is the number of nodes in the same layer, and nl−1 is the number of weights in the previous layer. The formula for calculating the number of weights in a layer in the sparse network is as follows:

weights=MW, (8)

where M is the mask matrix of each layer, with each element in M being 1 or 0 depending on whether or not the corresponding connection path with the parent-child relationship exists. W is the weight matrix of the layer.

Table 1.

GCS-NET network model parameters.

Layer (type) Output shape Param Connected to
Inputs (InputLayer) (none, 34380) 0
h0 (Diagonal) (none, 11460) 45840 Inputs[0][0]
dropout_0 (Dropout) (none, 11460) 0 h0[0][0]
h1 (SparseTF) (none, 1061) 22081 dropout_0[0][0]
dropout_1 (Dropout) (none, 1061) 0 h1[0][0]
h2 (SparseTF) (none, 447) 1512 dropout_1[0][0]
dropout_2 (Dropout) (none, 447) 0 h2[0][0]
h3 (SparseTF) (none, 147) 594 dropout_2[0][0]
dropout_3 (Dropout) (none, 147) 0 h3[0][0]
h4 (SparseTF) (none, 26) 174 dropout_3[0][0]
o_linear1 (Dense) (none, 1) 11461 h0[0][0]
o_linear2 (Dense) (none, 1) 1062 h1[0][0]
o_linear3 (Dense) (none, 1) 448 h2[0][0]
o_linear4 (Dense) (none, 1) 148 h3[0][0]
o_linear5 (Dense) (none, 1) 27 h4[0][0]
Total 83347

3.2. Comparison with Other Methods

Traditional machine learning models such as decision trees, support vector machines, logistic regression, and random forests perform worse than the GCS-Net method. We trained the GCS-Net and these traditional machine learning models for long-term survival prediction of gastric cancer (GC), and the GCS-Net showed higher accuracy, area under the curve (AUC), and F1 score than previous traditional prediction classifiers (area under the receiver operating characteristic (ROC) curve (AUC) = 0.807, area under the precision-recall curve (AUPR) = 0.949, and accuracy = 0.844) (Table 2) (Figure 3(a)).

Table 2.

The GCS-Net and other classic machine learning method model's scores.

Model Accuracy auc aupr f1 Precision Recall
GCS-Net 0.844 0.807 0.949 0.913 0.840 1
L2 LogisticRegression 0.800 0.751 0.907 0.886 0.833 0.945
RBF support vector machine 0.733 0.628 0.916 0.846 0.804 0.891
Linear support vector machine 0.777 0.743 0.943 0.871 0.829 0.918
Random forest 0.800 0.785 0.946 0.886 0.833 0.945
Decision tree 0.755 0.692 0.893 0.857 0.825 0.891

Figure 3.

Figure 3

Prediction performance of the GCS-Net. (a) The AUPRC value of the GCS-Net outperforms other classical machine learning models on the test set. (b) The GCS-Net has a true negative rate (TN) of 75% and a true positive rate (TP) of 100% in the test set.

Evaluated on the test set, the GCS-Net model achieved a true negative rate of 75% (TN) and a true positive rate of 100% (TP), indicating that the model has a certain generalization and can classify samples that are not in the training set (Figure 3(b)).

3.3. Inspection and Interpretation of the GCS-Net

To understand the connections and interactions between different mutations, copy number variations, genes, and biological pathways from input to output after training, we visualized the entire structure of the GCS-Net using a Sankey diagram (Figure 4).

Figure 4.

Figure 4

GCS-Net model pathway Sankey diagram. The Sankey diagram visualization shows the node importance and mutual drive of each layer of the GCS-Net model, and the nodes with darker colors are more important. The left most node represents the input feature data type; the nodes of the second layer represent the last layer of genes constructed according to the Reactome pathway; each subsequent layer represents a higher-level biological pathway; the last layer represents the prediction result.

From the figure, we can see that compared with copy number variation, mutation has a greater impact on the prognosis, which is consistent with the related studies of gastric cancer. To obtain the importance of each node, we use the DeepLIFT attribution method to calculate the node's contribution score to rank the nodes. UBE2C, JAK2, RAD21, NUP210, PTPN1, CDC27, NUP188, and PLK4 were the top-ranked genes, and they have been reported in related gastric cancer studies (Table 3).

Table 3.

The top genes for survival prediction in GC by the GCS-Net.

Gene name Reference
UBE2C [25]
JAK2 [26]
RAD21 [27]
NUP210 [28]
PTPN1 [29]
CDC27 [30]
NUP188 [31]
PLK4 [32]

At the same time, in the hidden layer of pathways, we found that mitotic anaphase, antigen processing, recruitment of NuMA to mitotic centrosomes, neddylation, centrosome maturation, SUMO E3 ligases, G2/M transition, M phase, SUMOylation, and cell cycle have an important impact on the prognosis of gastric cancer. These pathways involve cell cycle checkpoints, posttranslational modification, and transcriptional regulation. These pathways have been studied in the relevant gastric cancer prognostic literature (Table 4).

Table 4.

The top pathway for survival prediction in GC by the GCS-Net.

Pathway name Reference
Mitotic anaphase [33]
Antigen processing [34]
Recruitment of NuMA to mitotic centrosomes [35]
Neddylation [36]
Centrosome maturation [37]
SUMO E3 ligases [38]
G2/M transition [39]
M phase [40]
SUMOylation [41]
Cell cycle [42]

The expression level of the mitotic checkpoint BUB gene family is closely connected with tumor cell proliferation, according to the literature [33], and the BUB overexpression in gastric cancer is a proliferation-dependent phenomenon. Authored study on antigen processing and immune regulation in the response to tumors by Reeves and James [34]. Pan et al. discovered the SUMO E3 ligase CBX4 as a poor prognostic predictor in gastric cancer using a multipronged OMIC analysis [38].

4. Discussion of Results

Compared with traditional machine learning methods, the GCS-Net has better performance and significantly reduces the number of learning parameters. More importantly, it has an excellent model interpretability. Using the DeepLIFT method to measure the importance of different genes and pathways in predicting results, the GCS-Net found known genes related to gastric cancer, such as UBE2C, JAK2, RAD21, CEP250, NUP210, PTPN1, CDC27, NINL, NUP188, and PLK4. In addition, the GCS-Net also discovered important biological pathways, such as mitotic anaphase, resolution of sister chromatid cohesion, and SUMO E3 ligases. These important genes and pathways are documented in relevant gastric cancer biology literature.

Although our method has proved to be robust and reliable in predicting the prognosis of gastric cancer, there are still some concerns that need to be addressed. First, we found that the false-positive rate was high. One possible reason was the imbalance of samples in the dataset. Among them, there were only 42 samples with a good prognosis of gastric cancer with long-term survival greater than 5 years. Second, this experiment uses mutation data and copy number variation data in the multiomics data. If more omics data such as RNA and methylation data had been added, there might have been a higher prediction accuracy. Third, studies [42] have shown that clinical data also help to improve cancer prognosis prediction performance, which is a potential approach to improve model prediction performance.

5. Conclusions

Multiomics data analysis can be used to forecast cancer survival information. In this study, we developed the GCS-Net for predicting gastric cancer prognosis. The GCS-Net utilizes a biological pathway-based architecture and integrates multiomics data for prognosis prediction of gastric cancer.

In the future, we will add more omics data for prediction, use cross-validation to reduce the performance impact of low sample size, and collect more sample data for modeling. In addition, we will optimize the interpretability of deep neural networks through optimization algorithms, such as loss functions, to further improve the accuracy of the model. We will also consider applying this model to the prediction of gastric cancer types, such as diffuse and intestinal types [44].

Finally, the GCS-Net is a deep neural network with interpretable biological pathways for accurate gastric cancer prognosis. Neural networks based on biological information pathways offer a novel approach to biological discovery that might be used for a variety of additional cancer prediction and research applications. To more precisely assess the prognosis of gastric cancer patients, we will combine clinical data and multiomics data and analyze the effect of heterogeneity generated by diverse clinical characteristic data (including age, gender, and pathology) on the prognostic risk of gastric cancer patients.

Acknowledgments

The authors would like to thank University Excellent Talent Funding Project of Anhui Province (Grant no. gxgnfx2020088), Natural Science Project of Anhui University of Chinese Medicine (Grant no. 2020wtzx02), and Industry-University Cooperation Collaborative Education Project of Ministry of Education of the People's Republic of China (Grant no. 202101123001).

Data Availability

Gastric cancer data are obtained from TCGA database (https://tcga-data.nci.nih.gov/tcga/). The Bioinformatics Pathway Database Reactome is from https://reactome.org/download-data.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors' Contributions

Qingfa Wu designed the study. Jili Hu developed the computational model. Yuting Dai and Weiqiang Yu collected data. Weiqiang Yu, Can Liu, and Yongkang Wang analyzed the data. Jili Hu wrote the manuscript. All authors reviewed and approved this paper. Jili Hu and Weiqiang Yu contributed equally.

References

  • 1.Maconi G., Manes G., Porro G. B. Role of symptoms in diagnosis and outcome of gastric cancer. World Journal of Gastroenterology . 2008;14(8):1149–1155. doi: 10.3748/wjg.14.1149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Ferlay J., Shin H.-R., Bray F., Forman D., Mathers C., Parkin D. M. Estimates of worldwide burden of cancer in 2008: globocan 2008. International Journal of Cancer . 2010;127(12):2893–2917. doi: 10.1002/ijc.25516. [DOI] [PubMed] [Google Scholar]
  • 3.Qiu J., Sun M., Wang Y., Chen B. Identification of hub genes and pathways in gastric adenocarcinoma based on Bioinformatics analysis. Medical Science Monitor: International Medical Journal of Experimental and Clinical Research . 2020;26 doi: 10.12659/msm.920261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Johnston F. M., Beckman M. Updates on management of gastric cancer. Current Oncology Reports . 2019;21(8):p. 67. doi: 10.1007/s11912-019-0820-4. [DOI] [PubMed] [Google Scholar]
  • 5.Stirzaker C., Zotenko E., Song J. Z., et al. Methylome sequencing in triple-negative breast cancer reveals distinct methylation clusters with prognostic value. Nature Communications . 2015;6(1):p. 5899. doi: 10.1038/ncomms6899. [DOI] [PubMed] [Google Scholar]
  • 6.Volinia S., Croce C. M. Prognostic microRNA/mRNA signature from the integrated analysis of patients with invasive breast cancer. Proceedings of the National Academy of Sciences . 2013;110(18):7413–7417. doi: 10.1073/pnas.1304977110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wu Y., Chen H., Jiang G., et al. Genome-wide association study (GWAS) of germline copy number variations (CNVs) reveal genetic risks of prostate cancer in Chinese population. Journal of Cancer . 2018;9(5):923–928. doi: 10.7150/jca.22802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Tomczak K., Czerwińska P., Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemporary Oncology . 2015;19:A68–A77. doi: 10.5114/wo.2014.47136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tong D., Tian Y., Zhou T., et al. Improving prediction performance of colon cancer prognosis based on the integration of clinical and multi-omics data. BMC Medical Informatics and Decision Making . 2020;20(1):p. 22. doi: 10.1186/s12911-020-1043-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chaudhary K., Poirion O. B., Lu L., Garmire L. X. Deep learning–based multi-omics integration robustly predicts survival in liver cancer. Clinical Cancer Research . 2018;24(6):1248–1259. doi: 10.1158/1078-0432.ccr-17-0853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hu Y., Zhao L., Li Z., Dong X., Xu T., Zhao Y. Classifying the multi-omics data of gastric cancer using a deep feature selection method. Expert Systems with Applications . 2022;200 doi: 10.1016/j.eswa.2022.116813.116813 [DOI] [Google Scholar]
  • 12.Xu J., Yao Y., Xu B., Li Y., Su Z. Unsupervised learning of cross-modal mappings in multi-omics data for survival stratification of gastric cancer. Future Oncology . 2022;18(2):215–230. doi: 10.2217/fon-2021-1059. [DOI] [PubMed] [Google Scholar]
  • 13.Tufail A. B., Ma Y.-K., Kaabar M. K. A., et al. Deep learning in cancer diagnosis and prognosis prediction: a minireview on challenges, recent trends, and future directions. Computational and Mathematical Methods in Medicine . 2021;2021:28. doi: 10.1155/2021/9025470.9025470 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ma J., Yu M. K., Fong S., et al. Using deep learning to model the hierarchical structure and function of a cell. Nature Methods . 2018;15(4):290–298. doi: 10.1038/nmeth.4627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Yang J. H., Wright S. N., Hamblin M., et al. A white-box machine learning approach for revealing antibiotic mechanisms of action. Cell . 2019;177(6):1649–1661. doi: 10.1016/j.cell.2019.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kuenzi B. M., Park J., Fong S. H., et al. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell . 2020;38(5):672–684. doi: 10.1016/j.ccell.2020.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hao J., Kim Y., Kim T. K., Kang M. PASNet: pathway-associated sparse deep neural network for prognosis prediction from high-throughput data. BMC Bioinformatics . 2018;19(1):p. 510. doi: 10.1186/s12859-018-2500-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Elmarakeby H. A., Hwang J., Arafeh R., et al. Biologically informed deep neural network for prostate cancer discovery. Nature . 2021;598(7880):348–352. doi: 10.1038/s41586-021-03922-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wei L., Jin Z., Yang S., Xu Y., Zhu Y., Ji Y. TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data. Bioinformatics . 2018;34(9):1615–1617. doi: 10.1093/bioinformatics/btx812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jassal B., Matthews L., Viteri G. The reactome pathway knowledgebase. Nucleic Acids Research . 2020;48(D1):D498–D503. doi: 10.1093/nar/gkz1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hagberg A., Swart P., Chult D. S. Exploring Network Structure, Dynamics, and Function Using Networkx . Los Alamos, NM, USA: Los Alamos National Lab. (LANL); 2008. [Google Scholar]
  • 22.Shrikumar A., Greenside P., Kundaje A. Learning important features through propagating activation differences. Proceedings of the 34th International Conference on Machine Learning; August 2017; Sydney, Australia. PMLR; pp. 3145–3153. [Google Scholar]
  • 23.Kingma D. P., Ba J. Adam: A method for stochastic optimization. 2017. https://arxiv.org/abs/1412.6980 .
  • 24.Pedregosa F., Varoquaux G., Gramfort A. Scikit-learn: machine learning in Python. Journal of Machine Learning Research . 2011;12(85):2825–2830. [Google Scholar]
  • 25.Zhang H.-Q., Zhao G., Ke B., et al. Overexpression of UBE2C correlates with poor prognosis in gastric cancer patients. European Review for Medical and Pharmacological Sciences . 2018;22(6):1665–1671. doi: 10.26355/eurrev_201803_14578. [DOI] [PubMed] [Google Scholar]
  • 26.Judd L. M., Menheniott T. R., Ling H., et al. Inhibition of the JAK2/STAT3 pathway reduces gastric cancer growth in vitro and in vivo. PLoS One . 2014;9(5) doi: 10.1371/journal.pone.0095993.e95993 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Yun J., Song S.-H., Kang J. Y., et al. Reduced cohesin destabilizes high-level gene amplification by disrupting pre-replication complex bindings in human cancers with chromosomal instability. Nucleic Acids Research . 2016;44(2):558–572. doi: 10.1093/nar/gkv933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gu Q., Hou W., Liu H., et al. NUP210 and MicroRNA-22 modulate fas to elicit HeLa cell cycle arrest. Yonsei Medical Journal . 2020;61(5):371–381. doi: 10.3349/ymj.2020.61.5.371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhu Z., Fu H., Wang S., et al. Whole-exome sequencing identifies prognostic mutational signatures in gastric cancer. Annals of Translational Medicine . 2020;8(22):p. 1484. doi: 10.21037/atm-20-6620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Xin Y., Ning S., Zhang L., Cui M. CDC27 facilitates gastric cancer cell proliferation, invasion and metastasis via twist-inducedepithelial-mesenchymal transition. Cellular Physiology and Biochemistry . 2018;50(2):501–511. doi: 10.1159/000494164. [DOI] [PubMed] [Google Scholar]
  • 31.Xu G., Li K., Zhang N., Zhu B., Feng G. Screening driving transcription factors in the processing of gastric cancer. Gastroenterology Research and Practice . 2016;2016 doi: 10.1155/2016/8431480.e8431480 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Shinmura K., Kurabe N., Goto M., et al. PLK4 overexpression and its effect on centrosome regulation and chromosome stability in human gastric cancer. Molecular Biology Reports . 2014;41(10):6635–6644. doi: 10.1007/s11033-014-3546-2. [DOI] [PubMed] [Google Scholar]
  • 33.Grabsch H., Takeno S., Parsons W. J., et al. Overexpression of the mitotic checkpoint genes BUB1, BUBR1, and BUB3 in gastric cancer—association with tumour cell proliferation. The Journal of Pathology . 2003;200(1):16–22. doi: 10.1002/path.1324. [DOI] [PubMed] [Google Scholar]
  • 34.Reeves E., James E. Antigen processing and immune regulation in the response to tumours. Immunology . 2017;150(1):16–24. doi: 10.1111/imm.12675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Shi W., Zhang G., Ma Z., et al. Hyperactivation of HER2-SHCBP1-PLK1 axis promotes tumor cell mitosis and impairs trastuzumab sensitivity to gastric cancer. Nature Communications . 2021;12(1):p. 2812. doi: 10.1038/s41467-021-23053-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lan H., Tang Z., Jin H., Sun Y. Neddylation inhibitor MLN4924 suppresses growth and migration of human gastric cancer cells. Scientific Reports . 2016;6(1) doi: 10.1038/srep24218.24218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kamada K., Yamada Y., Hirao T., et al. Amplification/overexpression of Aurora-A in human gastric carcinoma: potential role in differentiated type gastric carcinogenesis. Oncology Reports . 2004;12(3):593–599. doi: 10.3892/or.12.3.593. [DOI] [PubMed] [Google Scholar]
  • 38.Pan Y., Li Q., Cao Z., Zhao S. The SUMO E3 ligase CBX4 is identified as a poor prognostic marker of gastric cancer through multipronged OMIC analyses. Genes & Diseases . 2021;8(6):827–837. doi: 10.1016/j.gendis.2020.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ding L., Yang L., He Y., et al. CREPT/RPRD1B associates with Aurora B to regulate Cyclin B1 expression for accelerating the G2/M transition in gastric cancer. Cell Death & Disease . 2018;9(12):1172–1215. doi: 10.1038/s41419-018-1211-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Cai R., Ding X., Zhou K., et al. Blockade of TRPC6 channels induced G2/M phase arrest and suppressed growth in human gastric cancer cells. International Journal of Cancer . 2009;125(10):2281–2287. doi: 10.1002/ijc.24551. [DOI] [PubMed] [Google Scholar]
  • 41.Ren Y. H., Liu K. J., Wang M., et al. De-SUMOylation of FOXC2 by SENP3 promotes the epithelial-mesenchymal transition in gastric cancer cells. Oncotarget . 2014;5(16):7093–7104. doi: 10.18632/oncotarget.2197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Otsubo T., Akiyama Y., Yanagihara K., Yuasa Y. SOX2 is frequently downregulated in gastric cancers and inhibits cell growth through cell-cycle arrest and apoptosis. British Journal of Cancer . 2008;98(4):824–831. doi: 10.1038/sj.bjc.6604193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Cheerla A., Gevaert O. Deep learning with multimodal representation for pancancer prognosis prediction. Bioinformatics . 2019;35(14):i446–i454. doi: 10.1093/bioinformatics/btz342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Choi Y. J., Kim N. Gastric cancer and family history. Korean Journal of Internal Medicine (Korean Edition) . 2016;31(6):1042–1053. doi: 10.3904/kjim.2016.147. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Gastric cancer data are obtained from TCGA database (https://tcga-data.nci.nih.gov/tcga/). The Bioinformatics Pathway Database Reactome is from https://reactome.org/download-data.


Articles from Journal of Oncology are provided here courtesy of Wiley

RESOURCES