Skip to main content
Medicine logoLink to Medicine
. 2022 Sep 9;101(36):e30374. doi: 10.1097/MD.0000000000030374

Identification of prognosis-related hub genes of ovarian cancer through bioinformatics analyses and experimental verification

Zhong Yu a, Ling Ouyang a,*
PMCID: PMC10980417  PMID: 36086731

Abstract

Ovarian cancer (OC) is a lethal and highly prevalent disease in women worldwide. The disease is often diagnosed in late stages, which leads to its rapid progression and low survival rate. This study aims to identify new prognostic genes for OC. Based on 2 datasets from the National Center for Biotechnology Information Gene Expression Omnibus public database, we constructed 2 Weighted Gene Co-expression Network Analysis networks. Then, we selected and intersected 2 key modules to screen key genes. Enrichment analyses were performed, and a protein-protein interaction network was constructed. The cytoHubba plugin of Cytoscape and survival analysis were used to screen hub genes related to prognosis. The expression of hub genes was analyzed by GEPIA and verified by quantitative Real-Time PCR. Gene alteration frequency analysis, gene set variation analysis, immune infiltration analysis, drug sensitivity analysis, tumor mutation burden, and neoantigen analyses were conducted to determine the prognostic value and molecular mechanisms of the hub genes. In total, 214 key genes were selected from 2 Weighted Gene Co-expression Network Analysis networks, and 3 hub genes, namely ALDH1A2, CLDN4, and GPR37, were identified as prognostic candidates through cytoHubba and survival analysis. Three hub genes were significantly associated with overall survival of OC patients. GEPIA and quantitative Real-Time PCR indicated that ALDH1A2 expression was significantly downregulated, while expression of CLDN4 and GPR37 was upregulated in OC samples compared with normal samples. CIBERSORT showed that 3 hub genes were closely associated with the infiltrating immune cells. GDSC showed that hub genes expression influenced IC50 values of chemotherapeutic drugs. OC patients with high expression of ALDH1A2 and CLDN4 had lower TMB and low ALDH1A2 expression could produce a larger number of neoantigens. In conclusion, the 3 hub genes (ALDH1A2, CLDN4 and GPR37) identified through bioinformatics analyses in the present study may serve as OC prognosis biomarkers. The study findings offer valuable insights into OC progression and mechanisms.

Keywords: immune cell infiltration, ovarian cancer, prognosis biomarkers, tumour mutation burden, Weighted Gene Co-expression Network Analysis

1. Introduction

Ovarian cancer (OC) is a major gynecological malignancy with an overall 5-year survival rate of 45%.[1] Early-stage (stages I and II) OC has a 5-year survival rate of approximately 90%, in contrast to the late-stage (stages III and IV) survival of only 20 to 40% of patients.[2,3] OC is generally treated through surgical resection, chemotherapy, and molecular targeted therapy. However, the optimal treatment for advanced OC remains unknown, necessitating the identification of reliable diagnostic and prognostic biomarkers.

Serum CA125 is a well-known biomarker and is the most widely used indicator in OC diagnosis and prognosis prediction.[4] The advancement of genome-sequencing technologies and bioinformatics algorithms has led to the identification of many molecular signatures and genetic markers, which can improve prognosis prediction in OC patients. Deep-learning applications for analyzing genomic data have been the preferred methods for exploring prognostic signatures to improve the efficacy of OC treatment. The Weighted Gene Co-expression Network Analysis (WGCNA) is extensively employed to analyze large datasets and identify modules with highly correlated genes. Furthermore, the assessment of the relationships between genes and clinical characteristics, as well as candidate biomarker identification, have been performed successfully using the WGCNA. Other applications of the WGCNA include gene clustering, module screening based on resemblances between gene expression profiles, and analyzing the associations between modules and clinical data.[5] In the present study, we identified key prognosis-related genes by screening survival-related modules through the WGCNA, followed by protein–protein interaction (PPI) network construction, finally selecting 3 hub genes through cytoHubba and survival analyses. The expression of hub genes was analyzed by Gene Expression Profiling Interactive Analysis (GEPIA) and verified by quantitative Real-Time PCR (qRT-PCR). We also conducted gene alteration frequency analysis, gene set variation analysis (GSVA), and analyses of immune infiltration, drug sensitivity, tumor mutation burden (TMB), and neoantigens to assess the prognostic value and molecular functions of the candidate prognosis-related OC genes.

2. Materials and Methods

2.1. Data retrieval from Gene Expression Omnibus

The publicly available database Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) serves as a rich source of gene sequencing and expression data obtained through DNA-microarray analysis and next-generation sequencing. The GEO database is freely accessible to users. The species Homo sapiens was selected to obtain information on the ovarian tissues. The datasets with more than 15 samples and containing complete information on survival were selected for subsequent analyses. Two gene expression profiles, namely GSE26712 and GSE54388, were obtained. The array data of GSE26712 comprised 185 OC tumor samples and 10 samples of normal ovarian tissue, whereas those of GSE54388 comprised 16 OC and 6 normal samples. The original expression profile was downloaded, and the robust multi-array average algorithm was used for background correction and quantile normalization. Thereafter, mRNA expression matrices of patients with OC were developed. Furthermore, the expression level was determined through variance analysis and ranked by variance. Finally, the top 5000 genes were selected and used for the construction of a co-expression network.

2.2. Weighted co-expression network analysis

The availability of the top 5000 variant genes was evaluated, and a co-expression network was established using the package “WGCNA” in R. The Pearson’s correlation coefficients between individual pairs of genes were evaluated, and the correlation strength between the nodes was determined by constructing an adjacency matrix. The soft thresholds of 9 and 11 were set for the analysis of GSE26712 and GSE54388, respectively (scale-free R2 = 0.9). A topological overlap matrix (TOM) was constructed from the adjacency matrix; the TOM compares the weighted correlation between 2 nodes and other nodes and thus helps in the quantitative assessment of node similarities.[6] Modules were identified by hierarchical clustering, with each module containing at least 30 genes. Finally, the eigengene was calculated, and the modules were hierarchically clustered to merge similar modules (MEDissThres = 0.2). Subsequently, the key module having the highest correlation coefficient and the smallest P value was selected according to the correlation coefficient and the associated P value between the trait and the eigengene of each module, except the grey module. To screen for the key genes, the gene significance (GS) and module membership (MM) of each gene were determined; GS refers to the associations of individual genes with the trait of interest, whereas MM represents the relationships between the eigengenes of the modules and the patterns of gene expression. A high degree of correlation between GS and MM implies that the genes associated significantly with a trait are also the key elements of the modules associated with the trait. Finally, we selected the genes with GS > 0.5 and MM > 0.8 in the key module for further analysis.

2.3. Data source from The Cancer Genome Atlas

The Cancer Genome Atlas (TCGA; https://portal.gdc.cancer.gov/) is the world’s largest cancer-related genetic information database that contains information on diverse aspects such as gene expression, miRNAs, copy number variations, DNA status, histone methylation, and single nucleotide polymorphisms. We used relevant data on OC and the differential expression patterns to process a total of 379 samples belonging to OC patients.

2.4. Functional enrichment analysis and hub gene identification

To analyze gene functions, the “clusterProfiler” R package was used to investigate Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses. A PPI network was generated using the Search Tool for the Retrieval of Interacting Genes database (version 10.5, https://string-db.org/) and the top 100 hub prognosis-related genes were screened using the “cytoHubba” plugin with Cytoscape version 3.7.2. Survival analysis of the 100 genes was conducted respectively to identify hub genes that were significantly associated with overall survival of OC patients. We used respectively 3 datasets (GSE26712, DUKE-OC, GSE8841) in PrognoScan database to verify the relationship between the expression of hub genes and the prognosis of OC. Using OC data of the GEPIA database, the comparative levels of the hub genes in OC and normal tissues were determined.

2.5. Quantitative real-time polymerase chain reaction (qRT-PCR)

From March 2021 to October 2021, 10 cases of OC tissues (without preoperative chemotherapy or radiotherapy) and 10 cases of normal ovarian tissues were collected from Shengjing Hospital of China Medical University (Shenyang, China). All samples were kept at −80°C before use. The study was approved by the Ethics Committee of the Shengjing Hospital of China Medical University (2021PS823K). All included patients gave their written informed consent.

Total RNAs from normal ovarian samples (n = 10) and OC samples (n = 10) were isolated using TRIpure (RP1001; BioTeke, Beijing, China) After being left at room temperature for 5 minutes, 200 μL of chloroform was added to the solution and mixed by inversion to isolate RNA. After centrifugation at 10,000 g at 4°C for 10 minutes, the upper aqueous phase was transferred to a new centrifuge tube. Isopropanol equal to the volume of water was added and mixed upside down. The mixture was placed at −20°C overnight. After centrifugation at 10,000 g at 4°C for 10 minutes, the supernatant was discarded. The RNA precipitate was washed with 1 mL of 75% ethanol and centrifugated at 34,000 g at 4°C for 3 minutes, the supernatant was discarded. Allow the remaining ethanol to evaporate at room temperature for 5–6 minutes. At last, 30 μL of RNase-free H2O was used to completely dissolve the precipitate to obtain the total RNA. Photometer Nano 2000 was used to determine the concentration of RNA in each sample. The cDNA was synthesized using BeyoRT II M-MLV reverse transcriptase (D7160L; Beyotime, Shanghai, China). After cDNA mixed with SYBR Green (SY1020; Beyotime, Beijing, China), quantitative real-time PCR was performed using Exicycler™ 96 Real-Time PCR System (Bioneer, Daejeon, South Korea) with the housekeeping gene β-actin as an internal control. The program was set as follows: 94°C for 5 minutes; 40 cycles of 94°C for 15 seconds, 60°C for 25 seconds, 72°C for 30 seconds and fluorescence scan; 72°C, for 5 minutes 30s, 40°C, for 2 minutes 30 seconds, melting 60°C to 94°C, Every 1.0°C, 1 second, 25°C for 1–2 minutes. All procedures were conducted according to the manufacturer’s instructions and data were analyzed using the 2-ΔCT method (ΔCt = (Ct of genes of interest) − (Ct of β-actin).[7] The primers used in this experiment were as follows: ALDH1A2: F: 5′-TGATGATATGCGGATTG-3′R: 5′-CTGAGTTATTGGCTCTTTC-3′; CLDN4: F: 5′-CAACTGCCTGGAGGATGAAA-3′R: 5′-AGCGGATTGTAGAAGTCTTGGAT-3′; GPR37: F: 5′-GGGAAACAGCACGAACC-3′R: 5′-AGATGACCAGCGGAAGG-3′.

2.6. Gene alteration frequency analysis

Alterations in the frequencies of the hub genes in OC were assessed using the cBioPortal for Cancer Genomics (http://www.cbioportal.org/).

2.7. Gene set variation analysis

Gene sets from the Molecular Signatures Database (version 7.0) were downloaded, and potential changes in the biological functions in the samples were evaluated by comprehensively scoring each gene set using the GSVA algorithm.

2.8. Gene Multiple Association Network Integration Algorithm Analysis

The Gene Multiple Association Network Integration Algorithm (GeneMANIA) was applied to form an interaction network of the hub genes and explore the possible mechanisms of action of the hub genes in OC.

2.9. Immune cell infiltration analysis

CIBERSORT software is a tool for evaluating the types of immune cells in tumor microenvironments. This tool performs deconvolution analysis on the expression matrix of the various immune cells using support vector regression. It contains 547 biomarkers and 22 human immune cell phenotypes, including T cells, B cells, plasma cells, and myeloid cell subgroups. Spearman’s correlation coefficients between the hub genes and the identified immune cells were determined using the “ggstatsplot” package in R (https://github.com/IndrajeetPatil/ggstatsplot).

2.10. Drug sensitivity analysis

The Genomics of Drug Sensitivity in Cancer (GDSC; https://www.cancerrxgene.org/) database provides information about molecular markers of drug sensitivity in cancer cells. The chemotherapeutic sensitivity of each tumor sample was predicted by analyzing the data from the database using the R package “pRRophetic”. Regression analyses were used to obtain the IC50 value for each specific drug treatment, and both the regression and prediction accuracy were evaluated using 10 cross-validation tests with the GDSC training set. Default values were used for all parameters, including “combat”, to remove batch effects and the average value of the repeated gene expression.

2.11. TMB and neoantigen analyses

The TMB is calculated as the number of mutations per megabase of sequenced DNA from a specific cancer type. Increased numbers of mutations result in the development of more neoantigens and the increased numbers increases the probability of one or more of those neoantigens being immunogenic and involved in triggering a T-cell response increases.[8] The somatic mutation data of 379 OC patients from TCGA were downloaded. TMB was calculated by determining the mutation frequency of each sample and the number of mutations per exon length. The web-accessible server NetMHCpan (version 3.0) (http://www.cbs.dtu.dk/services/NetMHC-3.0) was used to evaluate the neoantigens of each patient.

2.12. Statistical analysis

The R Programming Language (R Statistical computing software of the R Core Team, version 3.6) was employed for statistical analyses. Values were expressed as the mean ± SD and were analyzed using Student’s t test through GraphPad Prism 9.0 software (GraphPad Software, San Diego, California USA, www.graphpad.com). All statistical tests were bilateral. A P value of < .05 was considered to represent the level of significance.

3. Results

3.1. WGCNA construction and identification of key modules from GSE26712

The holistic analytical framework of this study was shown in Figure 1. The GSE26712 dataset was obtained from the NCBI GEO public database for a group of 195 patients comprising 10 healthy controls and 185 OC patients. Based on the observed clinical features of the samples from the patients, the WGCNA network was designed to uncover significant genes playing key roles in OC establishment (Fig. 2). The function “sft$powerEstimate” was applied to determine the soft threshold (β) value, which was fixed as 9 (Fig. 3A,B). The dissimilarity measure based on TOM was used for hierarchical clustering, and genes showing similar expression patterns were merged into the same gene module by using the DynamicTreeCut algorithm (Fig. 3C). A total of 16 gene modules, namely black (1032), blue (1416), brown (768), cyan (238), dark red (55), green (510), grey (1818), grey (101), light cyan (106), light green (98), light yellow (97), magenta (612), pink (332), royal blue (77), salmon (159), and turquoise (2581), were detected in this analysis (Fig. 3D). By further analyzing the modules and traits, we found the strongest correlation of the magenta module with the tumor phenotype (cor = −0.8, P = 2e − 45) (Fig. 3E); therefore, this module was selected for subsequent verification.

Figure 1.

Figure 1.

The framework of the study. GDSC = Genomics of Drug Sensitivity in Cancer, GEO = Gene Expression Omnibus, GEPIA = Gene Expression Profiling Interactive Analysis, GO = Gene Ontology, GSVA = gene set variation analysis, KEGG = kyoto encyclopedia of genes and genomes, PPI = protein–protein interaction, qRT-PCR = quantitative Real-Time PCR, TMB = Tumor Mutation Burden, WGCNA = Weighted Gene Co-expression Network Analysis.

Figure 2.

Figure 2.

Clustering dendrogram of 195 samples from GSE26712.

Figure 3.

Figure 3.

(A) The scale-free fit index assessment for various soft-thresholding powers (β). (B) The mean connectivity assessment for various soft-thresholding powers. (C) Dendrogram of the differentially expressed genes grouped according to a dissimilarity measure (1-TOM), along with the assigned merged module colors and the original module colors. (D) Heatmap of the correlations between the modules and OC clinical features; red and blue colors indicate positive and negative correlations, respectively. (E) Scatter plot of GS verus MM in the magenta module. OC = ovarian cancer, GS = gene significance, MM = module membership, TOM = topological overlap matrix.

3.2. WGCNA construction and identification of key modules from GSE54388

We downloaded GSE54388 from the NCBI GEO public database for 22 groups of patients, including both the healthy control group (n = 6) and the OC patient group (n = 16). The WGCNA network was constructed (Fig. 4), with the soft threshold set to 11 (Fig. 5A,B). Nine gene modules, namely, black (606), blue (1217), brown (3750), dark red (897), dark turquoise (57), grey (199), light green (154), pink (392), and turquoise (2728), were identified through hierarchical clustering (Fig. 5C,D). The blue module exhibited the strongest correlation with the tumor phenotype (cor = −0.89, P = 4e − 08) (Fig. 5E). This module was, therefore, chosen for subsequent verification.

Figure 4.

Figure 4.

Clustering dendrogram of 22 samples from GSE54388.

Figure 5.

Figure 5.

(A) The scale-free fit index assessment for various soft-thresholding powers (β). (B) The mean connectivity assessment for various soft-thresholding powers. (C) Dendrogram of the differentially expressed genes clustered according to a dissimilarity measure (1-TOM), along with the assigned merged module colors and the original module colors. (D) Heatmap of the correlation between the modules and OC clinical features; red and blue colors denote positive and negative correlations, respectively. (E) Scatter plot of GS versus MM in the blue module. OC = ovarian cancer, GS = gene significance, MM = module membership, TOM = topological overlap matrix.

3.3. Enrichment analysis and identification of prognosis-related hub genes

After the WGCNA of the 2 datasets, 2 key modules were intersected to obtain 214 key genes (Fig. 6A). To investigate the potential physiological functions of these genes, we performed GO and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses on all 214 genes. According to the results, the most significantly enriched pathways were “response to wounding”, “sensory organ development”, “heart morphogenesis”, “head development”, and “regulation of endothelial cell proliferation” (Fig. 6B); the relationships between the functional pathways are illustrated in Figure 6C. The PPI network was built and visualized using Cytoscape software (Fig. 7). The top 100 genes were ranked using the cytoHubba plugin and validated through survival analysis respectively in TCGA OC database. ALDH1A2, CLDN4, and GPR37 were finally selected as the hub genes associated with prognosis. There was a negative association between ALDH1A2 and CLDN4 expression and the overall survival of OC patients, whereas GPR37 expression exhibited a positive association with survival (Fig. 8A–C). The prognostic significance of 3 hub genes in OC patients was also shown through data from PrognoScan database and the survival trend was persistent with that in TCGA database (Fig. 8D–F). GEPIA analysis showed that ALDH1A2 levels were significantly lower in OC tissues in comparison with non-cancerous tissue, while the levels of CLDN4 and GPR37 were significantly higher (Fig. 8G–I).

Figure 6.

Figure 6.

(A) Venn diagram of the intersection of genes in key modules. (B) GO and KEGG pathway enrichment analyses of key genes. (C) Interaction relationship between the pathways. GO = Gene Ontology, KEGG = kyoto encyclopedia of genes and genomes.

Figure 7.

Figure 7.

Protein–protein interaction (PPI) network of 177 key genes.

Figure 8.

Figure 8.

Survival analysis of the hub genes in TCGA database (A–C). Survival analysis of the hub genes in PrognoScan database (D–F). TCGA = The Cancer Genome Atlas. The hub gene expression levels in GEPIA (G) ALDH1A2, (H) CLDN4, and (I) GPR37. Red indicates tumor tissues, and grey indicates normal tissues. ALDH1A2 expression was significantly downregulated, while expression of CLDN4 and GPR37 was upregulated in OC samples compared with normal samples (P < .01). The red * represents P < .01.

3.4. Expression validation of hub genes via qRT-PCR

To better characterize the expression levels of the 3 hub genes in normal and OC tissues, 10 normal ovarian samples and 10 OC samples were collected and qRT-PCR was conducted. As shown in Figure 9, we found that compared with normal ovarian samples, the mRNA expression level of ALDH1A2 was significantly decreased in OC samples but the mRNA expression levels of CLDN4 and GPR37 were significantly elevated in OC samples (P < .05).

Figure 9.

Figure 9.

The mRNA expression of hub genes in ovarian cancer and normal ovarian tissues was detected via qRT-PCR. (A) ALDH1A2, (B) CLDN4, (C) GPR37. *P < .05, **P < .01, ***P < .001, and ****P < .0001. qRT-PCR = quantitative Real-Time PCR.

3.5. Mutations in the hub genes

The cBioportal database was used to study mutations in the hub genes in OC patients from TCGA Pan-Cancer dataset. The results showed that hub gene alteration occurred in 25% of cases, with a 9% mutation frequency for each hub gene (Fig. 10A).

Figure 10.

Figure 10.

(A) The mutation rate of hub genes: ALDH1A2 (9%), CLDN4 (9%), GPR37 (9%). GSVA enrichment pathway diagram of the hub genes: (B) ALDH1A2, (C) CLDN4, (D) GPR37. The upper right part of each figure shows the pathways enriched in the high-expression group, and the lower left part of each figure shows the pathways enriched in the low-expression group; grey represents the pathways that are not significantly related to gene expression. (E) The interaction network of hub genes generated using GeneMANIA. GeneMANIA = Gene Multiple Association Network Integration Algorithm.

3.6. Identification of the pathways correlated significantly with the hub genes

The GSVA results indicated ALDH1A2 expression had positive correlations with “epithelial–mesenchymal transition”, “apical junction”, “IL6-JAK-STAT3 signalling”, and “KRAS signalling”, and negative correlations with “DNA repair” and “oxidative phosphorylation” (Fig. 10B). CLDN4 expression was positively correlated with “cholesterol homeostasis”, “coagulation”, and “reactive oxygen species pathway” but was negatively correlated with “E2F targets”, “G2M checkpoint”, “angiogenesis”, “MYC targets”, and “epithelial-mesenchymal transition: (Fig. 10C). GPR37 expression was positively correlated with “peroxisome”, “bile acid metabolism”, and “P53 pathway” and negatively correlated with “hedgehog signalling”, “reactive oxygen species pathway”, “coagulation”, and “hypoxia” (Fig. 10D). The interaction relationships among the 3 hub genes is shown in Figure 10E.

3.7. Investigation of the relationships between hub genes and infiltrating immune cells

The occurrence and clinical prognosis of tumors are believed to be closely associated with the distribution and infiltration of immune cells in the tumor microenvironment.[9] This study used CIBERSORT to analyze the associations between the hub genes and immune infiltration in TCGA; CIBERSORT software (http://cibersort.stanford.edu/) is used to analyze the presence of different immune cell types in complex tissues using the gene expression profiles of the cells. The results indicated a significant positive correlation between ALDH1A2 expression and naive B cells and negative correlations with follicular helper T cells, activated natural killer cells, and memory B cells; CLDN4 expression was significantly and positively correlated with activated NK cells, activated dendritic cells, and memory B cells and negatively correlated with naive B cells; GPR37 expression exhibited a significant positive correlation with M1 macrophages (Fig. 11).

Figure 11.

Figure 11.

Correlation between the hub genes and infiltrating immune cells. (A) ALDH1A2; (B) CLDN4; (C) GPR37. Correlation strength between genes and immune cells is depicted by the dot size, with large dots denoting a strong correlation. The color of the dots represents the P value; the greener the color, the lower is the P value.

3.8. Drug sensitivity analysis of the hub genes in OC

The associations between the hub genes and commonly used chemotherapeutic drugs were explored using chemosensitivity data in the GDSC database. The results showed that OC patients showing low ALDH1A2 expression had high IC50 values for dasatinib (Fig. 12). Patients with high CLDN4 expression had a high IC50 value for cisplatin but low IC50 values for dasatinib, erlotinib, and gefitinib (Fig. 13). The IC50 value for gefitinib was decreased in patients with high expression of GPR37 (Fig. 14).

Figure 12.

Figure 12.

Relationship between the IC50 value of chemotherapeutic drugs and ALDH1A2 expression.

Figure 13.

Figure 13.

Relationship between the IC50 value of chemotherapeutic drugs and CLDN4 expression.

Figure 14.

Figure 14.

Relationship between the IC50 value of chemotherapeutic drugs and GPR37 expression.

3.9. Correlations of the hub genes with TMB and neoantigens

Our results showed that patients with high expression of ALDH1A2 and CLDN4 had lower TMB than those with low levels of these genes (Fig. 15A–C). Moreover, the patients with low ALDH1A2 expression could produce a larger number of neoantigens than those with high ALDH1A2 expression (Fig. 15D–F).

Figure 15.

Figure 15.

(A–C) Relationship between the TMB level and hub gene expression. (D–F) Relationship between neoantigens and hub gene expression. TMB = Tumor Mutation Burden.

4. Discussion

Aldehyde dehydrogenase 1 family member A2 (ALDH1A2), a rate-limiting enzyme, catalyzes the synthesis of retinoic acid in the cell.[10] This enzyme mainly metabolizes octanal and decanal but does not efficiently metabolize citral, benzaldehyde, acetaldehyde, and propanal.[11] The results of GEPIA and qRT-PCR showed that ALDH1A2 expression was downregulated in OC. Previous study found the same result in different OC cell lines and ALDH1A2 served as a tumor suppressor gene that could attenuate the proliferation and invasion of OC.[12] Our results through TCGA database analysis suggest that OC patients with low levels of ALDH1A2 have a high survival rate; however, Jung-A Choi showed the possible association of low ALDH1A2 expression with an unfavorable prognosis and shorter disease-free and overall survival for OC patients.[12] The present study is limited by a small sample size, which warrants further studies with a larger OC sample size to verify the results.

It has been found that CLDN4, a cell tight junction (TJ) protein that encodes claudin-4, is overexpressed in human epithelial ovarian carcinomas and that CLDN4 overexpression stimulates the migration and invasion of ovarian epithelial cells.[13] CLDN4 may have a role in the aggressive behavior and chemoresistance of OC.[14,15] Furthermore, claudin-4 could be a promising target for the development of a therapeutic antibody as it is a transmembrane protein with 2 extracellular loops.[16] CLDN4 is a receptor for the Clostridium perfringens enterotoxin (CPE) which was used for targeted OC therapy.[17] CPE cytotoxicity was mainly dependent upon claudin-4 expression.[18] Our experiment result also confirmed CLDN4 overexpressed in OC. A study on the relationship between the expression of claudin 4 and the prognosis in 500 patients with OC showed that there was no association between claudin 4 expression and disease-specific survival or relapse-free survival.[19] Another study in 42 high grade serous OC also found no association between claudin 4 expression and survival.[20] However, in clear cell renal cell carcinoma and breast cancer claudin 4 overexpression was associated with poor outcome.[21,22] Our bioinformatic analysis showed that low CLDN4 expression was associated with a high survival rate in OC patients, which was not completely consistent with previous research results.

G protein-coupled receptor 37 (GPR37), a representative of the GPCR family, is also known as parkin-associated endothelin receptor-like receptor,[23] and its expression in different cancers varies accordingly. The expression of GPR37 had been reported reduced in hepatocellular carcinoma and its low expression was associated with poor prognosis in hepatocellular carcinoma patients.[24] However in another study, GPR37 exhibited high expression levels in lung adenocarcinoma, with GPR37 upregulation found to be associated with poor outcomes.[25] The relationship between GPR37 expression and OC prognosis, however, is undocumented. Our results demonstrated that GPR37 expression was upregulated in OC samples through experiment and based on TCGA database we revealed that OC patients with high levels of GPR37 had a high survival rate. Based on the GSVA, we speculate that GPR37 affects the prognosis of OC patients by upregulating the P53 pathway or downregulating the hedgehog signaling pathway; however, the investigation of its precise function warrants further experiments in vitro and in vivo.

Studies have shown that the components of the tumor microenvironment are dynamic and that many biological behaviors of tumors, such as invasion and metastasis, are related to the tumor microenvironment.[26] The CIBERSORT analysis indicated that the 3 hub genes were strongly correlated with tumor-associated immune cells. It is speculated that the increased ALDH1A2 expression may accelerate OC progression by increasing the content of naive B cells or reducing the contents of follicular helper T cells, activated NK cells, and memory B cells. CLDN4 may enhance tumor growth, invasion, and metastasis by increasing the contents of activated NK cells, activated DCs, and memory B cells or by reducing the content of naive B cells. GPR37 may inhibit OC development by increasing the content of M1 macrophages.

The drug sensitivity results indicated that patients with high ALDH1A2 expression are more sensitive to dasatinib than those with low ALDH1A2 expression. Resistance to dasatinib in patients with OC may be due to decreased expression of ALDH1A2. Patients with high levels of CLDN4 were more sensitive to dasatinib, erlotinib, and gefitinib, whereas those with the low CLDN4 expression may develop resistance to dasatinib, erlotinib, and gefitinib. High expression of CLDN4 may also contribute to cisplatin resistance in OC patients. Patients with high GPR37 expression were more sensitive to gefitinib than those with low GPR37 expression. Resistance to gefitinib in OC patients may be due to decreased GPR37 expression. The 3 hub genes selected in this study could thus be novel gene targets for studying the molecular mechanisms of sensitivity or resistance to chemotherapeutic drugs.

The use of immune-checkpoint inhibitors (ICIs) is a recently developed therapeutic strategy for various cancers. However, the response of patients to anti-programmed cell death protein-1 (PD-1) or programmed cell death protein-ligand 1 (PD-L1) therapies varies greatly.[27] TMB is considered to be a useful independent biomarker for predicting patient response to inhibitors of the PD-1/PD-L1 pathway.[28] Initially, TMB was identified as a biomarker for ICI treatment in melanoma.[29] High levels of TMB have also been associated with favorable outcomes in patients with non-small cell lung cancer,[30,31] bladder cancer,[32] and microsatellite instability cancers[33,34] treated with a PD-1/PD-L1 blockade. Based on the analysis of clinical data from TCGA, studies have shown that any alteration in the DNA damage response (DDR) is significantly associated with high TMB and that patients with high TMB exhibit a favorable survival outcome.[35] Our results showed that the expression of both ALDH1A2 and CLDN4 was significantly correlated with TMB and that low ALDH1A2 expression could produce a larger number of neoantigens. The TMB in patients with low ALDH1A2 and CLDN4 expressions was high, indicating that ALDH1A2 and CLDN4 gene expression may be related to the anti-tumor immune response. Furthermore, the survival analysis showed that low levels of both genes were associated with better outcomes. Thus, we speculate that the high TMB caused by alterations in ALDH1A2 and CLDN4 may promote an anti-tumor immune response, thereby improving the prognosis of OC patients. In addition, ALDH1A2 could be used as a potential indicator for the identification of OC patients who could benefit from ICI treatment.

5. Conclusion

The present study identified 3 hub genes (ALDH1A2, CLDN4, GPR37) as prognostic biomarkers, as well as effective indicators that may be valuable in targeted OC treatment. However, our research used the secondary mining and analysis of previously released datasets. We only verified the expression of the screened hub genes at the mRNA level, which carries certain limitations. Thus, more empirical evidence is needed to explore the function of the genes identified in this study.

Author contributions

ZY conception and design, provision of study materials or patients, collection and assembly of data, data analysis and interpretation. LO administrative support. All authors read and approved the final manuscript.

Conceptualization: Zhong Yu.

Data curation: Zhong Yu.

Project administration: Ling Ouyang.

Writing – original draft: Zhong Yu.

Writing – review & editing: Zhong Yu.

Abbreviations:

ALDH1A2 =
aldehyde dehydrogenase 1 family member A2
CLDN4 =
claudin-4
CPE =
clostridium perfringens enterotoxin
DDR =
DNA damage response
GDSC =
Genomics of Drug Sensitivity in Cancer
GeneMANIA =
Gene Multiple Association Network Integration Algorithm
GEO =
Gene Expression Omnibus
GEPIA =
Gene Expression Profiling Interactive Analysis
GO =
Gene Ontology
KEGG =
kyoto encyclopedia of genes and genomes
GPR37 =
G protein-coupled receptor 37
GSVA =
gene set variation analysis
ICIs =
immune-checkpoint inhibitors
OC =
ovarian cancer
PD-1 =
programmed death 1
PD-L1 =
programmed cell death protein-ligand 1
PPI =
protein-protein interaction
qRT-PCR =
quantitative Real-Time PCR
TCGA =
The Cancer Genome Atlas
TMB =
Tumor Mutation Burden
TOM =
topological overlap matrix
WGCNA =
Weighted Gene Co-expression Network Analysis

The study was approved by the Ethics Committee of the Shengjing Hospital of China Medical University (2021PS823K). All included patients gave their written informed consent.

The authors have no conflicts of interest to disclose.

All data generated or analyzed during this study are included in this published article [and its supplementary information files]. The data used to support the findings of this study are included within the article.

How to cite this article: Yu Z, Ouyang L. Identification of prognosis-related hub genes of ovarian cancer through bioinformatics analyses and experimental verification. Medicine 2022;101:36(e30374).

References

  • [1].Hennessy BT, Coleman RL, Markman M. Ovarian cancer. Lancet. 2009;374:1371–82. [DOI] [PubMed] [Google Scholar]
  • [2].Zeppernick F, Meinhold-Heerlein I. The new FIGO staging system for ovarian, fallopian tube, and primary peritoneal cancer. Arch Gynecol Obstet. 2014;290:839–42. [DOI] [PubMed] [Google Scholar]
  • [3].Schiavone MB, Herzog TJ, Lewin SN, et al. Natural history and outcome of mucinous carcinoma of the ovary. Am J Obstet Gynecol. 2011;205:480 e1–8. [DOI] [PubMed] [Google Scholar]
  • [4].Van Haaften-Day C, Shen Y, Xu F, et al. OVX1, macrophage-colony stimulating factor, and CA-125-II as tumor markers for epithelial ovarian carcinoma: a critical appraisal. Cancer. 2001;92:2837–44. [DOI] [PubMed] [Google Scholar]
  • [5].Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9:559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Ravasz E, Somera AL, Mongru DA, et al. Hierarchical organization of modularity in metabolic networks. Science. 2002;297:1551–5. [DOI] [PubMed] [Google Scholar]
  • [7].Schmittgen TD, Livak KJ. Analyzing real-time PCR data by the comparative CT method. Nat Protocols. 2008;3:1101–8. [DOI] [PubMed] [Google Scholar]
  • [8].Yarchoan M, Johnson BA, Lutz ER, et al. Targeting neoantigens to augment antitumour immunity. Nat Rev Cancer. 2017;17:209–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Hou P, Kapoor A, Zhang Q, et al. Tumor microenvironment remodeling enables bypass of oncogenic KRAS dependency in pancreatic cancer. Cancer Discov. 2020;10:1058–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Vasiliou V, Nebert DW. Analysis and update of the human aldehyde dehydrogenase (ALDH) gene family. Hum Genomics. 2005;2:138–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Kasimanickam VR. Expression of retinoic acid-metabolizing enzymes, ALDH1A1, ALDH1A2, ALDH1A3, CYP26A1, CYP26B1 and CYP26C1 in canine testis during post-natal development. Reprod Domest Anim. 2016;51:901–9. [DOI] [PubMed] [Google Scholar]
  • [12].Choi JA, Kwon H, Cho H, et al. ALDH1A2 is a candidate tumor suppressor gene in ovarian cancer. Cancers. 2019;11:1553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Hough CD, Sherman-Baust CA, Pizer ES, et al. Large-scale serial analysis of gene expression reveals genes differentially expressed in ovarian cancer. Cancer Res. 2000;60:6281–7. [PubMed] [Google Scholar]
  • [14].Agarwal R, D’Souza T, Morin PJ. Claudin-3 and claudin-4 expression in ovarian epithelial cells enhances invasion and is associated with increased matrix metalloproteinase-2 activity. Cancer Res. 2005;65:7378–85. [DOI] [PubMed] [Google Scholar]
  • [15].Stewart JJ, White JT, Yan X, et al. Proteins associated with Cisplatin resistance in ovarian cancer cells identified by quantitative proteomic technology and integrated with mRNA expression levels. Mol Cell Proteomics. 2006;5:433–43. [DOI] [PubMed] [Google Scholar]
  • [16].Morin PJ. Claudin proteins in human cancer: promising new targets for diagnosis and therapy. Cancer Res. 2005;65:9603–6. [DOI] [PubMed] [Google Scholar]
  • [17].Litkouhi B, Kwong J, Lo CM, et al. Claudin-4 overexpression in epithelial ovarian cancer is associated with hypomethylation and is a potential target for modulation of tight junction barrier function using a C-terminal fragment of Clostridium perfringens enterotoxin. Neoplasia. 2007;9:304–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Tanaka S, Aoyama T, Ogawa M, et al. Cytotoxicity of clostridium perfringens enterotoxin depends on the conditions of claudin-4 in ovarian carcinoma cells. Exp Cell Res. 2018;371:278–86. [DOI] [PubMed] [Google Scholar]
  • [19].Boylan KL, Misemer B, DeRycke MS, et al. Claudin 4 is differentially expressed between ovarian cancer subtypes and plays a role in spheroid formation. Int J Mol Sci. 2011;12:1334–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Litkouhi B, Kwong J, Lot CM, et al. Claudin-4 overexpression in epithelial ovarian cancer is associated with hypomethylation and is a potential target for modulation of tight junction barrier function using a C-terminal fragment of Clostridium perfringens enterotoxin. Neoplasia. 2007;9:304–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Lanigan F, McKiernan E, Brennan DJ, et al. Increased claudin-4 expression is associated with poor prognosis and high tumour grade in breast cancer. Int J Cancer. 2009;124:2088–97. [DOI] [PubMed] [Google Scholar]
  • [22].Lechpammer M, Resnick MB, Sabo E, et al. The diagnostic and prognostic utility of claudin expression in renal cell neoplasms. Mod Pathol. 2008;21:1320–9. [DOI] [PubMed] [Google Scholar]
  • [23].Marazziti D, Gallo A, Golini E, et al. Molecular cloning and chromosomal localization of the mouse Gpr37 gene encoding an orphan G-protein-coupled peptide receptor expressed in brain and testis. Genomics. 1998;53:315–24. [DOI] [PubMed] [Google Scholar]
  • [24].Liu F, Zhu C, Huang X, et al. A low level of GPR37 is associated with human hepatocellular carcinoma progression and poor patient survival. Pathol Res Pract. 2014;210:885–92. [DOI] [PubMed] [Google Scholar]
  • [25].Wang J, Xu M, Li DD, et al. GPR37 promotes the malignancy of lung adenocarcinoma via TGF-beta/Smad pathway. Open Med (Wars). 2021;16:24–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Binnewies M, Roberts EW, Kersten K, et al. Understanding the tumor immune microenvironment (TIME) for effective therapy. Nat Med. 2018;24:541–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Topalian SL, Drake CG, Pardoll DM. Immune checkpoint blockade: a common denominator approach to cancer therapy. Cancer Cell. 2015;27:450–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Chan TA, Yarchoan M, Jaffee E, et al. Development of tumor mutation burden as an immunotherapy biomarker: utility for the oncology clinic. Ann Oncol. 2019;30:44–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Johnson DB, Pollack MH, Sosman JA. Emerging targeted therapies for melanoma. Expert Opin Emerg Drugs. 2016;21:195–207. [DOI] [PubMed] [Google Scholar]
  • [30].Carbone DP, Reck M, Paz-Ares L, et al. First-line nivolumab in stage IV or recurrent non-small-cell lung cancer. N Engl J Med. 2017;376:2415–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Rizvi H, Sanchez-Vega F, La K, et al. Molecular determinants of response to anti-programmed cell death (PD)-1 and anti-programmed death-ligand 1 (PD-L1) blockade in patients with non-small-cell lung cancer profiled with targeted next-generation sequencing. J Clin Oncol. 2018;36:633–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Rosenberg JE, Hoffman-Censits J, Powles T, et al. Atezolizumab in patients with locally advanced and metastatic urothelial carcinoma who have progressed following treatment with platinum-based chemotherapy: a single-arm, multicentre, phase 2 trial. Lancet. 2016;387:1909–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Le DT, Uram JN, Wang H, et al. PD-1 Blockade in Tumors with Mismatch-Repair Deficiency. N Engl J Med. 2015;372:2509–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Le DT, Durham JN, Smith KN, et al. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science. 2017;357:409–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Tian W, Shan B, Zhang Y, et al. Association of high tumor mutation (TMB) with DNA damage repair (DDR) alterations and better prognosis in ovarian cancer. J Clin Oncol. 2018;36(15_suppl):5512. [Google Scholar]

Articles from Medicine are provided here courtesy of Wolters Kluwer Health

RESOURCES