Skip to main content
Heliyon logoLink to Heliyon
. 2023 Apr 7;9(4):e15096. doi: 10.1016/j.heliyon.2023.e15096

Identifying TME signatures for cervical cancer prognosis based on GEO and TCGA databases

Wen-Tao Xia a, Wang-Ren Qiu a,, Wang-Ke Yu a, Zhao-Chun Xu a, Shou-Hua Zhang b
PMCID: PMC10121839  PMID: 37095983

Abstract

The mortality rate from cervical cancer (CESC), a malignant tumor that affects women, has increased significantly globally in recent years. The discovery of biomarkers points to a direction for the diagnosis of cervical cancer with the advancement of bioinformatics technology.

The goal of this study was to look for potential biomarkers for the diagnosis and prognosis of CESC using the GEO and TCGA databases. Because of the high dimension and small sample size of the omic data, or the use of biomarkers generated from a single omic data, the diagnosis of cervical cancer may be inaccurate and unreliable. The purpose of this study was to search the GEO and TCGA databases for potential biomarkers for the diagnosis and prognosis of CESC. We begin by downloading CESC (GSE30760) DNA methylation data from GEO, then perform differential analysis on the downloaded methylation data and screen out the differential genes. Then, using estimation algorithms, we score immune cells and stromal cells in the tumor microenvironment and perform survival analysis on the gene expression profile data and the most recent clinical data of CESC from TCGA. Then, using the ‘limma’ package and Venn plot in R language to perform differential analysis of genes and screen out overlapping genes, these overlapping genes were then subjected to GO and KEGG functional enrichment analysis. The differential genes screened by the GEO methylation data and the differential genes screened by the TCGA gene expression data were intersected to screen out the common differential genes. A protein-protein interaction (PPI) network of gene expression data was then created in order to discover important genes. The PPI network's key genes were crossed with previously identified common differential genes to further validate them. The Kaplan-Meier curve was then used to determine the prognostic importance of the key genes. Survival analysis has shown that CD3E and CD80 are important for the identification of cervical cancer and can be considered as potential biomarkers for cervical cancer.

Keywords: DNA methylation, Differentially expressed genes, Tumor microenvironment, Cervical cancer, Biomarkers

1. Introduction

Although cervical cancer (CESC) is a common female malignancy that is considered as the only one with a known cause in human tumors, there has been no significant decline in the incidence and mortality of CESC [1]. Worldwide, there were roughly 570,000 new instances of cervical cancer in 2018, making up 3.15% of all malignant tumors, and about 310,000 fatalities, making up 3.26% of all malignant tumor deaths. Thus, the global cervical cancer disease burden is severe [2,3]. According to research data, cervical cancer, followed with breast, colorectal and lung cancers, is the 4th most prevalent malignancy among women worldwide and is ranked 8th among all malignancies in both sexes combined [4]. Cervical cancer is the sixth most prevalent malignant tumor in Chinese women, accounting for 2.83% of all malignant tumors in 2015, and over 34,000 fatalities from cervical cancer, accounting for 1.45% of all malignant tumor deaths [[5], [6], [7]].

Cervical cancer (CESC) can be characterized in terms of classification, staging, surgical treatment and prognosis. The treatment of cervical cancer (CESC) has become an important clinical issue because of its late diagnosis and poor prognosis due to its occult onset and non-specific clinical signs [8]. The tumor microenvironment (TME) refers to the surrounding environment in which tumor cells exist, which includes blood vessels, immune cells, fibroblasts, bone marrow-derived inflammatory cells, stromal cells, and other signaling molecules. Immune cells and stromal cells, for example, are critical components of carcinogenesis and development. According to previous research, the tumor microenvironment has an important role in the occurrence, development, prognosis, and treatment of cancers [9,10]. Therefore, it is crucial to understand how the TME's components interact with one another while studying malignancies. The interaction between stromal cells and tumor cells, according to new research, enhances tumor development, invasion, and spread. A considerable influence on the features of tumors is exerted by the growth factors, cytokines, and chemokines that stromal cells can release [11]. As a result, elucidating the characteristics of TME in CESC will aid in our understanding of the pathological mechanisms underlying CESC and provide a cue for future treatment.

Because of the advancement of bioinformatics technology, some researchers may quickly collect high-throughput omics data from numerous databases in order to study cancer alterations and mine relevant biomarkers. As is known to bioinformatics researchers, biomarkers are biochemical indicators that can mark changes or possible changes in the structure or function of systems, organs, tissues, cells, and sub-cells and have a very wide range of uses, such as disease diagnosis [12], determining the occurrence, development, and prognosis of tumors [[13], [14], [15]] which provide new directions for cancer diagnosis and prognosis. Many researchers are currently studying the pathogenesis of various cancers using histological data [[16], [17], [18]] and using individual histological data to find relevant biomarkers. However, searching for biomarkers solely on the basis of individual histological data is insufficient, whereas multi-omics data can provide a more comprehensive analysis of the entire genomic profile. On the other hand, the combination of different omics data must be based on a deep understanding of them in order to make reliable and valid connections. Nevertheless, there is a few studies using multi-omics data for CESC correlation [17,[19], [20], [21]].

In this study, GEO DNA methylation data, TCGA gene expression data, and the most recent clinical data were used to mine the relevant biomarkers of CESC [22,23]. Firstly, we screened out two kinds of differentially expressed genes (DEGs), one from differential analysis of methylation data, and the other from the differential analysis of the performance of gene expression data in clinical analysis. Next, the estimation algorithm was used on the gene expression data to calculate stromal and immune cell scores in cervical cancer tumor tissue to calculate tumor purity and to predict the association between stromal/immune cells and clinical features. Secondly, the differential genes filtered out by GEO methylation data and TCGA gene expression data were intersected to screen out the common differential genes. In the third step, the overlapping DEGs in the TCGA dataset were subjected to GO and KEGG functional enrichment analysis [24,25], which revealed elevated immune-related signaling pathways. There is a direct correlation between tumor malignancy and poor prognosis, thus the top 10 pivotal genes were identified by scrutinizing the PPI network. Finally, the ten key genes were intersected with the previous common difference genes to screen out the key genes, and the Kaplan-Meier curve was used to validate the prognostic value of the key genes. Survival analysis has shown that CD3E and CD80 are important in identifying cervical cancer and are biomarkers of cervical cancer.

2. Materials and methods

Our research in this section focuses on mining CESC-related biomarkers. Fig. 1 depicts the process's three major components: data pre-processing, data analysis, and data validation.

Fig. 1.

Fig. 1

Workflow for mining cervical cancer TME signatures.

2.1. Ethical guidelines and informed consent

Because no patients were recruited and no personal information was collected, and the data included in the study were obtained from public databases, no ethical approval or patient consent were required (GEO and TCGA).

2.2. Data collection

From the Gene Expression Comprehensive Database, we got the CESC microarray dataset for this investigation [26]. We found that the sample size of the CESC dataset in the GEO database utilizing NGS methods is relatively tiny when we compared the microarray dataset to the next generation sequencing (NGS) dataset. It would be difficult to arrive at a convincing conclusion due to the high dimension and small sample size data set [16]. We therefore analyzed CESC by using clinical data and the microarray dataset comprised of two types of histological data, namely DNA methylation data from GEO with gene expression data from TCGA.

The TCGA data portal (https://portal.gdc.cancer.gov/) was used to obtain RNA gene expression data and corresponding clinical information for CESC patients. The following were the inclusion criteria for the CESC sample: (1) RNA gene expression data for CESC; (2) comprehensive clinical details, including age, stage, gender, and overall survival, for CESC patients [18,27]. Finally, 306 CESC patients were enrolled in the study.

The GEO data portal (https://www.ncbi.nlm.nih.gov/geo/) was used to obtain DNA methylation data for CESC patients. The inclusion criteria for CESC samples were as follows: DNA methylation data for CESC. Finally, 215 CESC patients were enrolled in the study.

2.3. Stromal and immune scoring

Malignant tumor tissue consists of tumor cells as well as stromal, immunological, vascular, and normal epithelial cells that are connected to the tumor. It is believed that stromal cells are crucial for tumor development, illness progression, and medication resistance [28]. Immune cells are a key component of normal cells in tumor tissues [29], and they not only interfere with tumor signaling in molecular studies, but they also play an important role in tumor biology.

Estimate is a method for calculating the proportion of stromal and immune cells in tumor samples using gene expression signatures. It is also known as calculating stromal and immune cells in malignant tumor tissues using expression data. The ESTIMATE algorithm analyzes the presence of stromal and immune cells in malignant tissue, forecasts the immune and stromal scores, and hence their content, and calculates tumor purity in each tumor sample. If the stromal and immune cell content is high, then tumor purity is low and conversely, tumor purity is high. Today, the ESTIMATE algorithm is being used in gastric adenocarcinoma, head and neck squamous carcinoma, and other related tumors [18,30].

To estimate the tumor purity of each CESC sample, matrix scores and immune scores for TCGA gene expression data were calculated using an estimation algorithm (https://bioinformatics.mdanderson.org/estimate/). The correlations between stromal/immune scores and clinical characteristics were then examined for the TCGA clinical data sample. Finally, the predictive usefulness of the stromal and immunological scores was validated using the Kaplan-Meier technique [31]. The Kaplan-Meier algorithm was proposed by British scientists Kaplan and Meier in 1958 and uses the probability multiplication theorem to calculate survival rates [32], so it is also known as the multiplicative limit method.

2.4. Heatmap cluster analysis and identification of differentially expressed genes (DEGs)

Heatmap cluster analysis is a heatmap plotted with the expressed genes in each sample, with each column representing a sample and each row representing a gene, and the color shade in the plot indicating the expressed gene in that sample [33,34].

The R software's “limma” package is used to recognize DEGs [35]. The thresholds for identifying DEGs in RNA gene expression data should be |logFC|>2, p-value<0.05, and FDR<0.01. The same method is used to identify DEGs in DNA methylation data, and the thresholds should satisfy |logFC|>1, p-value<0.05, and FDR<0.01. In distinct samples, genes were variably expressed, hypermethylated, and hypomethylated. This gene is perhaps connected to CESC. As a result, it is plausible to believe that the overlap of DEGs is connected to CESC, thus we employed the intersection of the differential genes in methylation data and the differential genes in gene expression data.

2.5. Functional enrichment analysis using GO and KEGG

We used GO and KEGG enrichment analysis on differential genes in gene expression data to learn more about the underlying biological processes, cellular components, physiological activities, and enriched signaling pathways of DEGs [36]. Gene function enrichment analysis, which refers to statistical analysis carried out with the aid of various databases and analytical tools, excavates the functional categories of genes in the database that have significant relevance to the biological questions we are addressing. The statistical principle is to test the significance of a certain functional class within a set of genes (co-expressed or differentially expressed) using a hypergeometric distribution. It can be used to derive the functional classes of genes that are targeted, significantly associated with the experimental purpose, or have low false positive rates by significance analysis with discrete distribution, enrichment analysis, and false-positive analysis [37].

2.6. PPI network analysis and key gene screening

In the literature related to transcriptional regulation [38,39], protein interaction networks are often involved in the deeply analysis. Specifically, the related work usually finds a series of differentiated expressed genes or proteins between different grouped samples by RNA-seq, expression profiling microarrays or proteomic analysis [39,40]. Subsequently, the STRING database (https://string-db.org/) was searched for possible potential interactions between encoded proteins and a protein interaction network representation was constructed with the aim of describing what kind of interrelationships exist between these genes or proteins, e.g. physical contact, targeting regulation, etc., and ultimately elaborating a meaningful molecular regulatory network in the organism [41].

The DEGs that were identified were entered into a string database (https://string-db.org/), then the interactions between overlapping DEGs products were obtained, and a PPI network was constructed and visualized by using Cytoscape software [42]. Finally, we used the plug-in ' cytoHubba’ to calculate the node scores of genes in the PPI network, and the top ten key genes were chosen.

2.7. Survival analysis and TIMER database validation

The Kaplan-Meier algorithm, also called the product-limit method, was proposed by British scientists Kaplan and Meier in 1958 and can be used to calculate survival rates [43] with the probability multiplication theorem. In fact, DEGs linked with poor prognosis in CESC were identified using the Kaplan-Meier survival analysis with a log-rank test [32,44]. For univariate analysis comparing clinical features and stroma/immune scores, the log-rank test was utilized. During the testing process, P < 0.05 values were considered statistically significant, and R software was used to plot Venn diagrams, heatmaps, and survival curves.

TIMER database (http://timer.comp-genomics.org/) is a website that uses high-throughput sequencing data to analyze the infiltration of immune cells in tumor tissues [[45], [46], [47]]. It mainly provides B cells, CD4+ T cells, CD8+ T cells and other six infiltration of immune cells. After survival analysis validation, the screened genes were imported into the database to confirm the immune correlation of these genes with cervical cancer.

3. Results

3.1. GEO methylation data differential gene screening

Based on the platform annotation file, probes in GEO's DNA methylation data cell file are transformed to genes. When numerous probes match the same gene, the average is calculated, and genes with null results are excluded. Unlike gene expression data, DNA methylation data must be changed from Beta to M values after normalization, as M values are better suited for statistical analyses used to assess the methylation ratio at each CpG site [19]. A specific transformation is illustrated in Equation (1), where beta is derived by computing the intensity ratio between methylation and unmethylated alleles. It is a continuous variable having a range of values from 0 to 1. Beta ≥ 0.6 was considered fully methylated and 0.2 ≤ Beta ≤ 0.6 was partially methylated when Beta ≤ 0.2 was completely unmethylated.

M=log2(1BetaBeta) (1)

In this paper, methylation data were analyzed in R using the “limma” package. The |logFC |>1, p-value<0.05, and FDR<0.01 thresholds should be met for the DNA methylation data used to identify DEGs. Fig. 2 depicts the screening of 122 up-regulated genes and 1783 down-regulated genes, followed by the screening of 51 crossover genes using a Venn diagram.

Fig. 2.

Fig. 2

Venn diagram of up- and down-regulated genes.

3.2. Stromal and immune scoring

To determine stromal and immunological scores, 306 patients' tumor tissue RNA gene expression data were retrieved from the TCGA database. According to the ESTIMATE algorithm for survival analysis, the immune scores ranged from −1166.31 to 3346.06 (Fig. 3 (a_1)) and the interstitial scores ranged from −2458.38 to 871.12 (Fig. 3 (a_2). Cases were separated into high- and low-score groups based on stromal/immune cell content to evaluate the potential association between OS and stromal and immune scores. As shown in Fig. 3 (b_1), Survival research found that patients with high numbers of immune cells had a longer and strongly associated survival time (p = 0.038), whereas there was no significant difference in OS between the low stromal score group and the high stromal score group until p > 0.255 (see Fig. 3 (b_2)).

Fig. 3.

Fig. 3

(a) Immune score and stromal score; (b) Survival analysis; (c) Relationship between immune score and stromal score and clinical staging.

Stromal and immune cells were analyzed in conjunction with clinical staging, as shown in Fig. 3 (c_1). The immune scores are ranging from −203.65 to 1500.73 and stromal scores distribute between −1700.63 and −474.73. There was also a significant difference in immune scores by stage (p = 0.024), indicating that immune cell levels may play a role in tumor treatment. There was no statistically significant difference in stromal scores by stage (p 0.738, Fig. 3 (c_2)).

3.3. Differential gene screening of TCGA gene expression data

By dividing the gene expression data of CESC into high and low-score groups of stromal/immune cells, the gene expression of 306 cases is shown in the heatmap. Heatmap of DEGs in stromal scores and immune scores were shown in Fig. 4 (a_1) and Fig. 4 (a_2), respectively. Genes with higher expression were depicted in red on the heatmaps, while genes with lower expression were depicted in green, and the same expression levels were marked in black.1355 stromal up-regulated genes and 46 stromal down-regulated genes were identified based on the comparative high and low matrix scores. There were 1162 up-regulated genes and 619 down-regulated genes identified in the immune score group. The Venn diagram additionally revealed that 776 identical genes were up-regulated in the stroma and immunological score groups, whereas 26 identical genes were down-regulated (Fig. 4 (b)). As a result of drawing the Venn diagram intersection, a total of 802 genes were screened as DEGs, and the thresholds should fulfill |logFC|>2, p-value<0.05, and FDR<0.01.

Fig. 4.

Fig. 4

(a) Heatmaps are created using the mean linkage method and the Pearson distance metric. Genes with greater levels of expression are highlighted in red, genes with lower levels of expression are highlighted in green, and genes with the same level of expression are highlighted in black; (b) Venn diagram depicting the number of DEGs that are up- or down-regulated in the stromal and immune scoring groups.

3.4. Functional assessment

We further analyzed the gene expression data of the differentially expressed genes (DEGs) to understand the underlying biological processes, cellular components, physiological functions, and enriched signaling pathways. We did this by performing Gene Ontology (GO) analysis and KEGG enrichment analysis on the differentially expressed genes in the gene expression data.

The Gene Ontology Consortium created GO, a database that can be updated as research advances, to create a semantic vocabulary standard for qualifying and describing gene and protein function across a wide range of species [36]. GO provides a set of semantics for describing concepts/classes of gene function and the relationships between these concepts. GO is the functional annotation of which pathways each gene may be involved in. The gene products are standardized in terms of cellular component (CC), molecular function (MF) and biological process (BP), and are then briefly annotated to give a rough idea of the biological function, pathway or cellular localization in which the differential gene is enriched by GO enrichment analysis.

The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding high-level functions and biological systems, as well as practical procedures for genome sequencing and other high-throughput experimental techniques generated from molecular-level information, particularly large molecular datasets. KEGG is functionally enriched, analyzing which functions the gene set (or multiple genes) may be significantly concentrated on, and arguably in which pathways.

The differential genes were highly correlated with the immune response, according to GO and KEGG enrichment analysis of the differential genes. According to the GO enrichment analysis, DEGs were significantly enriched in biological processes like leukocyte mediated immunity, lymphocyte mediated immunity, immunoglobulin complex, and immunoglobulin receptor binding, as shown in Fig. 5(A). The KEGG enrichment analysis revealed that DEGs were significantly enriched in pathways such as Cytokine-cytokine receptor interaction, chemokine signaling pathway, and cell adhesion molecules, as shown in Fig. 5(B). At the same time, KEGG enrichment analysis showed that the increase of immune related signal pathways (such as T cell pathway) was directly related to tumor malignancy and poor prognosis, which meant that most DEG related pathways were significantly related to immune cells and had an inhibitory effect on the growth of cancer cells.

Fig. 5.

Fig. 5

Results of functional enrichment analysis of DEGs (a) Results of GO enrichment analysis; (b) Results of KEGG enrichment analysis.

3.5. Building PPI networks and identifying key genes

The interaction of DEGs was investigated further by constructing a PPI network (as shown in Fig. 6(a)) using the online website https://string-db.org/ and the Cytoscape software [48]. Individual networks with ten or more nodes were included for further analysis; networks with fewer than ten nodes were excluded, and the connectivity of each network node was calculated. The top 10 key genes CD4, CD3E, IL10, LCP2, CD28, CD80, CD3D, CD8A, PTPRC, and ITK were then screened from the PPI network using the plugin cytoHubba (Fig. 6 (b)).

Fig. 6.

Fig. 6

(a)PPI network; (b)Top ten key genes.

3.6. Validation of survival analysis and TIMER analysis

Survival analysis was performed on the top 10 key genes screened by the PPI network to investigate the potential prognostic value of individual DEGs. Among these ten key genes, eight were significantly associated with poorer overall survival (P < 0.05), with CD3E and CD80 being two genes shared by GEO methylation data differential genes and TCGA gene expression data differential genes. It can be seen from Fig. 7(a) that the p value of CD3E is less than 0.05, and Fig. 7(b) shows that the p value of CD80 is also less than 0.05, which is significantly correlated with survival, indicating that these two genes have potential prognostic value.

Fig. 7.

Fig. 7

Validation of the correlation between genes extracted from the TCGA and GEO databases and overall survival rates.

Finally, we performed further immunological validation of the screened CD3E and CD80 using the TIMER database. As shown in Fig. 8(a), CD3E has a high correlation with immunotherapy of cervical cancer, and as shown in Fig. 8(b), CD80 also has a relatively high correlation with immunity of cervical cancer. It can be shown that these two genes have potential prognostic value.

Fig. 8.

Fig. 8

The immune correlation of CD3E and CD80.

4. Validation of related work

According to the work of Kaufmann am's [49], PBLs and tumor cells cocultured resulted in the proliferation of CD4+ and CD8+ T lymphocyte subsets. When compared to control cell lines, CD80-expressing tumor cells increased the proliferation of allogeneic PBLs by two to sixfold. After three weeks of co-culture with CD80-positive tumor cells, homozygous PBL produced cytotoxic T cells capable of lysing untransfected parental tumor cell lines. The findings show that CD80 expression has an immunostimulatory effect on cervical cancer cells, which provides clues for cervical cancer treatment.

The number of prognostic genes, prognostic immune genes (PIGs), and hazard ratios (HRs) for PIGs varies greatly between cancer types. The study screened 48 common PIGs in at least 5 cancer types from the PIGs of these 6 cancer types. The STRING database shows that 11 of the 48 PIGs are involved in the TCR signaling pathway. The TCR signaling triggering module includes genes like as CD3E and CD3D [50]. PIG expression, which is implicated in TCR signaling, is linked to better overall survival in five cancer types (cervical and other tumors).

Therefore, combined with our research results and relevant literature, we argue that CD80 and CD3E have a great impact on the prognosis of cervical cancer.

5. Discussion

TME signatures may be easily altered during cancer progression, which may make it difficult to translate these findings into practical clinical applications. Related studies have found that there are a large number of cytokines and chemokines in the TME, which play a vital role in tumor invasion and metastasis, such as TNF can regulate tumor progression through a variety of different pathways. Therefore, in the follow-up work, we will further analyze the cytokines and chemokines in the TME of cervical cancer.

HPV infection and viral onco-protein over-expression might be possible reasons to trigger TME remodeling. Known studies have shown that CDKN2A is a reliable host marker for HPV infection, so we performed a gene correlation analysis of CD3E, CD80 and CDKN2A, and the results showed that CD3E, CD80 and CDKN2A were positively correlated with p < 0.05 (as shown in Fig. 9(a) and Fig. 9(b)).

Fig. 9.

Fig. 9

Correlation between CD3E, CD80 and CDKN2A.

CD3E and CD80 play a vital role in the development of T cells. Defects of CD3E and CD80 genes can lead to severe immune deficiency, and the interaction of CDKN2A with CD3E and CD80 is likely to inhibit tumor cells. And play a therapeutic effect on the prognosis of cervical cancer. In the future, we will focus on analyzing the effect of the interaction between CD3E, CD80 and CDKN2A on the induction of cancer.

6. Conclusion

Data mining from the TCGA and GEO datasets has been widely used to predict cancer prognosis, and recent research have indicated that TME plays a major role in CESC growth and progression. As a result, in this study, we intend to mine the TCGA and GEO databases for TME-related genes that have a significant impact on CESC prognosis. These genes are associated with TME's stromal and immunological components.

Using the GEO methylation data, we initially analyzed 51 differential genes in our study (Fig. 2). Second, we extracted stromal and immune scores from TCGA gene expression data to see if they correlated with clinical characteristics and overall survival in CESC patients. The findings revealed that they were related to clinical progression and prognostic indicators such as tumor classification, staging, and so on. Finally, by comparing high and low stromal and immune cell groups, 802 DEGs were found (Fig. 4). Following that, GO enrichment analysis revealed that the majority of the screened genes were involved in TME, whereas KEGG enrichment pathway analysis revealed that the majority of DEGs were significantly associated with immune responses, implying that stromal and immune cell functions are interconnected and comprise the tumor microenvironment in CESC.

After identifying DEGs using TCGA gene expression data, differential genes were imported into the string database (https://string-db.org/) to extract interactions between overlapping DEGs, and Cytoscape software was used to generate and show PPI networks. Then, we use the cytoHubba plugin to compute the node scores of genes in the PPI network and choose the top ten key genes with scores. The researchers then conducted survival analyses to investigate the potential prognostic value of these ten key genes, which revealed a significant correlation between gene expression and poor prognosis in CESC patients. These ten genes (CD4, CD3E, IL10, LCP2, CD28, CD80, CD3D, CD8A, PTPRC, and TYROBP) are important for predicting survival in CESC, demonstrating the utility of our data analysis based on the ESTIMATE algorithm to mine key genes for prognosis in CESC. Among these ten genes, CD3E and CD80 were found to play an important role in methylation data. Finally, survival analysis and a search of the relevant literature revealed that two genes, CD3E and CD80, could be considered as potential prognosis biomarkers for CESC. Further research into these genes in CESC patients could help provide insight and a comprehensive understanding of the potential link between TME and CESC prognosis.

The study's strength is in mining biomarkers by analyzing genes associated with TME using multi-omics data and the most recent clinical datasets. We have carried out extensive bioinformatics analysis, but we still have great limitations - since we do not have a clinical laboratory in medicine, all research is based on the analysis and verification of data in the comprehensive database, so further demonstration of the experiment needs to be proved by relevant clinical experiments. We can continue to investigate the application of other approaches to additional histology data in the future, and we will continue to enhance the methodology of this study. Furthermore, we anticipate that the method used in this study to mine TME-related genes can be widely applied to the analysis of other malignancies, potentially identifying additional biomarkers with prognostic value for CESC or other malignancies.

Availability of data and materials

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Funding

This work was supported by grants from the National Natural Science Foundation of China (No. 62162032,62062043,32270789), and the Science and Technology Research Project of Education Department of Jiangxi Province, China (GJJ2201004). Key Program for S&T Cooperation Projects of Jiangxi Province, China (No.20212BDH80021).

Author's contributions

W-R Q: Conceived and designed the experiments; Wrote the paper.

W-T X: Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

W–K Y: Performed the experiments.

Z-C X: Performed the experiments.

S–H Z: Contributed reagents, materials, analysis data.

All authors have reviewed and approved the submitted version of this manuscript.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Zhao F., Wen Y., Li Y., Tao S., Ma L., Zhao Y., Dang L., Wang Y., Zhao F., Lang J., Qiao Y., Yang C.X. Epidemiologic and health economic evaluation of cervical cancer screening in rural China. Asian Pac. J. Cancer Prev. APJCP. 2020;21:1317–1325. doi: 10.31557/APJCP.2020.21.5.1317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Small W., Jr., Bacon M.A., Bajaj A., Chuang L.T., Fisher B.J., Harkenrider M.M., Jhingran A., Kitchener H.C., Mileshkin L.R., Viswanathan A.N., Gaffney D.K. Cervical cancer: a global health crisis. Cancer. 2017;123:2404–2412. doi: 10.1002/cncr.30667. [DOI] [PubMed] [Google Scholar]
  • 3.Arbyn M., Weiderpass E., Bruni L., de Sanjosé S., Saraiya M., Ferlay J., Bray F. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Global Health. 2020;8:e191–e203. doi: 10.1016/S2214-109X(19)30482-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Twardella D., Geiss K., Radespiel-Troger M., Benner A., Ficker J.H., Meyer M. [Trends in incidence of lung cancer according to histological subtype among men and women in Germany : analysis of cancer registry data with the application of multiple imputation techniques] Bundesgesundheitsblatt - Gesundheitsforsch. - Gesundheitsschutz. 2018;61:20–31. doi: 10.1007/s00103-017-2659-x. [DOI] [PubMed] [Google Scholar]
  • 5.Di J., Rutherford S., Chu C. Review of the cervical cancer burden and population-based cervical cancer screening in China. Asian Pac. J. Cancer Prev. APJCP. 2015;16:7401–7407. doi: 10.7314/apjcp.2015.16.17.7401. [DOI] [PubMed] [Google Scholar]
  • 6.Yuanyue L., Baloch Z., Shanshan L., Yasmeen N., Xiaomei W., Khan J.M., Xueshan X. Cervical cancer, human papillomavirus infection, and vaccine-related knowledge: awareness in Chinese women. Cancer Control. 2018;25 doi: 10.1177/1073274818799306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.He Y., Li X.M., Yin C.H., Wu Y.M. Killing cervical cancer cells by specific chimeric antigen receptor-modified T cells. J. Reprod. Immunol. 2020;139 doi: 10.1016/j.jri.2020.103115. [DOI] [PubMed] [Google Scholar]
  • 8.Yu S., Li X., Zhang J., Wu S. Development of a novel immune infiltration-based gene signature to predict prognosis and immunotherapy response of patients with cervical cancer. Front. Immunol. 2021;12 doi: 10.3389/fimmu.2021.709493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cao R., Yuan L., Ma B., Wang G., Tian Y. Tumour microenvironment (TME) characterization identified prognosis and immunotherapy response in muscle-invasive bladder cancer (MIBC) Cancer Immunol. Immunother. 2021;70:1–18. doi: 10.1007/s00262-020-02649-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ho W.J., Jaffee E.M., Zheng L. The tumour microenvironment in pancreatic cancer - clinical challenges and opportunities. Nat. Rev. Clin. Oncol. 2020;17:527–540. doi: 10.1038/s41571-020-0363-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ramaglia V., Florescu A., Zuo M., Sheikh-Mohamed S., Gommerman J.L. Stromal cell-mediated coordination of immune cell recruitment, retention, and function in brain-adjacent regions. J. Immunol. 2021;206:282–291. doi: 10.4049/jimmunol.2000833. [DOI] [PubMed] [Google Scholar]
  • 12.Bi J., Bi F., Pan X., Yang Q. Establishment of a novel glycolysis-related prognostic gene signature for ovarian cancer and its relationships with immune infiltration of the tumor microenvironment. J. Transl. Med. 2021;19:382. doi: 10.1186/s12967-021-03057-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lee E.C., Fragala M.S., Kavouras S.A., Queen R.M., Pryor J.L., Casa D.J. Biomarkers in sports and exercise: tracking health, performance, and recovery in athletes. J. Strength Condit Res. 2017;31:2920–2937. doi: 10.1519/JSC.0000000000002122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wei L., Zhou C., Chen H., Song J., Su R. ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics. 2018;34:4007–4016. doi: 10.1093/bioinformatics/bty451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Su R., Liu X., Xiao G., Wei L. Meta-GDBP: a high-level stacked regression model to improve anticancer drug response prediction. Briefings Bioinf. 2020;21:996–1005. doi: 10.1093/bib/bbz022. [DOI] [PubMed] [Google Scholar]
  • 16.Hu Y., Lan W., Miller D. Handling high-dimension (High-Feature) MicroRNA data. Methods Mol. Biol. 2017;1617:179–186. doi: 10.1007/978-1-4939-7046-9_13. [DOI] [PubMed] [Google Scholar]
  • 17.Deng Z., Wang J., Xu B., Jin Z., Wu G., Zeng J., Peng M., Guo Y., Wen Z. Mining TCGA database for tumor microenvironment-related genes of prognostic value in hepatocellular carcinoma. BioMed Res. Int. 2019 doi: 10.1155/2019/2408348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ren N., Liang B., Li Y. Identification of prognosis-related genes in the tumor microenvironment of stomach adenocarcinoma by TCGA and GEO datasets. Biosci. Rep. 2020;40 doi: 10.1042/BSR20200980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Qiu W.R., Qi B.B., Lin W.Z., Zhang S.H., Yu W.K., Huang S.F. Predicting the lung adenocarcinoma and its biomarkers by integrating gene expression and DNA methylation data. Front. Genet. 2022;13 doi: 10.3389/fgene.2022.926927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Khan A.M., Singer A. Biomarkers in cervical precancer management: the new frontiers. Future Oncol. 2008;4:515–524. doi: 10.2217/14796694.4.4.515. [DOI] [PubMed] [Google Scholar]
  • 21.Reza M.S., Harun-Or-Roshid M., Islam M.A., Hossen M.A., Hossain M.T., Feng S., Xi W., Mollah M.N.H., Wei M.S. Bioinformatics screening of potential biomarkers from mRNA expression profiles to discover drug targets and agents for cervical cancer. Int. J. Mol. Sci. 2022;23 doi: 10.3390/ijms23073968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Li X., Schottker B., Holleczek B., Brenner H. Associations of DNA methylation algorithms of aging and cancer risk: results from a prospective cohort study. EBioMedicine. 2022;81 doi: 10.1016/j.ebiom.2022.104083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Papanicolau-Sengos A., Aldape K. DNA methylation profiling: an emerging paradigm for cancer diagnosis. Annu. Rev. Pathol. 2022;17:295–321. doi: 10.1146/annurev-pathol-042220-022304. [DOI] [PubMed] [Google Scholar]
  • 24.Kanehisa M., Furumichi M., Tanabe M., Sato Y., Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45:D353–D361. doi: 10.1093/nar/gkw1092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kanehisa M., Sato Y., Kawashima M. KEGG mapping tools for uncovering hidden features in biological data. Protein Sci. 2022;31:47–53. doi: 10.1002/pro.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hwang B., Lee J.H., Bang D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 2018;50:1–14. doi: 10.1038/s12276-018-0071-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Arbyn M., Weiderpass E., Bruni L., de Sanjose S., Saraiya M., Ferlay J., Bray F. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Global Health. 2020;8:e191–e203. doi: 10.1016/S2214-109X(19)30482-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hagmeijer M.H., Korpershoek J.V., Crispim J.F., Chen L.T., Jonkheijm P., Krych A.J., Saris D.B.F., Vonk L.A. The regenerative effect of different growth factors and platelet lysate on meniscus cells and mesenchymal stromal cells and proof of concept with a functionalized meniscus implant. J Tissue Eng Regen Med. 2021;15:648–659. doi: 10.1002/term.3218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.T G.S. Innate and adaptive immune cells in Tumor microenvironment. Gulf J Oncolog. 2021;1:77–81. [PubMed] [Google Scholar]
  • 30.Feng H., Zhong L., Yang X., Wan Q., Pei X., Wang J. Development and validation of prognostic index based on autophagy-related genes in patient with head and neck squamous cell carcinoma. Cell Death Dis. 2020;6:59. doi: 10.1038/s41420-020-00294-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Calabuig J.M., Garcia-Raffi L.M., Garcia-Valiente A., Sanchez-Perez E.A. Kaplan-meier type survival curves for COVID-19: a health data based decision-making tool. Front. Public Health. 2021;9 doi: 10.3389/fpubh.2021.646863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Hage A., Hage F. Kaplan-meier survival, actuarial survival, censoring, and competing events-what is what? Ann. Thorac. Surg. 2022;114:40–43. doi: 10.1016/j.athoracsur.2022.03.044. [DOI] [PubMed] [Google Scholar]
  • 33.Kreis J., Nedic B., Mazur J., Urban M., Schelhorn S.E., Grombacher T., Geist F., Brors B., Zuhlsdorf M., Staub E. RosettaSX: reliable gene expression signature scoring of cancer models and patients. Neoplasia. 2021;23:1069–1077. doi: 10.1016/j.neo.2021.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ni J., Liu S., Qi F., Li X., Yu S., Feng J., Zheng Y. Screening TCGA database for prognostic genes in lower grade glioma microenvironment. Ann. Transl. Med. 2020;8:209. doi: 10.21037/atm.2020.01.73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Ding L., Fan L., Xu X., Fu J., Xue Y. Identification of core genes and pathways in type 2 diabetes mellitus by bioinformatics analysis. Mol. Med. Rep. 2019;20:2597–2608. doi: 10.3892/mmr.2019.10522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ding J., Zhang Y. Comb Chem High Throughput Screen; 2017. Analysis of Key GO Terms and KEGG Pathways Associated with Carcinogenic Chemicals. [DOI] [PubMed] [Google Scholar]
  • 37.Yuan F., Lu L., Zhang Y., Wang S., Cai Y.D. Data mining of the cancer-related lncRNAs GO terms and KEGG pathways by using mRMR method. Math. Biosci. 2018;304:1–8. doi: 10.1016/j.mbs.2018.08.001. [DOI] [PubMed] [Google Scholar]
  • 38.Devraj R., Deshpande M. Demographic and health-related predictors of proton pump inhibitor (PPI) use and association with chronic kidney disease (CKD) stage in NHANES population. Res. Soc. Adm. Pharm. 2020;16:776–782. doi: 10.1016/j.sapharm.2019.08.032. [DOI] [PubMed] [Google Scholar]
  • 39.Zhou J., Xiong W., Wang Y., Guan J. Protein function prediction based on PPI networks: network reconstruction vs edge enrichment. Front. Genet. 2021;12 doi: 10.3389/fgene.2021.758131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Mahdipour E., Ghasemzadeh M. The protein-protein interaction network alignment using recurrent neural network. Med. Biol. Eng. Comput. 2021;59:2263–2286. doi: 10.1007/s11517-021-02428-5. [DOI] [PubMed] [Google Scholar]
  • 41.Xu T., Wang Q., Liu M. A network pharmacology approach to explore the potential mechanisms of huangqin-baishao herb pair in treatment of cancer. Med. Sci. Mon. Int. Med. J. Exp. Clin. Res. 2020;26 doi: 10.12659/MSM.923199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Doncheva N.T., Morris J.H., Gorodkin J., Jensen L.J. Cytoscape StringApp: network analysis and visualization of proteomics data. J. Proteome Res. 2019;18:623–632. doi: 10.1021/acs.jproteome.8b00702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Rich J.T., Neely J.G., Paniello R.C., Voelker C.C., Nussenbaum B., Wang E.W. A practical guide to understanding Kaplan-Meier curves. Otolaryngol. Head Neck Surg. 2010;143:331–336. doi: 10.1016/j.otohns.2010.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Li H., Li Q., Jing H., Zhao J., Zhang H., Ma X., Wei L., Dai R., Sun W., Suo Z. Expression and prognosis analysis of JMJD5 in human cancers. Front. Biosci. 2021;26:707–716. doi: 10.52586/4981. [DOI] [PubMed] [Google Scholar]
  • 45.Li T., Fu J., Zeng Z., Cohen D., Li J., Chen Q., Li B., Liu X.S. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. 2020;48:W509–W514. doi: 10.1093/nar/gkaa407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Li T., Fan J., Wang B., Traugh N., Chen Q., Liu J.S., Li B., Liu X.S. TIMER: a web server for comprehensive analysis of tumor-infiltrating immune cells. Cancer Res. 2017;77:e108–e110. doi: 10.1158/0008-5472.CAN-17-0307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Li B., Severson E., Pignon J.C., Zhao H., Li T., Novak J., Jiang P., Shen H., Aster J.C., Rodig S., Signoretti S., Liu J.S., Liu X.S. Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol. 2016;17:174. doi: 10.1186/s13059-016-1028-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Zou X.D., An K., Wu Y.D., Ye Z.Q. PPI network analyses of human WD40 protein family systematically reveal their tendency to assemble complexes and facilitate the complex predictions. BMC Syst. Biol. 2018;12:41. doi: 10.1186/s12918-018-0567-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kaufmann A.M., Gissmann L., Street D., Schreckenberger C., Hunter M., Qiao L. Expression of CD80 enhances immunogenicity of cervical carcinoma cells in vitro. Cell. Immunol. 1996;169:246–251. doi: 10.1006/cimm.1996.0115. [DOI] [PubMed] [Google Scholar]
  • 50.Wang Q., Li P., Wu W. A systematic analysis of immune genes and overall survival in cancer patients. BMC Cancer. 2019;19:1225. doi: 10.1186/s12885-019-6414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES