Abstract
Immune-related genes play a significant role in predicting the overall survival and monitoring the status of the cancer immune microenvironment. The aim of this research study was to identify differentially expressed immune-related genes (DEIRGs) and establish a Cox prediction model for the evaluation of prognosis in patients with non-small cell lung cancer (NSCLC). Transcription expression data, immune gene data, and tumor transcription factor data from The Cancer Genome Atlas (TCGA), the Immunology Database and Analysis Portal, and the Cistrome Cancer database were analyzed to detect differentially expressed genes (DEGs), DEIRGs, and differentially expressed transcription factors (DETFs). Multivariate Cox regression analysis was used to obtain potential DEIRGs as independent prognostic factors. Oncomine, The Human Protein Atlas (HPA), TIMER databases were performed to validate the mRNA and protein expression level of DEIRGs. TIMER database was performed to explore the immunocytes infiltration of DEIRGs. In total, 7448 DEGs, 536 DEIRGs, 87 DETFs were identified from 1,037 NSCLC tissues and 108 normal tissues in TCGA database. Fifteen-DEIRG signatures (THBS1, S100P, S100A16, DLL4, CD70, DKK1, IL33, NRTN, PDGFB, STC2, VGF, GCGR, HTR3A, LGR4, SHC3) could be perceived as independent prognostic factors for predicting the overall survival of patients with NSCLC (P = 4.89e−-09). Immune cell correlation analysis showed that neutrophils and b cells were positively and negatively correlated with the riskscore of the prediction model, respectively. Our study identified a Cox prediction model based on DEIRGs to predict the overall survival of patients with NSCLC. The immunocyte infiltration analysis provided a novel horizon for monitoring the status of the NSCLC immune microenvironment.
Keywords: Non-small cell lung cancer, immune-related genes, transcription factors, tumor immune microenvironment, overall survival, prognostic biomarkers
Introduction
Lung cancer remains the leading cause of cancer-related mortality worldwide [1,2]. Patients with non-small cell lung cancer (NSCLC) account for approximately 85% of lung cancer cases, mainly comprised of lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) [3,4]. With the increasing incidence and mortality associated with NSCLC, the survival rates of patients with LUAD and LUSC remain suboptimal [5]. For the clinical tumor-node-metastasis (TNM) stage IIIB and IV NSCLCs, the 5-year survival is 7%, and 2%, respectively [6]. High throughput technology, microarray, and RNA sequencing can be applied to identify data profiles related to prognostic biomarkers [7-9].
Recently, some studies have shown that the prediction model based on RNA sequencing data could precisely predict the survival of patients with cancers [9-12]. The immune microenvironment (IME), including immune cells associated with immune-related genes (IRGs), has a significant impact on predicting the prognosis of cancers [13]. A study has shown that an immune-related prognostic model in the context of the TP-53 status could forecast the prognosis of colorectal cancer [14]. Another study identified an immune signature prediction model to forecast the prognosis of LUAD [7]. Recently, a study revealed that the clinical immune signature could be perceived as a prognostic biomarker for forecasting the prognosis of non-squamous NSCLC [15]. However, a Cox predictive model in the context of IRGs and clinical relevance in LUAD and LUSC is currently lacking.
This research mainly aimed to acquire differentially expressed genes (DEGs), differentially expressed immune-related genes (DEIRGs), differentially expressed transcription factors (DETFs) based on The Cancer Genome Atlas (TCGA), Immunology Database and Analysis Portal (ImmPort), and Cistrome Cancer databases. Subsequently, a Cox prediction model based on DEIRGs was constructed for predicting the prognosis of LUAD and LUSC. Furthermore, the DEIRGs-related DETFs regulatory network reveals the potential mechanism of DEIRGs in patients with LUSC and LUAD. Moreover, correlation analysis between the riskScore of the prediction model and immunocyte infiltration was performed to estimate the status of the tumor microenvironment.
Materials and methods
Clinical patients and data collection
We downloaded the fragments per kilobase of exon model per million reads mapped data for LUSC and LUAD from the transcriptome RNA-Sequence data in TCGA database, incorporating 1,037 LUSC and LUAD tissues and 108 normal tissues. We acquired IRGs in the ImmPort (http://www.immport.org/) [16]. Furthermore, we used the Cistrome Project (http://www.cistrome.org/) to download the cancer transcription factor targets [17].
Differential gene expression analysis in LUAD and LUSC
We used the “limma R” (http://www.bioconductor.org/packages/release/bioc/html/limma.html) to perform the differential gene expression analysis in all transcriptome RNA-Seq gene expression data, cancer TF targets and IRGs, on the basis of absolute fold change (log2) > 1 and the thresholds of adjusted false discovery rate (P < 0.05). We used the Cistrome Cancer database to obtain DETFs from DEGs. Moreover, DEIRGs were extracted from DEGs using the ImmPort database.
Functional enrichment analysis of DEIRGs in LUSC and LUAD
We further investigated the functions of those DEIRGs based on their expression profiles. The Database for Annotation, Visualization and Integrated Discovery (http://www.david.niaid.nih.gov) was utilized to perform Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, together with the Gene Ontology (GO) analysis [18]. The GO analysis showed that a P < 0.05 denoted a statistically significant. The GOCircle plot and GO Chord plot were obtained using the GOplot R package (https://cran.r-project.org/web/packages/GOplot/citation.html) [19]. Furthermore, the KEGG pathway analysis was conducted on the “cluster profiler R” package (http://www.bioconductor.org/packages/release/bioc/html/clusterProfiler.html) [20]. A P < 0.05 denoted statistical significance. In addition, we used the Cytoscape (version 3.6.1) software to establish the DEIRGs-related KEGG pathway network for visual analysis [21].
Establishment of the DEIRG-based prediction Cox model in LUAD and LUSC
We used univariate Cox regression analyses to obtain survival-related DEIRGs. The survival-related DEIRGs were selected as prognostic biomarkers to perform multivariate Cox regression analyses (P < 0.05). In the context of the median riskScore value, Patients with LUAD and LUSC were classified into low-risk and high-risk groups. More importantly, receiver operating characteristic (ROC) analysis was performed to estimate the signature of DEIRGs (low-risk vs. high-risk) based on the overall survival (OS). The area under the receiver operating characteristic curve values were calculated to evaluate the prediction Cox model for explore the prognostic biomarkers in patients with LUAD and LUSC.
DEIRG-mediated DETF regulatory network in LUAD and LUSC
TFs act as significant molecules that can regulate the expression levels of genes. Thus, investigation of the mechanism of TFs involved in regulating the expression of OS-related DEIRGs is of great importance. The Cistrome Cancer database provides the regulatory relationship between the transcriptome and TFs in TCGA profilers. Therefore, we constructed DEIRG-mediated DETF regulatory network for visual analysis with the thresholds of filters P < 0.001 and standard correlation coefficient > 0.4 to investigate the potential mechanism of DEIRGs.
Independent prognosis analysis
We used the univariate and multivariate Cox independent prognosis analyses to further investigate whether the fifteen-DEIRG signatures could be used as independent prognostic factors. We screened 668 LUAD and LUSC patients with full clinical characteristics information and meanwhile integrated the expression and riskScore of fifteen DEIRG signatures to perform the univariate and multivariate Cox independent prognosis analyses. P < 0.05 denoted statistical significance.
Clinical relevance analysis between prognosis-related DEIRGs and clinical characteristics in LUAD and LUSC
Correlation analysis was conducted to further identify the interaction between prognosis-related DEIRGs and clinical characteristics in LUSC and LUAD. The clinical characteristics included age (≤ 65/> 65 years), sex (female/male), pathological T stage (T3&T4/T1&T2), pathological TNM stage (III&IV/I&II), pathological M stage (M1/M0), and pathological N stage (N1-N3/N0). P < 0.05 denoted statistical significance.
Oncomine, the Human Protein Atlas (HPA), TIMER validation
Oncomine is an online database, which could automatically detect differentially expressed profiles for cancer types and subtypes to analyze a gene of interest across the whole datasets [22]. We used the Oncomine to analyze the expression level of prognosis-related DEIRGs in 20 cancer types to visualize the differential expression level. The Human Protein Atlas (HPA) database could be used as a tool for identifying potential novel cancer biomarkers using the protein profiles with in silico-based methods [23]. We used the immunohistology data of HPA to validate the expression level of prognosis-related DEIRGs. TIMER 2.0 database is an online database, which could analyze the immunocytes infiltration, gene mutation, differential expression in all genes [24]. We used the TIMER 2.0 to validate the expression level of prognosis-related DEIRGs in all kinds of cancers from the Diff Exp section.
Survival analysis of prognosis-related DEIRGs
The Kaplan-Meier plotter (www.kmplot.com), including clinical data and gene expression data of lung cancer [25], gastric cancer [26], breast cancer [27], and ovarian cancer [28]. Kaplan-Meier plotter database with log-rank test was performed to conduct the survival analysis of prognosis-related DEIRGs.
Infiltration of immune cells
Tumor Immune Estimation Resource (TIMER) is an online database, which can analyze and visualize the levels of tumor-infiltrating immunocytes [29]. TIMER was used to reanalyze the gene expression data of 10,897 samples from 32 types of cancers in TCGA to reveal correlations between tumor-infiltrating immunocytes, including macrophages, neutrophils, dendritic cells, B cells, CD8 T cells, and CD4 T cells (https://zenodo.org/record/57669#.Xeezu9V5uMo) and other characteristics. We used the TIMER online database to download the TIMER immune Estimation file. Subsequently, we interacted the TIMER immune Estimation file with the immune riskScore file to perform the correlation analysis.
Results
Acquisition of DEGs, DEIRGs, and DETFs
In total, 7,448 DEGs, 536 DEIRGs, and 87 DETFs were obtained using the limma R package according to the set standards (P < 0.05, |logFC| > 1) between 1,037 LUAD and LUSC specimens and 108 adjacent normal specimens. Based on the criteria, we identified 5,552 up-regulated DEGs and 1,898 down-regulated DEGs, 340 up-regulated DEIRGs and 196 down-regulated DEIRGs, as well as 53 up-regulated DETFs and 34 down-regulated DETFs. The top 50 most statistically significant DEGs and DEIRGs and all DETFs are shown in Tables S1, S2, S3, respectively. A volcano plot and heat map plot were visualized to further exhibit the distribution of all DEGs, DEIRGs, and DETFs (Figure 1A-F). The flow diagram of the whole study is illustrated in Figure 1G.
Figure 1.

DEGs, DEIRGs, and DETFs between LUAD and LUSC specimens and adjacent normal specimens. A-C. Volcano plot of DEGs, DEIRGs, and DETFs. D-F. Heatmap of DEGs, DEIRGs, and DETFs. G. Flow chart of the research study.
Functional annotation and pathway enrichment analysis of DEIRGs
We performed GO and KEGG pathway analyses to further verify the function of DEIRGs in LUAD and LUSC. In the GO analysis, the top significant difference was the cellular component groups. The identification of the termed named extracellular region was GO:0005576 (adjusted P-value = 1.26E−-115). The GO Circle plotting was visualized, showing the top eight GO terms, including two cellular component terms, one molecular function term, and five biological process terms (P < 0.05) (Figure 2A, 2B). The GO Chord plot showed the top 30 significance DEIRGs related to the top eight GO terms (Figure 2C). As shown in Figure 2C, the top 30 significance DEIRGs were mostly enriched in GO:0005576 extracellular region. IFNG, IGLV6-57, IGHV3-11, IGKV1-5, PPBP, IGKV1-39, IGLV7-43, IGKV4-1, IGLV3-27, and IGLV3-25 were enriched in GO:0006955 immune response. IGLV6-57, IGHV3-11, IGKV1-5, IGKV1-39, IGLV7-43, IGLV3-27, IGKV4-1, and IGLV3-25 were enriched in GO:0050776 regulation of immune response, GO:0006958 complement activation, classical pathway, and GO:0038096 Fc-gamma receptor signaling pathway involved in phagocytosis. KEGG pathway analysis demonstrated that DEIRGs were most enriched in 65 KEGG pathways (Table S4). The bar plot showed the top 12 most statistically significant KEGG pathways, including Cytokine-cytokine receptor interaction, Viral protein interaction with cytokine and cytokine receptor, Rheumatoid arthritis, Chemokine signaling pathway, Neuroactive ligand-receptor interaction, JAK-STAT signaling pathway, Hematopoietic cell lineage, Th17 cell differentiation, Natural killer cell mediated cytotoxicity, IL-17 signaling pathway, Graft-versus-host disease, Intestinal immune network for IgA production. DEIRGs were most enriched in the cytokine-cytokine receptor interaction pathway (Figure 3A). The dot plot revealed the top 10 KEGG pathways (Figure 3B). Furthermore, we constructed the DEIRGs-pathway network using the Cytoscape 3.6.1 software (Figure 3C). As shown in Figure 3C, there were up-and down regulated DEIRGs were enriched in 65 KEGG pathways.
Figure 2.
GO analysis of DEIRGs in LUAD and LUSC. A. The inner circle indicates the statistically significant GO terms (log10-adjusted P values). The outer circle indicates the logFC of DEIRGs in the GO terms. Blue dots in the GO terms indicate down-regulated DEIRGs. Red dots in the GO terms indicate up-regulated DEIRGs. B. GO Chord indicates the relationship between the top 30 DEIRGs associated with eight enriched GO terms. C. The top eight GO terms.
Figure 3.
KEGG pathway analysis of DEIRGs in LUAD and LUSC. A. The bar plot of the top 12 KEGG pathways which were enriched in DEIRGs. B. The dot plot of the top 10 KEGG pathways which were enriched in DEIRGs. C. DEIRG-mediated KEGG pathway network. Green diamonds represent the KEGG pathways. Red circles represent the up-regulated DEIRGs; Blue circles represent the down-regulated DEIRGs.
Cox prediction model based on DEIRGs
We conducted Kaplan-Meier curve analysis to reveal the prognostics value of DEIRGs. We performed univariate and multivariate Cox regression analyses to establish a Cox prediction model. Screening through univariate Cox regression analysis, yielded 33OS-related DEIRGs (P < 0.05). Subsequently, these 33 OS-related DEIRGs were utilized to conduct the multivariate Cox regression analysis. Finally, 15 DEIRGs were selected to establish a Cox prediction model. The weighted relative coefficients were as follows: riskScore value = (0.09607 × THBS1 expression + 0.03865 × S100P expression + 0.11711 × S100A16 expression + 0.07232 × DKK1 expression + (-0.14257) × IL33 expression + 0.16969 × CD70 expression + 0.20890 × DLL4 expression + (-0.20366) × NRTN expression + 0.12007 × PDGFB expression + 0.10827 × STC2 expression + 0.13460 × VGF expression + (-0.31152) × GCGR expression + 0.10294 × HTR3A expression + 0.13890 × LGR4 expression + (-0.29843) × SHC3 expression). Multivariate Cox regression analyses are shown in Table S5. In the context of the median riskScore value, 887 LUAD and LUSC specimens with complete survival status and time data were classified into low- (n = 444) and high-risk (n = 443) groups. Figure 4A illustrates that the high-risk group had a notably poor prognosis compared with the low-risk group (P = 4.89e−-09). The area under the receiver operating characteristic curve analysis showed that the Cox prediction model based on DEIRGs achieved better accuracy in the monitoring of survival (Figure 4B). The riskScore plot and survival time and status plot are shown in Figure 4C and 4D, respectively. The heatmap plot shows the expression of 15 DEIRGs between the high- and low-risk groups (Figure 4E).
Figure 4.
Prognostic model based on DEIRGs in LUAD and LUSC. A. Kaplan-Meier analyses of OS in patients with LUAD and LUSC based on 15 DEIRG signatures. B. ROC analysis of 15 DEIRG signatures. C. The plot of the riskScore based on the prediction model in high- versus low-risk groups. D. The plot of survival status based on the prediction model in low- versus high-risk groups. Red and green dots represent the high- and low-risk groups, respectively. E. The heatmap of DEIRGs in the prediction model between high- and low-risk groups, respectively.
Prognostic value of DEIRGs and construction of the DEIRG-related DETF regulatory network
Screening through univariate Cox regression analysis, identified 33 OS-related DEIRGs, including 11 low-risk DEIRGs and 22 highrisk DEIRGs (P < 0.05) (Figure 5A). To further explore the mechanism of OS-related DEIRGs, we constructed the DEIRG-related DETF network. The network showed that 21 DETFs were associated with 12 DEIRGs. THBS1, HNF4G, DLL4, STC2, IL11, and S100A16 were DEIRGs, which were high-risk DEIRGs in the network. In contrast, ADRB2, IL33, LTB4R2, LTB4R, VIPR1, and SHC3 were low-risk DEIRGs in the network. As shown in Figure 5B, the relationship between DEIRG DLL4 and DETF CENPA, DEIRG VIPR1 and DETF NCAPG, DEIRG DLL4 and DETF RARG, DEIRG DLL4 and DETF SANI2, DEIRG DLL4 and DETF TBL1XR1, DEIRG DLL4 and DETF TP63 had a negative regulation in the network. The relationship between DEIRGs and DETFs are shown in Table S6. Univariate Cox regression analyses indicated that pathological T stage, pathological TNM stage, risk score, and pathological N stage were related to OS (P < 0.001). The pathological M stage was related to OS (P < 0.05) (Figure 5C). Multivariate Cox independent prognosis analysis indicated that pathological T stage and riskScore could be perceived as independent prognostic factors for patients with LUAD and LUSC (P < 0.05) (Table S7; Figure 5D).
Figure 5.
Prognostic value of 15 DEIRG signatures in LUAD and LUSC. A. Visualization of the forest plot of prognosis-related DEIRGs. B. DEIRG-related DETF regulatory network. Green diamonds indicate DETFs. Blue circles represent downregulated DEIRGs. Red circles represent upregulated DEIRGs. Solid and dashed lines represent the positive and negative regulatory relationships, respectively. C. Visualization of the forest map with single factor independent prognosis. D. Visualization of the forest map with multi-factor independent prognosis. The red of the forest map represents that the clinical-pathological characteristic could be used as a high-risk factor. The green of the forest map represents that the clinical-pathological characteristic could be used as a low-risk factor.
Correlation analysis of DEIRGs and clinical pathological characteristics in patients with LUAD and LUSC
We performed a correlation analysis between DEIRGs in the predictive model and the clinical characteristics of LUAD and LUSC specimens’ clinical characteristics (pathological TNM stage, age, pathological T stage, sex, pathological N stage, pathological M stage). Nine DEIRGs exhibited significant differences in clinical features (P < 0.05). The expression levels of CD70, GCGR, and STC2 were significantly different in patients with lung cancer at pathological M0 stage compared with those at pathological M1 stage (P < 0.05) (Figure 6A-C). The expression of DLL4, SHC3, STC2, and S100A16 were significantly different in patients with lung cancer (both sexes) (P < 0.05) (Figure 6B, 6E-G). Differences in the expression levels of IL33, PDGFB and SHC3 were statistically significant between pathological N0 stage and pathological N1-N3 stage (P < 0.05) (Figure 6H-J). DLL4 expression was significantly different in patients with lung cancer between pathological T3-T4 stage and pathological T1-T2 stage (P = 0.045) (Figure 6K). The expression of S100A16 were statistically significant in patients with lung cancer aged ≤ 65 and > 65 years (P = 0.011) (Figure 6L). Differences in the expression levels of S100A16 and S100P were statistically significant between pathological TNM stages III&-IV and pathological TNM stages I&-II (P < 0.05) (Figure 6M, 6N). The riskScore was significantly different in pathological N0, N1-N3 stages and pathological TNM stages III&-IV and pathological TNM stages I&-II (P < 0.05) (Figure 6O, 6P) (Table S8).
Figure 6.
Comparison of DEIRG expression levels in different pathological features in LUAD and LUSC. A-C. Differences in the expression of DEIRGs between the pathological M0/M1 stages. D-G. Differences in the expression of DEIRGs between the sexes (male/female). H-J. Differences in the expression of DEIRGs between the pathological N0/N1-N3 stages. K. Differences in the expression of DEIRGs between the pathological T1-T2/T3-T4 stages. L. Differences in the expression of DEIRGs in terms of age (≤ 65/> 65 years). M, N. Differences in the expression of DEIRGs between the pathological TNM stages (I&-II/III&-IV). O, P. Differences in the riskscore of DEIRGs in pathological N0/N1-N3 stages and pathological TNM stages (I&-II/III&-IV).
Validation of prognosis-related DEIRGs
We used the Oncomine and TIMER 2.0 database to validate the mRNA expression level of prognosis-related DEIRGs. HPA database was performed to verify the protein expression level of prognosis-related DEIRGs. Figure 7A showed the heatmap of the mRNA expression level of 15 DEIRGs in 20 cancer types. As we can see from the Figure 7A, the mRNA expression levels of S100P, S100A16, DKK1, VGF, HTR3A, and LGR4 were high expression in the heatmap. Figure 7B showed the protein expression levels of THBS1, S100P, S100A16, DLL4, PDGFB, LGR4, and SHC between LUAD, LUSC and normal tissues. As shown in Figure 7B, the expression levels of S100P, S100A16, and LGR4 were higher in LUAD, LUSC tissues than in normal tissues. Compared with the normal tissues, the expression levels of THBS1, DLL4, PDGFB, and SHC3 were lower in LUSC and LUAD tissues, which is consistent with our results. Figure 8 showed the expression levels of prognosis-related DEIRGs in 20 cancer types. As shown in Figure 8, the mRNA expression levels of S100P, S100A16, NRTN, VGF, GCGR, HTR3A, and LGR4 were significantly highly expressed in LUAD, LUSC tissues than in normal tissues (Figure 8A-G). Compared with the normal tissues, the mRNA expression levels of THBS1, DLL4, IL33, PDGFB, SHC3 were significantly lower in LUAD, LUSC tissues (Figure 8H-L).
Figure 7.
Validation of prognosis-related DEIRGs of NSCLC at mRNA and protein level. A. The heatmap of gene summary of mRNA expression levels in 20 cancer types from Oncomine database. Red indicate up-regulated, blue indicate down-regulated. B. The protein expression levels of THBS1, S100P, S100A16, DLL4, PDGFB, LGR4, and SHC3 for NSCLC tissues and normal tissues in immunohistology assay from HPA database.
Figure 8.
Validation of prognosis-related DEIRGs in TIMER 2.0 database. A-F. The mRNA expression levels of S100P, S100A16, NRTN, VGF, GCGR and HTR3Ain all kinds of cancer types. G-L. The mRNA expression levels of LGR4, THBS1, DLL4, IL33, PDGFB, SHC3 in all kinds of cancer types. Red in the box plot indicates tumor tissues. Blue in the box plot indicates normal tissues. * represent P < 0.05, ** represent P < 0.01, *** represent P < 0.001.
Kaplan-Meier plotter analysis of prognosis-related DEIRGs
We used the Kaplan-Meier plotter database to perform the survival analysis of prognosis-related DEIRGs. Figure 9 showed that 13 prognosis-related DEIRGs were associated with overall survival (P < 0.05), while the high-risk DEIRG LGR4 and the low-risk DEIRG NRTN were not correlated with overall survival in NSCLC (P > 0.05).
Figure 9.
Kaplan-Meier plotter survival analysis of 15 OS-related DEIRGs. A-K. The survival analysis of 11 high-risk DEIRGs. L-O. The survival analysis of 4 low-risk DEIRGs.
Correlation analysis of DEIRGs in the prediction model and infiltration of immunocytes
To further investigate whether the DEIRGs identified in the prediction model reflected the status of lung cancer IME, we performed a correlation analysis between immunocyte infiltration and the DEIRGs riskScore. As shown in Figure 10, B cells had a negative relationship with the DEIRGs riskScore (P < 0.05). In contrast, neutrophils had a positive relationship with the DEIRGs riskScore (P < 0.05), which may provide a novel horizon for investigating the lung cancer IME.
Figure 10.

Correlation analysis of the prognostic value and six types of immunocyte infiltration. A. B-cells. B. CD4-T cells. C. Dendritic cells. D. Macrophages. E. CD8-T cells. F. Neutrophils.
Discussion
Recently, the great significance of IRGs in the development of tumors and immunology has been recognized [30-33]. However, RNA sequencing associated with DEIRGs and the construction of the prediction model has not been explored thus far. Therefore, it is of crucial importance to verify underlying biomarkers for predicting the OS of patients with LUAD and LUSC. In the present study, we first used the limma R package to identify DEGs, followed by analyses of the ImmPort and Cistrome Cancer databases to obtain DEIRGs and DETFs, respectively. Subsequently, the DEIRG-mediated DETF regulatory network was established. Using univariate and multivariate Cox regression analyses, a Cox prediction model based on DEIRGs was established to determine whether these DEIRGs could be used as independent prognostic factors in patients with LUAD and LUSC. Furthermore, immunocyte estimation in the TIMER database was used to perform the correlation analysis between immunocytes and the riskScore of the prediction model. In this study, a prediction model associated with DEIRGs was adopted for monitoring immunocyte infiltration and estimating the prognosis of LUAD and LUSC.
In recent years, the identification of potential molecular targets associated with the tumor IME for the investigation of promising prognostic biomarkers related to OS has attracted considerable attention [26,34-37]. A study indicated that a prediction model based on four IRGs could estimate the prognosis of LUAD [30]. Recently, it was demonstrated that a prognostic genetic signature associated with the tumor IME could predict the survival of patients with NSCLC [38]. A research study revealed that an immune response related to mutational signatures could predict the prognosis of NSCLC [39]. The expression of PD-L1 expression could be used as a potential prognostic biomarker for predicting responses to immunotherapy and prognosis of NSCLC [40]. The molecular expression subtypes of LUSC and LUAD indicate that determining the tumor expression subtype could be used as a potential prognostic biomarker for immunotherapy [41]. However, the tumor IME related to IRGs and potential prognostic biomarkers based on the prediction model in LUAD and LUSC remain unknown.
In this study, we initially obtained DEGs, DEIRGs, and DETFs. Subsequently, a DEIRG-related DETF regulatory network was constructed. Furthermore, a prediction model based on DEIRGs was established to detect the DEIRGs that could be perceived as independent prognostic factor. Compared with previous studies, in the present study, a Cox prediction model based on DEIRGs was initially constructed, followed by DEIRG-mediated DETF regulatory network to investigate the mechanism of DEIRGs. Meanwhile, the TIMER database provided an estimation of the infiltration of immunocytes.
To further investigate the underlying mechanism of DEIRGs at the molecular level, we constructed a DEIRG-mediated DETF regulatory network, which revealed significant DETFs regulating DEIRGs in the network. According to the network, we hypothesized that ETS1 and MITF could regulate THBS1, ETV1 and KAT2B could regulate SHC3. In the network, VIPR1 was down-regulated in both LUAD and LUSC. Consistent with our study, a recent study showed that VIPR1 plays an important role in patients with LUAD as a tumor suppressor [42]. In the network, we hypothesized that TF CBX7, TF TCF21 and TF NCAPG could regulate VIRP1, which provided a novel insight for exploring the DEIRG VIRP1 at the molecular level.
In the present study, we conducted univariate and multivariate Cox regression analyses and established a Cox prediction model based on DEIRGs was established. The 15 DEIRG signatures provided a novel horizon to forecast the OS of patients with LUAD and LUSC. Furthermore, an independent prognosis analysis was performed, revealing the 15 DEIRGs (GCCR, HTR3A, VGF, NRTN, CD70, SHC3, DKK1, STC2, LGR4, IL33, DLL4, PDGFB, S100P, THBS1, and S100A16), which could be perceived as independent prognostic factors to forecast the survival of patients with LUSC and LUAD. CD70, which acts as a tumor necrosis factor, may play a pivotal role in forecasting the prognosis of patients with malignant pleural mesothelioma [43]. CD70 is a potential molecular target, which plays a significant role in cancer immunotherapy [44]. STC2 may act as a novel target for potential interventions against glioblastoma [45]. LGR4/GPR48 could be used as a potential unique target in regulating TLR2/4 mediated autoimmune diseases [46]. Compared with previous investigations, our study was the first identify 15 DEIRG signatures that could be used as independent prognostic factors in patients with LUAD and LUSC. This evidence, may provide a new horizon for forecasting the survival of patients with LUAD and LUSC.
According to the correlation analysis of DEIRGs based on the prediction model and clinical features of LUAD and LUSC, a total of nine DEIRGs were associated with the clinical characteristics. More significantly, our study was the first to access the correlation of the riskScore of the prediction model and six types of immunocyte infiltrations in TIMER, which may provide important insight into monitoring the status of the tumor IME.
Conclusions
Our study identified a Cox prediction model based on DEIRGs from TCGA, ImmPort, and Cistrome Cancer databases for predicting the OS of patients with NSCLC, Furthermore, multivariate Cox regression analysis was performed to verify 15 prognosis-related DEIRGs could be used as independent prognostic factors. Moreover, Oncomine, HPA, TIMER, and Kaplan-Meier plotter databases validated the prognosis-related DEIRGs at mRNA and protein level. Finally, the immunocyte infiltration analysis provided a new insight into monitoring the status of the NSCLC IME.
Acknowledgements
We thank the Charlesworth Group’s author services for professional language editing. This work was supported by grants from the Major Scientific and Technological Innovation Project of Shandong Province (grant no. 2018CXGC1212), the Science and Technology Foundation of Shandong Province (grant no. 2014GSF118084), the CSCO-Qilu Cancer Research Fund (grant no. Y-Q201802-014), the Medical and Health Technology Innovation Plan of Jinan City (grant no. 201805002) and the National Natural Science Foundation of China (grant no. 81372333).
Disclosure of conflict of interest
None.
Abbreviations
- DEGs
differentially expressed genes
- DEIRGs
differentially expressed immune-related genes
- DETF
differentially expressed transcription factor
- GO
Gene Ontology
- IME
immune microenvironment
- ImmPort
Immunology Database and Analysis Portal
- IRG
immune-related gene
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- LUAD
lung adenocarcinoma
- LUSC
lung squamous cell carcinoma
- NSCLC
non-small cell lung cancer
- OS
overall survival
- TCGA
The Cancer Genome Atlas
- TF
transcription factor
- TIMER
Tumor Immune Estimation Resource
Supporting Information
References
- 1.Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65:87–108. doi: 10.3322/caac.21262. [DOI] [PubMed] [Google Scholar]
- 2.Chen W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, Jemal A, Yu XQ, He J. Cancer statistics in China, 2015. CA Cancer J Clin. 2016;66:115–132. doi: 10.3322/caac.21338. [DOI] [PubMed] [Google Scholar]
- 3.Travis WD, Brambilla E, Noguchi M, Nicholson AG, Geisinger K, Yatabe Y, Ishikawa Y, Wistuba I, Flieder DB, Franklin W, Gazdar A, Hasleton PS, Henderson DW, Kerr KM, Petersen I, Roggli V, Thunnissen E, Tsao M. Diagnosis of lung cancer in small biopsies and cytology: implications of the 2011 international association for the study of lung cancer/American thoracic society/European respiratory society classification. Arch Pathol Lab Med. 2013;137:668–684. doi: 10.5858/arpa.2012-0263-RA. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Travis W, Brambilla E, Noguchi M, Nicholson A, Geisinger K, Yatabe Y, Ishikawa Y, Wistuba I, Flieder D, Franklin W, Gazdar A, Hasleton P, Henderson D, Kerr K, Nakatani Y, Petersen I, Roggli V, Thunnissen E, Tsao M. Diagnosis of lung adenocarcinoma in resected specimens implications of the 2011 international association for the study of lung cancer/American thoracic society/European respiratory society classification. Arch Pathol Lab Med. 2013;137:685–705. doi: 10.5858/arpa.2012-0264-RA. [DOI] [PubMed] [Google Scholar]
- 5.Cagle PT, Allen TC, Olsen RJ. Lung cancer biomarkers: present status and future developments. Arch Pathol Lab Med. 2013;137:1191–1198. doi: 10.5858/arpa.2013-0319-CR. [DOI] [PubMed] [Google Scholar]
- 6.Goldstraw P, Crowley J, Chansky K, Giroux D, Groome P, Rami-Porta R, Postmus P, Rusch V, Sobin L International Association for the Study of Lung Cancer International Staging Committee. The IASLC lung cancer staging project: proposals for the revision of the TNM stage groupings in the forthcoming (seventh) edition of the TNM classification of malignant tumours. J Thorac Oncol. 2007;2:706–714. doi: 10.1097/JTO.0b013e31812f3c1a. [DOI] [PubMed] [Google Scholar]
- 7.Shi X, Li R, Dong X, Chen AM, Liu X, Lu D, Feng S, Wang H, Cai K. IRGS: an immune-related gene classifier for lung adenocarcinoma prognosis. J Transl Med. 2020;18:55. doi: 10.1186/s12967-020-02233-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Duan S, Wang P, Liu F, Huang H, An W, Pan S, Wang X. Novel immune-risk score of gastric cancer: a molecular prediction model combining the value of immune-risk status and chemosensitivity. Cancer Med. 2019;8:2675–2685. doi: 10.1002/cam4.2077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wang S, Yang L, Ci B, Maclean M, Gerber DE, Xiao G, Xie Y. Development and validation of a nomogram prognostic model for SCLC patients. J Thorac Oncol. 2018;13:1338–1348. doi: 10.1016/j.jtho.2018.05.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chang CF, Huang PW, Chen JS, Chen YY, Lu CH, Chang PH, Hung YS, Chou WC. Prognostic factors for advanced pancreatic cancer treated with gemcitabine Plus S-1: retrospective analysis and development of a prognostic model. Cancers (Basel) 2019;11:57. doi: 10.3390/cancers11010057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Liu A, Hou F, Qin Y, Song G, Xie B, Xu J, Jiao W. Predictive value of a prognostic model based on pathologic features in lung invasive adenocarcinoma. Lung Cancer. 2019;131:14–22. doi: 10.1016/j.lungcan.2019.03.002. [DOI] [PubMed] [Google Scholar]
- 12.Ahlberg J, Giragossian C, Li H, Myzithras M, Raymond E, Caviness G, Grimaldi C, Brown SE, Perez R, Yang D, Kroe-Barrett R, Joseph D, Pamulapati C, Coble K, Ruus P, Woska JR, Ganesan R, Hansel S, Mbow ML. Retrospective analysis of model-based predictivity of human pharmacokinetics for anti-IL-36R monoclonal antibody MAB92 using a rat anti-mouse IL-36R monoclonal antibody and RNA expression data (FANTOM5) MAbs. 2019;11:956–964. doi: 10.1080/19420862.2019.1615345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lu Y, Zhou X, Liu Z, Wang B, Wang W, Fu W. Assessment for risk status of colorectal cancer patients: a novel prediction model based on immune-related genes. DNA Cell Biol. 2020;39:958–964. doi: 10.1089/dna.2019.5195. [DOI] [PubMed] [Google Scholar]
- 14.Zhao X, Liu J, Liu S, Yang F, Chen E. Construction and validation of an immune-related prognostic model based on TP53 status in colorectal cancer. Cancers (Basel) 2019;11 doi: 10.3390/cancers11111722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Song Q, Shang J, Yang Z, Zhang L, Zhang C, Chen J, Wu X. Identification of an immune signature predicting prognosis risk of patients in lung adenocarcinoma. J Transl Med. 2019;17:70. doi: 10.1186/s12967-019-1824-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bhattacharya S, Andorf S, Gomes L, Dunn P, Schaefer H, Pontius J, Berger P, Desborough V, Smith T, Campbell J, Thomson E, Monteiro R, Guimaraes P, Walters B, Wiser J, Butte AJ. ImmPort: disseminating data to the public for the future of immunology. Immunol Res. 2014;58:234–239. doi: 10.1007/s12026-014-8516-1. [DOI] [PubMed] [Google Scholar]
- 17.Mei S, Meyer CA, Zheng R, Qin Q, Wu Q, Jiang P, Li B, Shi X, Wang B, Fan J, Shih C, Brown M, Zang C, Liu XS. Cistrome cancer: a web resource for integrative gene regulation modeling in cancer. Cancer Res. 2017;77:e19–e22. doi: 10.1158/0008-5472.CAN-17-0327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. DAVID: database for annotation, visualization, and integrated discovery. Genome Biol. 2003;4:P3. [PubMed] [Google Scholar]
- 19.Walter W, Sánchez-Cabo F, Ricote M. GOplot: an R package for visually combining expression data with functional analysis. Bioinformatics. 2015;31:2912–4. doi: 10.1093/bioinformatics/btv300. [DOI] [PubMed] [Google Scholar]
- 20.Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Reimand J, Isserlin R, Voisin V, Kucera M, Tannus-Lopes C, Rostamianfar A, Wadi L, Meyer M, Wong J, Xu C, Merico D, Bader GD. Pathway enrichment analysis and visualization of omics data using g: Profiler, GSEA, Cytoscape and EnrichmentMap. Nat Protoc. 2019;14:482–517. doi: 10.1038/s41596-018-0103-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Rhodes DR, Kalyana-Sundaram S, Mahavisno V, Varambally R, Yu J, Briggs BB, Barrette TR, Anstet MJ, Kincead-Beal C, Kulkarni P, Varambally S, Ghosh D, Chinnaiyan AM. Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia. 2007;9:166–180. doi: 10.1593/neo.07112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ponten F, Schwenk JM, Asplund A, Edqvist PH. The human protein atlas as a proteomic resource for biomarker discovery. J Intern Med. 2011;270:428–446. doi: 10.1111/j.1365-2796.2011.02427.x. [DOI] [PubMed] [Google Scholar]
- 24.Yang S, Liu T, Nan H, Wang Y, Chen H, Zhang X, Zhang Y, Shen B, Qian P, Xu S, Sui J, Liang G. Comprehensive analysis of prognostic immune-related genes in the tumor microenvironment of cutaneous melanoma. J Cell Physiol. 2020;235:1025–1035. doi: 10.1002/jcp.29018. [DOI] [PubMed] [Google Scholar]
- 25.Sun CC, Zhou Q, Hu W, Li SJ, Zhang F, Chen ZL, Li G, Bi ZY, Bi YY, Gong FY, Bo T, Yuan ZP, Hu WD, Zhan BT, Zhang Q, Tang QZ, Li DJ. Transcriptional E2F1/2/5/8 as potential targets and transcriptional E2F3/6/7 as new biomarkers for carcinoma. Aging (Albany NY) 2018;10:973–987. doi: 10.18632/aging.101441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Pan JH, Zhou H, Cooper L, Huang JL, Zhu SB, Zhao XX, Ding H, Pan YL, Rong L. LAYN is a prognostic biomarker and correlated with immune infiltrates in gastric and colon cancers. Front Immunol. 2019;10:6. doi: 10.3389/fimmu.2019.00006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sun CC, Li SJ, Hu W, Zhang J, Zhou Q, Liu C, Li LL, Songyang YY, Zhang F, Chen ZL, Li G, Bi ZY, Bi YY, Gong FY, Bo T, Yuan ZP, Hu WD, Zhan BT, Zhang Q, He QQ, Li DJ. Comprehensive analysis of the expression and prognosis for E2Fs in human breast cancer. Mol Ther. 2019;27:1153–1165. doi: 10.1016/j.ymthe.2019.03.019. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 28.Gou R, Zhu L, Zheng M, Guo Q, Hu Y, Li X, Liu J, Lin B. Annexin A8 can serve as potential prognostic biomarker and therapeutic target for ovarian cancer: based on the comprehensive analysis of Annexins. J Transl Med. 2019;17:275. doi: 10.1186/s12967-019-2023-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Li T, Fan J, Wang B, Traugh N, Chen Q, Liu JS, Li B, Liu XS. TIMER: a web server for comprehensive analysis of tumor-infiltrating immune cells. Cancer Res. 2017;77:e108–e110. doi: 10.1158/0008-5472.CAN-17-0307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhang M, Zhu K, Pu H, Wang Z, Zhao H, Zhang J, Wang Y. An immune-related signature predicts survival in patients with lung adenocarcinoma. Front Oncol. 2019;9:1314. doi: 10.3389/fonc.2019.01314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sundar R, Qamra A, Tan ALK, Zhang S, Ng CCY, Teh BT, Lee J, Kim KM, Tan P. Transcriptional analysis of immune genes in Epstein-Barr virus-associated gastric cancer and association with clinical outcomes. Gastric Cancer. 2018;21:1064–1070. doi: 10.1007/s10120-018-0851-9. [DOI] [PubMed] [Google Scholar]
- 32.Lin P, Guo YN, Shi L, Li XJ, Yang H, He Y, Li Q, Dang YW, Wei KL, Chen G. Development of a prognostic index based on an immunogenomic landscape analysis of papillary thyroid cancer. Aging (Albany NY) 2019;11:480–500. doi: 10.18632/aging.101754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Yang S, Wu Y, Deng Y, Zhou L, Yang P, Zheng Y, Zhang D, Zhai Z, Li N, Hao Q, Song D, Kang H, Dai Z. Identification of a prognostic immune signature for cervical cancer to predict survival and response to immune checkpoint inhibitors. Oncoimmunology. 2019;8:e1659094. doi: 10.1080/2162402X.2019.1659094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Taube JM, Galon J, Sholl LM, Rodig SJ, Cottrell TR, Giraldo NA, Baras AS, Patel SS, Anders RA, Rimm DL, Cimino-Mathews A. Implications of the tumor immune microenvironment for staging and therapeutics. Mod Pathol. 2018;31:214–234. doi: 10.1038/modpathol.2017.156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wagner NB, Weide B, Gries M, Reith M, Tarnanidis K, Schuermans V, Kemper C, Kehrel C, Funder A, Lichtenberger R, Sucker A, Herpel E, Holland-Letz T, Schadendorf D, Garbe C, Umansky V, Utikal J, Gebhardt C. Tumor microenvironment-derived S100A8/A9 is a novel prognostic biomarker for advanced melanoma patients and during immunotherapy with anti-PD-1 antibodies. J Immunother Cancer. 2019;7:343. doi: 10.1186/s40425-019-0828-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Senbabaoglu Y, Gejman RS, Winer AG, Liu M, Van Allen EM, de Velasco G, Miao D, Ostrovnaya I, Drill E, Luna A, Weinhold N, Lee W, Manley BJ, Khalil DN, Kaffenberger SD, Chen Y, Danilova L, Voss MH, Coleman JA, Russo P, Reuter VE, Chan TA, Cheng EH, Scheinberg DA, Li MO, Choueiri TK, Hsieh JJ, Sander C, Hakimi AA. Tumor immune microenvironment characterization in clear cell renal cell carcinoma identifies prognostic and immunotherapeutically relevant messenger RNA signatures. Genome Biol. 2016;17:231. doi: 10.1186/s13059-016-1092-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lazăr DC, Avram MF, Romosan I, Cornianu M, Tăban S, Goldis A. Prognostic significance of tumor immune microenvironment and immunotherapy: novel insights and future perspectives in gastric cancer. World J Gastroenterol. 2018;24:3583–3616. doi: 10.3748/wjg.v24.i32.3583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Li J, Li X, Zhang C, Zhang C, Wang H. A signature of tumor immune microenvironment genes associated with the prognosis of nonsmall cell lung cancer. Oncol Rep. 2020;43:795–806. doi: 10.3892/or.2020.7464. [DOI] [PubMed] [Google Scholar]
- 39.Chen H, Chong W, Teng C, Yao Y, Wang X, Li X. The immune response-related mutational signatures and driver genes in non-small-cell lung cancer. Cancer Sci. 2019;110:2348–2356. doi: 10.1111/cas.14113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Takada K, Toyokawa G, Shoji F, Okamoto T, Maehara Y. The significance of the PD-L1 expression in non-small-cell lung cancer: trenchant double swords as predictive and prognostic markers. Clin Lung Cancer. 2018;19:120–129. doi: 10.1016/j.cllc.2017.10.014. [DOI] [PubMed] [Google Scholar]
- 41.Faruki H, Mayhew GM, Serody JS, Hayes DN, Perou CM, Lai-Goldman M. Lung adenocarcinoma and squamous cell carcinoma gene expression subtypes demonstrate significant differences in tumor immune landscape. J Thorac Oncol. 2017;12:943–953. doi: 10.1016/j.jtho.2017.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhao L, Yu Z, Zhao B. Mechanism of VIPR1 gene regulating human lung adenocarcinoma H1299 cells. Med Oncol. 2019;36:91. doi: 10.1007/s12032-019-1312-y. [DOI] [PubMed] [Google Scholar]
- 43.Inaguma S, Lasota J, Czapiewski P, Langfort R, Rys J, Szpor J, Waloszczyk P, Okon K, Biernat W, Schrump DS, Hassan R, Kasai K, Miettinen M, Ikeda H. CD70 expression correlates with a worse prognosis in malignant pleural mesothelioma patients via immune evasion and enhanced invasiveness. J Pathol. 2020;250:205–216. doi: 10.1002/path.5361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Jacobs J, Deschoolmeester V, Zwaenepoel K, Rolfo C, Silence K, Rottey S, Lardon F, Smits E, Pauwels P. CD70: an emerging target in cancer immunotherapy. Pharmacol Ther. 2015;155:1–10. doi: 10.1016/j.pharmthera.2015.07.007. [DOI] [PubMed] [Google Scholar]
- 45.Tarassishin L, Lim J, Weatherly DB, Angeletti RH, Lee SC. Interleukin-1-induced changes in the glioblastoma secretome suggest its role in tumor progression. J Proteomics. 2014;99:152–168. doi: 10.1016/j.jprot.2014.01.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Du B, Luo W, Li R, Tan B, Han H, Lu X, Li D, Qian M, Zhang D, Zhao Y, Liu M. Lgr4/Gpr48 negatively regulates TLR2/4-associated pattern recognition and innate immunity by targeting CD14 expression. J Biol Chem. 2013;288:15131–15141. doi: 10.1074/jbc.M113.455535. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.








