Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2020 Sep 23;10:15495. doi: 10.1038/s41598-020-72488-4

Molecular subtyping of glioblastoma based on immune-related genes for prognosis

Xueran Chen 1,2,✉,#, Xiaoqing Fan 3,4,#, Chenggang Zhao 1,5, Zhiyang Zhao 1,5, Lizhu Hu 1,5, Delong Wang 3,4, Ruiting Wang 3,4, Zhiyou Fang 1,2
PMCID: PMC7511296  PMID: 32968155

Abstract

Glioblastoma (GBM) is associated with an increasing mortality and morbidity and is considered as an aggressive brain tumor. Recently, extensive studies have been carried out to examine the molecular biology of GBM, and the progression of GBM has been suggested to be correlated with the tumor immunophenotype in a variety of studies. Samples in the current study were extracted from the ImmPort and TCGA databases to identify immune-related genes affecting GBM prognosis. A total of 92 immune-related genes displaying a significant correlation with prognosis were mined, and a shrinkage estimate was conducted on them. Among them, the 14 most representative genes showed a marked correlation with patient prognosis, and LASSO and stepwise regression analysis was carried out to further identify the genes for the construction of a predictive GBM prognosis model. Then, samples in training and test cohorts were incorporated into the model and divided to evaluate the efficiency, stability, and accuracy of the model to predict and classify the prognosis of patients and to identify the relevant immune features according to the median value of RiskScore (namely, Risk-H and Risk-L). In addition, the constructed model was able to instruct clinicians in diagnosis and prognosis prediction for various immunophenotypes.

Subject terms: Cancer microenvironment, CNS cancer, Tumour biomarkers, Tumour immunology

Introduction

Glioblastoma (GBM), an aggressive primary malignancy in the central nervous system, has a median survival time of 12–15 months and a 5-year survival rate of < 5%1,2. According to the Clinical Practice Guideline formulated by the American National Comprehensive Cancer Network, chemotherapy is still the preferred choice for stage III–IV GBM3. Currently, there are various chemotherapy regimens, but some patients do not benefit from chemotherapy, and imaging examination could be applied to examine cancer development. In addition, the distinct long-term clinical outcomes may be detected based on tumor heterogeneities among these cases with the same pathological subtype4. However, some problems remain to be solved: how to assess tumor heterogeneities prior to treatment for these cases in a non-invasive or less traumatic way, estimate the risk of cancer progression, evaluate tumor response to chemotherapy in individual patients, and to estimate the different long-time overall survival (OS) among groups with different cancer heterogeneities5.

Currently, an immune disorder that can promote tumor genesis has been recognized as the enabling feature in the glioma genesis process6. Glioma cells can remarkably induce an immune response; in some cases, they can subjugate such a response to establish an appropriate microenvironment to promote their development7. Standard treatment cannot achieve a satisfying effect; thus, immunotherapy is being intensively investigated as an additional method8. Meanwhile, some parameters related to immunity have been reported to predict the disease prognosis, which has highlighted the significance of different immune states in identifying glioma outcomes9,10. Nonetheless, immune phenotypes in a glioma microenvironment, together with their relationship with prognosis, are rarely examined systemically.

Biomarkers are able to accurately estimate disease prognosis and patient survival, which are thereby valuable for decision-making in clinical GBM treatment11,12. Recently, an increasing number of studies have suggested that the expression patterns of genes can predict and classify the survival outcomes of GBM patients13. Nonetheless, this proposal has still not been identified as a clinical routine practice, which may be related to the lack of evidence, small sample size, and tremendous data fitting in most studies. Consequently, use of large-scale databases that are accessible to the public and involve the expression patterns of genes, like TCGA, makes it possible to identify the most reliable biomarkers to predict and classify GBM prognosis. In this study, a model to predict the prognosis of GBM was constructed and verified based on immune-related genes, according to the clinical characteristics of patients extracted from the ImmPort and TCGA databases. Our results can help clinicians evaluate the efficacy, predict the disease prognosis, and select the suitable GBM treatment.

Results

Mining of specific immune-related genes based on GBM patient survival and prognostic outcomes

At first, related data were collected based on the ImmPort and TCGA databases, followed by a pre-processing. Then, all immune-related genes and survival data were analyzed using the univariate Cox proportional hazards regression model based on the R survival package coxph function, with the significance level set at p < 0.05 (Supplementary Table S1). Finally, 92 prognosis-specific immune-related genes were mined. The association between the p values for these 92 genes and expression intensities (log2(EXP)), together with hazard ratios (HRs), is presented in Fig. 1A,B.

Figure 1.

Figure 1

Construction of the prognosis prediction model for glioblastoma (GBM) patients by least absolute shrinkage and selection operator (LASSO) analysis. (A) The relationships between the p values of 92 genes and the hazard ratio (HR). (B) The relationships between the p values of 92 genes and the expression levels. Red dots represent significantly different immune-related genes regarding prognosis. (C) The changing trajectory of each independent variable. The horizontal axis represents the log value of the independent variable lambda, and the vertical axis represents the coefficient of the independent variable. With the increase in lambda, the number of independent variable coefficients tending to 0 also increases. (D) Confidence intervals for each lambda. The optimal model is acquired when the lambda is 0.04456.

Altogether, 92 immune-related genes were identified, but most of them were not suitable for clinical detection. Therefore, the number of immune-related genes was reduced, while a high accuracy was maintained. Consequently, these 92 genes were narrowed down using a least absolute shrinkage and selection operator (LASSO) regression, to decrease the number of genes recruited into this risk model. The LASSO algorithm, a biased estimate used for processing multicollinearity data, can predict and select variables, and overcome the multicollinearity problem in regression analysis. Here, R package glmnet was utilized for LASSO regression analysis. The variation trajectory of each independent variable was assessed, as presented in Fig. 1C, which indicated that most independent parameters had coefficients of about zero with a gradual lambda increase. Moreover, the model was also established by means of a tenfold cross-validation. Figure 1D displays the confidence interval of each lambda, which reveals that the optimal model was acquired when the lambda was 0.04456. Therefore, this model was selected as the final model, involving 34 immune-related genes (Supplementary Table S2). Moreover, MASS R package was used for stepwise regression analysis, according to Akaike data standards, and 14 genes were used for the risk model construction (Supplementary Tables S3 and S4). The formula is presented in the “Methods” section.

Construction of the model to predict prognosis for GBM patients

Then, all samples in the training set were substituted into the formula to calculate the RiskScore value. The median RiskScore value was used as the threshold to classify patients into high- (Risk-H) and low-risk (Risk-L) groups. Receiver operating characteristic (ROC) analysis was also performed for prognosis classification according to the RiskScore value. The OS of all samples was 1–3 years (Supplementary Fig. S1). As a result (Fig. 2A), the model prediction efficiency for 1–3-year OS was examined, and the average area under the curve (AUC) was as high as 0.793. Moreover, Fig. 2B shows the sample distribution in Risk-H and Risk-L groups for various OS, suggesting no statistically significant differences in 0- and 1-year sample sizes between the two groups. Moreover, the 1.5-year sample size of Risk-H group was remarkably decreased compared with that of Risk-L group, which was more obvious with the OS extension (Fig. 2C). We next extracted the gene expression profile for the clustering analysis using log10 for all expression values. We also used the hierarchical clustering method to calculate the Euclidean distance between different features. Figure 2D shows the results of sample clustering of the training set. As expected, the above-mentioned 14 genes were markedly clustered into high and low expression groups, respectively, and the training set samples were also divided into two groups. Additionally, the RiskScore values between these two subclasses were compared (Fig. 2E).

Figure 2.

Figure 2

Verification of the stability of the prognosis prediction model including 14 immune-related genes of GBM patients in the training set. (A) The 1–3-year overall survival (OS) predicted receiver operating characteristic (ROC) curves of a 14-gene risk model in the training set. (B) The distribution of samples in Risk-H and Risk-L groups of the training set was done using the 14-gene risk model under different OS. (C) The level of Risk-L group/Total sample size with the extension in OS in the training set. (D) The clustering results of the training set samples. Fourteen genes were used for hierarchical clustering. The distance between different features was calculated by a Euclidean distance analysis. These genes clustered into high- and low-expression groups, and samples in the training set were also divided into two groups. (E) Difference in the RiskScore between the two groups, which had been clustered by the expression of 14 genes of training set samples.

To validate the model reliability, the expression patterns of the above 14 genes were extracted based on test cohort and substituted into the validation model. Meanwhile, the RiskScore values of all samples were also computed, and the test set data were also used to evaluate the model efficacy to predict the OS at 1–3 years, as presented in Supplementary Fig. S2, which displays the sample distribution in Risk-H and Risk-L groups at various OS. The difference in the distribution of 0–1-year sample size between the two groups was not statistically significant. Moreover, the 2-year sample size in the Risk-H group was also notably decreased compared with that in the Risk-L group, which was even obvious with the OS extension (Supplementary Fig. S2). Supplementary Figure S2 shows the results of sample clustering of the test cohort, as well as the different RiskScore values between these two subgroups.

Moreover, we retrieved the GSE74187 data set with prognosis follow-up information from the GEO database. The expression matrix of these 14 genes was extracted from the expression profile and the risk score of each sample was calculated using the same method. We evaluated the ROC risk score analysis, which indicated that the average AUC at 1, 2, and 3 years was 0.83 (Supplementary Fig. S3A). According to the median of the high-risk group, the prognosis was significantly worse than that of the low-risk group (Supplementary Fig. S3B), which was consistent with the training and test sets.

In addition, the expression patterns of 14 genes extracted based on all the above 523 samples were substituted into the model to calculate the RiskScore values to validate the model reliability and stability (Supplementary Fig. S4), which exhibits the results of sample clustering and different RiskScore values between these two subgroups. Overall, the RiskScore model established based on the expression patterns of 14 immune-related genes presented favorable accuracy and stability to identify immunity-related features.

Finally, we plotted the Kaplan–Meier survival curves of the Risk-H and Risk-L groups based on the 14-gene-based risk model in the training (n = 261) and test cohorts (n = 262), and in all the samples (n = 523), separately, as shown in Fig. 3A–C (p < 0.0001, p < 0.001, and p < 0.0001, respectively).

Figure 3.

Figure 3

The Kaplan–Meier survival curve of the 14-gene-based risk model predicting the Risk-H and Risk-L groups in the training set (A, n = 261), test set (B, n = 262), and all samples (C, n = 523).

Functional annotations of immune-related genes and enrichment of signaling pathways specific to prognosis

All the above 14 gene families were first annotated according to the human gene classification in the HGNC database (Supplementary Table S5). All were significantly enriched in galanin receptors and endothelin receptor gene families (p < 0.05). Additionally, the clusterProfile in the R package was used for the enrichment analysis on the 14 genes. Supplementary Fig. S5 shows the results of the GO enrichment analysis and Supplementary Table S6 shows the related data, which indicated that most genes were enriched to distinct immune-related signaling pathways and biological processes.

The R package GSVA ssGSEA function was used for KEGG functional enrichment. Associations with the RiskScore values were examined based on the pathway enrichment scores among the different samples to obtain a total of 21 KEGG-related pathways (Supplementary Table S7S9). These 21 pathways were chosen for clustering analysis in accordance with the sample enrichment results from the training cohort (Fig. 4A). Additionally, the relationship between the enrichment score and the RiskScore value was examined by selecting the two major pathways with the highest GSEA enrichment scores (e.g., vascular smooth muscle contraction and the JAK-STAT signaling pathway). The sample distribution in the two groups was also explored. We found that the pathway enrichment scores were different in the Risk-H relative to the Risk-L group (Fig. 4B,C).

Figure 4.

Figure 4

Correlation of RiskScore with signaling pathways. KEGG functional enrichment scores of each sample were analyzed and their correlation with RiskScore was calculated based on the enrichment score of each pathway in each sample. All 21 pathways related to the KEGG pathways are shown. (A) The clustering analysis was conducted according to the enrichment scores. (B) The distribution of JAK-STAT KEGG pathway enrichment scores in Risk-H and Risk-L groups for GBM patients. (C) Distribution of the vascular smooth muscle contraction KEGG pathway enrichment scores in Risk-H and Risk-L groups for GBM patients.

Relationships of the RiskScore values with the clinical characteristics of samples

Subsequently, the associations between various parameters (such as neoadjuvant, sex, and age) and the RiskScore value were examined (Fig. 5A–C). Clearly, other features were not related to the RiskScore value (p > 0.05), except for age, and the constructed RiskScore model was dependent on patient age.

Figure 5.

Figure 5

The relationships of different clinical factors with RiskScore values of GBM patients. Comparison of RiskScore among different ages (A), sexes (B), and neoadjuvants (C). The horizontal axis represents the different clinical factors, and the vertical axis represents RiskScore values. The constructed RiskScore model was dependent on patient age. (D) The nomogram model constructed by combining the clinical features (age, sex, neoadjuvant) with the RiskScore of GBM patients. There was an obvious association with the greatest influence on predicting the survival rate. (E) The forest plot constructed by combining age with RiskScore for GBM patients. The HR for RiskScore was approximately 1.4 in the forest plots established in combination with RiskScore and age (p < 0.05).

At last, the RiskScore values combined with the clinical characteristics were used to construct the nomogram model. Use of a nomogram, an approach to intuitively and effectively present risk model results, is convenient for predicting patient outcomes. Specifically, the straight-line length in a nomogram represents the effects of different parameters and their significance on the outcome. Here, a nomogram was constructed to combine the RiskScore, age, neoadjuvant, and sex, respectively, as displayed in Fig. 5D. RiskScore characteristics showed an obvious association with the greatest influence on predicting the survival rate, indicating that the 14-gene-based risk model had a superb prognosis prediction ability.

The forest plot was established based on the clinical characteristics and RiskScore. In Fig. 5E, the HRs for RiskScore were approximately 1.4 (p < 0.05).

Indeed, we also had analyzed the relationship between the expression level of 36 immune-checkpoint genes and RiskScore (Supplementary Table S10). In addition to eight genes, including PDCD1 and CTLA4, the expression levels of other 26 genes showed a positive correlation with RiskScore, suggesting that the constructed model was able to instruct clinicians in diagnosing and predicting the prognosis for various immunophenotypes.

Practical application of the prediction model for GBM patients

According to the prognostic prediction model, we analyzed the clinical follow-up data of these 24 GBM patients, which were divided into Risk-H and Risk-L groups (n = 12, each), based on the median RiskScore value. There was an inverse correlation between the RiskScore value and OS (p = 0.0392) (Fig. 6A), with an AUC of 0.7465 (Fig. 6B).

Figure 6.

Figure 6

Clinical practice application of the prognostic predictor. (A) OS curves of the two clusters predicted from 24 GBM patients using the prognosis model. The log-rank test was used to assess the statistical significance of the difference. The red line indicates the Risk-H group, while the blue line indicates the Risk-L group, based on the median RiskScore value. (B) ROC curve with AUC under the final prognostic predictor. (C) Relationship between the RiskScore value and the score of CD3+CD4+/CD3+CD8+ cells of the peripheral blood samples of 24 GBM patients. The RiskScore value was negatively associated with the ratio of CD3+CD4+/CD3+CD8+ cells. (D) Relationship between the RiskScore value and the percentage of CD4+CD25+ Tregs in peripheral blood samples of the 24 GBM patients. The RiskScore value was positively related with the percentage of CD4+CD25+ Tregs. (E) Immunohistochemical (IHC) analysis of PD-L1 (left) and PD-L2 (right) for the 24 GBM patients. (F) Relationship between the IHC score of PD-L1 (yellow) or PD-L2 (green) and the RiskScore groups. The IHC score was positively correlated with the RiskScore value.

CD4+CD25+ regulatory T cells (Tregs) play an important role in anti-tumor immune responses, and a poor prognosis and declining survival rates are closely related with high Treg expression in cancer patients14,15. Consistent with these, the RiskScore value showed a negative relationship with CD3+CD4+/CD3+CD8+ (r = − 0.9635, p < 0.0001; Fig. 6C), but a positive relationship with CD4+CD25+ Tregs percentage (r = 0.5167, p = 0.0116; Fig. 6D). Notably, PD-L1 or PD-L2 immunohistochemical (IHC) analysis results showed that the IHC score was positively correlated with the RiskScore (Fig. 6E,F).

Taken together, we concluded that this prognostic predictor showed great promise in clinical practice application.

Discussion

Currently, GBM treatments include surgery alone for an early-stage disease and adjuvant radio/chemotherapy plus surgical resection for an advanced stage. However, surgical resection cannot provide a satisfactory effect because cancer cells may have invaded the local adjacent tissues or developed metastasis16. Moreover, it is still controversial whether systemic adjuvant therapy can be prescribed following surgery owing to tumor heterogeneity or potential adverse effects17. Consequently, it is important to mine the potential biomarkers to predict GBM prognosis; this way, high-risk GBM cases can benefit from early adjuvant therapy. This can also assist in the clinical management of individual patients and thereby accurately distinguish patients that can be completely treated using adjuvant treatment from those that can avoid treatment and the possible chemotherapeutics-derived toxicity18. In the current work, a candidate signature was examined as a reliable method to predict GBM prognosis.

Due to the emerging next-generation sequencing techniques, a number of candidate biomarkers for the diagnosis and prognosis prediction of GBM were identified, which makes it possible to more specifically classify and more accurately predict GBM outcomes19. Several molecular markers, such as isocitrate dehydrogenase, O6-methylguanine DNA methyltransferase, phosphatase and tensin homolog, and epidermal growth factor receptor, are conventionally examined in clinical GBM cases20,21. These molecular markers facilitate targeted anti-GBM treatments and individualized therapeutic methods. Nonetheless, GBM has a dismal prognosis, so new treatment strategies and molecular biomarkers are urgently needed to illustrate the underlying GBM mechanisms and improve the OS of patients.

Limited clinical data and fresh tumor specimens symbolizing transitional steps from tumor initiation to progression are important barriers to improving clinical outcomes in GBM patients. Methylation-based subtypes that predict GBM patient survival have been reported. Notably, the methylation levels of different subgroups could reflect different molecular genetic features22,23. More and more attention has been paid to the relationship between the immune system and malignancy progression and pathogenesis, which contribute to GBM treatment, thereby promoting the development of anti-tumor treatments. CD68+ and CD163+ cells were the most abundant populations in GBM, and the percentage of CD163+ cells correlated with a poorer prognosis. Mesenchymal GBMs displayed the highest percentages of microglia, macrophage, and lymphocyte infiltration24. Wild-type and the mesenchymal subtype, IDH1, in GBM presented strong immunosuppressive microenvironments, while tumors of mutated IDH1 and TCGA proneural subtypes exhibited a significantly less immunosuppressive state25. Regarding tumor origin (namely, the immune system), the approach of regulating and killing cancer cells by modulating the immune system and promoting anti-cancer immunity in the tumor microenvironment is novel. Therefore, screening of novel significant prognosis-specific immune-related genes is meaningful for predicting disease prognosis and identifying novel therapeutic targets. Some researchers have reported gene expression-based immunoprofiling of GBM using TCGA data. For example, Arivazhagan et al. reported a 14-gene expression signature that predicted survival in GBM patients. A network analysis specifically revealed inflammatory response pathway activation in the high-risk group26. Zhang et al. showed that samples with high tumor microenvironment (TME) scores were characterized by immune activation, TGF pathway activation, and high expression of immune checkpoint genes, while those with low TME scores were characterized by a high-frequency of IDH1 and MET mutations27. Zhang et al. identified six immune-related genes (CANX, HSPA1B, KLRC2, PSMC6, RFXAP, and TAP1) as risk signatures. Importantly, Kaplan–Meier and ROC curves, as well as risk plotting, verified their performance in TCGA and CGGA datasets28. Zhang et al. observed that a high immune score was associated with low methylation and copy number variation levels, a high expression of immunosuppressive markers (CD27, PDL1 and CTLA4), and a shorter recurrence-free survival29. Here, GBM classification based on the prognosis-specific and immune-related signature could precisely estimate the clinical outcomes and identify those with a high or low risk of postoperative recurrence. Notably, PD-L1 or PD-L2 IHC analysis results showed that the IHC score was positively correlated with the RiskScore. Moreover, the RiskScore value showed a negative relationship with CD3+CD4+/CD3+CD8+, but a positive relationship with CD4+CD25+ Tregs percentage.

Here, 14 prognosis-specific immune-related genes were mined by big data mining, TCGA and ImmPort database sorting, and statistical analyses. Two key points must be cautiously taken into consideration to ensure the prognosis model validity: clinical utility and transport capability in different cohorts. Typically, our constructed prognosis model is better than other prognosis models for GBM that were not duplicated in GBM-independent cohorts. Additionally, our validation set was a multi-institutional cohort involving cases from different hospitals, which suggests that our constructed GBM model is applicable to different clinical settings and patient types. Afterwards, the 14-gene-based model was constructed for prognosis prediction, and RiskScore values for all cases were also computed. Then, the model was applied for prediction and validation. The prognosis model was established based on the expression patterns of specific immune-related genes, and it could classify patients at a certain clinical stage into various subgroups, according to the estimated survival outcomes.

Nine of these 14 genes were previously suggested to be involved in malignant transformation, pathogenesis, progression, and immune microenvironment of GBM, including S100A9, HSPA1A, GALR2, EDNRB, IL13RA2, ELN, NR1D1, HDGF, and MET3035. They were markedly correlated with patient survival and prognosis, which means that our bioinformatic mining displayed a high reliability and accuracy. However, the relationship of the other two genes (namely, CLRF1 and GRAP2) with GBM is not validated in a clinical or basic study, and we are interested in this topic. CRLF1 is verified to be involved in regulating malignant cancer cell proliferation and invasion, which can affect signaling pathways (such as MAPK/ERK and Akt/PI3K) and modulate the immune and nervous systems maturity during fetal development36,37. GRAP2 is also found to be a candidate tumor suppressor, and it is recognized to be a prognosis prediction marker for different types of cancers, which can regulate tumor cell sensitivity to immunotherapy38,39.

In conclusion, our results assist in identifying novel biomarkers for predicting the clinical prognosis of GBM. Additionally, the 14-gene-based risk model can provide a variety of targets for an accurate GBM treatment, and it can also help classify GBM patients according to the molecular subtypes. In addition, the constructed model may be used to instruct clinicians in the medication, prognosis prediction, and diagnosis of GBM patients with various immunophenotypes.

Methods

GBM tissue specimens were collected from 24 patients (ages 42–75) who underwent curative resection for glioma with informed consent between 2017 and 2019 at Hefei Cancer Hospital, Chinese Academy of Sciences (CAS), with Institutional Review Board approval. All methods were performed in accordance with the relevant guidelines and regulations, as stated in relevant sections below.

Pre-processing of original sample data and preliminary selection of immune-related genes in GBM

The up-to-date clinical follow-up information was extracted from TCGA GDC API. Altogether, 539 RNA-Seq data samples were mined (as displayed in Supplementary Table S11), and 529 of them were tumor tissues. Additionally, the immune-related gene set involving 1811 genes was also acquired based on the ImmPort database40 (Supplementary Table S12).

At first, 529 tumor tissues were subjected to a pro-processing (Supplementary Table S13), and 523 of them involving 1,108 genes were used for further model analysis. Supplementary Table S14 presents the clinical characteristics of samples. Afterwards, these 523 samples were classified into training and test sets, respectively. Random grouping with replacement was carried out 100 times on all samples to remove the influence of random allocation bias on model stability. The training (n = 261) and test set (n = 262) samples are displayed in Supplementary Tables S15 and 16, respectively. The eventual data of training and test set samples are shown in Supplementary Table S14. Differences between the two sets were not statistically significant, indicating a reasonable sample grouping.

Univariate survival analysis for immune-related samples in training set

The univariate Cox proportional hazards regression model was utilized to analyze the immune-related genes and the survival data using the survival coxph function41 of R package. A p < 0.05 was regarded to be statistically significant.

Screening of immune-related genes specific to GBM prognosis, and establishment of the model to predict prognosis

At first, the R package MASS and glmnet functions were used for stepwise and LASSO regression analysis42, and the risk model was established based on specific immune-related genes, as displayed below:

RiskScore=EDNRA×-0.325652748+HSPA1A×-0.312268258+S100A9×0.17460672+PI15×-1.128026913+EDNRB×-0.199258031+GALR2×-1.690737959+NR1D1×0.367374589+FGF14×0.184640626+ELN×0.258161826+IL13RA2×0.081069744+MET×0.172446326+HDGF×-0.342300085+GRAP2×-0.863180168+CRLF1×-0.138403709 1

Afterwards, related gene expression patterns were selected based on training and test sets, which were then substituted into the constructed model to calculate the RiskScore values in each sample. The median RiskScore value was utilized as the threshold to classify samples as belonging to the high- (Risk-H) or low-risk (Risk-L) group. Finally, the accuracy, stability, and efficiency of the model to predict and classify GBM prognosis were evaluated through gene clustering, ROC, and KM analyses.

Signaling pathway enrichment and functional annotations for immune-related genes specific to immunity

Finally, 14 genes were screened and the corresponding gene families were annotated in accordance with the human gene classification in the HGNC database43. Moreover, GO enrichment analyses were carried out using these 14 prognosis-specific immune-related genes and clusterProfile44 of R package.

Relationships of RiskScore with the signaling pathways and clinical characteristics of samples

At first, the R package GSVA45 ssGSEA function was utilized to evaluate the score of KEGG enrichment analysis. At the same time, the relationship of RiskScore was computed, and later, clustering analysis was performed based on the pathway enrichment score for all samples. Then, the relationships of related factors (like neoadjuvant, sex, and age) with the RiskScore were determined. Finally, the nomogram model was established, and related clinical characteristics and RiskScore values were used to draw the forest plot, and the relationships between RiskScore and clinical characteristics with patient survival were examined.

Phenotyping of peripheral T cells and IHC staining for GBM tissue microarray analysis

Peripheral blood samples from 24 GBM patients undergoing curative resection with informed consent between 2017 and 2019 at Hefei Cancer Hospital, Chinese Academy of Sciences (Anhui, China), were stained with the following sets of monoclonal antibodies (BD Biosciences; San Jose, CA, USA): CD3-PE (clone SP34), CD4-APC-Cy7 (clone SK3), CD8-PerCP (clone SK1), and CD25-FITC (clone MA251), and analyzed on Cytomics FC500 Flow Cytometer CXP with the CXP analysis software (Beckman Coulter Inc.). Twenty-four GBM tissues were placed on a tissue microarray and stained with anti-PD-L1 (clone E1L3N) and anti-PD-L2 (clone D7U8C) antibodies (Cell Signaling Technology; Danvers, MA, USA) , and visualized using the KF-PRO Digital Slide Scanning System (Kongfong Biotech International Co., LTD; Ningbo, China).

Statistical methods

The TCGA dataset was randomly divided into training and test cohorts in a 1:1 ratio. Samples in the training set were analyzed to identify the potential prognosis-predicting genes and validated in both the test and the whole sets. First, the relationships between the expression of immune-related genes and patient OS were evaluated using the univariate Cox proportional hazards regression analysis. Typically, genes with a p < 0.05 through log rank test were selected to be the candidate variables. Later, the number of candidate genes was decreased based on the LASSO-Cox method, and later, immune-related genes showing the greatest significance were chosen for constructing the RiskScore model to predict prognosis. The RiskScore model could be calculated as follows:

Riskscore=i=0nβi×χi 2

where βi indicates the coefficient, and χi represents the gene expression level (fpkm) of each gene. The RiskScore model was calculated for all patients, who were then divided into low- or high-risk groups according to the median RiskScore value in the training set. Patients in the low-risk group had a lower risk of OS, while those in the high-risk group had a higher risk of OS. Then, the difference in OS between these two groups was calculated based on the Kaplan–Meier survival curve. The specificity and sensitivity of the model in diagnosis and prognosis prediction were evaluated according to the areas under the ROC curve. A two-tailed p < 0.05 was deemed to indicate statistical significance. The Bio-conductor and R software (version 3.5.0) were utilized for all statistical analyses.

Ethics approval and consent to participate

This study was reviewed and approved by the Institutional Review Board of the Cancer Hospital of Hefei Institutes of Physical Science, CAS, and written informed consent was obtained from patients based on the Declaration of Helsinki.

Supplementary information

Supplementary figures. (581.3KB, pdf)
Supplementary tables. (6.5MB, xls)
Supplementary legends. (14KB, docx)

Abbreviations

AUC

Area under the curve

GO

Gene ontology

GBM

Glioblastoma

HRs

Hazard ratios

HGNC

HUGO Gene Nomenclature Committee

IDH1

Isocitrate dehydrogenase (NADP(+)) 1

KEGG

Kyoto Encyclopedia of Genes and Genomes

LASSO

Least absolute shrinkage and selection operator

MET

MET proto-oncogene, receptor tyrosine kinase

OS

Overall survival

ROC

Receiver operating characteristic

TGF

Transforming growth factor

Author contributions

X.R.C. and Z.Y.F.: conceived and designed the experiments. C.G.Z., Z.Y.Z., L.Z.H. .and D.L.W.: collected the data, prepared Figs. 1, 2 and 3. X.R.C., and X.Q.F.: phenotyping of peripheral T cells and IHC staining for GBM tissue microarray analysis, performed the analysis, prepared Figs. 4, 5, 6. X.R.C., R.T.W. and Z.Y.F.: participated in the discussion of the algorithm. X.R.C., X.Q.F. and C.G.Z.: prepared and edited the manuscript. All authors have read and approved the final manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (81872066, 31571433, 81773131 and 81972635), the Innovative Program of Development Foundation of Hefei Center for Physical Science and Technology (2018CXFX004 and 2017FXCX008), and Youth Innovation Promotion Association of Chinese Academy of Sciences (2018487).

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Xueran Chen and Xiaoqing Fan.

Supplementary information

is available for this paper at 10.1038/s41598-020-72488-4.

References

  • 1.Wen PY, Kesari S. Malignant gliomas in adults. N. Engl. J. Med. 2008;359(5):492–507. doi: 10.1056/NEJMra0708126. [DOI] [PubMed] [Google Scholar]
  • 2.Ostrom QT, Gittleman H, Farah P, Ondracek A, Chen Y, Wolinsky Y, Stroup NE, Kruchko C, Barnholtz-Sloan JS. CBTRUS statistical report: Primary brain and central nervous system tumors diagnosed in the United States in 2006–2010. Neuro Oncol. 2013;15 Suppl 2:ii1–ii56. doi: 10.1093/neuonc/not151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Nabors LB, Portnow J, Ammirati M, Baehring J, Brem H, Butowski N, Fenstermaker RA, Forsyth P, Hattangadi-Gluth J, Holdhoff M, Howard S, Junck L, Kaley T, Kumthekar P, Loeffler JS, Moots PL, Mrugala MM, Nagpal S, Pandey M, Parney I, Peters K, Puduvalli VK, Ragsdale J, Rockhill J, Rogers L, Rusthoven C, Shonka N, Shrieve DC, Sills AK, Swinnen LJ, Tsien C, Weiss S, Wen PY, Willmarth N, Bergman MA, Engh A. NCCN guidelines insights: central nervous system cancers, version 1.2017. J. Natl. Compr. Canc. Netw. 2017;15(11):1331–1345. doi: 10.6004/jnccn.2017.0166. [DOI] [PubMed] [Google Scholar]
  • 4.Cheng W, Ren X, Zhang C, Cai J, Liu Y, Han S, Wu A. Bioinformatic profiling identifies an immune-related risk signature for glioblastoma. Neurology. 2016;86(24):2226–2234. doi: 10.1212/WNL.0000000000002770. [DOI] [PubMed] [Google Scholar]
  • 5.Verhaak RG, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD, Miller CR, Ding L, Golub T, Mesirov JP, Alexe G, Lawrence M, O'Kelly M, Tamayo P, Weir BA, Gabriel S, Winckler W, Gupta S, Jakkula L, Feiler HS, Hodgson JG, James CD, Sarkaria JN, Brennan C, Kahn A, Spellman PT, Wilson RK, Speed TP, Gray JW, Meyerson M, Getz G, Perou CM, Hayes DN. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell. 2010;17(1):98–110. doi: 10.1016/j.ccr.2009.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kim R, Emi M, Tanabe K. Cancer immunoediting from immune surveillance to immune escape. Immunology. 2007;121(1):1–14. doi: 10.1111/j.1365-2567.2007.02587.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Silver DJ, Sinyuk M, Vogelbaum MA, Ahluwalia MS, Lathia JD. The intersection of cancer, cancer stem cells, and the immune system: therapeutic opportunities. Neuro Oncol. 2016;18(2):153–159. doi: 10.1093/neuonc/nov157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Finocchiaro G, Pellegatta S. Immunotherapy for glioma: getting closer to the clinical arena? Curr. Opin. Neurol. 2011;24(6):641–647. doi: 10.1097/WCO.0b013e32834cbb17. [DOI] [PubMed] [Google Scholar]
  • 9.Han S, Zhang C, Li Q, Dong J, Liu Y, Huang Y, Jiang T, Wu A. Tumour-infiltrating CD4(+) and CD8(+) lymphocytes as predictors of clinical outcome in glioma. Br. J. Cancer. 2014;110(10):2560–2568. doi: 10.1038/bjc.2014.162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Han S, Liu Y, Li Q, Li Z, Hou H, Wu A. Pre-treatment neutrophil-to-lymphocyte ratio is associated with neutrophil and T-cell infiltration and predicts clinical outcome in patients with glioblastoma. BMC Cancer. 2015;15:617. doi: 10.1186/s12885-015-1629-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Aldape K, Zadeh G, Mansouri S, Reifenberger G, von Deimling A. Glioblastoma: pathology, molecular mechanisms and markers. Acta Neuropathol. 2015;129(6):829–848. doi: 10.1007/s00401-015-1432-1. [DOI] [PubMed] [Google Scholar]
  • 12.Szopa W, Burley TA, Kramer-Marek G, Kaspera W. Diagnostic and therapeutic biomarkers in glioblastoma: current status and future perspectives. Biomed. Res. Int. 2017;2017:8013575. doi: 10.1155/2017/8013575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bao ZS, Li MY, Wang JY, Zhang CB, Wang HJ, Yan W, Liu YW, Zhang W, Chen L, Jiang T. Prognostic value of a nine-gene signature in glioma patients based on mRNA expression profiling. CNS Neurosci. Ther. 2014;20(2):112–118. doi: 10.1111/cns.12171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gajewski TF. Identifying and overcoming immune resistance mechanisms in the melanoma tumor microenvironment. Clin. Cancer Res. 2006;12(7 Pt 2):2326s–2330s. doi: 10.1158/1078-0432.ccr-05-2517. [DOI] [PubMed] [Google Scholar]
  • 15.Shevach EM. CD4+ CD25+ suppressor T cells: more questions than answers. Nat. Rev. Immunol. 2002;2(6):389–400. doi: 10.1038/nri821. [DOI] [PubMed] [Google Scholar]
  • 16.Cunha M. Maldaun MVC (2019) Metastasis from glioblastoma multiforme: a meta-analysis. Rev. Assoc. Med. Bras. 1992;65(3):424–433. doi: 10.1590/1806-9282.65.3.424. [DOI] [PubMed] [Google Scholar]
  • 17.Abdul KU, Houweling M, Svensson F, Narayan RS, Cornelissen FMG, Kucukosmanoglu A, Metzakopian E, Watts C, Bailey D, Wurdinger T, Westerman BA. WINDOW consortium: a path towards increased therapy efficacy against glioblastoma. Drug Resist. Updates. 2018;40:17–24. doi: 10.1016/j.drup.2018.10.001. [DOI] [PubMed] [Google Scholar]
  • 18.Harrison RA, de Groot JF. Treatment of glioblastoma in the elderly. Drugs Aging. 2018;35(8):707–718. doi: 10.1007/s40266-018-0568-9. [DOI] [PubMed] [Google Scholar]
  • 19.Huang J, Liu F, Liu Z, Tang H, Wu H, Gong Q, Chen J. Immune checkpoint in glioblastoma: promising and challenging. Front. Pharmacol. 2017;8:242. doi: 10.3389/fphar.2017.00242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ayoub Z, Geara F, Najjar M, Comair Y, Khoueiry-Zgheib N, Khoueiry P, Mahfouz R, Boulos FI, Kamar FG, Andraos T, Saadeh F, Kreidieh F, Abboud M, Skaf G, Assi HI. Prognostic significance of O6-methylguanine-DNA-methyltransferase (MGMT) promoter methylation and isocitrate dehydrogenase-1 (IDH-1) mutation in glioblastoma multiforme patients: a single-center experience in the Middle East region. Clin. Neurol. Neurosurg. 2019;182:92–97. doi: 10.1016/j.clineuro.2019.04.008. [DOI] [PubMed] [Google Scholar]
  • 21.Chamberlain MC, Sanson M. Combined analysis of TERT, EGFR, and IDH status defines distinct prognostic glioblastoma classes. Neurology. 2015;84(19):2007. doi: 10.1212/WNL.0000000000001625. [DOI] [PubMed] [Google Scholar]
  • 22.Ma H, Zhao C, Zhao Z, Hu L, Ye F, Wang H, Fang Z, Wu Y, Chen X. Specific glioblastoma multiforme prognostic-subtype distinctions based on DNA methylation patterns. Cancer Gene Ther. 2019 doi: 10.1038/s41417-019-0142-6. [DOI] [PubMed] [Google Scholar]
  • 23.Tang Y, Qing C, Wang J, Zeng Z. DNA methylation-based diagnostic and prognostic biomarkers for glioblastoma. Cell Transplant. 2020;29:963689720933241. doi: 10.1177/0963689720933241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Martinez-Lage M, Lynch TM, Bi Y, Cocito C, Way GP, Pal S, Haller J, Yan RE, Ziober A, Nguyen A, Kandpal M, O'Rourke DM, Greenfield JP, Greene CS, Davuluri RV, Dahmane N. Immune landscapes associated with different glioblastoma molecular subtypes. Acta Neuropathol. Commun. 2019;7(1):203. doi: 10.1186/s40478-019-0803-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhang C, Li J, Wang H, Song SW. Identification of a five B cell-associated gene prognostic and predictive signature for advanced glioma patients harboring immunosuppressive subtype preference. Oncotarget. 2016;7(45):73971–73983. doi: 10.18632/oncotarget.12605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Arimappamagan A, Somasundaram K, Thennarasu K, Peddagangannagari S, Srinivasan H, Shailaja BC, Samuel C, Patric IR, Shukla S, Thota B, Prasanna KV, Pandey P, Balasubramaniam A, Santosh V, Chandramouli BA, Hegde AS, Kondaiah P, Sathyanarayana Rao MR. A fourteen gene GBM prognostic signature identifies association of immune response pathway and mesenchymal subtype with high risk group. PLoS ONE. 2013;8(4):e62042. doi: 10.1371/journal.pone.0062042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zhang J, Xiao X, Zhang X, Hua W. Tumor Microenvironment Characterization in Glioblastoma Identifies Prognostic and Immunotherapeutically Relevant Gene Signatures. J. Mol. Neurosci. 2020;70(5):738–750. doi: 10.1007/s12031-020-01484-0. [DOI] [PubMed] [Google Scholar]
  • 28.Zhang M, Wang X, Chen X, Zhang Q, Hong J. Novel immune-related gene signature for risk stratification and prognosis of survival in lower-grade glioma. Front. Genet. 2020;11:363. doi: 10.3389/fgene.2020.00363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhang B, Shen R, Cheng S, Feng L. Immune microenvironments differ in immune characteristics and outcome of glioblastoma multiforme. Cancer Med. 2019;8(6):2897–2907. doi: 10.1002/cam4.2192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Meshalkina DA, Shevtsov MA, Dobrodumov AV, Komarova EY, Voronkina IV, Lazarev VF, Margulis BA, Guzhova IV. Knock-down of Hdj2/DNAJA1 co-chaperone results in an unexpected burst of tumorigenicity of C6 glioblastoma cells. Oncotarget. 2016;7(16):22050–22063. doi: 10.18632/oncotarget.7872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Liu Y, Ye F, Yamada K, Tso JL, Zhang Y, Nguyen DH, Dong Q, Soto H, Choe J, Dembo A, Wheeler H, Eskin A, Schmid I, Yong WH, Mischel PS, Cloughesy TF, Kornblum HI, Nelson SF, Liau LM, Tso CL. Autocrine endothelin-3/endothelin receptor B signaling maintains cellular and molecular properties of glioblastoma stem cells. Mol. Cancer Res. 2011;9(12):1668–1685. doi: 10.1158/1541-7786.MCR-10-0563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Han J, Puri RK. Analysis of the cancer genome atlas (TCGA) database identifies an inverse relationship between interleukin-13 receptor alpha1 and alpha2 gene expression and poor prognosis and drug resistance in subjects with glioblastoma multiforme. J. Neurooncol. 2018;136(3):463–474. doi: 10.1007/s11060-017-2680-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Shin J, Shim HG, Hwang T, Kim H, Kang SH, Dho YS, Park SH, Kim SJ, Park CK. Restoration of miR-29b exerts anti-cancer effects on glioblastoma. Cancer Cell Int. 2017;17:104. doi: 10.1186/s12935-017-0476-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Thirant C, Galan-Moya EM, Dubois LG, Pinte S, Chafey P, Broussard C, Varlet P, Devaux B, Soncin F, Gavard J, Junier MP, Chneiweiss H. Differential proteomic analysis of human glioblastoma and neural stem cells reveals HDGF as a novel angiogenic secreted factor. Stem Cells. 2012;30(5):845–853. doi: 10.1002/stem.1062. [DOI] [PubMed] [Google Scholar]
  • 35.De Bacco F, Casanova E, Medico E, Pellegatta S, Orzan F, Albano R, Luraghi P, Reato G, D'Ambrosio A, Porrati P, Patane M, Maderna E, Pollo B, Comoglio PM, Finocchiaro G, Boccaccio C. The MET oncogene is a functional marker of a glioblastoma stem cell subtype. Cancer Res. 2012;72(17):4537–4550. doi: 10.1158/0008-5472.CAN-11-3490. [DOI] [PubMed] [Google Scholar]
  • 36.Sims NA. Cardiotrophin-like cytokine factor 1 (CLCF1) and neuropoietin (NP) signalling and their roles in development, adulthood, cancer and degenerative disorders. Cytokine Growth Factor Rev. 2015;26(5):517–522. doi: 10.1016/j.cytogfr.2015.07.014. [DOI] [PubMed] [Google Scholar]
  • 37.Yu ST, Zhong Q, Chen RH, Han P, Li SB, Zhang H, Yuan L, Xia TL, Zeng MS, Huang XM. CRLF1 promotes malignant phenotypes of papillary thyroid carcinoma by activating the MAPK/ERK and PI3K/AKT pathways. Cell Death Dis. 2018;9(3):371. doi: 10.1038/s41419-018-0352-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lee I, Yeom SY, Lee SJ, Kang WK, Park C. A novel senescence-evasion mechanism involving Grap2 and Cyclin D interacting protein inactivation by Ras associated with diabetes in cancer cells under doxorubicin treatment. Cancer Res. 2010;70(11):4357–4365. doi: 10.1158/0008-5472.CAN-09-3791. [DOI] [PubMed] [Google Scholar]
  • 39.Chen KY, Chen CC, Tseng YL, Chang YC, Chang MC. GCIP functions as a tumor suppressor in non-small cell lung cancer by suppressing Id1-mediated tumor promotion. Oncotarget. 2014;5(13):5017–5028. doi: 10.18632/oncotarget.2075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Bhattacharya S, Dunn P, Thomas CG, Smith B, Schaefer H, Chen J, Hu Z, Zalocusky KA, Shankar RD, Shen-Orr SS, Thomson E, Wiser J, Butte AJ. ImmPort, toward repurposing of open access immunological assay data for translational and clinical research. Sci. Data. 2018;5:180015. doi: 10.1038/sdata.2018.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Liang R, Wang M, Zheng G, Zhu H, Zhi Y, Sun Z. A comprehensive analysis of prognosis prediction models based on pathwaylevel, genelevel and clinical information for glioblastoma. Int. J. Mol. Med. 2018;42(4):1837–1846. doi: 10.3892/ijmm.2018.3765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Hou JY, Wang YG, Ma SJ, Yang BY, Li QP. Identification of a prognostic 5-Gene expression signature for gastric cancer. J. Cancer Res. Clin. Oncol. 2017;143(4):619–629. doi: 10.1007/s00432-016-2324-z. [DOI] [PubMed] [Google Scholar]
  • 43.Braschi B, Denny P, Gray K, Jones T, Seal R, Tweedie S, Yates B, Bruford E. Genenamesorg: the HGNC and VGNC resources in 2019. Nucleic Acids Res. 2019;47(D1):D786–D792. doi: 10.1093/nar/gky930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Xu Z, Wang C, Xiang X, Li J, Huang J. Characterization of mRNA expression and endogenous RNA profiles in bladder cancer based on the cancer genome atlas (TCGA) database. Med. Sci. Monit. 2019;25:3041–3060. doi: 10.12659/MSM.915487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013;14:7. doi: 10.1186/1471-2105-14-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary figures. (581.3KB, pdf)
Supplementary tables. (6.5MB, xls)
Supplementary legends. (14KB, docx)

Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES