Abstract
Molecular prognostic signatures are critical for treatment decision-making in esophageal squamous cell cancer (ESCC), but the robustness of these signatures is limited. The aberrant DNA damage response (DDR) pathway may lead to the accumulation of mutations and thus accelerate tumor progression in ESCC. Given this, we applied the LASSO Cox regression to the transcriptomic data of DDR genes, and a prognostic DDR-related gene expression signature (DRGS) consisting of ten genes was constructed, including PARP3, POLB, XRCC5, MLH1, DMC1, GTF2H3, PER1, SMC5, TCEA1, and HERC2. The DRGS was independently associated with overall survival in both training and validation cohorts. The DRGS achieved higher accuracy than six previously reported multigene signatures for the prediction of prognosis in comparable cohorts. Furtherly, a nomogram incorporating DRGS and clinicopathological features showed improved predicting performance. Taken together, the DRGS was identified as a novel, robust, and effective prognostic indicator, which may refine the scheme of risk stratification and management in ESCC patients.
1. Introduction
Esophageal cancer, with 604,100 new cases accounting for 544,076 deaths in 2020, ranked fifth among the most common deadly gastrointestinal carcinomas [1]. Based on the histopathological manifestations, esophageal cancer is classified into two main subtypes, esophageal adenocarcinoma (EAC) and esophageal squamous cell carcinoma (ESCC). EAC has a higher prevalence in Western countries, while ESCC is more common in the developing world, including Asia [2]. This geographical bias has been suspected to be related to environmental differences and genetic factors. Irrespective of the heterogeneity of patients with esophageal cancer, the five-year relative survival rate for all stages combined remains less than 20%, thus putting this particular cancer into the group with the worst prognostic outcomes [3].
Although persistent efforts have been made to determine the prognostic factors of esophageal cancer, however, the prognostic analyses based on the clinical characteristics and assay of traditional serum biomarkers, such as squamous cell carcinoma antigen (SCC Ag) [4], cytokeratin 19 fragments CYFRA 21-1 [5], and vascular endothelial growth factors [6], exhibited limited predictive power. In recent years, the development of the next-generation genome sequencing technologies and open-source cancer gene expression profiles, such as The Cancer Genome Atlas (TCGA) project, has dramatically advanced the field of data mining-based studies to identify novel prognostic biomarkers. Thus, recent studies have profoundly explored the wide ranges from single gene marker to multigene array for the potential mRNA [7, 8], long noncoding RNA (lncRNA) [9], and competing endogenous RNA (ceRNA) network [10]-based prognostic biomarkers for esophageal cancer. In addition to the genes, including FAM46A, RAB15, SLC20A1, IL1A, and ACSL1, which have been found to be associated with the overall survival (OS) or relapse-free survival (RFS) of EAC patients [11], several autophagy-related genes [12], as well as glycolysis-related genes [13], have also been detected as the potential prognostic biomarkers of EAC progression. Moreover, ceRNA network-derived eight-gene panel [10] and four-gene panel [13] models have been established to predict the overall survival rate of EAC patients. Furthermore, several genetic panels have been developed based on the tumor microenvironment-associated oncogenes [14, 15], flavoproteins [16], histone modifications [17], actin cytoskeletal proteins [18], and other heterogeneous pathways [19–21], to track the ESCC prognostic signatures. However, a substantial fraction of these models, especially those for ESCC, exhibited unsatisfactory, moderate, or uneven prediction performance without lateral comparisons, which further warrants developing novel biomarkers with higher precision.
Among the various esophageal cancer-related genes, those involved in the DNA damage response (DDR) pathway have shown potential prognostic values. The DDR signaling is critical for initiating DNA damage repair processes to maintain the genomic integrity, which, if otherwise compromised, may lead to the accumulation of aberrant genetic changes and can transform normal cells into cancer cells by activating oncogenes [22]. Earlier studies have developed DDR-related gene panels for the prognostic analysis in different cancers, including ovarian cancer [23], glioblastoma [24], and low-grade gliomas [25]. However, the number and function of genes vary widely across different models, indicating the involvement of a broad spectrum of DDR-related genes in the prognosis of various tumors. Given that ESCC is primarily associated with DDR dysfunction and carcinogen-specific genetic mutations, we hypothesized that there might be a potential correlation between modulation of the DDR-related gene expression and the prognosis in ESCC patients. This correlation could be regarded as the basis for predicting novel signatures for ESCC prognosis, which might exhibit better performance than the existing models.
In this study, we aimed to build a prognostic signature based on the mRNA expression of DDR genes for ESCC. A pre-optimized 10-gene panel predictive model for ESCC prognostic signature was established and validated in an integrated cohort from the Gene Expression Omnibus (GEO) and TCGA databases.
2. Materials and Methods
2.1. Data Curation
mRNA expression profiles of tumor tissue samples from 2 publicly available ESCC cohorts, including a gene microarray data set (GSE53625 cohort) from the GEO database and an RNA sequencing data set from the TCGA-ESCC cohort, were retrieved. Only patients with pathologically confirmed ESCC diagnoses were included in this study [26].
For microarray data from the GEO database, the normalized matrix file was directly downloaded. RNA sequencing data file (count values) of TCGA was downloaded from the Genomic Data Commons (GDC, https://portal.gdc.cancer.gov/) using the R package GDCRNATools [27]. The trimmed mean of M value (TMM) algorithm was used to normalize the count values, and then, the resultant values were transformed to log2 counts per million (logCPM) of transcripts [28]. The batch effect caused by nonbiological technical biases was reduced using the “ComBat” algorithm [29]. All data sets were column-combined and then randomly split into three balanced subgroups: meta-training, meta-testing, and meta-validation data sets.
Transcriptomic data sets and clinical information for 28 cancers in TCGA and three immunotherapy cohorts, including metastatic urothelial carcinoma (IMvigor210) [30] treated with atezolizumab and metastatic melanoma (Liu2019 and GSE78220) [31, 32] treated with pembrolizumab and nivolumab, were analyzed to determine the immunotherapy prognostic value of the DDR-related gene expression signature (DRGS) model.
2.2. Generation of DDR-Related Gene Expression Signature (DRGS) Model
A prognostic signature model based on the previously reported 276 DDR genes was constructed [33]. Briefly, a prognostic signature was developed using the meta-training set. To minimize the risk of overfitting, integration of best subset regression into LASSO Cox analyses was applied to identify a panel of genes followed by the construction of the multigene prognostic signature for predicting OS in the meta-training set. Subsequently, based on the above prognostic genes, a formula was defined to calculate the DRG score for each patient as follows: sum (gene's coefficient × each gene's expression level).
2.3. Validation of the Prognostic Signature
To validate the classification effect of the signature, we applied the DRG score in the meta-training set, meta-testing set, and meta-validation set, respectively. Patients in these three sets were grouped into the low- and high-risk groups using the cutoff value obtained from the meta-training set. The performance of the DRG score was evaluated by the Kaplan–Meier (KM) survival analysis, the area under the curve (AUC) of the receiver operating characteristic (ROC), and the C-index.
The independent prognostic value of the DRG score was assessed in the GSE53625 and TCGA-ESCC cohorts using univariate and multivariate analyses, respectively. All available clinicopathological variables, such as age, sex, smoking habit, tumor grade, cancer stage, tumor location, and frequency of alcohol consumption, were included.
We further built a nomogram integrating the independent prognostic factors, including the DRG score and clinical factors in the multivariate analyses, for prognostic prediction for patients with ESCC. The prediction efficiency of the nomogram was evaluated by C-index and calibration curves. Decision curve analysis (DCA) was employed to determine the clinical value of the nomogram.
Finally, the predictive accuracy of the DRG score was compared with previously published multigene signatures in both C-index and AUC.
2.4. Gene Set Enrichment Analyses (GSEAs)
To evaluate the function of DRG score, GSEA was performed on Molecular Signatures Database v.7.2 using GSEA v.4.1.0. to identify pathways that were enriched in the high- or low-risk group.
2.5. Statistical Analysis
Wilcoxon's rank-sum test and chi-square test were applied for continuous variables and categorical variables, respectively. The createDataPartition function in the caret package was used to create balanced splits of target data. The LASSO Cox regression and the best subset regression were performed by the glmnet package and leaps package, respectively [34, 35]. Genes were selected using the method described by Zhou et al. [36]. The log-rank test was applied to compare the survival curves of two or more groups. Univariable and multivariable analyses were performed using the Cox proportional hazards model. The restricted mean survival (RMS) curves were compared for the survival distribution using survRM2 R packages. All data preprocessing, statistical analyses, and graphics were performed in R software v4.0.2. P < 0.05 was regarded as statistically significant.
3. Results
3.1. Generation of DDR-Related Gene Signature
A total of 259 primary ESCC tumors were divided into three groups: a meta-training set (n = 104), a meta-testing set (n = 78), and a meta-validation data set (n = 77). The baseline characteristics are summarized in Table S1. The LASSO Cox analysis was performed in the meta-training set and selected ten DDR genes when the best lambda of 0.072 was chosen. As shown in Figure S1, a DRGS consisting of 10 genes (PARP3, POLB, XRCC5, MLH1, DMC1, GTF2H3, PER1, SMC5, TCEA1, and HERC2) from six pathways was designed using the best subset regression analysis. The DRG scores were calculated as follows:
(1) |
A score of 1.32 was used as a cutoff value based on the DRG score of the meta-training set and was applied to all subsequent stratifications.
3.2. Association between DRGS and Survival across Different Data sets
First, the prognostic prediction performance of the DRGS was estimated in the meta-training and meta-testing sets. In the meta-training set, there were 43 patients in the high-risk group. The AUCs for the 1-, 3-, and 5-year OS of the meta-training set were 0.758, 0.816, and 0.786, respectively (Figure 1(a)). Patients in the high-risk group exhibited shorter OS than the ones in the low-risk group (P < 0.001; HR = 4.3; 95% confidence interval (CI) = 2.4–8.0, Figure 1(b)). Similar results were also observed in the meta-testing set, where the AUCs for 1-, 3-, and 5-year OS were 0.614, 0.632, and 0.604, respectively (Figure 1(c)). The OS was also shorter in the patients with higher risk than those with lower risk (P=0.032; HR = 2.0; 95% CI = 1.0–3.7, Figure 1(d)).
Figure 1.
Performance of the DRGS in predicting OS in the training, testing, and validation sets. (a) Time-dependent ROC analyses at 1-, 3-, and 5-year survival rates of the DRGS in the meta-training set.(b), (c) KM analysis of the DRGS in the meta-training set. (d) Time-dependent ROC analyses at 1-, 3-, and 5-year survival rates of the DRGS in the meta-testing set. (d) KM analysis of the DRGS in the meta-testing set. KM analysis of the DRGS in (e) the meta-validation set and (f) the whole meta-data sets.
To further confirm the prognostic value of the DRGS, we validated the DRGS score in other data sets. Using the preestablished cutoff of the risk score in the meta-training sets, 29 patients were identified as high risk in the meta-validation sets (P=0.03; HR = 1.9; 95% CI = 1.1–3.5, Figure 1(e)). The AUCs for 1-, 3-, and 5-year OS in the meta-validation sets were 0.536, 0.588, and 0.626, respectively. Considering the sample size, we further explored the association between DRGS and survival in the merged meta-cohort, and similar results were obtained (Figure 1(f)). Similar analyses were conducted in both GSE53625 and TCGA-ESCC cohorts, and similar results were achieved (Figure S2).
3.3. Independence, Subgroup, and Comparative Analysis
We then verified the independent prognostic performance of the DRGS using multivariable Cox regression analysis by adjusting clinicopathological factors as previously mentioned. In the multivariable analysis, including the variables with significant results in the univariable analysis (P < 0.05), DRGS was identified as an independent prognostic factor for OS (P < 0.001; HR = 2.67; 95% CI = 1.80–3.94, Table 1). The American Joint Cancer Committee (AJCC) guided the tumor, node, metastasis (TNM) staging system, and tumor location and grade are regarded as the standard diagnostic factors for predicting prognostic outcomes of esophageal cancer. Hence, we included the grade and location values into the multivariable Cox regression analysis, which showed a significant association between DRGS and OS (P < 0.001; HR = 2.52; 95% CI = 1.69–3.75, Table 1).
Table 1.
Univariable and multivariable Cox regression analyses to identify independent prognostic predictors in the GSE53625 cohorts.
Characteristics | Size | Uni-Cox analysis | aMulti-Cox analysis | bMulti-Cox analysis | |||
---|---|---|---|---|---|---|---|
HR (95% CI) | P value | HR (95% CI) | P value | HR (95% CI) | P value | ||
Age | |||||||
≥60 vs. <60 | 179 | 1.58 (1.07–2.31) | 0.0240 | 1.43 (0.97–2.1) | 0.0709 | 1.57 (1.06–2.34) | 0.0235 |
Sex | |||||||
Male vs. female | 179 | 0.78 (0.49–1.25) | 0.3070 | ||||
Grade | |||||||
Moderately vs. well | 130 | 1.01 (0.59–1.75) | 0.9620 | 0.78 (0.44–1.37) | 0.3810 | ||
Poorly vs. well | 81 | 1.65 (0.93–2.96) | 0.0900 | 1.09 (0.60–1.99) | 0.7750 | ||
Stage | |||||||
II vs. I | 179 | 2.15 (1.45–3.21) | 0.00015 | 2.26 (1.51–3.38) | <0.0001 | 2.30 (1.51–3.51) | 0.00011 |
Location | |||||||
Middle vs. upper | 117 | 0.68 (0.39–1.20) | 0.1850 | 0.64 (0.36–1.16) | 0.1400 | ||
Lower vs. upper | 82 | 0.60 (0.33–1.11) | 0.1010 | 0.49 (0.26–0.95) | 0.0338 | ||
Tobacco | |||||||
Yes vs. no | 179 | 0.75 (0.51–1.10) | 0.1450 | ||||
Alcohol | |||||||
Yes vs. no | 179 | 0.86 (0.59–1.27) | 0.4550 | ||||
DRGS | |||||||
High vs. low | 179 | 2.57 (1.75–3.77) | <0.0001 | 2.67 (1.80–3.94) | <0.0001 | 2.52 (1.69–3.75) | <0.0001 |
a: variables in multi-Cox analysis were selected by P < 0.05; b: variables in multi-Cox analysis were selected by P < 0.05 and clinical expertise; HR: hazard ratio, CI: confidence interval, DRGS : DDR-related gene expression signature.
We further performed subgroup analysis to investigate potential confounding factors (sex, tobacco, alcohol, grade, and location). Compared with the low-risk group, high-risk group patients had lower OS rates irrespective of their sex, smoking habits, and drinking habits. In addition, patients in well- and moderate-differentiated status and with middle-lower thoracic ESCC also exhibited shorter OS (Figure 2(a)). According to the TNM staging system for ESCC, patients were separated into two subgroups: early stage (TNM stages I and II) and late stage (TNM stages III and IV). We found equivalent predictive efficacy in patients in both early and late stages (Figures 2(b)–2(c)). We then specifically examined the ability of the DRGS combined with the residual tumor in patients with ESCC. We found that patients with low DRGS and R0 resection margin status had significant survival advantages, while patients with high DRGS, even with R0 resection margin status, had a worse OS compared with those without R0 resection margin status (Figure 2(d)).
Figure 2.
Performance of the DRGS in predicting OS among clinical factors. (a) Subgroup analyses estimating the prognostic value of DRGS in different clinical factors. (b) KM analysis of the DRGS in the early-stage (I/II) ESCC patients. (c) KM analysis of the DRGS in the advanced-stage (III/IV) ESCC patients. (d) KM survival curves of OS among four patient groups stratified by the DRGS and residual tumor.
Furthermore, we compared the DRGS with several previously published multigene signatures for predicting ESCC prognosis using AUC and C-index. The results demonstrate that the DRGS showed a comparable C-index and AUC for OS prediction than the other six signatures (Figure 3, Table S2).
Figure 3.
Performance comparison between the DRGS and six previous models. Comparison of the DRGS with previously published signatures using restricted mean survival (RMS) time (a) and AUC for predicting 1-year (b), 3-year (c), and 5-year (d) survival rates.
3.4. Association between DRGS and Cancer Hallmarks
To identify the biological significance of the DRGS, GSEA was conducted to compare the high-risk group with the low-risk group in the TCGA data set. As indicated in Figure 4 and Table S3, cell cycle, G2M checkpoint, E2F targets, mitotic spindle, and homologous recombination pathways were significantly enriched in the low DRGS group, while patients with high DRGS scores showed enrichment of genes involved in the metabolic processes, ribosome synthesis, cardiac muscle contraction, and ABC transporter expression.
Figure 4.
Gene set enrichment analyses between the high- and low-risk groups. Representative hallmarks in (a) the high‐risk group and (b) the low-risk group.
3.5. Construction and Verification of the Predictive Nomogram
Next, a nomogram was constructed to predict the 1-, 3-, and 5-year OS rates in ESCC patients integrating DRGS and two clinical factors, including tumor location and tumor stage (Figure 5(a)). Calibration plots indicate that the nomogram might overestimate the 3- or 5-year survival rate (Figure 5(b)). The C-index for tumor stage, tumor location, DRGS, and the nomogram was 0.61 (95% CI = 0.56–0.66), 0.51 (95% CI = 0.47–0.56), 0.61 (95% CI = 0.57–0.65), and 0.67 (95% CI = 0.62–0.72), respectively. The DCA curves demonstrate that the nomogram showed the best net benefit than other factors (Figure 5(c)).
Figure 5.
Construction and validation of a nomogram for predicting OS. (a) Nomogram predicting OS for ESCC patients at 1, 3, and 5 years. (b) Calibration plot for predicting 1-, 3-, and 5-year OS. (c) The DCA curves of the nomograms in ESCC.
3.6. The DRGS in the Prediction of TCGA Pan-Cancer and Immune Checkpoint Inhibition (ICI) Cohorts
To further examine the utilization of the DRGS, we calculated the DRGS score in the TCGA pan-cancer data set to verify its prognostic value. Although there was heterogeneity among different tumors, the DRGS was supported as a favorable prognostic signature in TCGA pan-cancer (Figure 6(a)).
Figure 6.
DRGS is a prognostic biomarker and predicts immunotherapy efficacy. Subgroup analyses estimating the prognostic value of DRGS in (a) pan-cancers from TCGA data sets. The Kaplan–Meier survival curves of overall survival in (b) the Liu2019 cohort, (e) the IMvigor210 cohort, and (h) the GSE78220 cohort. Rate of CR/PR and SD/PD to anti-PD-L1 immunotherapy in the high or low group in (c) the Liu2019 cohort, (f) the IMvigor210 cohort, and (i) the GSE78220 cohort. Distribution of DRGS scores with different anti-PD-L1 clinical responses in (d) the Liu2019 cohort, (g) the IMvigor210 cohort, and (j) the GSE78220 cohort. The values represent the mean value. The differences among groups were compared using the Kruskal–Wallis test.
We next investigated the prognostic value of the DRGS in three immunotherapy cohorts. In both metastatic melanoma (Liu 2019) and metastatic urothelial carcinoma (IMvigor210), the patients with low DRGS scores exhibited favorable clinical benefits and longer survival (Liu 2019, P < 0.001, HR = 3.0, 95% CI = 1.80–5.00, Figures 6(b)–6(d); IMvigor210, P=0.049, HR = 1.5, 95% CI = 1.00–2.20, Figures 6(h)–6(j)). No significant difference in the GSE78220 cohort was observed (P=0.166, HR = 2.3, 95% CI = 0.69–7.50, Figures 6(e)–6(g)), which could be due to the small sample size in this cohort (n = 25).
4. Discussion
In this study, we developed and validated a DRGS based on the DDR-related genes' expression modulation to predict the prognostic outcomes in patients with ESCC. A nomogram based on the DRGS score and clinical variables was further built for the prognostic prediction. Lastly, our study suggested that the DRGS score was correlated with survival in most tumors beyond ESCC, further proving the potential utility of DRGS in clinical settings. However, the precise clinical application of the DRGS score needs to be further examined in the larger cohorts.
Investigation into the prognostic factors revealed an implication of therapeutic decision guidance, curative efficacy judgment, and prognostic prediction in clinical applications. Conventional prognostic signatures mainly include various clinicopathological risk factors, pathological grade, and TNM stage, which present uneven prediction efficiencies. In this study, the DRGS score displayed robust performance in the prediction of ESCC prognosis, which remained stable in the subgroup analysis and multivariate regression analysis. In addition, it was noteworthy that R0 resection showed significant survival advantages among patients with low DRGS, while R0 resection did not exhibit enhanced survival benefits among patients with high DRGS, which could be indicative of surgical alternatives in clinical application. Considering the poor prognosis in patients with high DRGS, postsurgical adjuvant therapies might be helpful and should be actively considered. The therapeutic benefits of other treatments, such as adjuvant chemotherapy and immunotherapy, needed to be explored in patients with high DRGS. In addition, DRGS is an independent prognostic factor for ESCC, and thus, we constructed an integrated model using DRGS in combination with clinicopathological features, which further improved the predictive performance of the independent factors and supported the clinical utility of DRGS.
Previously published studies have investigated the potential of DDR genes in the prognostic prediction of other cancer types. For example, Pang et al. have developed a DDR-related gene-based prognosis predicting model for low-grade gliomas and subsequently discovered that mutations in the isocitrate dehydrogenase (IDH) gene might affect the prognosis through the regulation of DDR pathways [25]. In another study by Sun et al., a prognostic signature was constructed for OS rate prediction in patients with ovarian cancer, which might also serve as a potential therapeutic target in ovarian cancer [23]. Besides, co-mutations in specific DDR pathway-associated genes have been identified as predictors of survival outcomes in response to immune checkpoint blockade, which has inspired the concept of clinical utilization of patient selection for immune therapy [37]. In this study, the 10 DDR-related genes included in the DRGS by LASSO resulted in either positive or negative correlation coefficients, respectively, suggesting that these genes might perform differentially and even with opposite functions in ESCC pathology. As the key interacting partner of RAD51, DNA meiotic recombinase 1 (DMC1) has been reported to promote the proliferation of ESCC cells through the interaction between RAD51 and checkpoint kinase 1 (CHK1) [38]. The downregulation of period circadian regulator 1 (PER1) gene expression has been found to enhance tumorigenicity and proliferation of oral squamous cell carcinoma cells [39]. Furthermore, inhibited expression of X-ray repair cross-complementing 5 (XRCC5) in ESCC cells has been linked to reduced malignancies of tumor cells, such as proliferation, clonal progression, and apoptosis escape [40]. Besides, our GSEA results also showed significant enrichment of genes involved in cell cycle and mitotic spindle regulations in the low DRGS group and those involved in metabolic process-related pathways in the high DRGS group. Taken together, these findings suggest potential associations between DDR-related genes and regulation of cell division and metabolism, which may affect the occurrence and development of ESCC.
Furthermore, we assessed the application of the DRGS model in pan-cancer cohorts and investigated the potential prognostic value of the DDR-related genes in three immunotherapy cohorts. Results showed that the patients with low DRGS scores exhibited favorable clinical benefits and longer survival after immunotherapy in both metastatic melanoma and metastatic urothelial carcinoma. Possible explanations for such observation could be mutations in DDR-associated genes resulting in a deficiency of their DNA repair capacities, which might, in turn, increase neoantigen burden and subsequently improve the response to immunotherapy. These results could provide novel insights into the promising biomarkers involved in DNA repair pathways for the prediction of responses to immune checkpoint blockade therapies.
However, the limitations of this study should not be ignored. This is a retrospective study based on public databases, which might limit the strength of evidence, and therefore, the results should be regarded as hypothesis-generating rather than conclusive. Besides, DRGS was only validated in one cohort, which might introduce potential bias, and it is recommended to be validated with further cohorts.
5. Conclusions
In summary, a DRGS score consisting of 10 DDR-related genes was designed for prognostic prediction in patients with ESCC and validated in two meta-data sets and pan-cancers. Furthermore, a nomogram combing the DRGS score, tumor location, and tumor stage was built, which exhibited great potential in predicting OS and immunotherapy efficacy. Additionally, the DRGS score may conduce to clinical decision-making for treatment and hold promise for clinical practice in the future. Retrospective studies in larger cohorts and prospective studies are warranted to investigate the mechanisms and clinical utility of the DRGS score.
Acknowledgments
The authors would like to acknowledge the TCGA and the GEO for providing data. This work was supported by the 2020 Guangdong Provincial Special Project for Popularization of Science and Technology Innovation (2020A1414070007) and the Science and Technology Program of Guangzhou, China (201704020107).
Abbreviations
- EAC:
Esophageal adenocarcinoma
- ESCC:
Esophageal squamous cell carcinoma
- SCC Ag:
Squamous cell carcinoma antigen
- TCGA:
The Cancer Genome Atlas
- OS:
Overall survival
- RFS:
Relapse-free survival
- DDR:
DNA damage response
- GEO:
Gene Expression Omnibus
- GDC:
Genomic Data Commons
- TMM:
Trimmed mean of M values
- logCPM:
Log2 counts per million
- LASSO:
Least absolute shrinkage and selection operator
- KM:
Kaplan–Meier
- ROC:
Receiver operating characteristic
- C-index:
Concordance index
- DCA:
Decision curve analysis
- AUC:
Area under the curve
- GSEA:
Gene set enrichment analyses
- SE:
Standard error
- DRGS:
DDR-related gene expression signature
- AJCC:
American Joint Cancer Committee.
Contributor Information
Rixin Chen, Email: chenrixin@gdph.org.cn.
Guibin Qiao, Email: guibinqiao@126.com.
Data Availability
The datasets generated and analyzed during the current study are available in the public data repositories, TCGA-DGC: https://portal.gdc.cancer.gov/repository and GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE53625.
Disclosure
Weitao Zhuang, Xiaosong Ben, and Zihao Zhou share the first authorship.
Conflicts of Interest
Y. Liao, Q. Zhou, J. Zhao, G. Wang, Y. Xu, Y. Zhang, X. Wen, and S. Cai are employees of Burning Rock Biotech. The remaining authors declare no conflicts of interest.
Authors' Contributions
G. Qiao and R. Chen designed this work. W. Zhuang, X. Ben, Z. Zhou, and Y. Liao integrated and analyzed the data. W. Zhuang and Q. Zhou wrote this manuscript. Y. Ding, Y. Tang, S. Huang, C. Deng, J. Zhao, G. Wang, Y. Xu, X. Wen, Y. Zhang, and S. Cai edited and revised the manuscript. All the authors approved this manuscript. Weitao Zhuang, Xiaosong Ben, and Zihao Zhou have contributed equally to this work.
Supplementary Materials
Supplementary Table S1. Patient characteristics in the meta-data sets. Supplementary Table S2. Comparison of the DRGS with eight other previously reported signatures. Supplementary Table S3. GSEA on the TCGA cohort to explore mechanisms underlying the DRGS. Supplementary Figure S1. (a) LASSO deviance profiles of the cancer-associated DDR genes in ESCC; (b) LASSO coefficient profiles of the cancer-associated DDR genes in ESCC; (c) the optimal subset selection using best subset regression model; (d) chord diagram for the relationship between signature-based genes and DDR pathways. Supplementary Figure S2. Kaplan–Meier analysis of the DRGS in (a) the GSE53625 cohort and (b) the TCGA-ESCC cohort.
References
- 1.Sung H., Ferlay J., Siegel R. L. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians . 2021;71 doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
- 2.Arnold M., Soerjomataram I., Ferlay J., Forman D. Global incidence of oesophageal cancer by histological subtype in 2012. Gut . 2015;64(3):381–387. doi: 10.1136/gutjnl-2014-308124. [DOI] [PubMed] [Google Scholar]
- 3.Siegel R. L., Miller K. D., Jemal A. Cancer statistics, 2020. CA: A Cancer Journal for Clinicians . 2020;70(1):7–30. doi: 10.3322/caac.21590. [DOI] [PubMed] [Google Scholar]
- 4.Kanie Y., Okamura A., Maruyama S. Clinical significance of serum squamous cell carcinoma antigen for patients with recurrent esophageal squamous cell carcinoma. Annals of Surgical Oncology . 2021;28 doi: 10.1245/s10434-021-09945-5. [DOI] [PubMed] [Google Scholar]
- 5.Qiao Y., Chen C., Yue J., Yu Z. Tumor marker index based on preoperative SCC and CYFRA 21-1 is a significant prognostic factor for patients with resectable esophageal squamous cell carcinoma. Cancer Biomarkers . 2019;25(3):243–250. doi: 10.3233/cbm-190058. [DOI] [PubMed] [Google Scholar]
- 6.Kozlowski M., Kowalczuk O., Milewski R., Chyczewski L., Niklinski J., Laudański J. Serum vascular endothelial growth factors C and D in patients with oesophageal cancer☆☆☆. European Journal of Cardio-Thoracic Surgery . 2010;38(3):260–267. doi: 10.1016/j.ejcts.2010.01.061. [DOI] [PubMed] [Google Scholar]
- 7.Kang H., Wang N., Wang X., et al. A glycolysis-related gene signature predicts prognosis of patients with esophageal adenocarcinoma. Aging . 2020;12(24):25828–25844. doi: 10.18632/aging.104206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wang L., Wei Q., Zhang M. Identification of the prognostic value of immune gene signature and infiltrating immune cells for esophageal cancer patients. International Immunopharmacology . 2020;87:p. 106795. doi: 10.1016/j.intimp.2020.106795. [DOI] [PubMed] [Google Scholar]
- 9.Yu M., Zhanzhao F., Yunjie Z. A seven-lncRNA signature predicts overall survival in esophageal squamous cell carcinoma. Scientific Reports . 2018;8:p. 8823. doi: 10.1038/s41598-018-27307-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wang Y., Liang N., Xue Z., Xue X. Identifying an eight-gene signature to optimize overall survival prediction of esophageal adenocarcinoma using bioinformatics analysis of ceRNA network. OncoTargets and Therapy . 2020;13:13041–13054. doi: 10.2147/OTT.S287084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Dong Z., Wang J., Zhan T., Xu S. Identification of prognostic risk factors for esophageal adenocarcinoma using bioinformatics analysis. OncoTargets and Therapy . 2018;11:4327–4337. doi: 10.2147/ott.s156716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zhu L., Dong L., Feng M., et al. Profiles of autophagy-related genes in esophageal adenocarcinoma. BMC Cancer . 2020;20(1):p. 943. doi: 10.1186/s12885-020-07416-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhao M., Wang J., Yuan M., Ma Z., Bao Y., Hui Z. Multivariate gene expression‐based survival predictor model in esophageal adenocarcinoma. Thoracic Cancer . 2020;11(10):2896–2908. doi: 10.1111/1759-7714.13626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhang C., Luo Y., Zhang Z. Identification of a prognostic immune signature for esophageal squamous cell carcinoma to predict survival and inflammatory landscapes. Frontiers in Cell and Developmental Biology . 2020;8 doi: 10.3389/fcell.2020.580005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhang D., Qian C., Wei H., Qian X. Identification of the prognostic value of tumor microenvironment-related genes in esophageal squamous cell carcinoma. Frontiers in Molecular Biosciences . 2020;7:p. 599475. doi: 10.3389/fmolb.2020.599475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Peng L., Guo J. C., Long L., et al. A novel clinical six-flavoprotein-gene signature predicts prognosis in esophageal squamous cell carcinoma. BioMed Research International . 2019;2019:p. 3869825. doi: 10.1155/2019/3869825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sun L. L., Wu J. Y., Wu Z. Y. A three-gene signature and clinical outcome in esophageal squamous cell carcinoma. International Journal of Cancer . 2015;136:E569–E577. doi: 10.1002/ijc.29211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zhan X. H., Jiao J. W., Zhang H. F. A three-gene signature from protein-protein interaction network of LOXL2- and actin-related proteins for esophageal squamous cell carcinoma prognosis. Cancer Medicine . 2017;6 doi: 10.1002/cam4.1096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.He W., Yan Q., Fu L., Han Y. A five-gene signature to predict the overall survival time of patients with esophageal squamous cell carcinoma. Oncology letters . 2019;18:1381–1387. doi: 10.3892/ol.2019.10449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Li M., Zhao J., Li X., et al. HiFreSP: a novel high-frequency sub-pathway mining approach to identify robust prognostic gene signatures. Briefings in Bioinformatics . 2020;21(4):1411–1424. doi: 10.1093/bib/bbz078. [DOI] [PubMed] [Google Scholar]
- 21.Li Y., Lu Z., Che Y., et al. Immune signature profiling identified predictive and prognostic factors for esophageal squamous cell carcinoma. OncoImmunology . 2017;6(11):p. e1356147. doi: 10.1080/2162402x.2017.1356147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Minchom A., Aversa C., Lopez J. Dancing with the DNA damage response: next-generation anti-cancer therapeutic strategies. Therapeutic advances in medical oncology . 2018;10:p. 1758835918786658. doi: 10.1177/1758835918786658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sun H., Cao D., Ma X. Identification of a prognostic signature associated with DNA repair genes in ovarian cancer. Frontiers in Genetics . 2019;10:p. 839. doi: 10.3389/fgene.2019.00839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Gobin M., Nazarov P. V., Warta R., et al. A DNA repair and cell-cycle gene expression signature in primary and recurrent glioblastoma: prognostic value and clinical implications. Cancer Research . 2019;79(6):1226–1238. doi: 10.1158/0008-5472.can-18-2076. [DOI] [PubMed] [Google Scholar]
- 25.Pang F.-M., Yan H., Mo J.-L., et al. Integrative analyses identify a DNA damage repair gene signature for prognosis prediction in lower grade gliomas. Future Oncology . 2020;16(8):367–382. doi: 10.2217/fon-2019-0764. [DOI] [PubMed] [Google Scholar]
- 26.Li J., Chen Z., Tian L., et al. LncRNA profile study reveals a three-lncRNA signature associated with the survival of patients with oesophageal squamous cell carcinoma. Gut . 2014;63(11):1700–1710. doi: 10.1136/gutjnl-2013-305806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Li R., Qu H., Wang S., et al. GDCRNATools: an R/Bioconductor package for integrative analysis of lncRNA, miRNA and mRNA data in GDC. Bioinformatics . 2018;34(14):2515–2517. doi: 10.1093/bioinformatics/bty124. [DOI] [PubMed] [Google Scholar]
- 28.Robinson M. D., McCarthy D. J., Smyth G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics . 2010;26(1):139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Leek J. T., Johnson W. E., Parker H. S., Jaffe A. E., Storey J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics . 2012;28(6):882–883. doi: 10.1093/bioinformatics/bts034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Necchi A., Joseph R. W., Loriot Y., et al. Atezolizumab in platinum-treated locally advanced or metastatic urothelial carcinoma: post-progression outcomes from the phase II IMvigor210 study. Annals of Oncology . 2017;28(12):3044–3050. doi: 10.1093/annonc/mdx518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hugo W., Zaretsky J. M., Sun L., et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell . 2016;165(1):35–44. doi: 10.1016/j.cell.2016.02.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Liu D., Schilling B., Liu D., et al. Integrative molecular and clinical modeling of clinical outcomes to PD1 blockade in patients with metastatic melanoma. Nature Medicine . 2019;25(12):1916–1927. doi: 10.1038/s41591-019-0654-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Knijnenburg T. A., Wang L., Zimmermann M. T. Genomic and molecular landscape of DNA damage repair deficiency across the cancer genome Atlas. Cell Reports . 2018;23:239–e6. doi: 10.1016/j.celrep.2018.03.076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Farkas O., Héberger K. Comparison of ridge regression, partial least-squares, pairwise correlation, forward- and best subset selection methods for prediction of retention indices for aliphatic alcohols. Journal of Chemical Information and Modeling . 2005;45(2):339–346. doi: 10.1021/ci049827t. [DOI] [PubMed] [Google Scholar]
- 35.Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software . 2010;33(1):1–22. doi: 10.18637/jss.v033.i01. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zhou T., Cai Z., Ma N. A novel ten-gene signature predicting prognosis in hepatocellular carcinoma. Frontiers in cell and developmental biology . 2020;8:p. 629. doi: 10.3389/fcell.2020.00629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wang Z., Zhao J., Wang G., et al. Comutations in DNA damage response pathways serve as potential biomarkers for immune checkpoint blockade. Cancer Research . 2018;78(22):6486–6496. doi: 10.1158/0008-5472.can-18-1814. [DOI] [PubMed] [Google Scholar]
- 38.Zhu X., Pan Q., Huang N. RAD51 regulates CHK1 stability via autophagy to promote cell growth in esophageal squamous carcinoma cells. Tumour Biology . 2016;148 doi: 10.1007/s13277-016-5455-6. [DOI] [PubMed] [Google Scholar]
- 39.Fu X. J., Li H. X., Yang K., Chen D., Tang H. The important tumor suppressor role of PER1 in regulating the cyclin-CDK-CKI network in SCC15 human oral squamous cell carcinoma cells. OncoTargets and Therapy . 2016;9:2237–2245. doi: 10.2147/OTT.S100952. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 40.Wang S., Xi J., Lin Z., et al. Clinical values of Ku80 upregulation in superficial esophageal squamous cell carcinoma. Cancer Medicine . 2018;7(4):1006–1018. doi: 10.1002/cam4.1314. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Table S1. Patient characteristics in the meta-data sets. Supplementary Table S2. Comparison of the DRGS with eight other previously reported signatures. Supplementary Table S3. GSEA on the TCGA cohort to explore mechanisms underlying the DRGS. Supplementary Figure S1. (a) LASSO deviance profiles of the cancer-associated DDR genes in ESCC; (b) LASSO coefficient profiles of the cancer-associated DDR genes in ESCC; (c) the optimal subset selection using best subset regression model; (d) chord diagram for the relationship between signature-based genes and DDR pathways. Supplementary Figure S2. Kaplan–Meier analysis of the DRGS in (a) the GSE53625 cohort and (b) the TCGA-ESCC cohort.
Data Availability Statement
The datasets generated and analyzed during the current study are available in the public data repositories, TCGA-DGC: https://portal.gdc.cancer.gov/repository and GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE53625.