Abstract
Background:
Breast cancer patients who are positive for hormone receptor typically exhibit a favorable prognosis. It is controversial whether chemotherapy is necessary for them after surgery. Our study aimed to establish a multigene model to predict the relapse of hormone receptor-positive early-stage Chinese breast cancer after surgery and direct individualized application of chemotherapy in breast cancer patients after surgery.
Methods:
In this study, differentially expressed genes (DEGs) were identified between relapse and nonrelapse breast cancer groups based on RNA sequencing. Gene set enrichment analysis (GSEA) was performed to identify potential relapse-relevant pathways. CIBERSORT and Microenvironment Cell Populations-counter algorithms were used to analyze immune infiltration. The least absolute shrinkage and selection operator (LASSO) regression, log-rank tests, and multiple Cox regression were performed to identify prognostic signatures. A predictive model was developed and validated based on Kaplan–Meier analysis, receiver operating characteristic curve (ROC).
Results:
A total of 234 out of 487 patients were enrolled in this study, and 1588 DEGs were identified between the relapse and nonrelapse groups. GSEA results showed that immune-related pathways were enriched in the nonrelapse group, whereas cell cycle- and metabolism-relevant pathways were enriched in the relapse group. A predictive model was developed using three genes (CKMT1B, SMR3B, and OR11M1P) generated from the LASSO regression. The model stratified breast cancer patients into high- and low-risk subgroups with significantly different prognostic statuses, and our model was independent of other clinical factors. Time-dependent ROC showed high predictive performance of the model.
Conclusions:
A multigene model was established from RNA-sequencing data to direct risk classification and predict relapse of hormone receptor-positive breast cancer in Chinese patients. Utilization of the model could provide individualized evaluation of chemotherapy after surgery for breast cancer patients.
Keywords: Breast neoplasms, CKMT1B, OR11M1P, Predictive model, Prognosis, Risk score, SMR3B
Introduction
Breast cancer is one of the most common cancers and the second leading cause of death in women, and its socioeconomic burden is increasing rapidly in China.[1,2] Hormone receptor (HR)-positive, early-stage breast cancers account for 66% of breast cancers, and after being treated with endocrine therapy combined with chemotherapy, the 5-year survival rates are greater than 90%.[3] However, for those only receiving endocrine therapy, less than 15% of HR-positive patients are prone to relapse, indicating that greater than 85% of these patients might have been over-treated with chemotherapy.[4] Determining which patients are in no need of chemotherapy is significant; however, no standardized approach has been established.
Various multigene prediction models have been developed to date and are recommended in the National Comprehensive Cancer Network (NCCN) guidelines for the prediction of prognosis and instruction of postoperative chemotherapy methods.[5] The 21-gene assay Oncotype DX (Genomic Health Inc., Redwood City, California, USA) was developed based on American patients to predict prognosis and direct chemotherapy,[4,6] The MammaPrint 70-gene model (Agendia Inc., Amsterdam, The Netherlands) was developed for European breast cancer patients after surgery to predict clinical prognosis,[7] and chemotherapy benefits.[8] In addition, the Prediction Analysis of Microarray (PAM) 50 model (Nanostring Technologies Inc., Seattle, Washington, USA),[9,10] and the EndoPredict (EP) model (Myriad Genetics Inc., Salt Lake City, Utah, USA),[11,12] were also commonly used. Despite the availability of these models, they are not applicable to Chinese patients due to vast differences discovered between Chinese and Western breast cancer patients in the genetical backgrounds. For instance, Lang et al[13] found a higher frequency of TP53 and NF1 mutations in luminal patients from Fudan cohorts than patients from The Cancer Genome Atlas (TCGA) database (founded by The National Cancer Institute and The National Human Genome Research Institute), and Memorial Sloan Kettering Cancer Center database (founded by Memorial Sloan Kettering Cancer Center). These findings raised concerns about the generalizability of the previous models.
Our study explored prognostic signatures through transcriptomic analysis on a cohort of Chinese HR-positive breast cancer patients, aiming to establish a multigene model for the prediction of the disease relapse in Chinese patients and direct individualized application of chemotherapy in breast cancer patients after surgery.
Methods
Patient information and tissue selection
Patients diagnosed with breast cancer from Jan. 2007 to Dec. 2014 at two centers, the National Cancer Center (NCC) and the Fourth Hospital of Hebei Medical University (HBMU), were included in this study. The inclusion criteria were as follows: (1) female patients diagnosed with primary breast cancer by pathological examinations, (2) either estrogen receptor (ER) or progesterone receptor (PR) was positive, (3) patients undergone breast surgery, (4) patients at the stages of T1-3N0M0, and (5) patients without a history of other cancers. Those with metastatic breast cancer were excluded in this study. Patients’ information regarding age, tumor size, tumor grade, ER, PR, Ki67, human epidermal growth factor receptor 2 (HER2), and treatment after surgery were collected from the medical records. Patients were followed up with a mean time of over five years to determine their relapse status. The tissue sections were collected from tumors that had been resected via surgery and embedded in formalin-fixed paraffin-embedded (FFPE) blocks. Sections with <5% cancer cells were removed. The bioinformatics quality control criteria to identify samples with a low level of degradation were as follows: ribosomal RNA (rRNA) reads <5%, mapping rate >75%, intron reads <70%, and exon reads >5 million. The study was conducted in accordance with the Helsinki Declaration and approved by the institutional review boards of the NCC (Approval No. 20/468-2664). The requirement for informed consent was waived because it was a retrospective study.
RNA sequencing (RNA-seq)
FFPE samples were reviewed to determine whether there were sufficient tumor tissues. RNA was extracted from 10-μm unstained sections using the QIAGEN RNeasy FFPE Kit (Cat. No. 73504, QIAGEN, Dusseldorf, Germany) and measured by a NanoDrop One spectrophotometer (Thermo Scientific, Waltham, Massachusetts, USA) to examine total RNA content and DNA contamination. Ribosomal RNA was removed using the NEBNext rRNA Depletion Kit (Cat. E6310X, New England Biolabs, Los Angeles, California, USA) before mRNA library construction. Gene expression was performed using next-generation sequencing with a NovaSeq 6000 instrument (Illumina, San Diego, California, USA).
Differentially expressed gene (DEG) analysis and gene set enrichment analysis (GSEA)
The DEGs were tested between the relapse and nonrelapse groups by the “DESeq2” package from R software (https://www.r-project.org) using the read count matrix of 173 out of the 234 cases which were firstly sequenced as the first cohort. The threshold for DEGs was set as false discovery rate (FDR) <0.05 and |log (Fold Change)| >1. DEGs were identified using GSEA in the relapse and nonrelapse groups by “clusterProfiler” package from R software. The thresholds of significant pathways were defined as absolute values of |normalized enrichment score (NES) | >1 and q value <0.05.
Comparison of the relative and absolute abundance of tumor-infiltrating immune cells
Anti-tumor immunity might play an important role in the relapse of the disease, so we detected differences in the immune cell population using transcriptome sequencing data. To detect differences in the relative fraction of tumor-infiltrating immune cells, the CIBERSORT algorithm with 22 immune signatures was performed online (https://cibersortx.stanford.edu/). Then, Microenvironment Cell Populations-counter (MCP-counter) analysis was used by “MCPcounter” package (https://github.com/ebecht/MCPcounter) from R software to explore the difference of absolute abundance of immune cells between the relapse and nonrelapse groups. P-value <0.05 was considered statistically significant.
Selection of prognostic signatures correlated with relapse
DEGs were selected with the threshold of P-value <0.05 through least absolute shrinkage and selection operator (LASSO) regression analysis by “glmnet” package from R software. Then, each significant gene was tested by the log-rank test to detect its prognostic value, and genes with P-values <0.05 were retained. Multiple Cox regression analysis was then performed on those genes, and significant genes were finally enrolled in the model. The risk score was calculated using coefficients of those genes obtained from multiple Cox regression and their expression levels.
Validation of the risk model
After the model was established, patients were divided into training group (patients from HBMU cohort) and testing group (patients from NCC cohort) based on their affiliation. Patients in both the training and testing groups were divided into high- and low-risk groups according to their median risk score. Kaplan–Meier (K–M) analysis was performed on each cohort to detect the association between risk score and relapse. Then, the risk score combined with other clinical characteristics was tested using univariate and multiple Cox analyses in data of the training and testing groups to determine whether the risk score was an independent factor associated with relapse. The predictive accuracy of the risk model was tested using receiver operating characteristic (ROC) curves in the training and testing groups. The area under the curves (AUCs) at three and five years were calculated to detect the predictive performance.
Correlation analysis between risk score and DEGs
The expression levels of the model genes were compared between the relapse and nonrelapse groups. To validate the biofunctional value of the model, Pearson's correlation analysis was performed between the risk score and DEGs. Genes with |r| ≥0.3 were enriched in the Metascape database online (http://metascape.org).
Software packages and statistical analysis
RNA-seq data were aligned to the human genome using Spliced Transcripts Alignment to a Reference (STAR, version 2.7.9a, https://www.encodeproject.org), and the read counts of each sample were generated using HTSeq (version 0.13.5, https://pypi.python.org/pypi/HTSeq). The raw sequencing reads containing adapters were trimmed off, and the reads containing >30% low-quality bases (base quality <30) and >60% N bases were discarded. Low-quality bases at the end of sequencing reads were also trimmed. DEGs between relapse and nonrelapse groups were analyzed using R software version 4.0.3. The “survival” package in R software was used to perform Cox regression model analysis. K–M analysis was performed to determine survival outcomes with the “survminer” package in R software, and statistical significance was evaluated with a log-rank test. ROC analysis was performed using the “timeROC” package. The categorical variables were displayed in the form of number (percentage) and analyzed using chi-squared test. P <0.05 represented a significant difference.
Results
Study design and patient information
We firstly selected 487 patients who underwent surgery from Jan. 2007 to Dec. 2014. HR-positive patients without lymph node metastasis were selected. Several quality control criteria were then applied to the RNA-seq data to filter samples with severe RNA degradation. Ultimately, 234 patients were included in the study [Figure 1], including 173 of the first cohort which was first sequenced and 61 of the second cohort which was sequenced later. The first cohort (107 from HBMU and 66 from NCC) was used to identify DEGs. Among all 234 cases, those from HBMU (n=107) were used as the training set to develop the risk model. After removing cases without model genes expressions (n = 15), cases from NCC (n = 112) were used as the testing set to validate the model. The average follow-up time was 5.64 years in the training set and 6.65 years in the testing set. The age, menopausal status, tumor size, tumor grade, ER status, PR status, HER2 status, Ki67 status, lymphovascular invasion (LVI), and perineural invasion (PNI) of these patients were presented along with the information on the endocrine therapy, chemotherapy, and radiotherapy they received [Supplementary Table 1].
Comparison of gene expression patterns, enriched pathways, and immune infiltration features
In the first cohort, a total of 173 cases were divided into a relapse group (n = 68) and a nonrelapse group (n = 105). We identified 1588 significant DEGs between the two groups [Figure 2A]. Among them, 882 genes were up-regulated in the relapse group while 706 were up-regulated in the nonrelapse group. DEGs in the nonrelapse group, such as CCL21 (related to immunoregulation and inflammation), FCER2 (regulating B-cell growth and differentiation), and IGHD (encoding a protein which is a constant region of immunoglobulin heavy chains), were primarily associated with immunoregulation. DEGs in the relapse group, such as TMPRSS4 (related to endopeptidase activity), GDA (related to purine metabolism), and HOXA11 (related to transcriptional regulation), were primarily associated with the regulation of metabolism [Figure 2B]. GSEA pathway analysis was performed using DEGs to identify significant pathways altered between the two groups. In the nonrelapse group, a series of immune-relevant pathways were enriched. Among them, the primary immunodeficiency pathway was the most significant. In addition, T-cell and B-cell receptor signaling pathways and natural killer cell-mediated cytotoxicity pathways were also enriched [Figure 2C]. However, in the relapse group, metabolic reprograming and cell cycle-relevant pathways such as the E2F target pathway and cytochrome P450-relevant pathways, were more enriched [Figure 2C].
As GSEA showed several enriched immune-relevant pathways in the nonrelapse group, considering that the immune response might play an important role in tumor relapse, we detected the abundance of immune cells infiltration in the two groups using CIBERSORT and MCP-counter analysis. The CIBERSORT algorithm revealed that naive B cells, CD4+ naive T cells, and memory B cells were more prevalent in the nonrelapse group, whereas CD4+ memory resting T cells and activated mast cells were more prevalent in the relapse group [Figure 2D]. MCP-counter analysis revealed that myeloid dendritic cells, CD8+ T cells, B lineage cells, endothelial cells, and fibroblasts showed higher infiltration in nonrelapse patients [Figure 2E].
Identification of gene markers through LASSO regression analysis
To identify the prognostic genes, we performed LASSO regression analysis on the DEGs, and 14 genes were retained [Supplementary Figure 1A]. Then, the log-rank test was performed on each of the 14 genes, and S100Z (Entrez ID: 170591, located on chromosome 5q13.3) was removed [Supplementary Table 2]. Multiple Cox regression analysis was then performed on the 173 patients using the 13 genes. Finally, CKMT1B (Entrez ID:1159, located on chromosome 15q15), OR11M1P (Entrez ID:121270, located on chromosome 12q13), and SMR3B (Entrez ID:10879, located on chromosome 4q13) showed significant prognostic values. The risk model was then established based on the three genes and their coefficients were obtained from multiple Cox regression analysis [Supplementary Figure 1B].
Validation of the prediction performance of the model
The formula of the risk score of patients was as follows:
Risk score=0.60875×CKMT1B+0.80868×OR11M1P+0.46315×SMR3B
According to the median risk score, patients in the training (n = 107) and testing groups (n = 112) were divided into high- and low-risk groups. K–M analysis was performed, and the log-rank test showed significant prognostic differences in each group (P <0.001 for both) [Figure 3A and 3B]. Then, the model combined with clinical factors was tested using univariate and multiple analyses. In both the training and testing cohorts, the risk model showed high statistical significance and was an independent prognostic factor (P < 0.001 for both training and testing data) [Table 1].
Table 1.
Univariate analysis | Multivariate analysis | |||||
Factor | HR | 95% CI | P-value | HR | 95% CI | P-value |
Training group | ||||||
Ki67 (≥15% vs. <15%) | 2.996 | 1.304–6.882 | 0.010 | 1.667 | 0.686–4.054 | 0.260 |
Age (≥60 years vs. <60 years) | 1.104 | 0.445–2.743 | 0.831 | – | – | – |
ER (positive vs. negative) | 0.545 | 0.071–4.192 | 0.560 | – | – | – |
PR (positive vs. negative) | 0.808 | 0.378–1.727 | 0.583 | – | – | – |
Tumor size (≥2 cm vs. <2 cm) | 2.287 | 0.967–5.412 | 0.060 | – | – | – |
Menopause (yes vs. no) | 1.966 | 0.881–4.390 | 0.099 | – | – | – |
Chemotherapy (no vs. yes) | 2.769 | 1.117–6.863 | 0.028 | 1.522 | 0.577–4.019 | 0.396 |
Radiotherapy (no vs. yes) | 1.685 | 0.582–4.875 | 0.336 | – | – | – |
Risk model (high risk vs. low risk) | 2.957 | 2.153–4.061 | <0.001 | 2.690 | 1.946–3.718 | <0.001 |
Testing group | ||||||
Ki67 (≥15% vs. <15%) | 1.101 | 0.580–2.089 | 0.768 | – | – | – |
Age (≥60 years vs. <60 years) | 1.219 | 0.516–2.883 | 0.652 | – | – | – |
ER (positive vs. negative) | 0.692 | 0.245–1.959 | 0.488 | – | – | – |
PR (positive vs. negative) | 0.397 | 0.223–0.705 | 0.002 | 0.590 | 0.317–1.098 | 0.096 |
HER-2 (positive vs. negative) | 0.246 | 0.088–0.690 | 0.008 | 0.355 | 0.117–0.954 | 0.041 |
Tumor size (≥2 cm vs. <2 cm) | 1.037 | 0.539–1.726 | 0.902 | – | – | – |
Menopause (yes vs. no) | 1.174 | 0.649–2.123 | 0.597 | – | – | – |
Chemotherapy (no vs. yes) | 1.088 | 0.539–2.196 | 0.814 | – | – | – |
Radiotherapy (no vs. yes) | 0.496 | 0.228–1.082 | 0.078 | – | – | – |
Risk model (high risk vs. low risk) | 2.301 | 1.840–2.878 | <0.001 | 2.107 | 1.655–2.682 | <0.001 |
CI: Confidence interval; ER: Estrogen receptor; HER-2: Human epidermal growth factor receptor 2; HR: Hazard ratio; PR: Progesterone receptor; –: Not applicable.
Time-dependent ROC curve of the model
To validate the accuracy of the model, 3-year and 5-year ROC curves of the training and testing groups were generated. The 3-year AUCs were 86.06% and 87.38% in the training and testing groups, respectively [Figure 4A]; while the 5-year AUCs were 87.60% and 90.31% in the two groups, respectively [Figure 4B]. The ROC curves confirmed the high predictive accuracy of the risk model in our study.
Further exploration of the biological functions of the three markers
To further explore the correlation between the three genes and relapse, the expression levels of the three genes were compared between the high- and the low-risk groups. Consistently, in both the training and testing sets, CKMT1B, OR11M1P and SMR3B were highly expressed in the relapse group [Figure 5A]. To determine the correlation between the risk score and protein-coding genes of the 1588 DEGs, we performed Pearson's correlation analysis and identified positively and negatively correlated clusters [Figure 5B]. In the positive cluster, the cell cycle regulatory pathway, centromere complex assembly pathway, and chromosome segregation pathway were significantly enriched [Figure 5C]. In the negative cluster, pathways related to the immune response, such as the regulation of immune effector processes, negative regulation of immune system processes, and lymphocyte activation, were enriched [Figure 5C]. The results were consistent with the GSEA data above, demonstrating the biological confidence of the model.
Discussion
The socioeconomic burden of breast cancer in China has increased rapidly in the past decade, and one of the main reasons is the overuse of chemotherapy,[14] which not only causes physical damage to patients, such as severe liver toxicity, but also induces negative emotions, such as anxiety and depression, severely affecting patients’ well-being.[15,16] Accurately predicting the relapse risk of breast cancer to spare patients of chemotherapy after surgery has been a hot topic for many years.
To date, many multigene models have been established and validated.[17–20] Using a reverse-transcriptase-chain-reaction assay of 21 genes, including 16 cancer-related genes and 5 reference genes, the 21-gene model categorizes patients into low-, intermediate-, and high-risk groups with 10-year distant recurrence rates of 6.8%, 14.3%, and 30.5%, respectively.[4] The 21-gene model has successfully altered the application of chemotherapy on over 30% HR-positive, HER2-negative breast cancer patients in a prospective clinical study.[6] In a recent external validation study of the MammaPrint 70-gene model that includes genes correlated with cell cycle, invasion, metastasis, and angiogenesis, the distant recurrence-free survival at ten years was 91.3% in the low-risk group and 84.8% in the high-risk group.[21] The PAM 50 model was first used to divide breast cancers into intrinsic subtypes,[9] and later to predict late and distant recurrence of patients with HR-positive breast cancer at the early stage.[10] The EP model based on Australian patients offers prognostic information for the distant recurrence of ER-positive, HER2-negative breast cancer in patients receiving endocrine therapy,[11] and provides more accurate information on 10-year distant recurrence rates than the 21-gene model.[12] However, different risk models have different specificities. For example, the 21-gene model tends to reduce the use of chemotherapy, whereas the 70-gene model is the opposite.
However, studies have shown poor consistency when applying different models to the same patients. In a comparative study, the Oncotype DX model classified 82.1% of patients into a low-risk group, whereas MammaPrint showed a proportion of 61.4%.[22] When applied to Chinese patients, those who were categorized into the low-risk group using the 21-gene model still had a 42.1% of 10-year distant recurrence rate, which made the application of the model controversial.[23] Meanwhile, models, such as the MammaPrint model, were established by microarray on specific genes instead of RNA-sequencing, which might neglect important genes. Moreover, gene mutation profiles vary among different populations, which might reduce the predictive value of those models when they are applied to Asian patients.[13] Therefore, it is of great significance to identify predictive signatures and establish specific model for Chinese HR-positive breast cancer patients.
In our study, we analyzed RNA-sequencing data from patients with and without relapse and identified 1588 DEGs. GSEA based on these DEGs revealed several immune-related pathways that were highly activated in the nonrelapse group, which coincided with previous conclusion in the relationship between immune infiltration and tumor prognosis.
Immune infiltration features are closely related to the prognosis of breast cancer.[24,25] Tumor-infiltrating lymphocyte (TIL) levels vary in different subtypes of breast cancer.[26,27] Studies have shown that a high level of TILs is correlated with a poor prognosis in luminal breast cancer.[28] Luminal patients with a higher level of tumor-infiltrating FOXP3+ immunosuppressive regulatory T (Treg) cells often had shorter overall survival, and the number of FOXP3+ Treg cells could be used to identify the risk of late recurrence in HR-positive patients.[29]
Our study suggested that the nonrelapse group might have stronger antitumor immune cell infiltration. This finding was supported by elevated immune cell fractions using CIBERSORT and MCP-counter algorithms. Significantly increased immune cell infiltration could be observed in the nonrelapse group, especially myeloid dendritic cells, CD8+ T cells, B lineage cells, endothelial cells, and fibroblasts. In the relapse group, several pathways related to the cell cycle and cell metabolism were enriched. The E2F target pathway, cytochrome P450-relevant pathways, drug metabolism pathways, and G2/M checkpoint pathway were all enriched, suggesting that tumors in the relapse group might have excessive proliferation, strong drug resistance, and abnormal cell cycle processes.
On this foundation, we tried to identify hub genes to predict relapse in patients. Through a series of statistical screenings, we found three genes that are strongly associated with tumor relapse: CKMT1B, OR11M1P and SMR3B.
The three hub genes have not been included in any other prediction model for HR-positive breast cancer, but studies have shown that the expression of these genes is related to the prognosis of cancer. CKMT1B is a protein-coding gene that accounts for the transfer of high-energy phosphate from mitochondria to the cytoplasm. This gene is upregulated in breast cancer tissues of patients with relapse and can be used as an independent prognostic marker,[30] which was consistent with our study results. SMR3B is also a protein-coding gene that is predicted to regulate the activity of endopeptidase inhibitors and is involved in the cellular response to lipopolysaccharide (http://www.genecards.org/). The SMR3B protein locates in extracellular exosomes and has been identified as a part of a signature to predict the prognosis of triple-negative breast cancer.[31]OR11M1P is a pseudogene that belongs to the olfactory receptor family. Olfactory receptors, which regulate the sense of smell, have been found to be associated with tumors.[32] For example, olfactory receptor 5B21 (OR5B21) has been identified as an oncogene that drives breast cancer metastasis. However, OR11M1P has not been reported, and its mechanism of promoting breast cancer relapse needs further investigation.
Based on the three signatures above, a multigene model was built. Patients were divided into low- and high-risk groups according to a risk score obtained from the expression of the three genes and their coefficients. The K–M curves showed significant prognostic differences among the low- and high-risk groups in both the training and testing datasets.
To validate the predictive accuracy of the model, time-dependent ROC curves were generated. The 3-year and 5-year AUCs of the training set were 0.8606 and 0.8760, respectively. The 3-year and 5-year AUCs of the testing set were 0.8738 and 0.9031, respectively. The model was then analyzed using univariate and multiple Cox regression tests and was demonstrated to be an independent prognostic factor.
In addition, we analyzed the prognostic model from the perspective of biological function. We found that all three genes were overexpressed in the high-risk group compared with the low-risk group. To explore how the three signatures might influence breast cancer relapse, Pearson's correlation analysis was performed between the risk score and DEGs. Higher risk scores were associated with positive regulation of the cell cycle, whereas lower risk scores were associated with an enhanced immune response. This finding was consistent with previous GSEA.
This study still has some limitations. First, RNA degradation is a common problem in FFPE samples. In the current study, we used genes expressed in >90% of samples to reduce the effect of RNA degradation and applied the internal housekeeping gene calibration method to reduce the expression differences caused by different degradation levels among samples, thus making genes more comparable. Second, the small sample size and the involvement of only two central institutions may have introduced some biases into the conclusions. Thus, multicenter studies involving more patients should be performed in the future. Finally, although three prognostic signatures were identified in the study, the specific mechanism of these genes in tumor recurrence still requires further exploration.
In summary, we identified three prognosis-related signatures of luminal breast cancers and established a predictive model to predict the risk of relapse in Chinese patients. Compared with existing models, our model might better fit Chinese breast cancer patients. The study may help clinical physicians to optimize the choice of chemotherapy after surgery. In the future, a multicenter prospective study will be performed to determine the cutoff value for the model to forecast the accurate risk of relapse in Chinese HR-positive breast cancer patients.
Funding
The study was supported by the National Key Research and Development Program of China (2019YFE0110000), National Natural Science Foundation of China (82072097), CAMS Innovation Fund for Medical Science (CIFMS) (2020-I2M-C&T-B-069, 2021-I2M-1-014), and Beijing Hope Run Special Fund of Cancer Foundation of China (LC2020A18).
Conflicts of interest
None.
Supplementary Material
Footnotes
How to cite this article: Liu J, Zhao S, Yang C, Ma L, Wu Q, Meng X, Zheng B, Guo C, Feng K, Shang Q, Liu J, Wang J, Zhang J, Shan G, Xu B, Liu Y, Ying J, Wang X, Wang X. Establishment and validation of a multigene model to predict the risk of relapse in hormone receptor-positive early-stage Chinese breast cancer patients. Chin Med J 2023;136:184–193. doi: 10.1097/CM9.0000000000002411
Jiaxiang Liu and Shuangtao Zhao contributed equally to this work.
Supplemental digital content is available for this article.
References
- 1.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin 2020; 70:7–30. doi: 10.3322/caac.21590. [DOI] [PubMed] [Google Scholar]
- 2.Luo CY, Li N, Lu B, Cai J, Lu M, Zhang YH, et al. Global and regional trends in incidence and mortality of female breast cancer and associated factors at national level in 2000 to 2019. Chin Med J 2021; 135:42–51. doi: 10.1097/cm9.0000000000001814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Miller KD, Nogueira L, Mariotto AB, Rowland JH, Yabroff KR, Alfano CM, et al. Cancer treatment and survivorship statistics, 2019. CA Cancer J Clin 2019; 69:363–385. doi: 10.3322/caac.21565. [DOI] [PubMed] [Google Scholar]
- 4.Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 2004; 351:2817–2826. doi: 10.1056/NEJMoa041588. [DOI] [PubMed] [Google Scholar]
- 5.Gradishar WJ, Anderson BO, Abraham J, Aft R, Agnese D, Allison KH, et al. Breast cancer, version 3.2020, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2020; 18:452–478. doi: 10.6004/jnccn.2020.0016. [DOI] [PubMed] [Google Scholar]
- 6.Dieci MV, Guarneri V, Zustovich F, Mion M, Morandi P, Bria E, et al. Impact of 21-gene breast cancer assay on treatment decision for patients with T1-T3, N0-N1, estrogen receptor-positive/human epidermal growth receptor 2-negative breast cancer: final results of the prospective multicenter ROXANE study. Oncologist 2019; 24:1424–1431. doi: 10.1634/theoncologist.2019-0103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002; 415:530–536. doi: 10.1038/415530a. [DOI] [PubMed] [Google Scholar]
- 8.Knauer M, Mook S, Rutgers EJT, Bender RA, Hauptmann M, van de Vijver MJ, et al. The predictive value of the 70-gene signature for adjuvant chemotherapy in early breast cancer. Breast Cancer Res Treat 2010; 120:655–661. doi: 10.1007/s10549-010-0814-2. [DOI] [PubMed] [Google Scholar]
- 9.Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol 2009; 27:1160–1167. doi: 10.1200/jco.2008.18.1370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Filipits M, Nielsen TO, Rudas M, Greil R, Stöger H, Jakesz R, et al. The PAM50 risk-of-recurrence score predicts risk for late distant recurrence after endocrine therapy in postmenopausal women with endocrine-responsive early breast cancer. Clin Cancer Res 2014; 20:1298–1305. doi: 10.1158/1078-0432.Ccr-13-1845. [DOI] [PubMed] [Google Scholar]
- 11.Filipits M, Rudas M, Jakesz R, Dubsky P, Fitzal F, Singer CF, et al. A new molecular predictor of distant recurrence in ER-positive, HER2-negative breast cancer adds independent information to conventional clinical risk factors. Clin Cancer Res 2011; 17:6012–6020. doi: 10.1158/1078-0432.Ccr-11-0926. [DOI] [PubMed] [Google Scholar]
- 12.Buus R, Sestak I, Kronenwett R, Denkert C, Dubsky P, Krappmann K, et al. Comparison of EndoPredict and EPclin with oncotype DX recurrence score for prediction of risk of distant recurrence after endocrine therapy. J Natl Cancer Inst 2016; 108:djw149.doi: 10.1093/jnci/djw149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lang GT, Jiang YZ, Shi JX, Yang F, Li XG, Pei YC, et al. Characterization of the genomic landscape and actionable mutations in Chinese breast cancers by clinical sequencing. Nat Commun 2020; 11:5679.doi: 10.1038/s41467-020-19342-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Xia CF, Dong XS, Li H, Cao MM, Sun DQ, He SY, et al. Cancer statistics in China and United States, 2022: profiles, trends, and determinants. Chin Med J 2022; 135:584–590. doi: 10.1097/cm9.0000000000002108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Moezian GSA, Javadinia SA, Sales SS, Fanipakdel A, Elyasi S, Karimi G. Oral silymarin formulation efficacy in management of AC-T protocol induced hepatotoxicity in breast cancer patients: a randomized, triple blind, placebo-controlled clinical trial. J Oncol Pharm Pract 2022; 28:827–835. doi: 10.1177/10781552211006182. [DOI] [PubMed] [Google Scholar]
- 16.Salek R, Dehghani M, Mohajeri SA, Talaei A, Fanipakdel A, Javadinia SA. Amelioration of anxiety, depression, and chemotherapy related toxicity after crocin administration during chemotherapy of breast cancer: a double blind, randomized clinical trial. Phytother Res 2021; 35:5143–5153. doi: 10.1002/ptr.7180. [DOI] [PubMed] [Google Scholar]
- 17.Filipits M, Dubsky P, Rudas M, Greil R, Balic M, Bago-Horvath Z, et al. Prediction of distant recurrence using EndoPredict among women with ER+, HER2(−) node-positive and node-negative breast cancer treated with endocrine therapy only. Clin Cancer Res 2019; 25:3865–3872. doi: 10.1158/1078-0432.Ccr-19-0376. [DOI] [PubMed] [Google Scholar]
- 18.Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, et al. Adjuvant chemotherapy guided by a 21-gene expression assay in breast cancer. N Engl J Med 2018; 379:111–121. doi: 10.1056/NEJMoa1804710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cardoso F, van’t Veer LJ, Bogaerts J, Slaets L, Viale G, Delaloge S, et al. 70-gene signature as an aid to treatment decisions in early-stage breast cancer. N Engl J Med 2016; 375:717–729. doi: 10.1056/NEJMoa1602253. [DOI] [PubMed] [Google Scholar]
- 20.Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, et al. Prospective validation of a 21-gene expression assay in breast cancer. N Engl J Med 2015; 373:2005–2014. doi: 10.1056/NEJMoa1510764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dubsky P, Van’t Veer L, Gnant M, Rudas M, Bago-Horvath Z, Greil R, et al. A clinical validation study of MammaPrint in hormone receptor-positive breast cancer from the Austrian Breast and Colorectal Cancer Study Group 8 (ABCSG-8) biomarker cohort. ESMO Open 2021; 6:100006.doi: 10.1016/j.esmoop.2020.100006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bartlett JM, Bayani J, Marshall A, Dunn JA, Campbell A, Cunningham C, et al. Comparing breast cancer multiparameter tests in the OPTIMA prelim trial: no test is more equal than the others. J Natl Cancer Inst 2016; 108:djw050.doi: 10.1093/jnci/djw050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sun B, Zhang F, Wu SK, Guo X, Zhang LL, Jiang ZF, et al. Gene expression profiling for breast cancer prognosis in Chinese populations. Breast J 2011; 17:172–179. doi: 10.1111/j.1524-4741.2010.01049.x. [DOI] [PubMed] [Google Scholar]
- 24.Wang L, Simons DL, Lu X, Tu TY, Avalos C, Chang AY, et al. Breast cancer induces systemic immune changes on cytokine signaling in peripheral blood monocytes and lymphocytes. EBioMedicine 2020; 52:102631.doi: 10.1016/j.ebiom.2020.102631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wang L, Simons DL, Lu X, Tu TY, Solomon S, Wang R, et al. Connecting blood and intratumoral Treg cell activity in predicting future relapse in breast cancer. Nat Immunol 2019; 20:1220–1230. doi: 10.1038/s41590-019-0429-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Desmedt C, Salgado R, Fornili M, Pruneri G, Van den Eynden G, Zoppoli G, et al. Immune infiltration in invasive lobular breast cancer. J Natl Cancer Inst 2018; 110:768–776. doi: 10.1093/jnci/djx268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sobral-Leite M, Salomon I, Opdam M, Kruger DT, Beelen KJ, van der Noort V, et al. Cancer-immune interactions in ER-positive breast cancers: PI3K pathway alterations and tumor-infiltrating lymphocytes. Breast Cancer Res 2019; 21:90.doi: 10.1186/s13058-019-1176-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Montagna E, Vingiani A, Maisonneuve P, Cancello G, Contaldo F, Pruneri G, et al. Unfavorable prognostic role of tumor-infiltrating lymphocytes in hormone-receptor positive, HER2 negative metastatic breast cancer treated with metronomic chemotherapy. Breast 2017; 34:83–88. doi: 10.1016/j.breast.2017.05.009. [DOI] [PubMed] [Google Scholar]
- 29.Bates GJ, Fox SB, Han C, Leek RD, Garcia JF, Harris AL, et al. Quantification of regulatory T cells enables the identification of high-risk breast cancer patients and those at risk of late relapse. J Clin Oncol 2006; 24:5373–5380. doi: 10.1200/jco.2006.05.9584. [DOI] [PubMed] [Google Scholar]
- 30.Cimino D, Fuso L, Sfiligoi C, Biglia N, Ponzone R, Maggiorotto F, et al. Identification of new genes associated with breast cancer progression by gene expression analysis of predefined sets of neoplastic tissues. Int J Cancer 2008; 123:1327–1338. doi: 10.1002/ijc.23660. [DOI] [PubMed] [Google Scholar]
- 31.Lv X, He M, Zhao Y, Zhang L, Zhu W, Jiang L, et al. Identification of potential key genes and pathways predicting pathogenesis and prognosis for triple-negative breast cancer. Cancer Cell Int 2019; 19:172.doi: 10.1186/s12935-019-0884-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Weber L, Maßberg D, Becker C, Altmüller J, Ubrig B, Bonatz G, et al. Olfactory receptors as biomarkers in human breast carcinoma tissues. Front Oncol 2018; 8:33.doi: 10.3389/fonc.2018.00033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.