Skip to main content
Cancer Biomarkers: Section A of Disease Markers logoLink to Cancer Biomarkers: Section A of Disease Markers
. 2021 Apr 9;30(4):417–428. doi: 10.3233/CBM-201684

An mRNA characterization model predicting survival in patients with invasive breast cancer based on The Cancer Genome Atlas database

Huayao Li a, Chundi Gao a, Jing Zhuang b, Lijuan Liu b, Jing Yang b, Cun Liu a, Chao Zhou a, Fubin Feng b, Ruijuan Liu b, Changgang Sun a,b,*
PMCID: PMC12499990  PMID: 33492284

Abstract

BACKGROUND:

Invasive breast cancer is a highly heterogeneous tumor, although there have been many prediction methods for invasive breast cancer risk prediction, the prediction effect is not satisfactory. There is an urgent need to develop a more accurate method to predict the prognosis of patients with invasive breast cancer.

OBJECTIVE:

To identify potential mRNAs and construct risk prediction models for invasive breast cancer based on bioinformatics

METHODS:

In this study, we investigated the differences in mRNA expression profiles between invasive breast cancer and normal breast samples, and constructed a risk model for the prediction of prognosis of invasive breast cancer with univariate and multivariate Cox analyses.

RESULTS:

We constructed a risk model comprising 8 mRNAs (PAX7, ZIC2, APOA5, TP53AIP1,MYBPH, USP41, DACT2, and POU3F2) for the prediction of invasive breast cancer prognosis. We used the 8-mRNA risk prediction model to divide 1076 samples into high-risk groups and low-risk groups, the Kaplan-Meier curve showed that the high-risk group was closely related to the poor prognosis of overall survival in patients with invasive breast cancer. The receiver operating characteristic curve revealed an area under the curve of 0.773 for the 8 mRNA model at 3-year overall survival, indicating that this model showed good specificity and sensitivity for prediction of prognosis of invasive breast cancer.

CONCLUSIONS:

The study provides an effective bioinformatic analysis for the better understanding of the molecular pathogenesis and prognosis risk assessment of invasive breast cancer.

Keywords: Invasive breast cancer, univariate and multivariate Cox analyses, bioinformatic analysis, 8-mRNA model

Abbreviations

lncRNAs: Long non-coding RNAs
TCGA: The Cancer Genome Atlas
OS: Overall survival
GO: Gene ontology
KEGG: Kyoto Encyclopedia of Genes and Genomes
ROC: Receiver operating characteristic
PI: Prognostic index
AUC: Area under curve
KIF18A: Kinesin family member 18A
BAP1: BRCA1-associated protein-1
PAX7: Paired box gene 7 protein
ZIC: Zinc finger of the cerebellum
TP53AIP1: p53-regulated apoptosis-inducing protein 1
MYBPH: Myosin-binding protein H
TTF-1: Thyroid transcription factor 1
RLC: Regulatory light chain
ROCK1: Rho-associated protein kinase 1
DACT2: Dishevelled binding antagonist of beta catenin 2
GSK-3: Glycogen synthase kinase 3
POU3F2: Transcription factor POU class 3 homeobox 2
tNOX: Tumor-associated NADH oxidase

1. Background

Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related death (11.6% of the total cancer deaths up to 2018) in women [1]. Although many tumor biomarkers and target genes have been associated with breast cancer, the incidence is increasing and prognosis still remains unfavorable, mainly owing to the late diagnosis and limited management options [2]. The available clinical information has limited predictive power because of the complex molecular mechanisms of tumor regulation. Most studies have used a single biomarker for the prediction of prognosis of patients with breast cancer. Therefore, it is necessary to develop a new model based on several biomarkers for the more accurate prediction of the survival of patients with breast cancer.

Next-generation sequencing technology has been widely used in the diagnosis, classification, targeted therapy, and prognosis of cancers [3, 4]. The recent advances in oncology bioinformatics have led to the development of drugs that target the signaling pathways in cancer cells and stroma and vasculature in tumor tissues, greatly extending the survival of patients [5]. In the post-genome era, the main challenge was to use system biology methods to mine and verify new tumor-associated biomarkers from multi-study data, thereby leveraging large amounts of genomic data to improve the clinical treatment of tumors. Long et al, constructed a prognostic model of liver cancer using four genes (CENPA, SPP1, MAGEB6, and HOXD9) through the integrated analysis of RNA sequencing data. Further analysis revealed the independent prognostic ability of this model in association with other clinical features [6]. Shi et al, identified the expression of 31 long non-coding RNAs (lncRNAs) in tumor tissues as a risk indicator for lung cancer treatment and validated the specificity and sensitivity of the model. This exploratory analysis provides new insights into the identification of the potential prognostic factors [7], thereby demonstrating that these novel next-generation sequencing approaches and data may reveal clinical prognostic biomarkers for cancer. Therefore, we used high-throughput sequencing data to construct biomarker models for the prediction of the prognosis of patients with breast cancer.

Here, we explored the differences in mRNA expression profiles between invasive breast cancer and normal breast. As RNA sequencing gradually replaces microarrays as a preferred transcriptomic platform [8], The Cancer Genome Atlas (TCGA) is a cancer database based on RNA sequencing. We analyzed the RNA expression profiles of 1,096 invasive breast cancer tissues and 112 non-invasive breast cancer tissues. Furthermore, the functions of the differentially expressed mRNAs were determined and a new model was constructed to predict the prognostic survival of patients with invasive breast cancer using several candidates.

2. Methods

2.1. Data processing

The mRNA expression profile data and corresponding clinical information from patients with invasive breast cancer were obtained from TCGA database. The mRNA expression profile data were derived from 1,096 invasive breast cancer tissue samples and 112 normal tissue samples. After filtering and excluding incomplete clinical data as well as the data showing no correlation between expression profiles and overall survival (OS), a total of 1,076 invasive breast cancer samples were retained for the construction of the prognostic risk model. Data from TCGA database are publicly available. Therefore, no local ethics committee approval is required.

2.2. Screening of differentially expressed mRNAs between invasive breast cancer tissues and non-cancerous tissues

We obtained raw data from TCGA database for mRNA expression profiles associated with invasive breast cancer. The downloaded mRNA data were normalized and differentially analyzed with edgerR package, and the differentially expressed mRNAs were obtained with log 2-fold change and corresponding P value screening.

2.3. Gene ontology (GO) and pathway enrichment analysis of differentially expressed mRNAs

To gain insight into the biological functions of these differential mRNAs, the annotation, visualization, and comprehensive discovery database DAVID v6.8 (http://david.abcc.ncifcrf.gov/) [9] was used to perform GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis on differentially expressed mRNAs.

2.4. Construction of risk assessment model

Univariate and multivariate Cox regression analyses were used to study the correlation between patient OS and gene expression level. A value of P< 0.001 was considered significant in the univariate Cox regression analysis. Multivariate Cox regression analysis was performed to assess the contribution of the gene as an independent prognostic factor of patient survival [10]. A stepwise approach was used to further select the best model. A prognostic risk score was established based on the linear combination of regression coefficients from the multivariate Cox regression model (β) and their expression levels.

Prognostic index=i=1N𝐸𝑥𝑝𝑖×βi

According to the median optimal threshold for prognostic risk, 1,076 patients with breast cancer were divided into low-risk and high-risk groups [11, 12]. In addition, the Kaplan-Meier survival curve method was used to assess OS in patients with high and low risk. Time-dependent receiver operating characteristic (ROC) curve analysis was used to assess the predictive power of the model. We applied an 8-mRNA model to patients with stage I invasive breast cancer to test the validity of the model for survival prediction. In addition, we compared the predictive performance of 8-mRNA model with traditional clinical risk factors (including age, TNM, stage) by univariate and multivariate Cox analysis. First of all, univariate Cox analysis found factors closely related to the prognosis of patients. Then, the effects of many factors on survival time were analyzed at the same time, and the independent prognostic factors could be used to evaluate the survival of patients. P< 0.05 was used as the cutoff condition to verify the ability of the model to evaluate the prognosis and sensitivity of patients.

3. Results

3.1. Downloading of TCGA data and differential expression analysis

In this study, 1208 samples were downloaded from the TCGA database and were used to identify differen-tially-expressed mRNAs in breast cancer patients. We analyzed the specific baseline clinical characteristic of 1076 breast cancer patients presented in Table 1 (Supplementary material 1). A total of 2,138 differentially expressed mRNAs (Supplementary material 2), including 1,375 upregulated mRNAs and 763 downregulated mRNAs, were screened using | log 2-fold change | 2 and FDR < 0.01 as screening cut-off conditions(Supplementary material 3). The cluster analysis of differential mRNA volcano maps is shown in Fig. 1.

Table 1.

Specific baseline clinical characteristic of 1076 invasive breast cancer patients

1076 breast cancer patients
Age
< 60 years 572
60 years 504
Stage
 I 180
 II 610
 III 244
 IV 19
 Unknown 23
Pathologic T stage
 T1-2 897
 T3-4 176
 Unknown 3
Pathologic N stage
 N0-1 862
 N2-3 194
 Unknown 20
Pathologic M stage
 M0 896
 M1 21
 Unknown 159
Estrogen receptor
 Positive 790
 Negative 237
 Unknown 49
Progesterone receptor
 Positive 683
 Negative 341
 Unknown 52
HER2
 Positive 161
 Negative 554
 Unknown 361
Survival time
1 years 185
 1 years 3 years 482
 3 years 5 years 167
> 5 years 242

Figure 1.

Figure 1.

The volcano diagram about differentially expresses mRNAs between invasive breast cancer tissue and normal tissue samples. Red dots represent up-regulated mRNA and green dots represent down-regulated mRNA.

3.2. Functional enrichment and pathway analyses of differentially expressed mRNAs

To understand the functional role of the differentially expressed mRNAs in invasive breast cancer, we performed GO and KEGG pathway enrichment analysis of these mRNAs using DAVID online software. The results indicate that the differentially expressed mRNAs were not only enriched in multiple KEGG pathways but also in molecular functions, biological processes, and cellular components. Pathway analysis revealed that these genes were mainly enriched in the phosphatidylinositol-4, 5-bisphosphate 3-kinase (PI3K)-protein kinase B (Akt) signaling pathway, cytokine-cytokine receptor interaction, calcium signaling pathway, cell cycle, and focal adhesion (Fig. 2). In addition, the results of GO analysis highlighted the enrichment of these genes in biological processes such as transcription from RNA polymerase II promoter, positive regulation of cell proliferation, cell-cell signaling, and cell adhesion; molecular functions such as various binding processes, including sequence-specific DNA binding and calcium ion binding, protein heterodimerization activity, and structural molecule activity; and cellular components such as plasma membrane, integral component of plasma membrane, proteinaceous extracellular matrix, and extracellular matrix (Table 2).

Figure 2.

Figure 2.

Pathways enrichment map of differentially expressed mRNA. KEGG terms were selected according to count > 20 and P-value < 0.05. Count: the number of enriched genes in each term.

Table 2.

Gene ontology analysis of the differentially expressed mRNAs in invasive breast cancer. Top ten terms were selected according to count and P-value < 0.05. Count: the number of enriched genes in each term

Category Term Count P value
Biological process Positive regulation of transcription from RNA polymerase II promoter 139 3.38E-04
Negative regulation of transcription from RNA polymerase II promoter 100 4.51E-03
Transcription from RNA polymerase II promoter 82 1.78E-04
Positive regulation of cell proliferation 79 3.27E-05
Cell adhesion 73 4.82E-04
Cell differentiation 73 5.87E-04
Proteolysis 71 1.02E-02
Response to drug 62 7.26E-07
Cell division 58 7.23E-04
Cell-cell signaling 55 4.76E-07
Molecular function Calcium ion binding 128 7.97E-10
Protein heterodimerization activity 104 4.66E-14
Sequence-specific DNA binding 79 4.75E-04
RNA polymerase II core promoter proximal region sequence-specific DNA binding 57 9.55E-04
Structural molecule activity 56 3.26E-08
Transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding 49 3.75E-06
Actin binding 49 2.84E-04
Receptor binding 49 3.11E-02
Heparin binding 44 3.03E-09
Serine-type endopeptidase activity 38 2.26E-02
Cellular component Plasma membrane 481 1.27E-02
Extracellular region 355 3.44E-44
Extracellular exosome 342 4.29E-03
Extracellular space 288 6.92E-33
Integral component of plasma membrane 205 3.18E-06
Proteinaceous extracellular matrix 75 6.45E-15
Cell surface 75 1.60E-02
Cell junction 65 1.58E-02
Nucleosome 57 8.65E-31
Extracellular matrix 57 1.48E-05

3.3. Construction and analysis of the prognosis risk assessment model of differentially expressed mRNAs in invasive breast cancer

We performed univariate Cox regression analysis to study the association between the differentially expressed mRNAs and OS in patients with invasive breast cancer and identified 11 mRNAs that were significantly associated with OS in patients with invasive breast cancer at P< 0.001 (Table 3). We performed a stepwise multivariate Cox regression analysis and selected 8 mRNAs to establish the prediction model. These 8 mRNAs are PAX7, ZIC2, APOA5, TP53AIP1, MYBPH, USP41, DACT2, and POU3F2 (Fig. 3), we can obtain the relationship between the 8 mRNAs and the prognosis of invasive breast cancer by observing the relationship between the expression trend of 8 mRNAs and the prognostic risk. In the 8-mRNA prediction model, the expressions of PAX7, APOA5, POU3F2, USP41, and ZIC2 are positively correlated with the prognosis of patients with invasive breast cancer, and the expressions of TP53AIP1, MYBPH, and DACT2 are negatively correlated with the prognosis of patients with invasive breast cancer. We present the relationship between 8 mRNAs and the survival of patients with invasive breast cancer in Fig. 4. Their relative coefficients in multivariate Cox regression were calculated as follows:

Table 3.

11 prognosis-related genes obtained based on univariate Cox regression analysis (P< 0.001)

Gene HR z p value
USP41 1.215104 4. 581436 0.00000462
TP53AIP1 0.802320 -4. 31668 0.0000158
PIGR 0.913720 -3. 70993 0.000207
POU3F2 1.166322 3. 682195 0.000231
PAX7 1.083225 3. 514932 0.00044
APOA5 1.217597 3. 510181 0.000448
PAK5 0.878070 -3. 50953 0.000449
ZIC2 1.092654 3. 37917 0.000727
MYBPH 0.846367 -3. 35099 0.000805
DACT2 0.866960 -3. 3245 0.000886
FREM1 0.875902 -3. 29877 0.000971

Figure 3.

Figure 3.

The heatmap of 8 independent invasive breast cancer related prognostic mRNAs. The color from green to red indicates a trend from low to high expression.Taking PAX7 as an example, it can be seen that as the expression of PAX7 increases, the prognostic risk of patients varies from low risk to high risk, indicating that the prognostic risk of patients may be positively correlated with the expression of PAX7.

Figure 4.

Figure 4.

The relationship between 8 mRNAs and the survival of patients with invasive breast cancer.

Prognostic index (PI) = (0.0454 × expression level of PAX7) + (0.0482 × expression level of ZIC2) + (0.2221 × expression level of APOA5) + (-0.1463 × expression level of TP53AIP1) + (-0.1605 × expression level of MYBPH) + (0.1106 × expression level of USP41) + (-0.0964 × expression level of DACT2) + (0.1418 × expression level of POU3F2).

With the median PI (value = 0.97) as the group cut-off threshold, 538 of the 1,076 samples that matched with the clinical follow-up samples in the invasive breast cancer data were classified as high-risk groups because their risk score was above the critical value. The remaining 538 samples with risk scores below the threshold were classified into a low-risk group. The 5-year survival rate of the low-risk group is 90.2%, and the 95% confidence interval is (86.4%, 94.1%); The 10-year survival rate for the low-risk group was 70.5%, and the 95% confidence interval was (60.6%, 82.0%). Based on the prognostic risk model constructed using these 8 genes, the Kaplan-Meier survival curve analysis of the high-risk group and the low-risk group showed that the OS rate was lower in the high-risk group, and the difference between the two groups was statistically significant (Fig. 5). In addition, the prognostic ability of the 8 gene markers was evaluated by calculating the area under curve (AUC) of the time-dependent ROC curve. For 1, 2, 3, and 5-year survival time, the AUC values of the 8-mRNA biomarker prognostic model were 0.76, 0.672, 0.731, and 0.736, respectively, indicating that the predictive model had high sensitivity and specificity (Fig. 6).

Figure 5.

Figure 5.

Assessment of prognostic risk in invasive breast cancer patients using an 8-mRNA model based on risk cut-off points.

Figure 6.

Figure 6.

Time-dependent ROC curve analysis of 8-mRNA model for survival prediction of invasive breast cancer patients. (A) at 1 years of OS (AUC = 0.76), (B) at 2 years of OS (AUC = 0.672), (C) at 3 years of OS (AUC = 0.731), (D) at 5 years of OS (AUC = 0.736).

3.4. Comprehensive assessment of model predictive performance and routine clinical risk factors

We compared the predictive performance of the 8-mRNA model with conventional clinical risk factors, including age, TNM, Stage. Univariate analysis found that age, Stage, TNM stage, and predictive performance of the 8-mRNA model were closely related to prognosis (Fig. 7A). Further multivariate analysis found that predictive performance of age and 8-mRNA models could be used as independent prognostic factors to assess patient outcomes (Fig. 7B).

Figure 7.

Figure 7.

Univariate (A) and multivariate (B) analysis of clinic pathologic factors for overall survival of invasive breast cancer patients from TCGA.

3.5. Validation of the 8-mRNA model to predict survival time

To confirm the validity and sensitivity of the 8-mRNA model for predicting survival, we applied this model to patients with stage I invasive breast cancer for survival risk assessment. We analyzed the specific baseline clinical characteristic of 180 patients with stage I invasive breast cancer presented in Table 4. We used the median risk score (value = 0.97) to classify patients into high-risk and low-risk groups. Kaplan-Meier curve results showed that the high-risk group was closely related to poor prognosis of OS, and the ROC curve indicated that the AUC value of the 8-mRNA model was 0.773 at 3 years of OS (Fig. 8).The test results indicate that the 8-mRNA model we constructed had high specificity and sensitivity in predicting OS time in patients with invasive breast cancer.

Table 4.

Specific baseline clinical characteristic of 180 patients with stage I invasive breast cancer

180 validation sample
Age
< 60 years 96
60 years 84
Pathologic T stage
 T1 180
Pathologic N stage
 N0-1 177
 Unknown 3
Pathologic M stage
 M0 162
 Unknown 18
Survival states
 Alive 164
 Dead 16
Survival time
1 years 19
 1 years 3 years 71
 3 years 5 years 35
> 5 years 55

Figure 8.

Figure 8.

Evaluation of stage I invasive breast cancer patients using 8-mRNA model, (A) assessment of prognostic risk analysis (B) time-dependent ROC curve analysis (the AUC was 0.773 at 3 years of OS).

4. Discussion

Breast cancer remains one of the deadliest malignancies in the world, owing to cellular heterogeneity and complex molecular regulatory mechanisms. Therefore, bioinformatic study of breast cancer may provide clinicians with new tools to predict disease prognosis and identify the potential and valuable mRNAs to improve clinical outcomes in patients with breast cancer. In this study based on a large sample of patients with invasive breast cancer from TCGA database, we identified 8 mRNAs, namely PAX7, ZIC2, APOA5, TP53AIP1, MYBPH, USP41, DACT2, and POU3F2. The expression patterns of these 8 mRNAs were significantly associated with OS in patients with invasive breast cancer. The survival of patients was predicted using the 8-mRNA combination model, which is better than the single mRNA and other predictive models. The AUC of the ROC curve predicting the prognosis of patients with invasive breast cancer with 1-, 2-, 3-, and 5-year survival rates were 0.76, 0.672, 0.731, and 0.736, respectively, indicating that the 8-mRNAs prediction model have good effects in survival prediction. The mRNA-based prognostic model for breast cancer may be applied in clinics and the clinicians may classify patients into high-risk and low-risk groups based on the predicted outcomes. For patients in the high-risk group, strategies for frequent monitoring of various tumor indicators should be used, including regular detection of tumor markers and regular chest and abdominal computed tomography examinations for early prevention and diagnosis of breast cancer recurrence, so as to play a role in predicting risk models. In addition, the average value of a single mRNA in the model is correlated with the prognosis of invasive fracture, and can act as a tumor biomarker for invasive breast cancer.

The rapid development of high-throughput sequencing technology and bioinformatic tools has improved our understanding of the molecular regulation mechanisms and characteristics of breast cancer [13]. Alfarsi et al., used the METABRIC dataset to assess kinesin family member 18A (KIF18A) expression at the genomic level and found that the high KIF18A expression has prognostic implications for the prediction of poor endocrine therapy outcomes in patients with estrogen receptor positive breast cancer [14], it shows the potential of mRNA to participate in the clinical treatment of breast cancer. MRNAs may perform the role of a tumor suppressor or an oncogene involved in cancer progression and metastasis and may be used as potential biomarkers for cancer. These molecules offer significant advantages as biomarkers for diagnosis and prognosis [15, 16]. We have developed a prognostic model for breast cancer that includes 8 mRNAs closely related to OS of patients with breast cancer; some of these mRNAs were previously shown to be potential biomarkers.

Paired box gene 7 protein (PAX7) is a DNA-active transcription factor that plays a role in muscle production by regulating the proliferation of muscle precursor cells. In addition to the proven PAX7 expression as a marker of skeletal muscle differentiation in rhabdomyosarcoma [17], PAX7 has been recently used as a highly sensitive marker for Ewing sarcoma [18, 19]. Progress and systemic effects of breast cancer Related, including restricted function and sarcopenia, Wang et al, found that breast cancer progression is associated with the expression of the skeletal muscle stem/satellite-specific transcription factor PAX7. The cytokine-inducible transcription factor NF-κB shows carcinogenic function and is a component of the Pax7:MyoD:Pgc-1β:miR-486 muscle axis [20]. Therefore, high expression of PAX7 may affect the prognosis of breast cancer patients.

ZIC2, a member of the human zinc finger of the cerebellum (ZIC) family genes, acts as a transcriptional activator or repressor and promotes cell proliferation and migration. ZIC2 dysregulation contributes to the infinite growth of cancer cells, and studies have shown that it can be involved in the pathogenesis of a variety of malignant tumors. The level of ZIC2 complex may be used for the diagnosis and prognosis of patients with hepatocellular carcinoma [21]. ZIC2 plays an indispensable role in the regulation of cell proliferation and apoptosis during the development of pancreatic ductal adenocarcinoma [22]. In addition, ZIC2 acts as a regulatory target for microRNAs. Wang et al., found that miR-129-5p may inhibit cervical cancer by targeting ZIC2 to prevent angiogenesis and suppress cell migration and invasion [23]. Zhang et al., used the luciferase reporter gene assay, reverse-transcription qPCR, and western blotting to show that miR-1284 overexpression inhibits ZIC2 protein expression in breast cancer cells. In addition, ZIC2 knockdown inhibits the proliferation, migration, and invasion of breast cancer cells [24]. Therefore, ZIC2 may be an effective therapeutic target for breast cancer. In addition, ZIC2 may be used as invasive breast cancer prognostic biomarkers of cancer, the high expression of ZIC2 is related to the poor prognosis of breast cancer.

p53-regulated apoptosis-inducing protein 1 (TP53AIP1) gene is a TP53 target and may play an important role in mediating p53/TP53-dependent apoptosis [25]. In cutaneous malignant melanoma, the TP53AIP1 gene plays a key role by inducing apoptosis in response to UV-mediated DNA damage. Truncated TP53AIP1 mutations tend to cause cutaneous malignant melanoma [26]. Existing studies have shown a variety of drugs that inhibit breast cancer cell invasion through the p53 signaling pathway and may enhance the sensitivity of breast cancer cells to drugs [27, 28]. TP53AIP1 may be involved in the p53 signaling pathway and cause cell cycle arrest and apoptosis in breast cancer cells, and the low expression of TP53AIP1 may lead to poor prognosis of invasive breast cancer.

Myosin-binding protein H (MYBPH) binds to myosin and may be involved in the interaction with thick filaments in the A band. Cell migration driven by actomyosin assembly is a critical step in tumor invasion and metastasis. Hosono et al, found that MYBPH is directly transactivated in lung adenocarcinoma by the thyroid transcription factor 1 (TTF-1) and through the direct inhibition of the non-muscle myosin IIA via phosphorylation of the myosin regulatory light chain (RLC) [29]. MYBPH may inhibit Rho-associated protein kinase 1 (ROCK1) expression and negatively regulate actomyosin tissue; this effect may reduce single cell motility and increase collective cell migration, resulting in reduced cancer invasion and metastasis [30]. Therefore, the expression of MYBPH may be related to invasion, migration, and metastasis of breast cancer, inhibition of MYBPH is associated with poor prognosis of invasive breast cancer.

Dishevelled binding antagonist of beta catenin 2 (DACT2), involved in the regulation of intracellular signaling pathways during development, may act as a tumor suppressor in various tumors and is often downregulated by hypermethylation. DACT2 is involved in the molecular regulation of a variety of tumors, such as colorectal cancer [31], head and neck squamous cell carcinoma [32], non-small cell lung cancer (2014), and liver cancer [33]. Li et al., found that DACT2 inhibits breast cancer cell growth by blocking the G1/S phase transition [34]. In addition, Guo et al., found that the hypermethylation of DACT2 gene promoter contributes to gene loss in breast cancer, demonstrative of its tumor suppressor role [35]. Xiang et al., found that the ectopic expression of DACT2 induces apoptosis of breast cells in vitro and further inhibits breast cancer cell proliferation, migration, and epithelial to mesenchymal transition by antagonizing the Wnt/β-catenin and Akt/glycogen synthase kinase 3 (GSK-3) signaling [36]. Therefore, the expression of DACT2 is associated with the proliferation and migration of breast cancer, low expression of DACT2 is involved in poor prognosis of invasive breast cancer.

The transcription factor POU class 3 homeobox 2 (POU3F2) plays a key role in neuronal differentiation (through similarity) and its expression has been detected in both POU3F2 glioblastoma and melanoma. POU3F2 is involved in tumor carcinogenesis and migration and other carcinogenic features [37, 38]. Chen et al, found that POU3F2 expression is positively correlated with tumor-associated NADH oxidase (tNOX) protein expression, and the overexpression of POU3F2 (with the corresponding upregulation of tNOX expression) enhances the proliferation, migration, and invasion of human gastric cancer cells, indicative of the involvement of POU3F2 in tumorigenesis through the transcriptional regulation of tNOX expression [39]. The high expression of POU3F2 may induce the proliferation, migration, and invasion of breast cancer cells.

Tumor staging is an important parameter, but patients from the same tumor stage may have different clinical outcomes, indicating that the prognosis of patients with breast cancer may not be effectively predicted. To validate the predictive performance of the cancer survival model, we tested the model in patients with stage I invasive breast cancer. The predictive model successfully classified patients with stage I invasive breast cancer into high-risk and low-risk groups, and the results were statistically significant. The results suggest that patients with invasive breast cancer may benefit from the predictive model. Therefore, further experimental research to explore the clinical predictions of the predictive model could provide new insights into OS prediction for patients with invasive breast cancer.

Among the prediction models we have studied, the role of four genetic biomarkers in breast cancer has not been studied and may provide some clinical indications and insights for the identification of prognostic factors for breast cancer. In this study, we obtained invasive breast cancer risk prediction models based on clinical real samples and statistical analysis of high-throughput data, and obtained invasive breast cancer biomarkers not found in the current study. The innovative research provides ideas for the next step of laboratory research and clinical research, and can assist the clinical treatment of invasive breast cancer.

The existing clinical prognosis evaluation system can predict the prognosis of patients based on the clinical and pathological characteristics of patients. The 8-mRNA prediction model can predict the prognosis of patients with invasive breast cancer in advance according to the results of model detection. The patient’s genomic risk score has important predictive power, and can be used in combination with clinical information, so that the prognosis of patients at all pathological stages can be better evaluated, so as to obtain more suitable and accurate treatment. Our prediction method reduces sequencing costs, making the application of targeted sequencing based on specific genes more cost-effective and routine. However, the current research still has some limitations. The quality of the samples in TCGA database is very high, and the sample size is small. Therefore, our predictive model still needs to be validated using large-scale clinical data, and multiple regression modeling methods must be used to further improve the prediction accuracy of this model.

5. Conclusion

In summary, we used TCGA database of invasive breast cancer samples to develop an OS prediction model for patients with invasive breast cancer, thereby providing new ideas for the prognostic prediction of patients with invasive breast cancer. This may allow clinicians to further improve the prognosis of patients with invasive breast cancer through personalized treatment.

Funding

This work is supported by the grants from National Natural Science Foundation of China (81673799) and National Natural Science Foundation of China Youth Fund (81703915), the funders Changgang Sun conceived and designed the study, and Lijuan Liu performed data analysis.

Conflict of interest

The authors declare that they have no competing interests.

References

  • [1]. Bray F., Ferlay J., Soerjomataram I., Siegel R.L. and Torre L.A., A Jemal Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: A Cancer Journal for Clinicians 68 (2018), 394–424. [DOI] [PubMed] [Google Scholar]
  • [2]. Shi F., Cai F.F., Cai L., Lin X.Y., Zhang W., Wang Q.Q., Zhao Y.J., Ni Q.C., Wang H. and He Z.X., Overexpression of SYF2 promotes cell proliferation and correlates with poor prognosis in human breast cancer, Oncotarget 8 (2017), 88453–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3]. Cheng L. and Hu Y., Human disease system biology, Current Gene Therapy 18 (2018), 255–6. [DOI] [PubMed] [Google Scholar]
  • [4]. Li H., Gao C., Liu C., Zhuang J., Yang J., Liu C., Zhou C., Feng F. and Sun C.G., 7-lncRNA Assessment Model for Monitoring and Prognosis of Breast Cancer Patients: Based on Cox Regression and Co-expression Analysis, Front Oncol 9 (2019), 1348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5]. Basu P., Mukhopadhyay A. and Konishi I., Targeted therapy for gynecologic cancers: Toward the era of precision medicine, International Journal of Gynaecology and Obstetrics: The Official Organ of the International Federation of Gynaecology and Obstetrics 143(Suppl 2) (2018), 131–6. [DOI] [PubMed] [Google Scholar]
  • [6]. Long J.Y., Zhang L., Wan X.S., Lin J.Z., Bai Y., Xu W.Y., Xiong J.P., Zhao H.T., A four-gene-based prognostic model predicts overall survival in patients with hepatocellular carcinoma, Journal of Cellular and Molecular Medicine 22 (2018), 5928–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7]. Xu N., Wu Y.P., Yin H.B., Xue X.Y. and Gou X., Molecular network-based identification of competing endogenous RNAs and mRNA signatures that predict survival in prostate cancer, Journal of Translational Medicine 16 (2018), 274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8]. Pedersen C.B., Nielsen F.C., Rossing M. and Olsen L.R., Using microarray-based subtyping methods for breast cancer in the era of high-throughput RNA sequencing, Molecular Oncology 12 (2018), 2136–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9]. Huang D.W., Sherman B.T. and Lempicki R.A., Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nature Protocols 4 (2009), 44–57. [DOI] [PubMed] [Google Scholar]
  • [10]. Lossos I.S., Czerwinski D.K., Alizadeh A.A., Wechser M.A., Tibshirani R., Botstein D. and Levy R., Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes, The New England Journal of Medicine 350 (2004), 1828–37. [DOI] [PubMed] [Google Scholar]
  • [11]. Bao Z.S., Li M.Y., Wang J.Y., Zhang C.B., Wang H.J., Yan W., Liu Y.W., Zhang W., Chen L. and Jiang T., Prognostic value of a nine-gene signature in glioma patients based on mRNA expression profiling, CNS Neuroscience & Therapeutics 20 (2014), 112–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12]. Zhang C.B., Zhu P., Yang P., Cai J.Q., Wang Z.L., Li Q.B., Bao Z.S., Zhang W. and Jiang T., Identification of high risk anaplastic gliomas by a diagnostic and prognostic signature derived from mRNA expression profiling, Oncotarget 6 (2015), 36643–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13]. Liu C., Li H., Wang K., Zhuang J., Chu F., Gao C., Liu L., Feng F., Zhou C., Zhang W. and Sun C., AstragalusIdentifying the antiproliferative effect of polysaccharides on breast cancer: Coupling network pharmacology with targetable screening from the cancer genome atlas, Front Oncol 9 (2019), 368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14]. Alfarsi L.H., Elansari R., Toss M.S., Diez-Rodriguez M., Nolan C.C., Ellis I.O., Rakha E.A. and Green A.R., Kinesin family member-18A (KIF18A) is a predictive biomarker of poor benefit from endocrine therapy in early ER+ breast cancer, Breast Cancer Res Treat 173(1) (2019), 93–102. [DOI] [PubMed] [Google Scholar]
  • [15]. Woo H.H., Lee S.C., Stoffer J.B., Rush D. and Chambers S.K., Phenotype of vigilin expressing breast cancer cells binding to the 69 nt 3’UTR element in CSF-1R mRNA, Translational Oncology 12 (2019), 106–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16]. Zhou W., Zhang Y., Zhong C., Hu J., Hu H., Zhou D. and Cao M., Decreased expression of TRIM21 indicates unfavorable outcome and promotes cell growth in breast cancer, Cancer Management and Research 10 (2018), 3687–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17]. Toki S., Wakai S., Sekimizu M., Mori T., Ichikawa H., Kawai A., Yoshida A., PAX7 immunohistochemical evaluation of Ewing sarcoma and other small round cell tumours, Histopathology 73 (2018), 645–52. [DOI] [PubMed] [Google Scholar]
  • [18]. Fernandez-Pol S., van De Rijn M., Natkunam Y., Charville G.W., Immunohistochemistry for PAX7 is a useful confirmatory marker for Ewing sarcoma in decalcified bone marrow core biopsy specimens, Virchows Archiv: An International Journal of Pathology 473 (2018), 765–9. [DOI] [PubMed] [Google Scholar]
  • [19]. Charville G.W., Wang W.L., Ingram D.R., Roy A., Thomas D., Patel R.M., Hornick J.L., van de Rijn M. and Lazar A.J., EWSR1 fusion proteins mediate PAX7 expression in Ewing sarcoma, Modern Pathology: An Official Journal of the United States and Canadian Academy of Pathology, Inc 30 (2017), 1312–20. [DOI] [PubMed] [Google Scholar]
  • [20]. Wang R., Bhat-Nakshatri P., Padua M.B., Prasad M.S., Anjanappa M., Jacobson M., Finnearty C., Sefcsik V., McElyea K., Redmond R., Sandusky G., Nenthala P., Crooks P.A., Liu J., Zimmers T. and Nakshatri H., Pharmacological dual inhibition of tumor and tumor-induced functional limitations in a transgenic model of breast cancer, Molecular Cancer Therapeutics 16 (2017), 2747–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21]. Zhu P., Wang Y., He L., Huang G., Du Y., Zhang G., Yan X., Xia P., Ye B., Wang S., Hao L., Wu J. and Fan Z., ZIC2-dependent OCT4 activation drives self-renewal of human liver cancer stem cells, The Journal of Clinical Investigation 125 (2015), 3795–808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22]. Savastano C.P., El-Jaick K.B., Costa-Lima M.A., Abath C.M., Bianca S., Cavalcanti D.P., Félix T.M., Scarano G., Llerena J.C., Jr., Vargas F.R., Moreira M.Â., Seuánez H.N., Castilla E.E. and Orioli I.M., Molecular analysis of holoprosencephaly in South America, Genetics and Molecular Biology 37 (2014), 250–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23]. Wang Y.F., Yang H.Y., Shi X.Q. and Wang Y., Upregulation of microRNA-129-5p inhibits cell invasion, migration and tumor angiogenesis by inhibiting ZIC2 via downregulation of the Hedgehog signaling pathway in cervical cancer, Cancer Biology & Therapy (2018), 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24]. Zhang P., Yang F., Luo Q., Yan D. and Sun S., miR-1284 Inhibits the Growth and Invasion of Breast Cancer Cells by Targeting ZIC2, Oncology Research 27 (2019), 253–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25]. Benfodda M., Gazal S., Descamps V., Basset-Seguin N., Deschamps L., Thomas L., Lebbe C., Saiag P., Zanetti R., Sacchetto L., Chiorino G., Scatolini M., Grandchamp B., Bensussan A. and Soufir N., Truncating mutations of TP53AIP1 gene predispose to cutaneous melanoma, Genes, Chromosomes & Cancer 57 (2018), 294–303. [DOI] [PubMed] [Google Scholar]
  • [26]. Rahman F.U., Bhatti M.Z., Ali A., Duong H.Q., Zhang Y., Ji X., Lin Y., Wang H., Li Z.T. and Zhang D.W., Dimetallic Ru(II) arene complexes appended on bis-salicylaldimine induce cancer cell death and suppress invasion via p53-dependent signaling, European Journal of Medicinal Chemistry 157 (2018), 1480–90. [DOI] [PubMed] [Google Scholar]
  • [27]. Hosono Y., Usukura J., Yamaguchi T., Yanagisawa K., Suzuki M. and Takahashi T., MYBPH inhibits NM IIA assembly via direct interaction with NMHC IIA and reduces cell motility, Biochemical and Biophysical Research Communications 428 (2012), 173–8. [DOI] [PubMed] [Google Scholar]
  • [28]. Wang X.N., Wang K.Y., Zhang X.S., Yang C. and Li X.Y., 4-Hydroxybenzoic acid (4-HBA) enhances the sensitivity of human breast cancer cells to adriamycin as a specific HDAC6 inhibitor by promoting HIPK2/p53 pathway, Biochemical and Biophysical Research Communications 504 (2018), 812–9. [DOI] [PubMed] [Google Scholar]
  • [29]. Hosono Y., Yamaguchi T., Mizutani E., Yanagisawa K., Arima C., Tomida S., Shimada Y., Hiraoka M., Kato S., Yokoi K., Suzuki M. and Takahashi T., MYBPH, a transcriptional target of TTF-1, inhibits ROCK1, and reduces cell motility and metastasis, The EMBO Journal 31 (2012), 481–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30]. Lu L., Wang Y., Ou R., Feng Q., Ji L., Zheng H., Guo Y., Qi X., Kong A.N. and Liu Z., DACT2 epigenetic stimulator exerts dual efficacy for colorectal cancer prevention and treatment, Pharmacological Research 129 (2018), 318–28. [DOI] [PubMed] [Google Scholar]
  • [31]. Paluszczak J., Wisniewska D., Kostrzewska-Poczekaj M., Kiwerska K., Grénman R., Mielcarek-Kuchta D., Jarmuż-Szymczak M., Prognostic significance of the methylation of Wnt pathway antagonists-CXXC4, DACT2, and the inhibitors of sonic hedgehog signaling-ZIC1, ZIC4, and HHIP in head and neck squamous cell carcinomas, Clinical Oral Investigations 21 (2017), 1777–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32]. Stewart D.J., Wnt signaling pathway in non-small cell lung cancer, Journal of the National Cancer Institute 106 (2014), 356. [DOI] [PubMed] [Google Scholar]
  • [33]. Gao S., Yang Z., Zheng Z.Y., Yao J., Zhang F., Wu L.M., Xie H.Y., Zhou L., Zheng S.S., Reduced expression of DACT2 promotes hepatocellular carcinoma progression: involvement of methylation-mediated gene silencing, World Journal of Surgical Oncology 11 (2013), 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34]. Li J., Zhang M., He T., Li H., Cao T., Zheng L., Guo M., Methylation of DACT2 promotes breast cancer development by activating Wnt signaling, Scientific Reports 7 (2017), 3325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35]. Guo L., Wang X., Yang Y., Xu H., Zhang Z., Yin L., Wang Y., Yang M., Zhao S., Bai S., Zhao L., Wang Z., Lian X., Liu Y. and Zhang Q., Methylation of DACT2 contributes to the progression of breast cancer through activating WNT signaling pathway, Oncology Letters 15 (2018), 3287–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36]. Xiang T., Fan Y., Li C., Li L., Ying Y., Mu J., Peng W., Feng Y., Oberst M., Kelly K., Ren G. and Tao Q., DACT2 silencing by promoter CpG methylation disrupts its regulation of epithelial-to-mesenchymal transition and cytoskeleton reorganization in breast cancer cells, Oncotarget 7 (2016), 70924–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37]. Bogeas A., Morvan-Dubois G., El-Habr E.A., Lejeune F.X., Defrance M., Narayanan A., Kuranda K., Burel-Vandenbos F., Sayd S., Delaunay V., Dubois L.G., Parrinello H., Rialle S., Fabrega S., Idbaih A., Haiech J., Bièche I., Virolle T., Goodhardt M., Chneiweiss H. and Junier M.P., Changes in chromatin state reveal ARNT2 at a node of a tumorigenic transcription factor signature driving glioblastoma cell aggressiveness, Acta Neuropathologica 135 (2018), 267–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38]. Simmons J.L., Pierce C.J., Al-Ejeh F. and Boyle G.M., MITF and BRN2 contribute to metastatic growth after dissemination of melanoma, Scientific Reports 7 (2017), 10909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39]. Chen H.Y., Lee Y.H., Chen H.Y., Yeh C.A., Chueh P.J., Lin Y.M., Capsaicin Inhibited Aggressive Phenotypes through Downregulation of Tumor-Associated NADH Oxidase (tNOX) by POU Domain Transcription Factor POU3F2, Molecules (Basel, Switzerland) (2016), 21. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Cancer Biomarkers: Section A of Disease Markers are provided here courtesy of SAGE Publications

RESOURCES