Abstract
Immune-related genes have great potential as prognostic markers in many types of cancer. Therefore, we have attempted to develop immune-related gene markers to enhance the prognosis of breast cancer; 1159 samples of breast cancer gene expression data and clinical follow-up messages were downloaded from TCGA and GEO, which were classified into training set, test set, and validation set. In the training set, the gene pairs are established according to the relative expression levels between 320 immune genes, in which the prognosis-related gene pairs are screened, and Lasso is used for feature selection to screen the robust biomarkers. A prognostic model of immune gene correlation was set up and verified. Sixty-six IRGPs were obtained, and 17-IRGPs signature was established. 17-IRGPs signature is an independent prognostic indicator for BC patients, which can stratify the risk in the training set and testing series, and AUC of five years survival was greater than 0.7; 17-IRGPs signature had better classification performance in patients with advanced BC. In addition, we compared the prognostic characteristics of 17-IRGPs with four reported breast cancers and clinical stages; 17-IRGPs achieved the highest average C index (0.7, P < 0.05), and functional analysis found that the dysregulated immune environment may be the cause of the observed difference in survival between patient groups defined by our characteristics. 17-IRGPs signature was constructed as a newly developed prognostic indicator to calculate the survival of BC patients.
Impact statement
Breast cancer is among the highest prevalent malignant tumors worldwide with a low survival ratio. Immune-related genes have great potential as prognostic indicator in many types of tumors. Therefore, we have attempted to develop immune-related gene markers to enhance the prognosis of breast cancer. 17-IRGPs signature was constructed as a newly developed prognostic indicator to predict the survival of BC patients.
Keywords: Bioinformatics, immune genes, prognostic markers, TCGA, IRGPs
Introduction
Breast cancer (BC) is the primary cause of cancer-related morbidity and mortality experienced by women. Even though patients with earlier BC can be treated with surgery, the relapse risk is quite high. BC genotype and cancer grade are the two top characteristics, and they are the strongest prognostic indicators in BC.1–5 The TNM staging system of American Joint Committee on Cancer is presently the only prognostic grading system currently available in clinical practice for selecting people with adjunctive chemotherapy.6–11 Nevertheless, the TNM staging system cannot accurately anticipate relapse in breast cancer with radical surgery in many patients. Gene expression profiling based on microarrays has been successfully applied to clinical cancer research to segment cancer, anticipate prognosis, or assess treatment response.12–14 However, only a few of these studies have shown clear prognostic significance. To date, in clinical practice, only BRCA1/2 mutation gene has been considered as a predictor for BC.15–19 This implies that recognizing robust genetic signatures still poses a challenge, requiring more queues to confirm signatures.
The accessibility of shared large-scale gene expression datasets offers a chance to recognize potentially more promising BC molecular biomarkers. However, in order to use all of this intelligence meaningfully, the complexity of data is also a formidable burden. Classic methods of utilizing gene expression levels require proper standardization, which is a formidable mission because of the potential bio-heterogeneity between datasets and the technical bias in measurement platforms. In contrast, methods based on relative sequencing of gene expression levels remove the requirement for data pre-processing, such as scaling and normalization, and have been proven to yield robust outcomes in a variety of applications containing cancer categorization.
An active immune response is crucial to manage tumor metastasis and advancement. Therefore, substantial evidence indicates a link between the good outcomes of diverse tumors and tumor-infiltrating lymphocytes (TILs), and20–23 a wide variety of elements of the immune system are deciding factors during cancer occurrence and progression. Escape from immune damage has been considered as marker of carcinoma.23 Immunotherapy, such as programmed death-1 (PD-1)/programmed death ligand 1 (PD-L1) inhibitors or tumor vaccines, is being developed a beneficial new treatment for many cancers. It has been reported that immunization has a significant and long-lasting response in BC. For example, TILs are predictor for triple-negative breast cancer, predicting the benefits of trastuzumab in early BC,24 tumor-associated lymphocytes as stand-alone predictors of neologically adjuvant chemotherapy response in BC,25 and CD8 + T cell infiltration is associated with BC survival.26 However, the molecular characterization of tumor immune interactions still needs to be fully investigated in terms of its prognostic potential in BC.
In this work, to validly recognize a trusted BC prognosis-associated immune gene indicators, we introduced a systematic channel to detect BC-associated immune gene markers. Gene expression profiling data of BC patients were obtained from large datasets in the TCGA and GEO databases to explore and prove personalized prognostic features of BCs based on immune-related gene pairs (IRGPs). We found that the 17-IRGPs signature is participated in vital biological processes and pathways in BC. The ssGSEA analysis also implied analogous results, suggesting that 17-IRGPs signature can strongly contribute to the prediction of the prognosis risk of patients with BC and offer a basis for better knowledge of the underlying molecular mechanism of BC prognosis.
Materials and methodologies
Data collection and analysis
RNA-seq FPKM data were collected from TCGA using GDC API containing 1222 samples, including 1109 tumor tissue samples and 113 normal samples. The chip dataset GSE2068527 of the Affymetrix Human Genome U133 Plus 2.0 Array platform, containing a whole of 327 samples and the chip dataset GSE739028,29 of the Affymetrix Human Genome U133A Array platform, containing altogether 198 samples were downloaded from GEO. All patients underwent surgically negative margin surgery, did not receive adjuvant or neoadjuvant treatment, and had open access to gene expression data and survival data downloaded on 5 April 2019. Furthermore, we downloaded all the genes (a total of 320 genes) related to four immune pathways, which were M13664 (immune system process), M19817(immune response), M14818 (immune effector process) and M3457 (immune system development), from the Molecular Signatures Database v4.0 database as immune-related gene sets.30
For the TCGA RNAseq data, we screened 1038 tumor samples with follow-up information and OS greater than 0, extracted the expression profile of the immune-related gene set, and removed the gene with the expression level of 0 in 50% of the samples. For chip datasets, we screened samples with follow-up information and OS greater than 0; probes were mapped to genes, probes were mapped to genes, and the probes were removed, while those mapped to a single gene were kept to take the median value to obtain gene expression profile, from which the expression profile of immune gene sets were extracted. The expression profiles of immune gene sets were extracted. The final statistics of every dataset sample is shown in Table 1, and the study design and workflow are shown in Figure 1.
Table 1.
Characteristic | TCGA (n=1038) | GSE20685 (n=325) | GSE7390 (n=196) |
---|---|---|---|
Survival status | |||
Alive | 891 | 242 | 141 |
Dead | 147 | 83 | 55 |
pathologic_T | |||
T1 | 277 | 101 | |
T2 | 593 | 186 | |
T3 | 129 | 26 | |
T4 | 36 | 12 | |
TX | 3 | ||
pathologic_N | |||
N0 | 485 | 137 | |
N1 | 353 | 86 | |
N2 | 110 | 62 | |
N3 | 73 | 40 | |
NX | 17 | ||
pathologic_M | |||
M0 | 855 | 317 | |
M1 | 21 | 8 | |
MX | 162 | ||
Tumor stage | |||
I | 180 | ||
II | 587 | ||
III | 230 | ||
IV | 19 | ||
X | 22 | ||
Age | |||
≤50 | 289 | 201 | 130 |
>50 | 749 | 124 | 66 |
Gender | |||
Female | 1026 | ||
Male | 12 |
Construction of immune gene pair
Firstly, we constructed pairs of any two genes according to 320 immune genes by traversing all the genes, and obtained a whole of 51,040 immune gene pairs (IRGPs). For two genes i and j in any sample, the IRGP value is defined as follows
IRG represents the expression level of the gene. We calculated all IRGP values of all samples, respectively, and further filtered the IRGPs with standard deviation of 0.
Sample grouping
Since IRGPs are discrete values of relative ranks of genes and independent of the data platform, we combined the TCGA and GSE20685 data, including a total of 1363 samples and further divided the samples into two groups, age range and clinical stage; the period of follow-up and the percentage of patients died were similar in the two groups, and the number of dichotomous samples was closed after the clustering of gene expression profiles of the two groups. One of them is used as a training set (n = 681), one used as a verification set (n = 682), and GSE7390 is used as an external verification set. The sample characteristics of each group are as shown in Table 2.
Table 2.
Clinical features | Overall | Training set | Testing set | Independent set |
---|---|---|---|---|
Stage_T | ||||
T1 | 378 | 197 | 181 | |
T2 | 779 | 383 | 396 | |
T3 | 155 | 71 | 84 | |
T4 | 48 | 27 | 21 | |
TX | 3 | 3 | 0 | |
Stage_N | ||||
N0 | 622 | 300 | 322 | |
N1 | 439 | 240 | 199 | |
N2 | 172 | 72 | 100 | |
N3 | 113 | 58 | 55 | |
NX | 17 | 11 | 6 | |
Stage_M | ||||
M0 | 1172 | 601 | 571 | |
M1 | 29 | 11 | 18 | |
MX | 162 | 69 | 93 | |
Age | ||||
0–40 | 141 | 73 | 68 | |
40–50 | 349 | 167 | 182 | |
50–60 | 345 | 171 | 174 | |
60–70 | 305 | 149 | 156 | |
70–100 | 223 | 121 | 102 | |
Status | ||||
0 | 1133 | 574 | 559 | 141 |
1 | 230 | 107 | 123 | 55 |
Construction of prognostic immune gene signature
LASSO is a popular regression modeling approach that has a wide range of possible prognostic characteristics, because it can execute automatic feature selection in a way that usually has signatures with great prognostic performance.31 The LASSO method has been developed to cover the Cox model for survival assessment and has been successfully applied to establish sparse signatures for survival prognosis purpose in many fields areas such as tumors.32–34 We carried out a Univariate Cox proportional hazard regression analysis for every IRGP using a sample of training sets, and a log rank P < 0.05 as a cutoff value for discerning prognostic IRGPs. Furthermore, R software package glmnet35 was further used to carry out robust prognostic features, and 10-fold cross validation was employed to assess the optimal features. Multivariate Cox regression analysis was further carried out with stepwise regression method, and the following risk scoring model is set up
where n is the amount of prognostic IRGPs, is the valuation of prognostic IRGPs, and is the calculated regression coefficient of IRGPs.
Functional enrichment analyzes
Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment examine and Gene Ontology (GO) were carried out via R package clusterprofiler36 to identify biological processes, molecular function, and cellular component of GO terms and KEGG pathway. A FDR < .05 was thought as significance.
Single sample gene set enrichment analysis (ssGSEA) was conducted by the R package GSVA37 via the MSigDB38 C2 Canonical pathways gene set collection, which contains 1320 gene sets.
Statistical analysis
Kaplan–Meier (KM) curve was drawn as the mean risk score in each dataset which was used as a threshold for the comparison of survival risk between the high- and the low-risk group. Multivariate Cox regression analysis was carried out to examine whether the IRGPs were independently prognostic events. Significance was classified as P < 0.05, both of which were two-sided tests. The ROC analysis uses the R package pROC,39 in which the heat map is drawn via the R package pheatmap,40 and the C-index calculation by the R package RMS.41 If not specified, default parameters were used, all in R 3.4.3.
Results
The expression profile of immune-correlated genes in BC samples was highly correlated
For the GEO and TCGA datasets, the correlation distribution of immune gene expression among each sample was analyzed (Figure 2(a)). In the GEO dataset, there was a higher correlation (mean correlation >0.85) and a lower standard deviation of immune gene expression among samples, while in the TCGA dataset, there was a lower correlation (mean correlation > 0.6) and a higher standard deviation of immune gene expression among samples, which may be related to the multi-center source of TCGA samples. All in all, there is a high consistency of immune gene expression profile among these samples, and there are differences between different platforms. Furthermore, we computed the IRGPs of each sample and explained the IRGPs correlation between each sample (Figure 2(b)); both the GEO and TCGA datasets have high correlation (average correlation > 0.55), and it is worth mentioning that the correlation distribution between the two datasets tends to be consistent. These results suggested that IRGPs could effectively reduce the differences caused by different data platforms.
Identification of a 17-IRGPs signature for BC survival
The relationship between IRGPs and prognosis was analyzed by univariate survival analysis, as shown in IRGPs HR (hazard ratio) and the significance of volcanic map (Figure 2(c)), among which 570 IRGPs with significant prognosis. Lasso was used for dimensionality reduction analysis, in which the 10-fold cross-validation was selected, the error rate is the minimum when λ = 0.0695 (Figure 3(a) and (b)) and a total of 66 IRGPs were obtained. Furthermore, stepwise multifactor regression was used to screen the least IRGPs with sufficient fitting degree, and finally 17 IRGPs were identified, and the distribution of these 17 IRGPs in each sample was determined (Figure 3(c)), which showed that 8 of these 17 IRGPs are protective factors and 9 are risk factors. The HR of these 17 IRGPs is shown in Table 3, and the risk formula is as follows
Table 3.
IRGPs | Coef | P value | HR | Low.95.CI. | High.95.CI. |
---|---|---|---|---|---|
LCK_vs_CTSE | −1.34145 | 3.02E-05 | 0.261466 | 0.13923 | 0.491015 |
GBP2_vs_MBP | −2.34651 | 0.000115 | 0.095702 | 0.029046 | 0.315325 |
COLEC12_vs_TAZ | −0.66023 | 0.001135 | 0.516734 | 0.347216 | 0.769012 |
THY1_vs_CD83 | 1.620052 | 0.001321 | 5.053355 | 1.880053 | 13.58281 |
INHBA_vs_HRH2 | −2.58346 | 0.00272 | 0.075512 | 0.013946 | 0.40888 |
CCR8_vs_AZU1 | 0.577867 | 0.008057 | 1.782233 | 1.16232 | 2.732772 |
SYK_vs_CST7 | 0.627731 | 0.009513 | 1.873354 | 1.165629 | 3.010785 |
ERAP2_vs_ZBTB16 | 1.033219 | 0.013751 | 2.810096 | 1.235229 | 6.392854 |
ELF4_vs_AIM2 | −1.30961 | 0.014578 | 0.269926 | 0.094382 | 0.771971 |
GBP2_vs_CHUK | −0.6816 | 0.01545 | 0.505806 | 0.291342 | 0.878143 |
TPD52_vs_CXCL13 | 0.776274 | 0.037776 | 2.173359 | 1.044815 | 4.520883 |
SIRPG_vs_CALCA | 0.776367 | 0.073936 | 2.173562 | 0.927614 | 5.093037 |
MNX1_vs_CARTPT | 0.41659 | 0.111214 | 1.51678 | 0.908425 | 2.532539 |
TNFAIP1_vs_CDK6 | −0.69727 | 0.1651 | 0.497942 | 0.186041 | 1.332749 |
ERAP2_vs_LAT | 0.838926 | 0.198932 | 2.313881 | 0.643357 | 8.322049 |
LAX1_vs_DMBT1 | 0.657293 | 0.206221 | 1.929562 | 0.69636 | 5.346674 |
IL27RA_vs_FCN1 | −0.36307 | 0.293667 | 0.695535 | 0.353198 | 1.369683 |
RiskScore = −1.3414526*LCK_vs_CTSE-2.3465119*GBP2_vs_MBP-0.6602279*COLEC12_vs_TAZ + 1.6200523*THY1_vs_CD83-2.5834641*INHBA_vs_HRH2 + 0.5778671*CCR8_vs_AZU1 + 0.6277307*SYK_vs_CST7 +1.0332187*ERAP2_vs_ZBTB16-1.3096065*ELF4_vs_AIM2-0.6816013*GBP2_vs_CHUK + 0.7762737*TPD52_vs_CXCL13 +0.7763675*SIRPG_vs_CALCA + 0.4165898*MNX1_vs_CARTPT-0.6972717*TNFAIP1_vs_CDK6 + 0.8389264*ERAP2_vs_LAT + 0.6572929*LAX1_vs_DMBT1-0.3630734*IL27RA_vs_FCN1.
The prognostic role of 17-IRGPs signature was verified
17-IRGPs signature separate people into high- and low-risk populations in the training set, and the prognosis in the high-risk group is vitally weaker than that in the low-risk populations (Figure 4(a)). There is also a difference in prognosis in the testing set (Figure 4(b)), and the same result is found in the external validation set (Figure 4(c)). The prognosis of the high-risk populations in the TCGA and GSE20685 data was also vitally weaker than the low-risk populations (Figure 4(d) and (e)). The ROC of 17-IRGPs signature in training set, test set, external verification set, TCGA and GSE20685 was analyzed, and the average AUC of 3 years, 5 years and 10 years was > 0.7 (Figure 4(f) to (j)).
Subgroup and sensitivity analysis of 17-IRGPs signature
To assess the stability of different clinical subgroups and models, we analyzed the classification performance of the models in TNBC and non-TNBC samples, respectively (Figure 5(a) and (b)), and the high- and low-risk samples showed significant prognostic differences in both types of samples. The models were applied to patients in different stages, where there were no prognostic differences among the high- and low-risk populations in the Stage I sample (Figure 5(c)), and significant differences in Stage II, Stage III+IV populations (Figure 5(d) and (e)); among the most significant samples in Stage III+IV populations, this suggests that 17-IRGPs signature may be more suitable for risk stratification in advanced patients. The model was further applied to different PAM50 molecular subtypes to observe the model’s prognostic classification performance, which was expected to be the most significant in the basal-like group, which has the adverse prognosis (Figure 5(f) to (i)). In order to know the robustness of 17-IRGPs, we randomly resampled 1000 samples from different datasets in different combinations to classify the resampled samples, and most of the P values can be less than 1e-5 under resampling conditions of different proportions (Figure 5(j)), and those results show its steady predictive power.
Potentially related regulatory pathways for 17-IRGPs
In order to analyze the function of 17-IRGPs, we first analyzed the enrichment scores of each sample in the TCGA dataset in pathways by using ssGSEA, and further calculated the correlation between 17-IRGPs and pathways, and selected FDR < 0.05 as the threshold. Finally, 73 significantly correlated pathways were screened. There are 44 positive correlations and 29 negative correlations, of which 35 have significant correlations > 0.2 (P < 0.05) (Figure 6(a)). Most of the negative correlations in these pathways are related to immunity and metabolism, such as CYTOKINE CYTOKINE RECEPTOR INTERACTION, PRIMARY IMMUNODEFICIENCY, ANTIGEN PROCESSING AND PRESENTATION, ARACHIDONIC ACID METABOLISM, LINOLEIC ACID METABOLISM, T CELL RECEPTOR SIGNALING PATHWAY, and Positive correlation pathways, NUCLEOTIDE EXCISION REPAIR, STEROID BIOSYNTHESIS, TIGHT JUNCTION were closely related to the cell cycle. These results indicated that abnormalities in the metabolic, immune, and cell cycle-related pathways in the high-risk group are the potential targets for breast cancer prognosis. Further, R software package was carried out to compute the difference between the immune microenvironment scores of high- and low-risk samples, and we observed that high-risk samples had a lower immune microenvironment score (Figure 6(b)).
17-IRGPS signature in comparison to other signatures and clinical features
We made a comparison for the precision of 17-IRGPs model prediction with four reported signatures of prognostic features of breast cancer, such as 76-gene signature (Wang),42 64-gene expression signature (Pawitan),43 cell cycle pathway signature (CCPS),44 and 92-gene predictor (Chang).45 To enable the models comparable, we computed the risk score of every BC patient in TCGA depending on the corresponding genes in the four models using the identical method, evaluated the ROC of four models, and classified the samples into risk-high and risk-low samples based on the median Risk score, and computed the OS prognosis difference among the two groups; 76-gene signature (Figure 7(a)), 64-gene expression signature (Figure 7(b)), and 92-gene predictor (Figure 7(c)) have better ROC, and they can effectively classify the samples at high and low risk, while ROC of cell cycle pathway signature (Figure 7(d)) model was relatively poor. However, it was lower than the AUC of the 17-IRGPs model in three and five years. Furthermore, we calculated the C-index of these four models and the age, T, N, M, and 17-IRGPs models, of which 17-IRGPs have the highest C-index (Figure 7(e)). Restricted mean survival (RMS) was used to evaluate the predictive performance of five models at different points in time (Figure 7(f)), and five of the models showed some crossover in 110 months. When <110 months, the 76-gene signature, 64-gene expression signature, cell cycle pathway signature, and 17-IRGPs risk model performed better than the 92-gene predictor model. This suggested that our risk model is more suitable for predicting survival data within 10 years.
Nomogram predicts OS probability
Considering that T, N, M, and Age are prognostic factors for breast cancer, we integrated T, N, M, Age, and 17-IRGPs to establish a new nomogram that combines important independent prognostic predictors (Figure 8). According to this model, 17-IRGPs contributed the most to OS, followed by N-segment, age, M-segment, and T-segment. By calculating the total score, oncologists can easily obtain the OS probability predicted by the nomogram of individual patients.
Discussion
BC is a strongly heterogeneous illness in regard of prognosis because BC patients with identical TNM stages have different survival life. As more and more BC is increasingly detected and treated in the early stages, traditional clinicopathological criteria such as TNM staging have become challenging to match the current requirement of predicting individual results, particularly at risk stratification, because no “one size fits all” approach to treatment has proven successful.46,47 Therefore, identification of prognostic markers that adequately describe the biological characteristics of tumors is essential for individualized management of breast cancer people. Prognostic biomarkers are key to risk stratification and treatment decisions in BC groups. In this work, we examined the expression spectrum of 1559 breast cancer samples and identified a robust 17 IRGPs associated with OS, which was validated in two independent study cohort. 17-IRGPs could divide BC samples into groups with different clinical and biological results. 17-IRGPs have greater correctness than many existing biomarkers. We then combined 17-IRGPs with clinical factors and show that 17-IRGPs have higher accuracy in advanced BC. Finally, clinical staging was used to establish nomogram to help clinicians predict the prognosis and make personalized treatment decisions for BC patients.
Given the inherent heterogeneity of tumors and the technogenetic deviations induced by sequencing or microarray platforms, the classical prognostic risk models demand appropriate standardization of gene expression spectrum, which is the bottleneck. In order to know the robustness characteristics of BC predictions, we used a robust method, no matter what the technical deviations are between different platforms.48 Our propose signature is dependent on the absolute ranking of gene expression values and involves only pairwise comparisons within the gene expression spectrum of the samples; thus, no data normalization is required and no data pre-processing (e.g. scaling and normalization) is required, and this method could produce reliable results in various studies.49–51 Therefore, our prognostic characteristics can be used as precision therapy estimates of BC life and can be easily converted to clinical application.
Prognosis biomarkers correlated with tumor immune microenvironment may have good prospects in evaluating new molecular objective of immunotherapy and promoting patient management. Hida et al.52 found that the proliferation and spread of tumor infiltrating lymphocytes are a hallmark of prognosis and chemotherapy outcome of triple-negative breast cancer, and Hill et al.53 found that the collection of stromal cells in oncology microenvironment accelerates the metastasis and expansion of BC. Abnormal immune microenvironment is strongly associated with the invasion and metastasis of breast cancer. Most of the genes encompassed in immune characteristics are also cytokines and cytokine receptors, which carry pivotal role in chemotaxis, angiogenesis, and inflammation. Enhanced inflammatory microenvironment has been proven to be a consistent element of tumor processes. Unlike immunological and inflammatory silencing of apoptosis, necrosis results in release proinflammatory intracellular contents into the tumor microenvironment and triggers inflammatory responses affecting a variety of immune cells. Furthermore, tumor-associated neutrophils have been demonstrated to be responsible for prognosis in multiple cancer types. We discovered that these immune-related characteristic genes are mainly enriched to T cell activation, cell–cell adhesion, T cell receptor signaling pathway, Th17 cell differentiation, Cytosolic DNA-sensing pathway, T cell receptor signaling pathway, and other biological processes (Figure S1). CYTOKINE RECEPTOR INTERACTION, PRIMARY IMMUNODEFICIENCY, T CELL RECEPTOR SIGNALING PATHWAY, ANTIGEN PROCESSING, AND PRESENTATION in the TCGA dataset of high-immune risk group were consistently significantly inhibited. Based on the above findings, the dysregulated immune environment may be the cause of the observed difference in life between the patient groups defined by our characteristics.
Notwithstanding the fact that we identified possible candidate genes for tumor prediction in large samples through bioinformatics tools, some restrictions of this study should be addressed. Initially, the sample was devoid of some clinical follow-up messages, so we did not account for factors such as the existence of other health conditions in patient to discriminate biomarkers. Second, the results achieved through bioinformatics studies are inadequate and experimental verification is needed to substantiate these results. Therefore, genetic and laboratory studies of substantially larger sample sizes and laboratory validation are also necessary.
Conclusions
In summary, in this work, we exploited a 17-IRGPs prognostic stratification system that has a promising AUC in both the training set and the validation set, and is independent of clinical features, and the gene classifier can lead to a better survival risk prediction in comparison to clinical features. Therefore, we propose to use this classifier as a molecular diagnostic test to help estimate the prognosis risk of breast cancer people.
Supplemental Material
Supplemental material, sj-pdf-1-ebm-10.1177_1535370220936964 for Development and verification of a personalized immune prognostic feature in breast cancer by HongLei Wang, Li Wu and HongTao Wang in Experimental Biology and Medicine
Authors’ contributions
HLW conceived and guided the research; LW and HTW analyzed the data; HLW wrote the manuscript and identified the research and editorial manuscript. All authors read and confirmed the manuscript.
DECLARATION OF CONFLICTING INTERESTS
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
FUNDING
The author(s) received no financial support for the research, authorship, and/or publication of this article.
ORCID iD
HongLei Wang https://orcid.org/0000-0002-5979-8331
SUPPLEMENTAL MATERIAL
Supplemental material for this article is available online.
References
- 1.Rakha EA, Teoh TK, Lee AH, Nolan CC, Ellis IO, Green AR. Further evidence that E-cadherin is not a tumour suppressor gene in invasive ductal carcinoma of the breast: an immunohistochemical study. Histopathology 2013; 62:695–701 [DOI] [PubMed] [Google Scholar]
- 2.Ch’ng ES, Tuan Sharif SE, Jaafar H. In human invasive breast ductal carcinoma, tumor stromal macrophages and tumor nest macrophages have distinct relationships with clinicopathological parameters and tumor angiogenesis. Virchows Arch 2013; 462:257–67 [DOI] [PubMed] [Google Scholar]
- 3.Baretta Z, Guindalini RS, Khramtsova G, Olopade OI. Resistance to trastuzumab in HER2-positive mucinous invasive ductal breast carcinoma. Clin Breast Cancer 2013; 13:156–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zheng J, Alsaadi T, Blaichman J, Xie X, Omeroglu A, Meterissian S, Mesurolle B. Invasive ductal carcinoma of the breast: correlation between tumor grade determined by ultrasound-guided core biopsy and surgical pathology. AJR Am J Roentgenol 2013; 200:W71–4 [DOI] [PubMed] [Google Scholar]
- 5.Currie MJ, Beardsley BE, Harris GC, Gunningham SP, Dachs GU, Dijkstra B, Morrin HR, Wells JE, Robinson BA. Immunohistochemical analysis of cancer stem cell markers in invasive breast carcinoma and associated ductal carcinoma in situ: relationships with markers of tumor hypoxia and microvascularity. Hum Pathol 2013; 44:402–11 [DOI] [PubMed] [Google Scholar]
- 6.Kim JH, Baek TH, Yim HS, Kim KH, Jeong SH, Kang HB, Oh SS, Lee HG, Kim JW, Kim KD. Collagen triple helix repeat containing-1 (CTHRC1) expression in invasive ductal carcinoma of the breast: the impact on prognosis and correlation to clinicopathologic features. Pathol Oncol Res 2013; 19:731–7 [DOI] [PubMed] [Google Scholar]
- 7.Knudsen ES, Dervishaj O, Kleer CG, Pajak T, Schwartz GF, Witkiewicz AK. EZH2 and ALDH1 expression in ductal carcinoma in situ: complex association with recurrence and progression to invasive breast cancer. Cell Cycle 2013; 12:2042–50 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hoefgen HR, Merritt DF. Invasive ductal carcinoma in a 46,XY partial androgen insensitivity syndrome patient on hormone therapy. J Pediatr Adolesc Gynecol 2015; 28:e95–e97 [DOI] [PubMed] [Google Scholar]
- 9.Sui K, Niguma T, Yamada M, Kojima T, Mimura T. [A case of a patient who underwent resection of the remnant pancreatic cancer following a distal pancreatectomy for invasive ductal carcinoma]. Gan to Kagaku Ryoho 2014; 41:2163–5 [PubMed] [Google Scholar]
- 10.Andjelic-Dekic N, Bozovic-Spasojevic I, Milosevic S, Matijasevic M, Karadzic K. A rare case of isolated adrenal metastasis of invasive ductal breast carcinoma. Srp Arh Celok Lek 2014; 142:597–601 [DOI] [PubMed] [Google Scholar]
- 11.Ouldamer L, Lechaux E, Arbion F, Body G, Vilde A. What should be the width of radiological margin to optimize resection of non-palpable invasive or in situ ductal carcinoma? Breast 2014; 23:889–93 [DOI] [PubMed] [Google Scholar]
- 12.Huang KT, Tan D, Chen KE, Walker AM. Blockade of estrogen-stimulated proliferation by a constitutively-active prolactin receptor having lower expression in invasive ductal carcinoma. Cancer Lett 2015; 358:152–60 [DOI] [PubMed] [Google Scholar]
- 13.Ko ES, Han BK, Kim RB, Cho EY, Ahn S, Nam SJ, Ko EY, Shin JH, Hahn SY. Apparent diffusion coefficient in estrogen receptor-positive invasive ductal breast carcinoma: correlations with tumor-stroma ratio. Radiology 2014; 271:30–7 [DOI] [PubMed] [Google Scholar]
- 14.Otomi Y, Otsuka H, Terazawa K, Nose H, Kubo M, Matsuzaki K, Ikushima H, Bando Y, Harada M. Comparing the performance of visual estimation and standard uptake value of F-18 fluorodeoxyglucose positron emission tomography/computed tomography for detecting malignancy in pancreatic tumors other than invasive ductal carcinoma. J Med Invest 2014; 61:171–9 [DOI] [PubMed] [Google Scholar]
- 15.Suciu C, Muresan A, Cornea R, Suciu O, Dema A, Raica M. Semi-automated evaluation of Ki-67 index in invasive ductal carcinoma of the breast. Oncol Lett 2014; 7:107–14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Tan D, Chen KE, Deng C, Tang P, Huang J, Mansour T, Luben RA, Walker AM. An N-terminal splice variant of human Stat5a that interacts with different transcription factors is the dominant form expressed in invasive ductal carcinoma. Cancer Lett 2014; 346:148–57 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wahler J, So JY, Kim YC, Liu F, Maehr H, Uskokovic M, Suh N. Inhibition of the transition of ductal carcinoma in situ to invasive ductal carcinoma by a gemini vitamin D analog. Cancer Prev Res 2014; 7:617–26 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wan Abdul Rahman WF, Fauzi MH, Jaafar H. Expression of DNA methylation marker of paired-like homeodomain transcription factor 2 and growth receptors in invasive ductal carcinoma of the breast. Asian Pac J Cancer Prev 2014; 15:8441–5 [DOI] [PubMed] [Google Scholar]
- 19.Yang J, Zhu J, He K, Zhao LY, Liu LY, Song TS, Huang C. Proteomic profiling of invasive ductal carcinoma (IDC) using magnetic beads-based serum fractionation and MALDI-TOF MS. J Clin Lab Anal 2015; 29:321–7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Colozza M, de Azambuja E, Personeni N, Lebrun F, Piccart MJ, Cardoso F. Achievements in systemic therapies in the pregenomic era in metastatic breast cancer. Oncologist 2007; 12:253–70 [DOI] [PubMed] [Google Scholar]
- 21.Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O, Pergamenschikov A, Williams C, Zhu SX, Lonning PE, Borresen-Dale AL, Brown PO, Botstein D. Molecular portraits of human breast tumours. Nature 2000; 406:747–52 [DOI] [PubMed] [Google Scholar]
- 22.Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Lonning PE, Borresen-Dale AL. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 2001; 98:10869–74 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ahmadzadeh M, Johnson LA, Heemskerk B, Wunderlich JR, Dudley ME, White DE, Rosenberg SA. Tumor antigen-specific CD8 T cells infiltrating the tumor express high levels of PD-1 and are functionally impaired. Blood 2009; 114:1537–44 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Loi S, Michiels S, Salgado R, Sirtaine N, Jose V, Fumagalli D, Kellokumpu-Lehtinen PL, Bono P, Kataja V, Desmedt C, Piccart MJ, Loibl S, Denkert C, Smyth MJ, Joensuu H, Sotiriou C. Tumor infiltrating lymphocytes are prognostic in triple negative breast cancer and predictive for trastuzumab benefit in early breast cancer: results from the FinHER trial. Ann Oncol 2014; 25:1544–50 [DOI] [PubMed] [Google Scholar]
- 25.Denkert C, Loibl S, Noske A, Roller M, Muller BM, Komor M, Budczies J, Darb-Esfahani S, Kronenwett R, Hanusch C, von Torne C, Weichert W, Engels K, Solbach C, Schrader I, Dietel M, von Minckwitz G. Tumor-associated lymphocytes as an independent predictor of response to neoadjuvant chemotherapy in breast cancer. J Clin Oncol 2010; 28:105–13 [DOI] [PubMed] [Google Scholar]
- 26.Ali HR, Provenzano E, Dawson SJ, Blows FM, Liu B, Shah M, Earl HM, Poole CJ, Hiller L, Dunn JA, Bowden SJ, Twelves C, Bartlett JM, Mahmoud SM, Rakha E, Ellis IO, Liu S, Gao D, Nielsen TO, Pharoah PD, Caldas C. Association between CD8+ T-cell infiltration and breast cancer survival in 12,439 patients. Ann Oncol 2014; 25:1536–43 [DOI] [PubMed] [Google Scholar]
- 27.Kao KJ, Chang KM, Hsu HC, Huang AT. Correlation of microarray-based breast cancer molecular subtypes and clinical outcomes: implications for treatment optimization. BMC Cancer 2011; 11:1–15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Patil P, Bachant-Winner PO, Haibe-Kains B, Leek JT. Test set bias affects reproducibility of gene signatures. Bioinformatics 2015; 31:2318–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, Haibe-Kains B, Viale G, Delorenzi M, Zhang Y, d’Assignies MS, Bergh J, Lidereau R, Ellis P, Harris AL, Klijn JG, Foekens JA, Cardoso F, Piccart MJ, Buyse M, Sotiriou C, Consortium T. Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res 2007; 13:3207–14 [DOI] [PubMed] [Google Scholar]
- 30.Liberzon A, Birger C, Thorvaldsdottir H, Ghandi M, Mesirov JP, Tamayo P. The molecular signatures database (MSigDB) hallmark gene set collection. Cell Syst 2015; 1:417–25 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kostareli E, Hielscher T, Zucknick M, Baboci L, Wichmann G, Holzinger D, Mucke O, Pawlita M, Del Mistro A, Boscolo-Rizzo P, Da Mosto MC, Tirelli G, Plinkert P, Dietz A, Plass C, Weichenhan D, Hess J. Gene promoter methylation signature predicts survival of head and neck squamous cell carcinoma patients. Epigenetics 2016; 11:61–73 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zhang JX, Song W, Chen ZH, Wei JH, Liao YJ, Lei J, Hu M, Chen GZ, Liao B, Lu J, Zhao HW, Chen W, He YL, Wang HY, Xie D, Luo JH. Prognostic and predictive value of a microRNA signature in stage II Colon cancer: a microRNA expression analysis. Lancet Oncol 2013; 14:1295–306 [DOI] [PubMed] [Google Scholar]
- 33.Papaemmanuil E, Gerstung M, Malcovati L, Tauro S, Gundem G, Van Loo P, Yoon CJ, Ellis P, Wedge DC, Pellagatti A, Shlien A, Groves MJ, Forbes SA, Raine K, Hinton J, Mudie LJ, McLaren S, Hardy C, Latimer C, Della Porta MG, O’Meara S, Ambaglio I, Galli A, Butler AP, Walldin G, Teague JW, Quek L, Sternberg A, Gambacorti-Passerini C, Cross NC, Green AR, Boultwood J, Vyas P, Hellstrom-Lindberg E, Bowen D, Cazzola M, Stratton MR, Campbell PJ; Chronic Myeloid Disorders Working Group of the International Cancer Genome C. Clinical and biological implications of driver mutations in myelodysplastic syndromes. Blood 2013; 122:3616–27; quiz 99 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Yuan Y, Van Allen EM, Omberg L, Wagle N, Amin-Mansour A, Sokolov A, Byers LA, Xu Y, Hess KR, Diao L, Han L, Huang X, Lawrence MS, Weinstein JN, Stuart JM, Mills GB, Garraway LA, Margolin AA, Getz G, Liang H. Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nat Biotechnol 2014; 32:644–52 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010; 33:1–22 [PMC free article] [PubMed] [Google Scholar]
- 36.Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics 2012; 16:284–7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 2013; 14:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics 2011; 27:1739–40 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2011; 12:77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Yang Q, Guo B, Sun H, Zhang J, Liu S, Hexige S, Yu X, Wang X. Identification of the key genes implicated in the transformation of OLP to OSCC using RNA-sequencing. Oncol Rep 2017; 37:2355–65 [DOI] [PubMed] [Google Scholar]
- 41.Wang Y, Li J, Xia Y, Gong R, Wang K, Yan Z, Wan X, Liu G, Wu D, Shi L, Lau W, Wu M, Shen F. Prognostic nomogram for intrahepatic cholangiocarcinoma after partial hepatectomy. J Clin Oncol 2012; 31:1188–95 [DOI] [PubMed] [Google Scholar]
- 42.Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 2005; 365:671–9 [DOI] [PubMed] [Google Scholar]
- 43.Pawitan Y, Bjohle J, Amler L, Borg AL, Egyhazi S, Hall P, Han X, Holmberg L, Huang F, Klaar S, Liu ET, Miller L, Nordgren H, Ploner A, Sandelin K, Shaw PM, Smeds J, Skoog L, Wedren S, Bergh J. Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res 2005; 7:R953–64 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Cao B, Wang Q, Zhang H, Zhu G, Lang J. Two immune-enhanced molecular subtypes differ in inflammation, checkpoint signaling and outcome of advanced head and neck squamous cell carcinoma. Oncoimmunology 2018; 7:1–13 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG, Gutierrez MC, Elledge R, Mohsin S, Osborne CK, Chamness GC, Allred DC, O’Connell P. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 2003; 362:362–9 [DOI] [PubMed] [Google Scholar]
- 46.Llovet JM, Ricci S, Mazzaferro V, Hilgard P, Gane E, Blanc JF, de Oliveira AC, Santoro A, Raoul JL, Forner A, Schwartz M, Porta C, Zeuzem S, Bolondi L, Greten TF, Galle PR, Seitz JF, Borbath I, Haussinger D, Giannaris T, Shan M, Moscovici M, Voliotis D, Bruix J, Group S. Sorafenib in advanced hepatocellular carcinoma. N Engl J Med 2008; 359:378–90 [DOI] [PubMed] [Google Scholar]
- 47.Cheng AL, Kang YK, Chen Z, Tsao CJ, Qin S, Kim JS, Luo R, Feng J, Ye S, Yang TS, Xu J, Sun Y, Liang H, Liu J, Wang J, Tak WY, Pan H, Burock K, Zou J, Voliotis D, Guan Z. Efficacy and safety of sorafenib in patients in the Asia-Pacific region with advanced hepatocellular carcinoma: a phase III randomised, double-blind, placebo-controlled trial. Lancet Oncol 2009; 10:25–34 [DOI] [PubMed] [Google Scholar]
- 48.Eddy JA, Sung J, Geman D, Price ND. Relative expression analysis for molecular cancer diagnosis and prognosis. Technol Cancer Res Treat 2010; 9:149–59 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Shu P, Wu J, Tong Y, Xu C, Zhang X. Gene pair based prognostic signature for colorectal colon cancer. Medicine 2018; 97:1–7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Li B, Cui Y, Diehn M, Li R. Development and validation of an individualized immune prognostic signature in early-stage nonsquamous non-small cell lung cancer. JAMA Oncol 2017; 3:1529–37 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Peng PL, Zhou XY, Yi GD, Chen PF, Wang F, Dong WG. Identification of a novel gene pairs signature in the prognosis of gastric cancer. Cancer Med 2018; 7:344–50 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Hida AI, Watanabe T, Sagara Y, Kashiwaba M, Sagara Y, Aogi K, Ohi Y, Tanimoto A. Diffuse distribution of tumor-infiltrating lymphocytes is a marker for better prognosis and chemotherapeutic effect in triple-negative breast cancer. Breast Cancer Res Treat 2019; 178:283–94 [DOI] [PubMed] [Google Scholar]
- 53.Hill BS, Sarnella A, D’Avino G, Zannetti A. Recruitment of stromal cells into tumour microenvironment promote the metastatic spread of breast cancer. Semin Cancer Biol 2019; 60:202–13 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material, sj-pdf-1-ebm-10.1177_1535370220936964 for Development and verification of a personalized immune prognostic feature in breast cancer by HongLei Wang, Li Wu and HongTao Wang in Experimental Biology and Medicine