Abstract
Estrogen receptor (ER) positivity by immunohistochemistry has long been a main selection criterium for breast cancer patients to be treated with endocrine therapy. However, ER positivity might not directly correlate with activated ER signaling activity, which is a better predictor for endocrine therapy responsiveness. In this study, we investigated if a deep learning method using whole-slide H&E-stained images could predict ER signaling activity. First, ER signaling activity score was determined using RNAseq data available from each of the 1082 breast cancer samples in the TCGA Pan-Cancer dataset based on the Hallmark Estrogen Response Early gene set from the Molecular Signature Database (MSigDB). Then the processed H&E-stained images and ER signaling activity scores from a training cohort were fed into ResNet101 with three additional fully connected layers to generate a predicted ER activity score. The trained models were subsequently applied to an independent testing cohort. The result demonstrated that ER + /HER2- breast cancer patients with a higher predicted ER activity score had longer progression-free survival (p = 0.0368) than those with lower predicted ER activity score. In conclusion, a convolutional deep neural network can predict prognosis and endocrine therapy response in breast cancer patients based on whole-slide H&E-stained images. The trained models were found to robustly predict the prognosis of ER + /HER2- patients. This information is valuable for patient management, as it does not require RNA-seq or microarray data analyses. Thus, these models can reduce the cost of the diagnosis workflow if such information is required.
Subject terms: Breast cancer, Cancer screening, Cancer genomics
Introduction
Breast cancers can be categorized into three major subtypes on the basis of the hormone receptors: estrogen receptor (ER) positivity, progesterone receptor (PR) positivity and human epidermal growth factor 2 (HER2, also known as ERBB2) positivity1–3. ER + /HER2- breast cancer is the most common subtype, constituting about 70% of the cases; the ER-/HER2 + subtype constitutes about 15–20%, and the ER-/PR-/HER2- subtype (triple-negative breast cancer [TNBC]) constitutes about 15%4,5. ER + breast cancers depend on ER signaling for proliferation, while HER2 + breast cancers depend on the HER2 signaling pathway.
Endocrine therapy reduces the recurrence rate of ER + breast cancers by about 50% and improves survival time by targeting ER and thus its downstream signaling. The most commonly used endocrine therapy drugs are tamoxifen, letrozole, and fulvestrant6. Although these drugs have relatively mild side effects, mainly post-menopausal symptoms, about 40% of patients do not complete the full 5-year treatment7; there is a need to identify patients who will experience a response to endocrine therapy to improve quality of life for the patients by avoiding unnecessary treatment.
The coupling of ER expression and ER pathway activity is essential for cell proliferation through ER signaling and endocrine therapy responsiveness8–10. Since ER + /HER2- cancers are commonly treated with endocrine therapy, recent study has found that ER pathway activity was significantly associated with survival duration in patients with ER + /HER2- breast cancer9. This implies that ER expression and elevated levels of ER pathway activity are related to endocrine therapy responsiveness. The ability to predict the efficacy of endocrine therapy in ER + /HER2- patients would facilitate the decision-making process for both physicians and patients. However, unlike hematoxylin-and-eosin (H&E) staining and ER/HER2/PR immunohistochemistry staining of tumor samples, the determination of ER pathway activity requires transcriptomic analyses that are not part of standard diagnostic procedures; thus, these data are usually unavailable.
Deep learning models for the prediction of ER/PR/HER2 status with H&E whole-slide images have been proposed in different studies11,12. However, there is also a need to predict ER signaling activity to determine prognosis and endocrine therapy effectiveness as no such study has been done. In this study, we determined whether a convolutional deep neural network and whole-slide H&E-stained images of The Cancer Genome Atlas (TCGA) breast cancer tumor samples could be used to predict ER signaling pathway activity to determine prognosis in patients with ER + /HER2- breast cancer. We measured ER pathway activity using the Early Estrogen Response Enrichment Score (EERES), determined by a Gene Set Variance Analysis (GSVA) with the Molecular Signature Database (MSigDB) hallmark Estrogen Response Early gene set13–15. We hypothesized that the predicted ER activity score from the deep learning model could determine endocrine therapy response in ER + /HER2- breast cancer patients. In line with our hypothesis, our results demonstrated that the predicted ER activity scores from whole-slide H&E-stained images were significantly associated with progression-free survival (PFS) in ER + /HER- breast cancer patients.
Results
ER + /HER2- breast cancer patient survival based on the EERES
The EERES of level 3 gene expression data (RNAseq) on 1082 TCGA Pan-Cancer breast cancer patient tumor samples from cBioPortal were determined by GSVA with the Estrogen Response Early gene set from MsigDB as described in the method section. The determined EERES scores for each sample are provided in Supplemental Table S1. The correlation between ESR1 gene expression and EERES for ER + /HER2- breast cancer and TNBC tumor samples are shown in Fig. 1a. The PFS and overall survival (OS) durations of ER + /HER2- and TNBC breast cancer patients are shown in Fig. 1b and c, respectively. The OS durations (p = 0.0394) of ER + /HER2- breast cancer patients were significantly longer than those of TNBC patients.
We next determined the association between the EERES and ER + /HER2- breast cancer patient survival (Fig. 1d and e, respectively). The PFS (p = 0.00466) and OS (p = 0.0189) durations were significantly longer for patients with an EERES > 0.2. As expected, the PFS/OS durations of TNBC patients were not statistically significant according to the EERES (data not shown).
Model training and evaluation
Examples of the image preparation and model architecture are shown in Fig. 2. The training data was labeled 0 or 1 according to the EERES at the cut-off quantile of 0.5 (label 0 if EERES ≤ − 0.0376; label 1 if EERES > − 0.0376). The metadata for the 1077 images are provided in Supplemental Table S2. The model was trained for 30 epochs.
The predicted ER activity scores of the testing cohort (n = 811, one sample without known hormonal receptor status was ignored) were determined by the trained model and are provided in Supplemental Table S3. The correlation of the predicted ER activity scores of ER + /HER2- with the actual EERES was determined with Pearson correlation test and the scatter plot with the statistics are shown in Fig. 3a (r = 0.135, p = 0.0147). The ER + /HER2- patients with higher and lower predicted ER activity score were compared for their survival using Kaplan–Meier curve (Fig. 3b). We found that ER + /HER2- patients with higher predicted score (quantile 0.2, predicted ER activity score threshold at 0.48964) had significantly longer response time (p = 0. 0368), indicating that higher predicted ER activity score was related to a better endocrine therapy response.
Moreover, to consider other clinical data (diagnostic age and stage) might impact treatment response (PFS), multivariant analysis of predicted ER activity score, age and stage of the testing samples were fitted into a cox proportional hazard (CoxPH) model with PFS duration. As shown in Fig. 4a, the predicted ER activity had an extreme low hazard ratio (HR) (HR < 0.001, p < 0.005), which implies a much lower risk of tumor progression with a higher predicted ER activity score. On the other hand, a higher stage (HR = 1.2, p < 0.005) and an older diagnostic age (HR = 1.01, p < 0.005) might have an increased risk of tumor progression. Thus, we fitted a CoxPH using data including predicted ER activity score, diagnostic age, and stage with PFS duration data from the training cohort to generate a risk score and applied the model to predict a risk score for the testing cohort. The resulting Kaplan–Meier curves for the PFS and OS duration between high and low predicted risk ER + /HER2- patients are shown in Fig. 4b and c, respectively. The PFS (p = 6.38e-6) and OS (p = 4.73e-10) durations were significantly longer for patients with a lower predicted risk at quantile 0.9.
The predicted risks of the ER + /HER2- patients were further evaluated by comparing their survival durations with those of TNBC patients (Fig. 4d–g). The PFS (p = 0.0206) and OS (p = 0.0178) durations of the testing ER + /HER2- patients with lower predicted risks were significantly longer than those of TNBC patients. On the other hand, the PFS and OS (p = 0.0152 and p = 0.0177, respectively) of ER + /HER2- patients with higher predicted risks had even shorter survival than TNBC patients.
Discussion
In this study, we presented a novel model based on ESR1 signaling activity score that can predict the prognosis and endocrine therapy response of ER + /HER2- breast cancer patients from whole-slide tumor H&E staining using a convolution deep neural network. The trained models were found to robustly predict the prognosis of ER + /HER2- patients; this information is valuable for patient management, as it does not require RNA-seq or microarray data analyses. Thus, these models can reduce the cost of the diagnosis workflow if such information is required. Using a CoxPH model based on the predicted ER activity score, age and stage, we were able to identify ER + /HER2- breast cancer patients from a testing cohort with high risk of progressive disease and even had poorer survival than TNBC patients. These high-risk patients could be spared from unnecessary endocrine therapy.
The model demonstrated its ability to distinguish patients with different prognoses according to the level of the predicted ER activity score from whole-slide tumor H&E staining for ER + /HER2- disease. In addition, the statistical difference between patients with ER + /HER2- breast cancer with a higher predicted ER activity score (but not those with a lower predicted ER activity score) and patients with TNBC confirmed that our model is predictive of endocrine therapy response in those with ER + /HER2- disease but not in those without ESR1 expression. This model could be developed as a medical device to evaluate treatment but will need further development16. Moreover, as EERES is related to the lower American Joint Committee on Cancer stage and the lower pathological grade, this model could also supplement this information9.
Other models have been extensively developed using convolutional neural networks and H&E images to identify mutation and receptor expression status, such as for the prediction of the expression of hormonal receptors, PD-L1, and BRCA mutations11,12,17–20. Cancer prognosis can also be predicted using histological information from tumor H&E images, together with other omics data21. However, predicting signaling activity to determine the prognosis of cancer patients using H&E images is novel and suggests that the histology of tumors reflects their signaling activities. Since reports have shown that the signaling pathway activity of different cancers cannot be reflected simply by their mutational status22–24, the results of this study might encourage more research using imaging data to predict drug sensitivity according to targeted pathway activity.
This study employed transfer learning by adopting the pre-trained convolutional neural network ResNet101. However, the model could be further improved for accuracy and interpretability using the attention method in deep learning25. It is also noteworthy that because of the randomness of annotation in the training process, the trained models could be slightly different each time.
In conclusion, our study provides a novel machine learning model for cancer prognosis and theragnosis by predicting ER signaling activity in patients with ER + /HER2- breast cancer from whole-slide tumor H&E staining.
Methods
Gene expression data, H&E-stained images, and EERES
Level 3 gene expression profiles of 1082 breast cancer patients in TCGA Pan-Cancer were downloaded from cBioportal (https://cbioportal.org)26,27. The H&E-stained tumor images of 1077 (some of the patients included in gene expression data were not available for their tumor image) non-duplicated patients and their histological and pathological statuses were downloaded from the Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov). The EERES for each sample were determined by GSVA with the gene set Hallmark Estrogen Response Early from MSigDB13. To elaborate, the gene expression matrix of all the samples and the list of the gene set Hallmark Estrogen Response Early from MSigDB (https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/HALLMARK_ESTROGEN_RESPONSE_EARLY.html) with its corresponding 200 gene symbols were loaded into R package GSVA to determine the ER pathway activity score EERES. The gene expression matrix for all samples was generated from the gene expression level (TPM, transcript per million) of each gene in each sample using log2(TPM + 1) values.
Data preparation
Out of the 1082 samples with gene expression profiles, 1077 samples had both gene expression profiles and whole-slide H&E-stained images. These 1077 samples were from four institutions. The dataset was split into training cohort (n = 265) using samples from three institutions—Harvard Medical School (Center code = 02), Lawrence Berkeley National Laboratory (Center code = 03), and Memorial Sloan-Kettering Cancer Center (Center code = 04). The trained model was then applied to the testing cohort (n = 812) from a single institution—Broad Institute of MIT (Massachusetts Institute of Technology) (Center code = 01). The downloaded images were cropped to square images using the length of the shorter side if their widths and lengths were not the same. The original images were then flipped horizontally or vertically according to their width or length and cropped to the same length as another square. The images were resized to 1024 × 1024 pixels. The images were labeled as 0 if the EERES was lower or equal to quantile 0.5 and 1 if it was higher than 0.5 (EERES threshold = -0.0376).
Survival and statistical analyses
The Kaplan–Meier survival curves were plotted, and the log-rank statistics were determined using the Python package kaplanmeier. The optimal threshold for separating the survival time of the breast cancer patients was identified by searching for the most significant log-rank p value of PFS or OS durations using the predicted ER activity score in the range from quantile 0.1 to 0.9 (Table S4). Pearson statistics were determined using the Python library SciPy28. p value lower than 0.05 is considered significant. CoxPH method was implemented using Python package lifelines29 to generate a predicted risk score incorporating predicted ER activity score, age, and stage. A CoxPH model was fitted with data from the training cohort. The trained model was then used to predict the risk of the testing cohort using their predicted ER activity score, diagnostic age and stage. The most significant log-rank p-value was found by comparing the patients with predicted lower and higher risk of tumor progression using different quantile. The log-rank p-values between testing patients with low and high risk using different quantile are provided in Table S5.
Model training and evaluation
The models were constructed and trained with the Python framework PyTorch30. A pre-trained ResNet101 with default parameters was employed31. The model was followed by fully connected layers with 128, 32, and 2 perceptrons, with a 0.2 dropout rate. The two perceptrons were followed by SoftMax for the probability of 0 and 1 classes. The loss was determined by CrossEntropyLoss, and the model was optimized by SGD (Stochastic Gradient Descent), with a 0.001 learning rate and a 0.9 momentum. The learning rate decay factor was 0.1 for every seven epochs. The training data were shuffled with random horizontal, vertical flip, and 360-degree rotation, with batch size 3. For each epoch, the model was evaluated with training data without augmentation.
The samples (n = 1077) were first separated into training (n = 812) and testing (n = 265) cohort first by their center code. A model was then trained with 30 epochs. The trained model was then used to predict the ER activity score for the testing cohort and determine the prognostic value of the predicted score.
Ethical approval
This study used open-access TCGA data that were de-identified and accordance with informed consent documents by the NCBI and were thus exempted from IRB approval (https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga/history/policies/tcga-human-subjects-data-policies.pdf [accessed Mar-9–2023]).
Supplementary Information
Acknowledgements
The results shown here are, in whole or in part, based on data generated by the TCGA Research Network: https://www.cancer.gov/tcga. The study was partly supported by the Ovarian Cancer Moon Shot at MD Anderson Cancer Center. We thank Ashli Ann M Sutton, Scientific Editor at the Research Medical Library of the University of Texas MD Anderson Cancer Center for editing this article.
Author contributions
C.N.W. proposed the idea; curated, analyzed, and interpreted the data; and prepared the manuscript. K.K.W. initiated the concept and participated in writing, critical review, commentary, and revision of the manuscript. All authors read and approved the final manuscript.
Code availability
A html file of the Jupyter notebook used to generate the results is available in the supplementary information.
Data availability
Processed data are available from the corresponding author upon reasonable request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-023-48830-x.
References
- 1.Burstein HJ. Systemic therapy for estrogen receptor-positive, HER2-negative breast cancer. N. Engl. J. Med. 2020;383:2557–2570. doi: 10.1056/NEJMra1307118. [DOI] [PubMed] [Google Scholar]
- 2.Jeselsohn RM. The evolving use of SERDs in estrogen receptor-positive, HER2-negative metastatic breast cancer. Clin. Adv. Hematol. Oncol. 2021;19:428–431. [PubMed] [Google Scholar]
- 3.McAndrew NP, Finn RS. Clinical review on the management of hormone receptor-positive metastatic breast cancer. JCO Oncol. Pract. 2022;18:319–327. doi: 10.1200/OP.21.00384. [DOI] [PubMed] [Google Scholar]
- 4.Nounou MI, et al. Breast cancer: Conventional diagnosis and treatment modalities and recent patents and technologies. Breast Cancer (Auckl) 2015;9:17–34. doi: 10.4137/BCBCR.S29420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kondov B, et al. Presentation of the molecular subtypes of breast cancer detected by immunohistochemistry in surgically treated patients. Open Access Maced. J. Med. Sci. 2018;6:961–967. doi: 10.3889/oamjms.2018.231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Burstein HJ, et al. Endocrine treatment and targeted therapy for hormone receptor-positive, human epidermal growth factor receptor 2-negative metastatic breast cancer: ASCO guideline update. J. Clin. Oncol. 2021;39:3959–3977. doi: 10.1200/JCO.21.01392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Milata JL, Otte JL, Carpenter JS. Oral endocrine therapy nonadherence, adverse effects, decisional support, and decisional needs in women with breast cancer. Cancer Nurs. 2018;41:E9–E18. doi: 10.1097/NCC.0000000000000430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Fuentes N, Silveyra P. Estrogen receptor signaling mechanisms. Adv. Protein Chem. Struct. Biol. 2019;116:135–170. doi: 10.1016/bs.apcsb.2019.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Oshi M, et al. Degree of early estrogen response predict survival after endocrine therapy in primary and metastatic ER-positive breast cancer. Cancers. 2020 doi: 10.3390/cancers12123557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yao J, Deng K, Huang J, Zeng R, Zuo J. Progress in the understanding of the mechanism of tamoxifen resistance in breast cancer. Front. Pharmacol. 2020;11:592912. doi: 10.3389/fphar.2020.592912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Farahmand S, et al. Deep learning trained on hematoxylin and eosin tumor region of Interest predicts HER2 status and trastuzumab treatment response in HER2+ breast cancer. Mod. Pathol. 2022;35:44–51. doi: 10.1038/s41379-021-00911-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Naik N, et al. Deep learning-enabled breast cancer hormonal receptor status determination from base-level H&E stains. Nat. Commun. 2020;11:5727. doi: 10.1038/s41467-020-19334-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hanzelmann S, Castelo R, Guinney J. GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013;14:7. doi: 10.1186/1471-2105-14-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Liberzon A, et al. The molecular signatures database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Subramanian A, et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U. S. A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ngiam KY, Khor IW. Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 2019;20:e262–e273. doi: 10.1016/S1470-2045(19)30149-4. [DOI] [PubMed] [Google Scholar]
- 17.Rawat RR, et al. Deep learned tissue "fingerprints" classify breast cancers by ER/PR/Her2 status from H&E images. Sci. Rep. 2020;10:7275. doi: 10.1038/s41598-020-64156-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Shamai G, et al. Deep learning-based image analysis predicts PD-L1 status from H&E-stained histopathology images in breast cancer. Nat. Commun. 2022;13:6753. doi: 10.1038/s41467-022-34275-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Nero C, et al. Deep-learning to predict BRCA mutation and survival from digital H&E slides of epithelial ovarian cancer. Int. J. Mol. Sci. 2022 doi: 10.3390/ijms231911326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gamble P, et al. Determining breast cancer biomarker status and associated morphological features using deep learning. Commun. Med. (Lond) 2021;1:14. doi: 10.1038/s43856-021-00013-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Schneider L, et al. Integration of deep learning-based image analysis and genomic data in cancer pathology: A systematic review. Eur. J. Cancer. 2022;160:80–91. doi: 10.1016/j.ejca.2021.10.007. [DOI] [PubMed] [Google Scholar]
- 22.Way GP, et al. Machine learning detects pan-cancer Ras pathway activation in the cancer genome atlas. Cell Rep. 2018;23(172–180):e173. doi: 10.1016/j.celrep.2018.03.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Li X, Li S, Wang Y, Zhang S, Wong KC. Identification of pan-cancer Ras pathway activation with deep learning. Brief Bioinform. 2021 doi: 10.1093/bib/bbaa258. [DOI] [PubMed] [Google Scholar]
- 24.Tang YC, Gottlieb A. Explainable drug sensitivity prediction through cancer pathway enrichment. Sci. Rep. 2021;11:3128. doi: 10.1038/s41598-021-82612-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zhu H, Xie C, Fei Y, Tao H. Attention mechanisms in CNN-based single image super-resolution: A brief review and a new perspective. Electronics. 2021;10:1187. doi: 10.3390/electronics10101187. [DOI] [Google Scholar]
- 26.Cerami E, et al. The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401–404. doi: 10.1158/2159-8290.CD-12-0095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Gao J, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal. 2013;6:pl1. doi: 10.1126/scisignal.2004088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Virtanen P, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17:261–272. doi: 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Davidson-Pilon C. lifelines: Survival analysis in Python. J. Open Source Softw. 2019 doi: 10.21105/joss.01317. [DOI] [Google Scholar]
- 30.Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. Adv Neur In32 (2019).
- 31.He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. Proc. Cvpr IEEE. 2016 doi: 10.1109/Cvpr.2016.90. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
A html file of the Jupyter notebook used to generate the results is available in the supplementary information.
Processed data are available from the corresponding author upon reasonable request.