Skip to main content
Cell Reports Medicine logoLink to Cell Reports Medicine
. 2022 Dec 20;3(12):100873. doi: 10.1016/j.xcrm.2022.100873

HRD-related morphology discovery in breast cancer by controlling for confounding factors

Yoni Schirris 1,2, Hugo Mark Horlings 1,
PMCID: PMC9798077  PMID: 36543118

Abstract

Lazard et al.1 predict homologous recombination deficiency from hematoxylin and eosin-stained slides of breast cancer tissue using deep learning. By controlling for technical artifacts on a curated dataset, the model puts forward novel HRD-related morphologies in luminal breast cancers.


Lazard et al.1 predict homologous recombination deficiency from hematoxylin and eosin-stained slides of breast cancer tissue using deep learning. By controlling for technical artifacts on a curated dataset, the model puts forward novel HRD-related morphologies in luminal breast cancers.


Worldwide, the prognosis of most cancer cases is generally improving, in part due to more advanced biomarker testing to provide personalized therapies. Two such therapies are poly (ADP-ribose) polymerase inhibitors (PARPi), which induce double-strand breaks, and interstrand crosslinks-inducing platinum-based therapies, which are effective against prostate, pancreatic, ovarian, and breast tumors that have a deficient homologous recombination repair (HRR) pathway.2 Whereas an HR-proficient (HRP) cell repairs the induced damage, the HR-deficient (HRD) cancer cell is unable to and will die. Therefore, it is important to clinically identify HRD as this allows for targeted treatment planning.2

The main cause of HRD are mutations in, or epigenetic modifications of, the BRCA1/2 or other HRR-supporting genes (e.g., ATM, PALB2, RAD51). These genetic changes result in a functionally deficient HRR phenotype with specific genomic scars.2 The majority of FDA-approved tests focus on these mutations, epigenetic changes, or genomic scars as they may be indicative of HRD, yet have diverging results.2 Additionally, these genetic tests have a long turn-around time and are generally considered expensive. Therefore, it is key to find quick and robust tests. Lazard et al.1 investigate the performance of artificial intelligence (AI) methods to predict HRD from hematoxylin and eosin-stained (H&E) whole-slide images (WSIs) of breast cancer (BC) tissue and describe novel HRD-related morphological patterns in clinically relevant luminal BC patients (Figure 1).

Figure 1.

Figure 1

Interplay of PARPi response, genotype and genomic phenotype, confounding factors, and resulting morphology

The confounders may be correlated with the genotype and used by the AI model. Lazard et al.1 partly block the technical confounders by focusing on a single-center dataset, controlling for technical artifacts. Additionally, they partly block the possibility of using biological confounders by modeling only basal BC samples. The resulting model is forced to focus on HRD-related morphologies. By analyzing the model predictions, they conclude that the morphologies described in the “Possible HRD-related morphologies” box are indicative of HRD. The figure draws heavily on Figure 2 from Stewart et al.2 The spurious correlation flow is inspired by Figure S2 in Ilse et al.3

Recently, the AI revolution made its way to computational pathology, able to predict a plethora of genomic mutations and signatures across cancer types from H&E WSIs.4 Notably the scoring of microsatellite instability, a biomarker for immune therapy response, e.g., in colorectal cancer, has reached clinical-level performance5 and is available through commercial products. Since the resulting genomic scar from HRD leads to downstream changes in cellular and tumor microenvironmental morphology in (for example) ovarian cancer,6 there exists an evident genotype-phenotype relation which can be learned by an AI model. This is evidenced by Valieris et al.,7 who show that an AI-based HRD classifier trained on The Cancer Genome Atlas (TCGA) could be validated on an independent test set. Building upon this method, we showed the benefit of self-supervised pre-training and tumor heterogeneity-aware multiple instance learning for the prediction of HRD in BC within TCGA, and we showed tumorous tissue with high pleomorphism, high heterogeneity, and necrosis to be HRD related.8 Finally, Wang et al.9 show that AI can predict BRCA1/2 mutation status in BC. This provides evidence that the mutational causes and genomic scar consequences of HRD can indeed be predicted, yet the exact morphologies are not fully understood.

Although AI models can find HRD-related morphological patterns, they might pick up patterns that are spuriously correlated with HRD. For example, a model might learn that a slide’s pen marking originates from a medical center that mainly treats high-grade tumors and uses this pattern to predict the patient’s poor prognosis. Such models could not be used to successfully discover novel morphologies related to a good prognosis, or for clinical applications, since the model does not generalize to other data distributions. Guidelines to reduce the model’s abuse of spurious features have been previously described.10

Lazard et al.1 show that spuriosity manifests in TCGA, besides the known site-related technical artifacts, through the correlation between molecular subtype and HRD, such that subtype-specific morphologies, unrelated to HRD, may be learned. Even their curated private dataset contains artifacts related to a change in fixation and impregnation protocols over the collection period which can indicate the time of sample collection and be correlated with a changing patient demographic. To circumvent these confounding factors, they train solely on H&E WSIs of luminal BC from a single institution while controlling for the changed protocols using strategic sampling. This forces the model to focus more on HRD-related features as visualized in Figure 1.

The two-dimensional representation of the latent feature vectors of this model’s high and low HRD-scoring tiles confirms that TILs, necrosis, and high atypia are HRD-related morphologies. In contrast, low tumor cell density and clear space surrounding (apocrine) cell nests are related to HRP tumors. The predictiveness of TIL density, nuclear, and atypia grade are validated through pathologist scoring showing evident correlations between the pathologist scores and HRD status. Furthermore, a visualization that walks varying paths from low- to high-scoring tiles in the two-dimensional feature space shows a continuous degree of HRDness along the dimensions of nucleus size, lymphocytic infiltration, or necrotic features, reminiscent of latent feature interpolation in generative networks. Finally, they find tiles enriched with carcinomatous cells with clear cytoplasm and intra-tumoral laminated fibrosis as novel HRD-related morphologies, hypothesizing an alteration of specific metabolic processes.

In conclusion, downstream cellular and tumor microenvironmental phenotypes caused by upstream intracellular functional changes, like HRD, are observable in H&E WSIs of BC tissue. AI can learn to correlate such morphologies to the genotype without morphological prior knowledge. Although technical or biological confounding factors may be present, methods like strategic sampling and careful data curation can reduce the model’s capacity of abusing these. Lazard et al.1 exemplify these methods, shed new light on morphologies that are indicative of HRD in luminal BC, and set the stage for further work to validate the clinical applicability of H&E WSI-based deep learning methods to be used as a complementary diagnostic tool to perform pre-screening for HRD.

Acknowledgments

The collaboration project is co-funded by the PPP Allowance made available by Health Holland, Top Sector Life Sciences & Health, to stimulate public-private partnerships (https://www.health-holland.com).

References

  • 1.Lazard T., Bataillon G., Naylor P., Popova T., Bidard G.-C., Stoppa-Lyonnet D., Stern M.-H., Decencière E., Walter T., Vincent-Salomon A. Deep Learning identifies new morphological patterns of Homologous Recombination Deficiency in luminal breast cancers from whole slide images. Cell Rep. Med. 2022;3:100872. doi: 10.1016/j.xcrm.2022.100872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Stewart M.D., Merino Vega D., Arend R.C., Baden J.F., Barbash O., Beaubier N., Collins G., French T., Ghahramani N., Hinson P., et al. Homologous recombination deficiency: Concepts, Definitions, and Assays. Oncol. 2022;27:167–174. doi: 10.1093/oncolo/oyab053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ilse M., Tomczak J.M., Louizos C., Welling M. DIVA: Domain Invariant Variational Autoencoders. Proc. Machine Learning Res. 2021;121:322–348. [Google Scholar]
  • 4.Kather J.N., Heij L.R., Grabsch H.I., Loeffler C., Echle A., Muti H.S., Krause J., Niehues J.M., Sommer K.A.J., Bankhead P., et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nature Cancer. 2020;1:789–799. doi: 10.1038/s43018-020-00149-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Echle A., Grabsch H.I., Quirke P., van den Brandt P.A., West N.P., Hutchins G.G., Heij L.R., Tan X., Richman S.D., Krause J., et al. Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning. Gastroenterology. 2020;159:1406–1416.e11. doi: 10.1053/j.gastro.2020.06.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Soslow R.A., Han G., Park K.J., Garg K., Olvera N., Spriggs D.R., Kauff N.D., Levine D.A. Morphologic patterns associated with BRCA1 and BRCA2 genotype in ovarian carcinoma. Mod. Pathol. 2012;25:625–636. doi: 10.1038/modpathol.2011.183. [DOI] [PubMed] [Google Scholar]
  • 7.Valieris R., Amaro L., Osório C.A. B. d T., Bueno A.P., Rosales Mitrowsky R.A., Carraro D.M., Nunes D.N., Dias-Neto E., Silva I.T. Deep learning predicts Underlying features on pathology images with Therapeutic relevance for breast and Gastric cancer. Cancers. 2020;12:3687. doi: 10.3390/cancers12123687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Schirris Y., Gavves E., Nederlof I., Horlings H.M., Teuwen J. DeepSMILE: Contrastive self-supervised pre-training benefits MSI and HRD classification directly from H&E whole-slide images in colorectal and breast cancer. Med. Image Anal. 2022;79:102464. doi: 10.1016/j.media.2022.102464. [DOI] [PubMed] [Google Scholar]
  • 9.Wang X., Zou C., Zhang Y., Li X., Wang C., Ke F., Chen J., Wang W., Wang D., Xu X., Xie L., Zhang Y. Prediction of BRCA gene mutation in breast cancer based on deep learning and Histopathology images. Front. Genetics. 2021;12:661109. doi: 10.3389/fgene.2021.661109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Javed S.A., Juyal D., Shanis Z., Chakraborty S., Pokkalla H., Prakash A. Rethinking Machine Learning Model Evaluation in Pathology. Preprint at arXiv. 2022 doi: 10.48550/arXiv.2204.05205. [DOI] [Google Scholar]

Articles from Cell Reports Medicine are provided here courtesy of Elsevier

RESOURCES