Abstract
PURPOSE
The proven efficacy of human epidermal growth factor receptor 2 (HER2) antibody-drug conjugate therapy for treating HER2-low breast cancers necessitates more accurate and reproducible HER2 immunohistochemistry (IHC) scoring. We aimed to validate performance and utility of a fully automated artificial intelligence (AI) solution for interpreting HER2 IHC in breast carcinoma.
MATERIALS AND METHODS
A two-arm multireader study of 120 HER2 IHC whole-slide images from four sites assessed HER2 scoring by four surgical pathologists without and with the aid of an AI HER2 solution. Both arms were compared with high-confidence ground truth (GT) established by agreement of at least four of five breast pathology subspecialists according to ASCO/College of American Pathologists (CAP) 2018/2023 guidelines.
RESULTS
The mean interobserver agreement among GT pathologists across all HER2 scores was 72.4% (N = 120). The AI solution demonstrated high accuracy for HER2 scoring, with 92.1% agreement on slides with high confidence GT (n = 92). The use of the AI tool led to improved performance by readers, interobserver agreement increased from 75.0% for digital manual read to 83.7% for AI-assisted review, and scoring accuracy improved from 85.3% to 88.0%. For the distinction of HER2 0 from 1+ cases (n = 58), pathologists supported by AI showed significantly higher interobserver agreement (69.8% without AI v 87.4% with AI) and accuracy (81.9% without AI v 88.8% with AI).
CONCLUSION
This study demonstrated utility of a fully automated AI solution to aid in scoring HER2 IHC accurately according to ASCO/CAP 2018/2023 guidelines. Pathologists supported by AI showed improvements in HER2 IHC scoring consistency and accuracy, especially for distinguishing HER2 0 from 1+ cases. This AI solution could be used by pathologists as a decision support tool for enhancing reproducibility and consistency of HER2 scoring and particularly for identifying HER2-low breast cancers.
INTRODUCTION
The standard-of-care evaluation of human epidermal growth factor receptor 2 (HER2) in breast cancer includes immunohistochemistry (IHC) to assess protein overexpression and in situ hybridization (ISH) to determine gene amplification. ASCO and College of American Pathologists (CAP) published guidelines for HER2 testing—first in 2007 and updated in 2013, 2018, and 2023—that enhanced standardization of HER2 testing in clinical practice.1 Since the available HER2-targeted therapy was beneficial only to patients with HER2-positive disease, the testing guidelines provided recommendations for clearly distinguishing a negative from a positive result. The categorization of the HER2 testing result until now has therefore been essentially binary.
CONTEXT
Key Objective
Can a fully automated artificial intelligence (AI)–based human epidermal growth factor receptor 2 (HER2) immunohistochemistry (IHC) scoring solution in breast cancer aid general surgical pathologists for consistent and accurate HER2 scoring in comparison with manual digital scores provided by expert breast pathologists.
Knowledge Generated
The HER2 AI solution could be applied irrespective of the laboratory performing HER2 IHC, the antibody, or the scanner used to generate whole-slide images of HER2 IHC slides. The AI solution demonstrated a standalone accuracy of 92.1% in comparison with HER2 scores of breast experts. Utilization of the HER2 AI solution by surgical pathologists significantly improved interobserver agreement in not only all HER2 scores but particularly distinction of HER2 0 from HER2 1+ cases.
Relevance
The performance of the HER2 AI solution supports consideration as a decision support tool to pathologists to improve HER2 scoring in routine clinical practice especially for optimal identification of HER2-low breast cancers.
The results of the DESTINY-Breast 04 clinical trial reported by Modi et al2 showed the need to identify patients with low levels of HER2 protein expression and to distinguish 0 from 1+ scores. In that trial, patients with metastatic breast cancer that was HER2-negative, but with HER2 IHC results of 1+ or 2+ with negative ISH results, referred to as HER2-low breast cancer, showed significant improvement in survival after treatment with the antibody-drug conjugate fam-trastuzumab deruxtecan-nxki. The favorable results of the trial led to the drug's approval by the US Food and Drug Administration for the treatment of patients with HER2-low breast cancer. The drug's approval was also followed by the premarket approval of a HER2 IHC semiquantitative assay (Ventana PATHWAY anti-HER2/neu 4B5 rabbit monoclonal antibody on the BenchMark ULTRA instrument) for optimal identification of these patients.3 The most recent ASCO/CAP update of HER2 testing guidelines provides best practice recommendations for the distinction of HER2 0 from 1+, including evaluation of HER2 IHC at high-power magnification (×40) and seeking consensus review when needed. The subjectivity of manual interpretation of either light microscopic examination or digital whole-slide images (WSIs) of HER2 IHC and the challenges faced by the pathologists for recognition of breast cancers with low levels of HER2 protein overexpression are well recognized.4
The adoption of digital pathology has grown significantly in recent years, enabling the implementation of artificial intelligence (AI) tools to support triage of cases, primary diagnosis, and biomarker quantification.5-8 Specifically, there is currently great interest in exploring computational image analysis using deep learning–based algorithms for objective categorization of HER2 IHC results, particularly to facilitate the identification of HER2-low breast cancers. We sought to evaluate the performance of a fully automated AI-based solution and to assess its potential utility in improving concordance of HER2 scoring and identification of HER2-low breast cancers in an international multicenter reader study.
MATERIALS AND METHODS
Study Cohort
The study cohort included hematoxylin and eosin (H&E)–stained and HER2 IHC slides of 120 patients with breast cancer from four pathology laboratories in three geographic regions, that is, the United States (1), France (1), and Israel (2). The cohort included randomly assigned retrospective cases from 2021 to 2022, and the required sample size for the study was calculated (Data Supplement, Methods). Each laboratory processed the slides on the basis of their institutional staining protocol using one of three different HER2 antibodies: 4B5 (Roche), HercepTest (Dako), and EP3 (Cell Marque). The HER2 IHC and corresponding H&E slides were scanned at 40× magnification on Philips UFS and Hamamatsu C13220 scanners and were fully anonymized. WSIs were obtained under ethics approval at each site by the local ethics committee or institutional review board with waiver of informed consent from the patients. The study cohort included invasive ductal carcinoma-not otherwise specified type (45%) and special types, including invasive lobular carcinoma (23%), tubular (6.7%), mucinous (6.7%), apocrine (6.7%), metaplastic (5%), adenoid cystic (3.3%), cribriform (2.5%), and secretory carcinomas (0.8%; Table 1). The distribution of HER2 scores in the study cases, as reported in standard-of-care practice in the laboratories, was HER2 0, 40 (33%); HER2 1+, 38 (32%); HER2 2+, 24 (20%); and HER2 3 +, 18 (15%) slides.
TABLE 1.
Study Cohort for Evaluation of the Fully Automated Artificial Intelligence Solution for Aiding Human Epidermal Growth Factor Receptor 2 Immunohistochemical Scoring
| Category | No. of Slides | Slides, % |
|---|---|---|
| Cancer type/subtype | ||
| IC-NST (IDC) | 54 | 45 |
| Special subtypesa | 32 | 26.7 |
| ILCb | 27 | 22.5 |
| IDC + ILC | 1 | 0.8 |
| Metaplastic | 6 | 5 |
| Laboratory | Geography | Scanner | Antibody | No. of Slides | Slides, % |
|---|---|---|---|---|---|
| Laboratory, geography, scanner, and antibody | |||||
| Laboratory 1 | US | Philips UFS | EP3 | 16 | 13 |
| Laboratory 2 | EU | Philips UFS | 4B5/HercepTest | 36 | 30 |
| Laboratory 3 | Israel | Philips UFS | 4B5 | 43 | 36 |
| Laboratory 4 | Israel | Hamamatsu C13220 | 4B5 | 25 | 21 |
Abbreviations: EU, European Union; IC-NST, invasive carcinoma-not otherwise specified type; IDC, infiltrating ductal carcinoma; ILC, infiltrating lobular carcinoma.
Special subtypes included eight mucinous, eight apocrine, eight tubular, four adenoid cystic, three cribriform, and one secretory cancers.
The ILC also included four ILC pleomorphic type.
Algorithm Development
The algorithm (Galen Breast HER2, Ibex Medical Analytics) was designed to support the interpretation and quantification of HER2 IHC on WSIs. The algorithm receives as input WSIs of HER2 IHC–stained tissue sections and runs six computational steps on each WSI (Fig 1). It first detects the tissue fragments and then identifies the on-slide control (if present). In the third step, an AI algorithm identifies areas of interest—namely, invasive tumor regions. Within those regions, another AI model is used to detect the individual tumor cells and classify their HER2 IHC staining pattern according to membrane staining intensity and completeness (not stained, moderate incomplete, intense complete, etc). Finally, the algorithm calculates the slide-level HER2 score (0, 1+, 2+, or 3+) according to the cell counts and the ASCO/CAP 2018/2023 guidelines1 and generates contours and cell overlays to visualize its results, which are displayed to the user in the Galen slide viewer. The two main AI models—for invasive cancer detection and for cell classification—are based on multilayered convolutional neural networks that were specifically developed for image classification and object detection tasks, respectively. Additional details on the AI algorithm development are included in the Data Supplement.
FIG 1.

Overview of the AI algorithm. The HER2 IHC whole-slide images are uploaded to the system, tissue is then detected using the tissue detection algorithm (step 1), and on-slide control (if present) is located and excluded (step 2). Then, the region classification AI model identifies the invasive cancer (step 3), and within it, tumor cells are detected and classified using the cell detection AI model (step 4). Finally, a slide-level HER2 score is calculated, and visualization of the previous steps is prepared and displayed to the user in the Galen slide viewer (steps 5 and 6). AI, artificial intelligence; AOI, area of interest; CAP, College of American Pathologists; HER2, human epidermal growth factor receptor 2; IHC, immunohistochemistry.
Study Design
The prospective multireader study with crossover design included four general surgical pathologists as readers whose performance in the digital review of HER2 IHC on the WSIs without and with the aid of AI was compared with the ground truth (GT) provided by five expert breast pathologists (Data Supplement, Fig S1). The readers and GT experts reviewed HER2 IHC digitally and interpreted HER2 IHC scores according to the ASCO/CAP 2018/2023 guidelines. The study comprised two arms: in arm A, digital manual read, WSIs of HER2 IHC were reviewed and scored manually using a digital viewer and in arm B, AI-supported read, an-AI supported HER2 workflow arm, pathologists reviewed and scored the same WSIs of HER2 IHC using the AI solution.
HER2 IHC slides were assigned in random order to each of the study reading pathologists (a pool of board-certified pathologists participating in the study with a range of 5-20 years of experience) and to the GT breast experts.
GT was established by a team of five international expert breast pathologists from the United States (S.K. and S.J.S.) and Europe (E.C., R.C.-M., and A.V.-S.). The HER2 IHC scores reported by the experts constituted the GT for evaluating both the standalone performance of the AI algorithm and the utility of the AI solution as an ancillary aid to the four general pathologists. High-confidence GT was defined as the agreement of HER2 IHC scores by at least four of the five breast experts and was used for the analysis of the results of the study. Rates of agreement in HER2 scoring between each arm and GT were compared.
All the study slides were assessed with the two modalities (arms) by the same reader, with a washout period of 2 weeks between the two reads. The readers assessed all the study slides in both arms and were blinded to the results of the other arm and to each other. The GT expert pathologists were blinded to each other's results and to the readers' scores.
Statistical Analysis
The agreement between the GT breast experts for each HER2 IHC score was determined, including the 95% CI. Similarly, the agreement between the readers, that is, inter-observer concordance, for each HER2 IHC score without and with the aid of AI was determined including the 95% CI for the entire cohort. In addition, inter-observer concordance forHER2 IHC scores 0 and 1+ according to GT, were also established. The mean agreement of the readers with the high-confidence GT for all scores and for distinction of HER2 0 from 1+ scores was evaluated. The standalone performance of the AI algorithm for each HER2 IHC score was determined by comparing the AI scores with the high-confidence GT scores. The study was not powered for HER2 0 versus 1+ or for calculating the accuracy per scanner or per antibody. Statistical analyses were performed using SAS v9.4 (SAS Institute, Cary, NC). Continuous variables were summarized using mean and standard deviation, and categorical variables by count and percentage. The required significance level of findings was P < .05.
RESULTS
GT
The overall interobserver agreement among the five expert breast pathologists who reviewed and scored the HER2 IHC WSIs was 72.4% (intraclass correlation coefficient, 0.86; 95% CI, 0.82 to 0.89), with complete agreement among all five for 53 of the 120 slides (44.2%). Four of the five GT breast pathologists agreed for another 39 (32.5%) slides (Fig 2A). Because of the high variability in expert agreement on HER2 scores, using simple majority might lead to unreliable GT (eg, for cases with three v two agreement), and thus, we decided to use high-confidence GT for most of the analyses. For HER2 scores 0, 1+, 2+, and 3+, experts' agreement (average of all GT pair agreement rates for each HER2 score, N = 120) was 80.1%, 65.9%, 69.2%, and 96.4%, respectively. The distribution of the HER2 IHC scores of the 92 (76.7%) slides for which at least four of the five expert breast pathologists agreed, defined as high-confidence GT, included 27 (29.3%) slides scored as HER2 0, 31 (33.7%) scored as 1+, 16 (17.4%) scored as 2+, and 18 (19.6%) scored as 3+ (Fig 2B). The HER2 IHC scores with high-confidence GT were used to determine the performance of the AI algorithm and that of the readers without and with the aid of the AI tool. For the other 27 (22.5%) slides, only three of the five breast pathologists agreed, and these slides were included only in part of the analyses. For these 27 slides with a weak majority agreement on HER2 score (low-confidence GT), the discrepancies were between 0 versus 1+ in 12 (44.4%) slides, 1+ versus 2+ in 12 (44.4%) slides, and 2+ versus 3+ in one (3.7%) slide (Fig 2C). One case had no majority agreement on GT and was excluded. The percentage of HER2 0+, 1+, 2+, and 3+ cases scored by each of the five experts is shown in Figure 3A.
FIG 2.

GT analysis. (A) Slide distribution per HER2 score as determined by the majority agreement of five breast pathology subspecialists performing GT (N = 120). (B) HER2 scores for slides with high-confidence GT (n = 92); (C) HER2 scores for slides with low-confidence GT (majority of 3 of 5; n = 27)—mainly 0/1+ (ie, slides that were given a score of 0 by three breast experts and 1+ by two breast experts or vice versa) and 1+/2+ slides. GT, ground truth; HER2, human epidermal growth factor receptor 2.
FIG 3.

Percentages of HER2 scores by expert GT and reader pathologists. Percentages per HER2 score for the entire study cohort (N = 120 slides) for (A) GT expert breast pathologists, (B) reader pathologists without the AI algorithm (n = 119, one slide with no GT was excluded) average pair agreement, and (C) reader pathologists with AI (n = 119). AI, artificial intelligence; GT, ground truth; HER2, human epidermal growth factor receptor 2.
Readers' Performance
The interobserver agreement of the four general surgical pathologists was variable for each HER2 IHC score and changed between the two arms. In arm A (digital manual read), the interobserver agreement was 75.3%, 61.6%, 63.0%, and 94.4% for HER2 0, 1+, 2+, and 3+, respectively, whereas in arm B (AI-supported read), the interobserver agreement was 92.5%, 72.5%, 53.6%, and 97.2%, respectively. Thus, the interobserver agreement increased from arm A to arm B for HER2 0, 1+, and 3+ scores, whereas it decreased for HER2 2+. The highest and statistically significant improvement was noted for HER2 IHC score 0 and for HER2 1+ slides. The percentage of HER2 0+, 1+, 2+, and 3+ cases scored by each of the four readers without and with the ancillary HER2 AI tool is shown in Figures 3B and 3C and the Data Supplement (Table S1). The smallest discordances in the percentage of cases were observed in the HER2 3+ category without and with AI (6.7% and 0.8%), followed by HER2 0 (9.2%; 6.7%). For HER2 1+ and 2+ cases, the percentage discordances were high (28.6% and 26.1% without AI; 23.5% and 24.4% with AI).
The overall interobserver agreement of the four readers for the 119 study cases was 69.7% in arm A and 77.2% with the help of the HER2 AI tool in arm B (Fig 4A). The interobserver agreement of the four reader pathologists for the slides with high-confidence GT (n = 92) was 75% in arm A and 83.7% in arm B (P < .05; Fig 4B). The pathologists' interobserver agreement significantly improved for the distinction of HER2 0 from 1+ cases with a high-confidence GT (n = 58) from 69.8% in arm A to 87.4% with the help of the AI in arm B (Fig 4C).
FIG 4.

Reader pathologists' performance for all HER2 scores. (A) Interobserver agreement for pathologists, with and without the AI algorithm, for all slides with GT (n = 119). (B) Interobserver agreement for pathologists with and without AI for high-confidence GT slides (n = 92). (C) Interobserver agreement for pathologists with and without AI for HER2 0 and 1+ slides with high-confidence GT (n = 58). (D) Reader pathologists' accuracy (ie, agreement with GT) with and without AI for slides with high-confidence GT (n = 92). (E) Reader pathologists' accuracy with and without AI for HER2 0 and 1+ slides with high-confidence GT (n = 58); Average percentage agreement and 95% CI are presented as bar with error graphs. AI, artificial intelligence; GT, ground truth; HER2, human epidermal growth factor receptor 2; STD, standard deviation.
The mean overall accuracy of the four readers compared with the high-confidence GT HER2 IHC scores (n = 92 slides) was 85.3% in arm A (digital manual read) and 88% in arm B (with AI ancillary aid; Fig 4D). The readers' accuracy for the distinction of HER2 0 from 1+ cases (n = 58 slides with high-confidence GT) increased from 81.9% in arm A to 88.8% with the support of the HER2 AI tool, in arm B, but did not achieve statistical significance (Fig 4E).
Two examples in which the interobserver concordance improved with the aid of the AI solution are illustrated in Figure 5.
FIG 5.

Examples of the AI algorithm's effect on pathologists' review. Shown are slides with borderline HER2 IHC scores of 0/1+ (A) without AI and (B) with AI (total percentage of stained cells is 3.6% per the AI), both at 18× magnification (0.56 μm/pixel) with higher magnification of 40× (0.25 μm/pixel) insert so that the cell overlays provided by the AI solution can be visualized, and slides scored HER2 1+/2+ (C) without AI and (D) with AI (percentage of faint stained cells is 23.2% and moderate complete and incomplete 2% per the AI) at 18× (0.56 μm/pixel) with higher magnification of 40× (0.25 μm/pixel) insert; (B and D) red contour line marks the invasive tumor area detected by the AI, with (D) DCIS correctly excluded by the AI from the area of interest; (B) HER2 score and detected % for different cell staining patterns are provided in the AI slide report, with cell overlay (small colored circles) indicating different HER2 cell staining patterns detected by the AI (eg, empty blue—not stained, green—faintly stained, etc). AI, artificial intelligence; DCIS, ductal carcinoma in situ; HER2, human epidermal growth factor receptor 2; IHC, immunohistochemistry.
HER2 AI Standalone Performance
The standalone performance of the HER2 AI algorithm was determined for each HER2 IHC score comparing the automatic AI output scores with the high-confidence GT scores (n = 92). The accuracy of the HER2 AI tool was 100% for the HER2 3+ slides, followed by 92.6% for HER2 0 slides and 90.3% for HER2 1+. The lowest agreement was 87.5% for HER2 2+ slides. Overall, the accuracy of the HER2 AI tool was 92.1% (Data Supplement, Fig S2).
DISCUSSION
Here, we report the performance of a fully automated AI solution for HER2 IHC analysis that helped improve interobserver concordance and accuracy of HER2 scoring among general surgical pathologists, measured by agreement with expert breast pathologists. The AI solution was applicable across different laboratories, antibody clones, and scanners. Importantly, the solution significantly increased the interobserver agreement and accuracy of HER2 0 and 1+ scores, thereby demonstrating its potential role in improved identification of HER2-low breast cancers.
The fully automated AI solution for HER2 IHC scoring reached a standalone accuracy of 92.1% compared with high-confidence GT scores. The AI solution was able to recognize invasive tumors with high precision, detect the HER2 expression pattern in individual invasive tumor cells, and provide HER2 IHC scores on the basis of the ASCO/CAP 2018/2023 guidelines. The AI tool helped the four general surgical pathologists to achieve significantly improved interobserver agreement and some improvements in agreement with high-confidence GT scores established by experts. The latter is of high importance as HER2 IHC interpretation can be very subjective, leading to relatively high intra- and interpathologist variability, even among experts.9-12 Thus, in order to reach more reliable GT and results, in the current study, we decided to use five GT experts who reviewed all the cases and we defined high-confidence GT as a majority agreement of four of five experts. Similarly, each case was reviewed by four reader pathologists, so that the statistics on the performance in each arm were more robust (more power even if the study cohort was not large).
The AI tool provided ancillary support that resulted in significantly improved interobserver agreement for 0 and 1+ HER2 scores, which, in previous studies, showed poor/moderate interobserver agreement.9-12 A distinct advantage of the AI solution's robustness and the study was its diversity, including slides stained in diverse laboratories, with different HER2 antibodies and two different digital scanners used to generate the WSIs of HER2 IHC.
Image analysis–based decision support tools for HER2 IHC scoring have been available for routine pathology practice in the past decade.13-19 In addition, the CAP has published guidelines for safe incorporation of digital image analysis–based HER2 quantification.20,21 Most of these tools were trained and validated to aid pathologists for accurate identification of HER2-positive (3+) breast cancers and for optimal identification of cases with 2+ scores to facilitate triaging for HER2 ISH. The utility of these digital aids to facilitate identification of HER2-low breast cancers is very limited as most of these tools were not developed to differentiate between HER2 0 and 1+ scores. In a recent study by Sode et al,22 the concordance between digital image reading and algorithm-assisted reading for identification of HER2-low cancers was only moderate.
In recent years, AI solutions were developed to aid pathologists for optimal categorization of HER2 0 and 1+ scores. Some of these AI algorithms were developed to provide real-time decision support to pathologists as an augmented reality module attached to a light microscope.23,24 These aids were designed to assist in HER2 scoring of regions of interest selected by a pathologist on light microscopic images and not for use on WSIs. Most of the reported AI solutions for HER2 scoring were developed to aid pathologists in the interpretation of digital WSIs of HER2 IHC, similar to our study.25-29 Of note, unlike our study, previous AI solutions for HER2 scoring were not fully automated, requiring human intervention for selecting the region of interest. For example, the studies reported by Palm et al and Frey et al (in abstract form) required manual annotation of regions of interest in the invasive tumor for application of the AI algorithm.27,28 While the report of an AI-powered HER2 analyzer by Jung et al26 (in abstract form) in their study of 209 HER2 WSIs suggests using the entire tumor, it is not clear if the AI tool recognized the invasive tumor for evaluation or if it required initial annotation. Similar to the AI solution validated in our study, Frey et al28 reported that their AI-based HER2 IHC quantifier software could be used across four institutions and five scanners, using different HER2 antibody clones at each site. None of the other reported AI solutions for HER2 scoring reported this advantage.
Establishment of the standalone performance of the AI solution is important to estimate its potential to aid pathologists and to generate trust by pathologists. Standalone performance was reported only in one of the previous reports of AI solutions for HER2 scoring28 and in a previous abstract including an earlier version of the current AI solution.29 In the current study, the performance of the AI tool for HER2 0, 1+, 2+, and 3+ scores was 92.6%, 90.3%, 87.5%, and 100%, respectively, when compared with high-confidence GT HER2 scores.
The two-arm design is another key aspect of the current study that allowed us to directly measure the impact of AI assistance on HER2 scoring by general surgical pathologists. We observed improvement in both interobserver agreement and accuracy after the use of the AI tool. Specifically, for cases whose GT was HER2 0 and 1+, that is, cases around the HER2-low boundary, the readers' accuracy increased from 81.9% to 88.8%.
The current study has several limitations. While the number of cases reviewed by both expert GT and reader pathologists is significant (600 and 960 reads, respectively), which allowed us to reach reliable GT scores and to calculate statistically significant improvements in HER2 scoring by readers, the actual number of patients in the cohort is relatively small. Larger international multicenter validation studies are underway to further substantiate the results of our study and to gain new insights into the performance of the AI tool and its utility across different laboratories and scanners and for various cancer subtypes and patient subpopulations. The utility of the fully automated HER2 AI solution to improve interobserver concordance of HER2 scoring of breast experts was not evaluated in the current study. However, this is a subject of evaluation in our ongoing investigations and the results will be presented in subsequent reports. Future studies may also validate specific steps of the solution separately, for example, how well the cell detection model identifies each staining pattern. Following the results of the current study, an improved version of the AI tool is under development to address specific shortcomings of the algorithm, such as its performance in distinguishing between HER2 2+ and 1+ cases.
This study demonstrated a significant potential of AI tools to improve HER2 scoring, which is essential for determining appropriate systemic therapy for patients with breast cancer. Specifically, in the HER2-low era and in the absence of other ancillary tests for differentiating HER2-low and ultra-low cases, AI solutions could be used as decision support tools for pathologists in standard-of-care pathology practice, enhancing the reproducibility and consistency of HER2 scoring, thus enabling optimal treatment pathways and better patient outcomes.
ACKNOWLEDGMENT
We thank Bryan Tutt, MA, ELS (D), Scientific Editor, Research Medical Library, The University of Texas MD Anderson Cancer Center for editorial assistance.
PRIOR PRESENTATION
Presented in part at the 35th European Congress of Pathology, Dublin, Ireland, September 9-13, 2023.
SUPPORT
Supported by Ibex Medical Analytics.
DATA SHARING STATEMENT
A data sharing statement provided by the authors is available with this article at DOI https://doi.org/10.1200/PO.24.00353. The patient data collected during this study for AI solution validation were obtained under ethics committees' approval and were provided to the researchers through a restricted-access agreement that prevented sharing the data with a third party or publicly. Future access to the data set can be requested through direct application to the corresponding author for data access. Aggregate data are available within the manuscript and its Data Supplement.
AUTHOR CONTRIBUTIONS
Conception and design: Savitri Krishnamurthy, Stuart J. Schnitt, Marina Maklakovski, Yuval Globerson, Maya Grinwald, Chaim Linhart, Judith Sandbank, Manuela Vecsler
Administrative support: Savitri Krishnamurthy, Maya Grinwald, Manuela Vecsler
Provision of study materials or patients: Savitri Krishnamurthy, Anne Vincent-Salomon, Manuela Vecsler
Collection and assembly of data: Savitri Krishnamurthy, Anne Vincent-Salomon, Eugenia Colon, Kanchan Kantekure, Marina Maklakovski, Wilfrid Finck, Yuval Globerson, Giuseppe Mallel, Maya Grinwald, Manuela Vecsler
Data analysis and interpretation: Savitri Krishnamurthy, Stuart J. Schnitt, Anne Vincent-Salomon, Rita Canas-Marques, Eugenia Colon, Kanchan Kantekure, Marina Maklakovski, Jeanne Thomassin, Lilach Bien, Maya Grinwald, Chaim Linhart, Judith Sandbank, Manuela Vecsler
Manuscript writing: All authors
Final approval of manuscript: All authors
Accountable for all aspects of the work: All authors
AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/po/author-center.
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).
Savitri Krishnamurthy
Honoraria: AstraZeneca
Consulting or Advisory Role: AstraZeneca
Research Funding: PathomIQ, Caliber ID, Ibex Medical Analytics
Patents, Royalties, Other Intellectual Property: Publication of text book from Elsevier
Stuart J. Schnitt
Consulting or Advisory Role: PathAI, Ibex Medical Analytics
Anne Vincent-Salomon
Stock and Other Ownership Interests: Primaa, Ibex Medical Analytics
Honoraria: Roche, AstraZeneca, MSD Oncology, Daiichi Sankyo/Astra Zeneca, Leica Biosystems, Ibex Medical Analytics
Consulting or Advisory Role: Ibex Medical Analytics, Primaa
Research Funding: NanoString Technologies (Inst), AstraZeneca (Inst), Ibex Medical Analytics (Inst), MSD Avenir (Inst)
Travel, Accommodations, Expenses: NanoString Technologies, Roche/Genentech
Other Relationship: Ibex Medical Analytics
Kanchan Kantekure
Employment: Bristol Myers Squibb
Yuval Globerson
Employment: Ibex Medical Analytics
Stock and Other Ownership Interests: Ibex Medical Analytics
Lilach Bien
Employment: Ibex Medical Analytics
Giuseppe Mallel
Employment: Ibex Medical Analytics
Maya Grinwald
Employment: Ibex Medical Analytics
Research Funding: Ibex Medical Analytics
Travel, Accommodations, Expenses: Ibex Medical Analytics
Chaim Linhart
Employment: Ibex Medical Analytics
Leadership: Ibex Medical Analytics
Stock and Other Ownership Interests: Ibex Medical Analytics
Travel, Accommodations, Expenses: Ibex Medical Analytics
Judith Sandbank
Employment: Ibex Medical Analytics
Leadership: Ibex Medical Analytics
Stock and Other Ownership Interests: Ibex Medical Analytics
Honoraria: Ibex Medical Analytics
Consulting or Advisory Role: Ibex Medical Analytics
Speakers' Bureau: Ibex Medical Analytics
Research Funding: Ibex Medical Analytics
Patents, Royalties, Other Intellectual Property: Ibex Medical Analytics
Expert Testimony: Ibex Medical Analytics
Travel, Accommodations, Expenses: Ibex Medical Analytics
Other Relationship: Ibex Medical Analytics
Uncompensated Relationships: Ibex Medical Analytics
Manuela Vecsler
Employment: Ibex Medical Analytics
Stock and Other Ownership Interests: Ibex Medical Analytics
No other potential conflicts of interest were reported.
REFERENCES
- 1. Wolff AC, Somerfield MR, Dowsett M, et al. Human epidermal growth factor receptor 2 testing in breast cancer: ASCO-College of American Pathologists guideline update. J Clin Oncol. 2023;41:3867–3872. doi: 10.1200/JCO.22.02864. [DOI] [PubMed] [Google Scholar]
- 2. Modi S, Jacot W, Yamashita T, et al. Trastuzumab deruxtecan in previously treated HER2-low advanced breast cancer. N Engl J Med. 2022;387:9–20. doi: 10.1056/NEJMoa2203690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.US Food and Drug Administration https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpma/pma.cfm?id5p990081s047 US Food and Drug Medical Device Database of premarket approvals.
- 4. Brevet M, Li Z, Parwani A. Computational pathology in the identification of HER2-low breast cancer: Opportunities and challenges. J Pathol Inform. 2024;15:100343. doi: 10.1016/j.jpi.2023.100343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Pantanowitz L, Quiroga-Garza GM, Bien L, et al. An artificial intelligence algorithm for prostate cancer diagnosis in whole slide images of core needle biopsies: A blinded clinical validation and deployment study. Lancet Digit Health. 2020;2:e407–e416. doi: 10.1016/S2589-7500(20)30159-X. [DOI] [PubMed] [Google Scholar]
- 6. da Silva LM, Pereira EM, Salles PG, et al. Independent real-world application of a clinical-grade automated prostate cancer detection system. J Pathol. 2021;254:147–158. doi: 10.1002/path.5662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Sandbank J, Bataillon G, Nudelman A, et al. Validation and real-world clinical application of an artificial intelligence algorithm for breast cancer detection in biopsies. NPJ Breast Cancer. 2022;8:129. doi: 10.1038/s41523-022-00496-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Eloy C, Marques A, Pinto J, et al. Artificial intelligence-assisted cancer diagnosis improves the efficiency of pathologists in prostatic biopsies. Virchows Archiv. 2023;482:595–604. doi: 10.1007/s00428-023-03518-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Lambein K, Van Bockstal M, Vandemaele L, et al. Distinguishing score 0 from score 1+ in HER2 immunohistochemistry-negative breast cancer: Clinical and pathobiological relevance. Am J Clin Pathol. 2013;140:561–566. doi: 10.1309/AJCP4A7KTAYHZSOE. [DOI] [PubMed] [Google Scholar]
- 10. Fernandez AI, Liu M, Bellizzi A, et al. Examination of low ERBB2 protein expression in breast cancer tissue. JAMA Oncol. 2022;8:1–4. doi: 10.1001/jamaoncol.2021.7239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Schettini F, Chic N, Brasó-Maristany F, et al. Clinical, pathological, and PAM50 gene expression features of HER2-low breast cancer. NPJ Breast Cancer. 2021;7:1. doi: 10.1038/s41523-020-00208-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Karakas C, Tyburski H, Turner BM, et al. Interobserver and interantibody reproducibility of HER2 immunohistochemical scoring in an enriched HER2-low–expressing breast cancer cohort. Am J Clin Pathol. 2023;159:484–491. doi: 10.1093/ajcp/aqac184. [DOI] [PubMed] [Google Scholar]
- 13. Dobson L, Conway C, Hanley A, et al. Image analysis as an adjunct to manual HER-2 immunohistochemical review: A diagnostic tool to standardize interpretation. Histopathology. 2010;57:27–38. doi: 10.1111/j.1365-2559.2010.03577.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Helin HO, Tuominen VJ, Ylinen O, et al. Free digital image analysis software helps to resolve equivocal scores in HER2 immunohistochemistry. Virchows Arch. 2016;468:191–198. doi: 10.1007/s00428-015-1868-7. [DOI] [PubMed] [Google Scholar]
- 15. Brügmann A, Eld M, Lelkaitis G, et al. Digital image analysis of membrane connectivity is a robust measure of HER2 immunostains. Breast Cancer Res Treat. 2012;132:41–49. doi: 10.1007/s10549-011-1514-2. [DOI] [PubMed] [Google Scholar]
- 16. Holten-Rossing H, Møller Talman ML, Kristensson M, et al. Optimizing HER2 assessment in breast cancer: Application of automated image analysis. Breast Cancer Res Treat. 2015;152:367–375. doi: 10.1007/s10549-015-3475-3. [DOI] [PubMed] [Google Scholar]
- 17. Yardley DA, Kaufman PA, Huang W, et al. Quantitative measurement of HER2 expression in breast cancers: Comparison with ‘real-world’ routine HER2 testing in a multicenter Collaborative Biomarker Study and correlation with overall survival. Breast Cancer Res. 2015;17:41. doi: 10.1186/s13058-015-0543-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Yousif M, Huang Y, Sciallis A, et al. Quantitative image analysis as an adjunct to manual scoring of ER, PgR, and HER2 in invasive breast carcinoma. Am J Clin Pathol. 2022;157:899–907. doi: 10.1093/ajcp/aqab206. [DOI] [PubMed] [Google Scholar]
- 19. Koopman T, Buikema HJ, Hollema H, et al. What is the added value of digital image analysis of HER2 immunohistochemistry in breast cancer in clinical practice? A study with multiple platforms. Histopathology. 2019;74:917–924. doi: 10.1111/his.13812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Bui MM, Riben MW, Allison KH, et al. Quantitative image analysis of human epidermal growth factor receptor 2 immunohistochemistry for breast cancer: Guideline from the College of American Pathologists. Arch Pathol Lab Med. 2019;143:1180–1195. doi: 10.5858/arpa.2018-0378-CP. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Lara H, Li Z, Abels E, et al. Quantitative image analysis for tissue biomarker use: A white paper from the digital pathology association. Appl Immunohistochem Mol Morphol. 2021;29:479–493. doi: 10.1097/PAI.0000000000000930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Sode M, Thagaard J, Eriksen JO, et al. Digital image analysis and assisted reading of the HER2 score display reduced concordance: Pitfalls in the categorisation of HER2-low breast cancer. Histopathology. 2023;82:912–924. doi: 10.1111/his.14877. [DOI] [PubMed] [Google Scholar]
- 23. Yue M, Zhang J, Wang X, et al. Can AI-assisted microscope facilitate breast HER2 interpretation? A multi-institutional ring study. Virchows Arch. 2021;479:443–449. doi: 10.1007/s00428-021-03154-x. [DOI] [PubMed] [Google Scholar]
- 24. Wu S, Yue M, Zhang J, et al. The role of artificial intelligence in accurate interpretation of HER2 immunohistochemical scores 0 and 1+ in breast cancer. Mod Pathol. 2023;36:100054. doi: 10.1016/j.modpat.2022.100054. [DOI] [PubMed] [Google Scholar]
- 25. Khameneh FD, Razavi S, Kamasak M. Automated segmentation of cell membranes to evaluate HER2 status in whole slide images using a modified deep learning network. Comput Biol Med. 2019;110:164–174. doi: 10.1016/j.compbiomed.2019.05.020. [DOI] [PubMed] [Google Scholar]
- 26. Jung M, Song SG, Cho SI, et al. Artificial intelligence-powered human epidermal growth factor receptor 2 (HER2) analyzer in breast cancer as an assistance tool for pathologists to reduce interobserver variation. J Clin Oncol. 2022;40 suppl 16; abstr e12543. [Google Scholar]
- 27. Palm C, Connolly CE, Masser R, et al. Determining HER2 status by artificial intelligence: An investigation of primary, metastatic, and HER2 low breast tumors. Diagnostics (Basel) 2023;13:168. doi: 10.3390/diagnostics13010168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Frey P, Mamilos A, Minin E, et al. AI-based HER2-low IHC scoring in breast cancer across multiple sites, clones, and scanners. J Clin Oncol. 2023;41 suppl 16; abstr 516. [Google Scholar]
- 29. Globerson Y, Bien L, Harel J, et al. Abstract P6-04-05: A fully automatic artificial intelligence system for accurate and reproducible HER2 IHC scoring in breast cancer. Cancer Res. 2023;83 suppl 5; abstr P6-04-P05. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
A data sharing statement provided by the authors is available with this article at DOI https://doi.org/10.1200/PO.24.00353. The patient data collected during this study for AI solution validation were obtained under ethics committees' approval and were provided to the researchers through a restricted-access agreement that prevented sharing the data with a third party or publicly. Future access to the data set can be requested through direct application to the corresponding author for data access. Aggregate data are available within the manuscript and its Data Supplement.
