Abstract
Objectives
To develop and validate machine learning models for human epidermal growth factor receptor 2 (HER2)-zero and HER2-low using MRI features pre–neoadjuvant therapy (NAT).
Methods
Five hundred and sixteen breast cancer patients post-NAT surgery were randomly divided into training (n = 362) and internal validation sets (n = 154) for model building and evaluation. MRI features (tumour diameter, enhancement type, background parenchymal enhancement, enhancement pattern, percentage of enhancement, signal enhancement ratio, breast oedema, and apparent diffusion coefficient) were reviewed. Logistic regression (LR), support vector machine (SVM), k-nearest neighbour (KNN), and extreme gradient boosting (XGBoost) models utilized MRI characteristics for HER2 status assessment in training and validation datasets. The best-performing model generated a HER2 score, which was subsequently correlated with pathological complete response (pCR) and disease-free survival (DFS).
Results
The XGBoost model outperformed LR, SVM, and KNN, achieving an area under the receiver operating characteristic curve (AUC) of 0.783 (95% CI, 0.733-0.833) and 0.787 (95% CI, 0.709-0.865) in the validation dataset. Its HER2 score for predicting pCR had an AUC of 0.708 in the training datasets and 0.695 in the validation dataset. Additionally, the low HER2 score was significantly associated with shorter DFS in the validation dataset (hazard ratio: 2.748, 95% CI, 1.016-7.432, P = .037).
Conclusions
The XGBoost model could help distinguish HER2-zero and HER2-low breast cancers and has the potential to predict pCR and prognosis in breast cancer patients undergoing NAT.
Advances in knowledge
HER2-low–expressing breast cancer can benefit from the HER2-targeted therapy. Prediction of HER2-low expression is crucial for appropriate management. MRI features offer a solution to this clinical issue.
Keywords: breast cancer, HER2-low, HER2-zero, MRI, machine learning, neoadjuvant therapy
Introduction
Pathological complete response (pCR) after neoadjuvant therapy (NAT) can be used as a surrogate endpoint of survival in human epidermal growth factor receptor 2 (HER2)–positive breast cancer.1 Targeted therapies specifically inhibiting the HER2 pathway improved pCR rates and survival outcomes in patients with HER2-positive (immunohistochemistry [IHC] score 3+ or IHC2+/in situ hybridization [ISH] positive) but were not applied in traditionally HER2-negative patients (IHC 0, 1+, or 2+/ISH negative).2–5 Recent clinical trials have shown that the novel anti-HER2–targeting antibody-drug conjugates (ADCs) can benefit subgroups of breast cancers with low HER2 expression (IHC1+ or IHC2+/ISH negative), which are usually classified as HER2-negative.6–8 Recent clinical investigations have indicated that HER2-low–expression breast cancers present with distinctive clinical behaviour, divergent rates of pCR, and varied prognostic outcomes when compared with HER2-zero expression.9,10
The expression of HER2 leads to abnormal activation of downstream signalling pathways, resulting in uncontrolled and unrestricted cell growth.11 It is plausible to hypothesize that increased cell proliferation may manifest as more diverse spatial relationships and voxel intensity characteristics in imaging phenotypes. MRI features have been used to assess the intratumoural heterogeneity of breast cancer, enabling correlation with tumour subtypes, NAT response, and prognosis.12–14 While several studies have demonstrated that MRI radiomic signatures have the potential to distinguish HER2 expression,15–17 these radiomics studies usually require complicated image processing and analysis, posing challenges for clinical use. Simple MRI descriptors have been validated in effectively differentiating HER2-positive and HER2-low breast cancers.18 However, these relatively accessible MRI features have not been sufficiently analysed to exploit the potential of evaluating HER-zero and HER-low.
Machine learning (ML) has found increased applications in biomedicine, such as biological analysis, image processing, and classification.19 ML methods have proven particularly effective in key feature detection and model construction.19 Therefore, this study aimed to develop and validate ML models that utilize routinely collected pretreatment MRI characteristics to assess the HER2 status in breast cancers before NAT.
Methods
Patient selection and data collection
This study was reviewed and approved by the local institutional review board, which waived the requirement for written informed consent due to the analysis’ retrospective nature. Medical records of breast cancer patients treated from 1 April 2015 to 30 June 2021 were retrieved from clinical databases for analysis. Patients with pathologically proven invasive breast cancer who underwent surgery after NAT were included in the study. Exclusion criteria were HER2-positive cancer or unknown HER2 status, unavailability of images from MRI before NAT, poor MRI quality, recurrent breast cancer in women with a history of breast cancer, and/or multiple primary neoplasms, distant metastasis at the initial diagnosis, and bilateral breast cancer. Finally, 516 patients were included for further analysis. The flowchart of the study is shown in Figure 1.
Figure 1.
Flowchart of the study population. Abbreviations: HER2 = human epidermal growth factor receptor 2; NAT=neoadjuvant therapy.
Clinical and pathological characteristics were collected for analysis, including age, menopausal status, histologic type, histologic grade, HER2 status, Ki-67 index, hormone receptor (HR), NAT regiment, breast surgery, and adjuvant therapy. The tumour type and receptor status were confirmed through IHC and ISH analysis of core biopsies guided by ultrasound. HER2 status was evaluated following the 2018 ASCO/CAP guidelines.3 Tumours with an IHC 0 score were classified as HER2-zero. Tumours with an IHC score of 1+ or 2+/ISH negative were classified as HER2-low.4,20 Ki-67 expression was evaluated as either low or high, with <30% considered low expression and ≥30% considered high expression.21 Hormone receptor was defined as positive if tumour cells exhibited ≥1% of nuclear staining of oestrogen receptor or progesterone receptor.22
Pathological complete response was defined as a complete absence of invasive tumour cells both in the breast and axillary lymph nodes (ypT0/N0 or ypTis/N0),1 based on the pathological examination of the surgical specimen. Disease-free survival (DFS) was defined as the time in months between the date of surgery and the occurrence of events including invasive local, regional, and distant recurrences; contralateral breast cancer; second nonbreast primary cancer; death from any cause; and in situ cancer events.23 The cut-off date for follow-up was June 30, 2023.
MRI acquisition and analysis
MRI was conducted using a 1.5-T GE Optima MR360 MRI scanner (GE Healthcare) and a 1.5-T Philips Achieva MRI scanner (Philips). The detailed parameters of breast MRI acquisition are shown in Table S1. Two board-certified radiologists with 7 years of interpreting breast MRI studies, blinded to the histopathologic results, reviewed the MRI independently. To resolve discordant image interpretations, a senior radiologist with 18 years of expertise in breast imaging was consulted to facilitate a consensus through discussion. Brest MRI were evaluated according to the 2013 Breast Imaging-Reporting and Data System (BI-RADS) MRI lexicon.24 The morphologic and kinetic features of findings were described using the BI-RADS lexicon. MRI features of the breast tumour were assessed in terms of tumour diameter, lesion enhancement type (mass, nonmass enhancement), background parenchymal enhancement, enhancement pattern (homogenous, heterogeneous, and rim enhancement), percentage of enhancement (PE),25 signal enhancement ratio (SER),25 breast oedema,26 and apparent diffusion coefficient (ADC) value.
Tumour diameter was defined as the maximum diameter at the largest cross section of a transverse contrast-enhanced T1-weighted MRI scan. On dynamic contrast-enhanced (DCE) imaging, lesions were categorized as exhibiting either mass or nonmass enhancement, and further distinguished based on their homogeneity or heterogeneity, as well as the presence or absence of rim enhancement. These classifications were made in accordance with the guidelines provided by the 2013 BI-RADS lexicon. Background parenchymal enhancement was classified as minimal, mild and moderate, or marked. For analysis of lesion enhancement kinetics, 2D regions of interest (ROI) were manually drawn for the entire lesion on the largest cross-sectional slice, and areas of necrosis and cystic were avoided as much as possible. The mean value was computed by aggregating the ROIs across all voxels within the lesion. Tumour contrast enhancement shown on the MRI was evaluated quantitatively by several parameters as previously described25: (1) PE and (2) SER. PE shows the degree of signal enhancement within the tumour at 120 s after contrast delivery, calculated by PE=(S1−S0)/S0. Where S0 is the baseline signal intensity before contrast administration and S1 is the signal intensity measured at 120 s after contrast delivery. SER is a unitless index that reflects the rate of contrast washout in the tumour between 120 and 300 s after contrast delivery, calculated by: SER=(S1−S0)/(S2−S0). Where S2 is the MRI signal intensity at 300 s after contrast delivery. Breast oedema was divided into 4 grades based on T2WI: no oedema, peritumoural oedema, prepectoral oedema, and subcutaneous oedema.26,27 The mean and minimum ADC values (ADCmean and ADCmin) of the tumour area were measured, corresponding to the lesion identified on DCE imaging.
Model construction, validation, and statistical analysis
Patients were randomly divided into training and internal validation sets for model building and evaluation. Models with 10-fold cross-validation were constructed including logistic regression (LR), support vector machine (SVM) classifier, k-nearest neighbour (KNN), and extreme gradient boosting (XGBoost). The discriminative ability of all developed models was initially evaluated on the internal validation set. The overall performance of each model was assessed using the area under the curve (AUC), sensitivity, specificity, accuracy, positive predictive value, and negative predictive value. CIs were generated, and pairwise comparisons of the AUC between models were conducted using the method described by DeLong et al.28 The Hosmer-Lemeshow goodness-of-fit test was used to test model calibration. The Shapley additive explanation method was employed to elucidate the significance of features in the ML model.29 The HER2 score was calculated based on the best-performing model among the 4 models and utilized to predict pCR. The optimal threshold of HER2 score, calculated by the maximally selected rank statistics, divided the patients into 2 groups: high-risk DFS and low-risk DFS.
All statistical analyses were performed using R software (version 4.1.3; https://www.r-project.org/) and IBM SPSS Statistics 23 (IBM Corporation, Armonk, NY, USA). Baseline clinicopathological characteristics were compared with a Wilcoxon rank-sum test for continuous variables, and a chi-square test, or Fisher’s test for categorical variables. All variables associated with HER2 expression at the P < .05 level in univariable analysis were included in multivariable LR analysis. We evaluated the interobserver agreement for subjective MRI characteristics using intraclass correlation coefficient (ICC). Two-sided P < .05 was considered statistical significance.
Results
Clinicopathological and MRI characteristics of study patients
A total of 516 breast cancer patients met the inclusion criteria and were included in the analysis. The HER2-low status was verified in 68.8% (249 out of 362) of patients in the training dataset and 64.9% (100 out of 154) of patients in the validation dataset. The interobserver agreements for the MRI characteristics ranged between an ICC of 0.523 and 0.920. The clinicopathological and MRI characteristics of both the training dataset and the validation dataset are summarized in Table 1. Compared with the training set, patients from the internal validation dataset had a high rate of no special type of invasive carcinoma (98.7% [152 of 154] vs 94.8% [343 of 362]; P = .038). No significant difference was observed in other clinicopathological and MRI findings between the training and validation datasets (Table 1).
Table 1.
Clinicopathological and MRI characteristics of all patients.
Characteristic | Total patients (n = 516) | Training dataset (n = 362) | Validation dataset (n = 154) | P-value |
---|---|---|---|---|
Age (years) | 49 (23-87) | 49 (24-77) | 48 (23-87) | .752 |
Menopausal status | .958 | |||
Premenopause | 304 (58.9%) | 213 (58.8%) | 91 (59.1%) | |
Postmenopause | 212 (41.1%) | 149 (41.2%) | 63 (40.9%) | |
Location | .274 | |||
L | 277 (53.7%) | 200 (55.2%) | 77 (50.0%) | |
R | 239 (46.3%) | 162 (44.8%) | 77 (50.0%) | |
Clinical T stage | .247 | |||
T1-T2 | 388 (75.2) | 267 (73.8%) | 121 (78.6%) | |
T3-T4 | 128 (24.8) | 95 (26.2%) | 33 (21.4%) | |
Clinical N stage | .690 | |||
N0 | 130 (25.2%) | 93 (25.7%) | 37 (24.0%) | |
N1-3 | 386 (74.8%) | 269 (74.3%) | 117 (76.0%) | |
Histologic type | .038 | |||
Invasive carcinoma of no special type | 495 (95.9%) | 343 (94.8%) | 152 (98.7%) | |
Other invasive histology | 21 (4.1%) | 19 (5.2%) | 2 (1.3%) | |
Histologic grade | .529 | |||
1 or 2 | 289 (56.0%) | 206 (56.9%) | 83 (53.9%) | |
3 | 227 (44%) | 156 (43.1%) | 71 (46.1%) | |
HER2 status | .392 | |||
Zero | 167 (32.4%) | 113 (31.2%) | 54 (35.1%) | |
Low | 349 (67.6%) | 249 (68.8%) | 100 (64.9%) | |
HR status | .332 | |||
Negative | 132 (25.6%) | 97 (26.8%) | 35 (22.7%) | |
Positive | 384 (74.4%) | 265 (73.2%) | 119 (77.3%) | |
Ki-67 expression | .483 | |||
Low (<30%) | 145 (28.1%) | 105 (29.0%) | 40 (26.0%) | |
High (≥30%) | 371 (71.9%) | 257 (71.0%) | 114 (74.0%) | |
NAT regiment | .175 | |||
Anthracycline-based | 20 (3.9%) | 12 (3.3%) | 8 (5.2%) | |
Taxane-based | 149 (28.9%) | 102 (28.2%) | 47 (30.5%) | |
Anthracycline and Taxane-based | 297 (57.6%) | 215 (59.4%) | 82 (53.2%) | |
Other chemotherapy | 10 (1.9%) | 4 (1.1%) | 6 (3.9%) | |
Endocrine therapy | 40 (7.8%) | 29 (8.0%) | 11 (7.1%) | |
Breast surgery | .597 | |||
Conserving surgery | 113 (21.9%) | 77 (21.3%) | 36 (23.4%) | |
Total mastectomy | 403 (78.1%) | 285 (78.7%) | 118 (76.6%) | |
Adjuvant chemotherapy | .153 | |||
Yes | 276 (53.5%) | 188 (51.9%) | 88 (57.1%) | |
No | 227 (44.0%) | 162 (44.8%) | 65 (42.2%) | |
Missing | 13 (2.5%) | 12 (3.3%) | 1 (0.6%) | |
Adjuvant radiotherapy | .966 | |||
Yes | 324 (62.8%) | 226 (62.4%) | 98 (63.6%) | |
No | 158 (30.6%) | 112 (30.9%) | 46 (29.9%) | |
Missing | 34 (6.6%) | 24 (6.6%) | 10 (6.5%) | |
pCR | .536 | |||
Negative | 401 (77.7%) | 284 (78.5%) | 117 (76.0%) | |
Positive | 115 (22.3%) | 78 (21.5%) | 37 (24.0%) | |
Enhancement type | .739 | |||
Mass | 420 (81.4%) | 296 (81.8%) | 124 (80.5%) | |
Nonmass | 96 (18.6%) | 66 (18.2%) | 30 (19.5%) | |
BPE | .498 | |||
Minimum or mild | 392 (76.0%) | 272 (75.1%) | 120 (77.9%) | |
Moderate or maximum | 124 (24.0%) | 90 (24.9%) | 34 (22.1%) | |
Enhancement pattern | .990 | |||
Homogeneous | 29 (5.6%) | 20 (5.5%) | 9 (5.8%) | |
Heterogeneous | 450 (87.2%) | 316 (87.3%) | 134 (87.0%) | |
Rim enhancement | 37 (7.2%) | 26 (7.2%) | 11 (7.1%) | |
Oedema | .470 | |||
No oedema | 122 (23.6%) | 91 (25.1%) | 31 (20.1%) | |
Peritumoral oedema | 122 (23.6%) | 82 (22.7%) | 40 (26.0%) | |
Prepectoral oedema | 139 (26.9%) | 100 (27.6%) | 39 (25.3%) | |
Subcutaneous oedema | 133 (25.8%) | 89 (24.6%) | 44 (28.6%) | |
ADCmin (10−3mm2/s) | 0.57 (0.14-1.53) | 0.57 (0.14-1.53) | 0.57 (0.25- 1.16) | .482 |
ADCmean (10−3mm2/s) | 0.91 (0.43-1.75) | 0.92 (0.43-1.75) | 0.90 (0.54-1.64) | .505 |
Tumour diameter (mm) | 30 (9-121) | 30 (9-121) | 28 (9-75) | .104 |
PE | 1.62 (0.46-3.14) | 1.64 (0.46-3.13) | 1.59 (0.55-3.14) | .196 |
SER | 1.04 (0.56-1.37) | 1.04 (0.56-1.37) | 1.05 (0.67-1.34) | .971 |
Data were presented as n (%) or median (range).
Abbreviations: ADCmin = the minimum value of apparent diffusion coefficient; ADCmean = the mean value of apparent diffusion coefficient; BPE = background parenchymal enhancement; HR = hormone receptor; HER2 = human epidermal growth factor receptor; NAT = neoadjuvant therapy; pCR = pathological complete response; PE = percentage of enhancement; SER = signal enhancement ratio.
The contrast between patients with HER2-zero and HER2-low status in the training dataset is elucidated in Table 2. Regarding clinicopathological factors, differences were observed between the HER2-zero group and the HER2-low group for the following variables: (1) Histologic grade (histologic grade 3 in HER2-zero vs HER2-low: 56.6% vs 36.9%; P < .001); (2) HR-positive (HER2-zero vs HER2-low: 57.5% vs 80.3%; P < .001); (3) high ki-67 expression (HER2-zero vs HER2-low: 80.5% vs 66.7%; P = .007). As for MRI characteristics, ADCmean, PE, and SER were significantly different between HER2-zero and HER2-low groups. None of the remaining clinicopathological and MRI characteristics were different between the HER2-zero and HER2-low groups in the training dataset. In multivariable analysis, after adjustment for the variables that presented a significantly uneven distribution in clinicopathological and MRI characteristics (histologic grade, Ki-67 status, PE), HR status, ADCmean, and SER remained statistically significant (Table S2). Table S3 shows the relationship between pCR and clinicopathological characteristics of all patients. The pCR group exhibited a lower clinical stage, higher histologic grade, more HER2-zero status, more HR negative, and higher Ki-67 expression than the non-pCR group. Patients who achieved pCR presented with more taxane-based and less endocrine therapy.
Table 2.
Clinicopathological and MRI characteristics of HER2-zero and HER2-low patients in the Training Datasets.
Characteristic | HER-zero (n = 113) | HER-low (n = 249) | P-value |
---|---|---|---|
Age (years) | 48 (32-71) | 49 (24-77) | .199 |
Menopausal status | .418 | ||
Premenopause | 70 (61.9%) | 143 (57.4%) | |
Postmenopause | 43 (38.1%) | 106 (42.6%) | |
Location | .215 | ||
L | 57 (50.4%) | 143 (57.4%) | |
R | 56 (49.6%) | 106 (42.6%) | |
Clinical T stage | .145 | ||
T1-T2 | 89 (78.8%) | 178 (71.5%) | |
T3-T4 | 24 (21.2%) | 71 (28.5%) | |
Clinical N stage | .789 | ||
N0 | 28 (24.8%) | 65 (26.1%) | |
N1-3 | 85 (75.2%) | 184 (73.9%) | |
Histologic type | .326 | ||
Invasive carcinoma of no special type | 109 (96.5%) | 234 (94.0%) | |
Other invasive histology | 4 (3.5%) | 15 (6.0%) | |
Histologic grade | <.001 | ||
1 or 2 | 49 (43.4%) | 157 (63.1%) | |
3 | 64 (56.6%) | 92 (36.9%) | |
HR status | <.001 | ||
Negative | 48 (42.5%) | 49 (19.7%) | |
Positive | 65 (57.5%) | 200 (80.3%) | |
Ki-67 expression | .007 | ||
Low (< 30%) | 22 (19.5%) | 83 (33.3%) | |
High (≥ 30%) | 91 (80.5%) | 166 (66.7%) | |
NAT regiment | .063 | ||
Anthracycline-based | 5 (4.4%) | 7 (2.8%) | |
Taxane-based | 35 (31.0%) | 67 (26.9%) | |
Anthracycline and Taxane-based | 68 (60.2%) | 147 (59.0%) | |
Other chemotherapy | 2 (1.8%) | 2 (0.8%) | |
Endocrine therapy | 3 (2.7%) | 26 (10.4%) | |
Breast surgery | .411 | ||
Conserving surgery | 27 (23.9%) | 50 (20.1%) | |
Total mastectomy | 86 (76.1%) | 199 (79.9%) | |
Adjuvant chemotherapy | .872 | ||
Yes | 57 (50.4%) | 131 (52.6%) | |
No | 52 (46.0%) | 110 (44.2%) | |
Missing | 4 (3.5%) | 8 (3.2%) | |
Adjuvant radiotherapy | .232 | ||
Yes | 77 (68.1%) | 149 (59.8%) | |
No | 28 (24.8%) | 84 (33.7%) | |
Missing | 8 (7.1%) | 16 (6.4%) | |
pCR | .066 | ||
Negative | 82 (72.6%) | 202 (81.1%) | |
Positive | 31 (27.4%) | 47 (18.9%) | |
Enhancement type | .638 | ||
Mass | 94 (83.2%) | 202 (81.1%) | |
Nonmass | 19 (16.8%) | 47 (18.9%) | |
BPE | .812 | ||
Minimum or mild | 84 (74.3%) | 188 (75.5%) | |
Moderate or maximum | 29 (25.7%) | 61 (24.5%) | |
Enhancement pattern | .778 | ||
Homogeneous | 5 (4.4%) | 15 (6.0%) | |
Heterogeneous | 99 (87.6%) | 217 (87.1%) | |
Rim enhancement | 9 (8.0%) | 17 (6.8%) | |
Oedema | .641 | ||
No oedema | 24 (21.2%) | 67 (26.9%) | |
Peritumoral oedema | 29 (25.7%) | 53 (21.3%) | |
Prepectoral oedema | 32 (28.3%) | 68 (27.3%) | |
Subcutaneous oedema | 28 (24.8%) | 61(24.5%) | |
ADCmin (10−3 mm2/s) | 0.55 (0.20-0.96) | 0.59 (0.14-1.53) | .066 |
ADCmean (10−3 mm2/s) | 0.87 (0.51-1.41) | 0.94 (0.43-1.75) | .009 |
Tumour diameter (mm) | 31 (9-114) | 30 (10-121) | .777 |
PE | 1.55 (0.46-3.08) | 1.67 (0.50-3.13) | .043 |
SER | 1.01 (0.69-1.27) | 1.05 (0.56-1.37) | .003 |
Data were presented as n (%) or median (range).
Abbreviations: ADCmin = the minimum value of apparent diffusion coefficient; ADCmean = the mean value of apparent diffusion coefficient; BPE = background parenchymal enhancement; HR = hormone receptor; HER2 = human epidermal growth factor receptor; NAT = neoadjuvant therapy; pCR = pathological complete response; PE = percentage of enhancement; SER = signal enhancement ratio.
Predictive performance of ML algorithms
Utilizing patient characteristics collected before receiving NAT, supervised ML models were trained to identify patients with low levels of HER2 expression. Table 3 provides a summary of the predictive performance of all 4 evaluated classification algorithms in both datasets. Among these algorithms, the XGBoost algorithm exhibited the highest performance for assessing HER2 status, with AUC of 0.783 (95% CI, 0.733-0.833) and 0.787 (95% CI, 0.709-0.865) in the training and validation datasets, which surpass LR (0.622, 95% CI, 0.562-0.682; 0.536, 95% CI, 0.442-0.630), SVM classifier (0.745, 95% CI, 0.691-0.800; 0.776, 95% CI, 0.699-0.853), and KNN models (0.738, 95% CI, 0.688-0.788; 0.756, 95% CI, 0.682-0.830; Figure 2). Comparing the AUC of the XGBoost model with that of the LR model revealed a statistically significant difference in the training and validation datasets. Meanwhile, the AUCs of the SVM and KNN models exhibited no statistically significant disparity in the training and validation datasets. Figure 3 illustrates the ranking of feature importance for the imputed variables in the XGBoost model. The ADCmean, PE, and SER were the important MRI features (Figure 4). After incorporating the clinicopathological variables (HR status, histologic grade, and Ki-67 status), the XGBoost model attained AUC values of 0.835 and 0.827 in the training and validation datasets, respectively (Figure S1). Finally, the predicted probabilities of the XGBoost model closely aligned with the observed probabilities and showed a good calibration (Hosmer-Lemeshow test: P-value in the training and validation cohorts were .108 and .168 respectively; Figure S2).
Table 3.
Performance of machine learning models in the training and validation datasets.
The training dataset |
The validation dataset |
|||||||
---|---|---|---|---|---|---|---|---|
LR | SVM | KNN | XGBoost | LR | SVM | KNN | XGBoost | |
AUC | 0.622 (0.562-0.682) | 0.745 (0.691-0.800) | 0.738 (0.688-0.788) | 0.783 (0.733-0.833) | 0.536 (0.442-0.630) | 0.776 (0.699-0.853) | 0.756 (0.682-0.830) | 0.787 (0.709-0.865) |
Accuracy, % | 54.7 | 71.8 | 66.6 | 71.0 | 47.4 | 73.4 | 67.5 | 72.1 |
Sensitivity, % | 45.0 | 69.9 | 62.2 | 70.3 | 26.0 | 68.0 | 61.0 | 68.0 |
Specificity, % | 76.1 | 76.1 | 76.1 | 72.6 | 87.0 | 83.3 | 79.6 | 79.6 |
PPV, % | 80.6 | 86.6 | 85.2 | 85.0 | 78.8 | 88.3 | 84.7 | 86.1 |
NPV, % | 38.6 | 53.4 | 47.8 | 52.6 | 38.8 | 58.4 | 52.4 | 57.3 |
P-value | <.0001 | .219 | .104 | Ref | .0006 | .836 | .514 | Ref |
Abbreviations: AUC = area under the curve; LR = logistic regression; KNN = k-nearest neighbour; NPV = negative predictive value; PPV = positive predictive value; ref = reference; SVM = support vector machine; XGBoost = extreme gradient boosting.
Figure 2.
Area under the receiver operating characteristic curves are shown for all 4 models. (A) The training dataset; (B) the validation dataset. Abbreviations: AUC = area under the receiver operating characteristic curve; KNN = k-nearest neighbour; LR = logistic regression; XGB = extreme gradient boosting
Figure 3.
Feature importance plot for the XGBoot model in the training dataset. The yellow and purple in each row indicate participants with low to high values of the particular variable, while the X-axis gives the SHAP values for the impact on the model. Abbreviations: SHAP = Shapley additive explanation values; PE = percentage of enhancement; SER = signal enhancement ratio.
Figure 4.
Percentage of enhancement (PE), signal enhancement ratio (SER), and the mean value of apparent diffusion coefficient (ADCmean) in human epidermal growth factor receptor 2 (HER2)-low and HER2-zero patients. (A-D) HER2-low patient, PE = 1.92; SER = 1.06; ADCmean = 1.002. (E-H) HER2-zero patient, PE = 0.47; SER = 0.72; ADCmean = 0.83. (A, E) S0, the baseline signal intensity; (B, F) S1, the signal intensity measured at 120 s after contrast delivery; (C, G) S2, the signal intensity measured at 300 s after contrast delivery; (D, H): ADC. PE = (S1−S0)/S0; SER=(S1−S0)/(S2−S0).
Pathological complete response and disease-free survival analysis
The HER2 score established by the XGBoost model, which was used for predicting pCR, achieved an AUC of 0.708 in the training datasets and an AUC of 0.695 in the validation dataset (Figure 5A and B). In exploratory survival analysis, with a median follow-up of 40 months (median [IQR], 39.9 [33.8-46.0] months), 70 patients had recurrence events in the whole dataset. The HER2 score was further employed to make predictions regarding DFS in a cohort of breast cancer patients undergoing neoadjuvant chemotherapy. The Kaplan-Meier survival curves demonstrated that the low HER2 score was significantly associated with a shorter DFS in the validation dataset (HR: 2.748, 95% CI: 1.016-7.432, P = .037; Figure 6B). However, no significant difference was found in the training set (HR: 1.561, 0.811-3.003, P = .18; Figure 6A).
Figure 5.
Area under the receiver operating characteristic curve of pathological complete response. (A) The training dataset; (B) the validation dataset. Abbreviation: AUC = area under the receiver operating characteristic curve.
Figure 6.
Kaplan-Meier curve of disease-free survival in a cohort of breast cancer patients undergoing neoadjuvant chemotherapy. (A) The training dataset; (B) the validation dataset. Abbreviation: HER2 = human epidermal growth factor receptor 2.
Discussion
This study investigated the performance of an MRI-based model in differentiating between HER2-low and HER2-zero breast cancers. The XGBoost model obtained the best classification in both the training and internal validation cohorts. The MRI features derived from the tumour exhibited the capacity to distinguish between HER2-low and HER2-zero breast cancers (internal validation cohorts AUC, 0.787). Our results revealed that the MRI characteristics, including PE, ADCmean, and SER, significantly contribute to the differentiation between HER2-zero and HER2-low patients. Furthermore, the model demonstrated predictive capabilities for both pCR status and DFS.
The HER2-low expression has now been recognized as targetable with the novel anti-HER2 ADCs, which have been shown to improve progression-free survival and overall survival in the phase III DESTINYBreast04 trial.6 Previous studies used MRI to detect and characterize associations between imaging phenotypes and HER2 expression by DCE-MRI.30–33 In our study, we observed that the kinetic characteristics, namely the PE and SER obtained from DCE-MRI were higher in the HER2-low breast cancer cohort compared to the HER2-zero. This observation suggests a plausible association between HER2 expression and the up-regulation of vascular endothelial growth factor in human breast cancer cells.34,35 Additionally, we noted that the HER2-low group demonstrated a higher ADC value compared to the HER2-zero group. The correlation between HER2 expression and increased angiogenesis potentially led to enhanced perfusion, resulting in elevated ADC values.36,37 Furthermore, our findings showed the HER2-low cohort was more likely to be HR-positive, had lower histologic grade, and exhibited lower Ki-67 expression, which aligns with the findings of several other studies.9,38,39
In this study, the XGBoost model, built using MRI characteristics, demonstrated superior performance in HER2 status assessment compared to other models such as KNN, SVM, and LR. As a gradient-boosted tree-based ML algorithm, XGBoost has consistently showcased exceptional performance across diverse clinical tasks, surpassing conventional algorithms like random forest, LR, and SVM.40 Notably, the XGBoost model employs a gain value to quantify the extent to which a variable enhances the accuracy of predictions, with higher gain values indicating greater improvement in prediction generation. Farrokhian et al also employed the XGBoost model to attain superior performance than conventional algorithms in clinical research.41 In our study, PE emerged as the most influential factor in the XGBoost model. SER derived from DCE-MRI and ADC value played a crucial role in the predictive model, enabling effective differentiation between these 2 subgroups. A recent study found that the combination of selected features from T1-DCE and ADC achieved the best performance in discriminating HER2 status.15 Furthermore, we observed that the incorporation of clinicopathological variables contributed to enhancing the discriminative performance of the model (internal validation cohorts AUC, 0.827).
In our research, the routinely collected pretreatment MRI characteristics identified through ML algorithms prove effective in differentiating between HER2-zero and HER2-low breast cancer. While radiomics studies may offer valuable insights for HER2 status assessment,15,16 their complex analytical process often limits their practical application in clinical practice. The objective MRI characteristics employed in our model are easily obtained and have the potential to streamline clinical workflows. Furthermore, the HER2 score derived from the XGBoost model, utilized for predicting pCR, carries substantial implications for advancing patient management. Moreover, this HER2 score, established via the XGBoost model, exhibited an association with DFS in both the training and validation datasets of patients undergoing neoadjuvant chemotherapy. In the validation dataset, lower HER2 scores tend to have a worse DFS, which is consistent with previous research.9,38 This contrast in DFS based on the HER2 score was not observed in the training dataset. Considering the prior study’s reliance on large sample sizes, future investigations necessitate an expanded sample size. This step is crucial to substantiate whether the model genuinely predicts distinct prognoses between HER2-low-positive and HER2-zero patients.
Limitations
In our study, several limitations were observed. Primarily, it was a retrospective single-institution analysis, posing challenges in controlling all variables to prevent biases. Additional multicentre investigations involving larger cohorts are imperative to validate the discriminative efficacy. Secondly, the use of a 2D region of interest during the assessment of kinetic features and ADC value was limited, as a 3D volume of interest would provide more comprehensive information.
Conclusion
In conclusion, the preoperative breast MRI characteristics using the XGBoost model could aid in identifying suitable candidates for HER2-targeted therapy. Furthermore, this model shows the potential to predict pCR status and prognosis in breast cancer patients treated with NAT.
Supplementary Material
Contributor Information
Xu Huang, Department of Radiology, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
Lei Wu, Department of Radiology, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
Yu Liu, Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China; Department of Ultrasound, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China.
Zeyan Xu, Department of Radiology, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
Chunling Liu, Department of Radiology, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
Zaiyi Liu, Department of Radiology, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
Changhong Liang, Department of Radiology, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
Author contributions
Drs X. Huang, L. Wu, and Y. Liu contributed equally and shared the first authorship. Drs. Z. Liu and C. Liang contributed equally to this work as co-corresponding authors.
Supplementary material
Supplementary material is available at BJR online.
Funding
This work was funded by the Key-Area Research and Development Program of Guangdong Province (No. 2021B0101420006); National Natural Science Foundation of China (No. 82071892, 82271941, 82272088); Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application (No. 2022B1212010011); the National Science Foundation for Young Scientists of China (No. 82102019); Medical Scientific Research Foundation of Guangdong Province (No. A2024403).
Conflicts of interest
The authors declare that they have no conflict of interest.
References
- 1. von Minckwitz G, Untch M, Blohmer J-U, et al. Definition and impact of pathologic complete response on prognosis after neoadjuvant chemotherapy in various intrinsic breast cancer subtypes. J Clin Oncol. 2012;30(15):1796-1804. 10.1200/JCO.2011.38.8595 [DOI] [PubMed] [Google Scholar]
- 2. Chumsri S, Li Z, Serie DJ, et al. Incidence of late relapses in patients with HER2-positive breast cancer receiving adjuvant trastuzumab: combined analysis of NCCTG N9831 (alliance) and NRG oncology/NSABP B-31. J Clin Oncol. 2019;37(35):3425-3435. 10.1200/JCO.19.00443 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Wolff AC, Hammond MEH, Allison KH, et al. Human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline Focused Update. J Clin Oncol. 2018;36(20):2105-2122. 10.1200/JCO.2018.77.8738 [DOI] [PubMed] [Google Scholar]
- 4. Tarantino P, Hamilton E, Tolaney SM, et al. HER2-low breast cancer: pathological and clinical landscape. J Clin Oncol. 2020;38(17):1951-1962. 10.1200/JCO.19.02488 [DOI] [PubMed] [Google Scholar]
- 5. Gianni L, Eiermann W, Semiglazov V, et al. Neoadjuvant chemotherapy with trastuzumab followed by adjuvant trastuzumab versus neoadjuvant chemotherapy alone, in patients with HER2-positive locally advanced breast cancer (the NOAH trial): a randomised controlled superiority trial with a parallel HER2-negative cohort. Lancet. 2010;375(9712):377-384. 10.1016/S0140-6736(09)61964-4 [DOI] [PubMed] [Google Scholar]
- 6. Modi S, Jacot W, Yamashita T, et al. ; DESTINY-Breast04 Trial Investigators. Trastuzumab deruxtecan in previously treated HER2-low advanced breast cancer. N Engl J Med. 2022;387(1):9-20. 10.1056/NEJMoa2203690 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Modi S, Park H, Murthy RK, et al. Antitumor activity and safety of trastuzumab deruxtecan in patients with HER2-low-expressing advanced breast cancer: results from a phase Ib study. J Clin Oncol. 2020;38(17):1887-1896. 10.1200/JCO.19.02318 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Banerji U, van Herpen CML, Saura C, et al. Trastuzumab duocarmazine in locally advanced and metastatic solid tumours and HER2-expressing breast cancer: a phase 1 dose-escalation and dose-expansion study. Lancet Oncol. 2019;20(8):1124-1135. 10.1016/S1470-2045(19)30328-6 [DOI] [PubMed] [Google Scholar]
- 9. Denkert C, Seither F, Schneeweiss A, et al. Clinical and molecular characteristics of HER2-low-positive breast cancer: pooled analysis of individual patient data from four prospective, neoadjuvant clinical trials. Lancet Oncol. 2021;22(8):1151-1161. 10.1016/S1470-2045(21)00301-6 [DOI] [PubMed] [Google Scholar]
- 10. Agostinetto E, Rediti M, Fimereli D, et al. HER2-low breast cancer: molecular characteristics and prognosis. Cancers (Basel). 2021;13(11): 10.3390/cancers13112824 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Wulfkuhle JD, Berg D, Wolff C, et al. ; I-SPY 1 TRIAL Investigators. Molecular analysis of HER2 signaling in human breast cancer by functional protein pathway activation mapping. Clin Cancer Res. 2012;18(23):6426-6435. 10.1158/1078-0432.CCR-12-0452 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Kim JJ, Kim JY, Suh HB, et al. Characterization of breast cancer subtypes based on quantitative assessment of intratumoral heterogeneity using dynamic contrast-enhanced and diffusion-weighted magnetic resonance imaging. Eur Radiol. 2022;32(2):822-833. 10.1007/s00330-021-08166-4 [DOI] [PubMed] [Google Scholar]
- 13. Tsai WC, Chang KM, Kao KJ. Dynamic contrast enhanced MRI and intravoxel incoherent motion to identify molecular subtypes of breast cancer with different vascular normalization gene expression. Korean J Radiol. 2021;22(7):1021-1033. 10.3348/kjr.2020.0760 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Mann RM, Cho N, Moy L. Breast MRI: state of the art. Radiology. 2019;292(3):520-536. 10.1148/radiol.2019182947 [DOI] [PubMed] [Google Scholar]
- 15. Bian X, Du S, Yue Z, et al. Potential Antihuman epidermal growth factor receptor 2 target therapy beneficiaries: the role of mri-based radiomics in distinguishing human epidermal growth factor receptor 2-low status of breast cancer. J Magn Reson Imaging. 2023;58(5):1603-1614. 10.1002/jmri.28628 [DOI] [PubMed] [Google Scholar]
- 16. Ramtohul T, Djerroudi L, Lissavalid E, et al. Multiparametric MRI and radiomics for the prediction of HER2-zero, -low, and -positive breast cancers. Radiology. 2023;308(2):e222646. 10.1148/radiol.222646 [DOI] [PubMed] [Google Scholar]
- 17. Guo Y, Xie X, Tang W, et al. Noninvasive identification of HER2-low-positive status by MRI-based deep learning radiomics predicts the disease-free survival of patients with breast cancer. Eur Radiol. 2024;34(2):899-913. 10.1007/s00330-023-09990-6 [DOI] [PubMed] [Google Scholar]
- 18. Mao C, Hu L, Jiang W, et al. Discrimination between human epidermal growth factor receptor 2 (HER2)-low-expressing and HER2-overexpressing breast cancers: a comparative study of four MRI diffusion models. Eur Radiol. 2023;34(4):2546-2559. 10.1007/s00330-023-10198-x [DOI] [PubMed] [Google Scholar]
- 19. Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. 2022;23(1):40-55. 10.1038/s41580-021-00407-0 [DOI] [PubMed] [Google Scholar]
- 20. Tarantino P, Viale G, Press MF, et al. ESMO expert consensus statements (ECS) on the definition, diagnosis, and management of HER2-low breast cancer. Ann Oncol. 2023;34(8):645-659. 10.1016/j.annonc.2023.05.008 [DOI] [PubMed] [Google Scholar]
- 21. Nielsen TO, Leung SCY, Rimm DL, et al. Assessment of Ki67 in breast cancer: updated recommendations from the International Ki67 in Breast Cancer Working Group. J Natl Cancer Inst. 2021;113(7):808-819. 10.1093/jnci/djaa201 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Allison KH, Hammond MEH, Dowsett M, et al. Estrogen and progesterone receptor testing in breast cancer: ASCO/CAP guideline update. J Clin Oncol. 2020;38(12):1346-1366. 10.1200/JCO.19.02309 [DOI] [PubMed] [Google Scholar]
- 23. Tolaney SM, Garrett-Mayer E, White J, et al. Updated standardized definitions for efficacy end points (STEEP) in adjuvant breast cancer clinical trials: STEEP version 2.0. J Clin Oncol. 2021;39(24):2720-2731. 10.1200/JCO.20.03613 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. BI-RADS Atlas Reporting System for Breast MRI. 2013. https://www.acr.org/Clinical-Resources/Reporting-and-Data-Systems/Bi-Rads.
- 25. Xiao J, Rahbar H, Hippe DS, et al. Dynamic contrast-enhanced breast MRI features correlate with invasive breast cancer angiogenesis. NPJ Breast Cancer. 2021;7(1):42. 10.1038/s41523-021-00247-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Harada TL, Uematsu T, Nakashima K, et al. Evaluation of breast edema findings at T2-weighted breast MRI is useful for diagnosing occult inflammatory breast cancer and can predict prognosis after neoadjuvant chemotherapy. Radiology. 2021;299(1):53-62. 10.1148/radiol.2021202604 [DOI] [PubMed] [Google Scholar]
- 27. Xu Z, Ding Y, Zhao K, et al. MRI characteristics of breast edema for assessing axillary lymph node burden in early-stage breast cancer: a retrospective bicentric study. Eur Radiol. 2022;32(12):8213-8225. 10.1007/s00330-022-08896-z [DOI] [PubMed] [Google Scholar]
- 28. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845. [PubMed] [Google Scholar]
- 29. Al'Aref SJ, Maliakal G, Singh G, et al. Machine learning of clinical variables and coronary artery calcium scoring for the prediction of obstructive coronary artery disease on coronary computed tomography angiography: analysis from the CONFIRM registry. Eur Heart J. 2020;41(3):359-367. 10.1093/eurheartj/ehz565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Zhou J, Tan H, Li W, et al. Radiomics signatures based on multiparametric MRI for the preoperative prediction of the HER2 Status of patients with breast cancer. Acad Radiol. 2021;28(10):1352-1360. 10.1016/j.acra.2020.05.040 [DOI] [PubMed] [Google Scholar]
- 31. Bitencourt AGV, Gibbs P, Rossi Saccarelli C, et al. MRI-based machine learning radiomics can predict HER2 expression level and pathologic response after neoadjuvant therapy in HER2 overexpressing breast cancer. Ebiomedicine. 2020. Nov;61:103042. 10.1016/j.ebiom.2020.103042 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Blaschke E, Abe H. MRI phenotype of breast cancer: kinetic assessment for molecular subtypes. J Magn Reson Imaging. 2015;42(4):920-924. 10.1002/jmri.24884 [DOI] [PubMed] [Google Scholar]
- 33. Ming WL, Li FY, Zhu YH, et al. Unsupervised analysis based on DCE-MRI radiomics features revealed three novel breast cancer subtypes with distinct clinical outcomes and biological characteristics. Cancers (Basel). 2022;14(22): 10.3390/cancers14225507 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Kumar R, Yarmand-Bagheri R. The role of HER2 in angiogenesis. Semin Oncol. 2001;28(5 Suppl 16):27-32. 10.1016/s0093-7754(01)90279-9 [DOI] [PubMed] [Google Scholar]
- 35. Konecny GE, Meng YG, Untch M, et al. Association between HER-2/neu and vascular endothelial growth factor expression predicts clinical outcome in primary breast cancer patients. Clin Cancer Res. 2004;10(5):1706-1716. 10.1158/1078-0432.ccr-0951-3 [DOI] [PubMed] [Google Scholar]
- 36. Park SH, Choi HY, Hahn SY. Correlations between apparent diffusion coefficient values of invasive ductal carcinoma and pathologic factors on diffusion-weighted MRI at 3.0 Tesla. J Magn Reson Imaging. 2015;41(1):175-182. 10.1002/jmri.24519 [DOI] [PubMed] [Google Scholar]
- 37. Makkat S, Luypaert R, Stadnik T, et al. Deconvolution-based dynamic contrast-enhanced MR imaging of breast tumors: correlation of tumor blood flow with human epidermal growth factor receptor 2 status and clinicopathologic findings—preliminary results. Radiology. 2008;249(2):471-482. 10.1148/radiol.2492071147 [DOI] [PubMed] [Google Scholar]
- 38. Kang S, Lee SH, Lee HJ, et al. Pathological complete response, long-term outcomes, and recurrence patterns in HER2-low versus HER2-zero breast cancer after neoadjuvant chemotherapy. Eur J Cancer. 2022;176:30-40. 10.1016/j.ejca.2022.08.031 [DOI] [PubMed] [Google Scholar]
- 39. Tarantino P, Jin Q, Tayob N, et al. Prognostic and biologic significance of ERBB2-low expression in early-stage breast cancer. JAMA Oncol. 2022;8(8):1177-1183. 10.1001/jamaoncol.2022.2286 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Nwanosike EM, Conway BR, Merchant HA, Hasan SS. Potential applications and performance of machine learning techniques and algorithms in clinical practice: a systematic review. Int J Med Inform. 2022. Mar;159:104679. 10.1016/j.ijmedinf.2021.104679 [DOI] [PubMed] [Google Scholar]
- 41. Farrokhian N, Holcomb AJ, Dimon E, et al. Development and validation of machine learning models for predicting occult nodal metastasis in early-stage oral cavity squamous cell carcinoma. JAMA Netw Open. 2022;5(4):e227226. 10.1001/jamanetworkopen.2022.7226 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.