Detection and classification of breast lesions using multiple information on contrast-enhanced mammography by a multiprocess deep-learning system: A multicenter study

Yuqian Chen; Zhen Hua; Fan Lin; Tiantian Zheng; Heng Zhou; Shijie Zhang; Jing Gao; Zhongyi Wang; Huafei Shao; Wenjuan Li; Fengjie Liu; Simin Wang; Yan Zhang; Feng Zhao; Hao Liu; Haizhu Xie; Heng Ma; Haicheng Zhang; Ning Mao

doi:10.21147/j.issn.1000-9604.2023.04.07

. 2023 Aug 30;35(4):408–423. doi: 10.21147/j.issn.1000-9604.2023.04.07

Detection and classification of breast lesions using multiple information on contrast-enhanced mammography by a multiprocess deep-learning system: A multicenter study

Yuqian Chen ¹, Zhen Hua ^1,^*, Fan Lin ², Tiantian Zheng ³, Heng Zhou ¹, Shijie Zhang ², Jing Gao ², Zhongyi Wang ², Huafei Shao ², Wenjuan Li ², Fengjie Liu ², Simin Wang ⁴, Yan Zhang ⁵, Feng Zhao ⁶, Hao Liu ⁷, Haizhu Xie ², Heng Ma ², Haicheng Zhang ^2,^8,^*, Ning Mao ^2,^8,^*

PMCID: PMC10485921 PMID: 37691895

Abstract

Objective

Accurate detection and classification of breast lesions in early stage is crucial to timely formulate effective treatments for patients. We aim to develop a fully automatic system to detect and classify breast lesions using multiple contrast-enhanced mammography (CEM) images.

Methods

In this study, a total of 1,903 females who underwent CEM examination from three hospitals were enrolled as the training set, internal testing set, pooled external testing set and prospective testing set. Here we developed a CEM-based multiprocess detection and classification system (MDCS) to perform the task of detection and classification of breast lesions. In this system, we introduced an innovative auxiliary feature fusion (AFF) algorithm that could intelligently incorporates multiple types of information from CEM images. The average free-response receiver operating characteristic score (AFROC-Score) was presented to validate system’s detection performance, and the performance of classification was evaluated by area under the receiver operating characteristic curve (AUC). Furthermore, we assessed the diagnostic value of MDCS through visual analysis of disputed cases, comparing its performance and efficiency with that of radiologists and exploring whether it could augment radiologists’ performance.

Results

On the pooled external and prospective testing sets, MDCS always maintained a high standalone performance, with AFROC-Scores of 0.953 and 0.963 for detection task, and AUCs for classification were 0.909 [95% confidence interval (95% CI): 0.822−0.996] and 0.912 (95% CI: 0.840−0.985), respectively. It also achieved higher sensitivity than all senior radiologists and higher specificity than all junior radiologists on pooled external and prospective testing sets. Moreover, MDCS performed superior diagnostic efficiency with an average reading time of 5 seconds, compared to the radiologists’ average reading time of 3.2 min. The average performance of all radiologists was also improved to varying degrees with MDCS assistance.

Conclusions

MDCS demonstrated excellent performance in the detection and classification of breast lesions, and greatly enhanced the overall performance of radiologists.

Keywords: Deep learning, contrast-enhanced mammography, breast lesions, detection, classification

Introduction

Breast cancer continues to be the primary cancer-related cause of disease burden for women (1). In 2020, it was estimated approximately that 2.3 million women were newly diagnosed with breast cancer and 685,000 breast cancer-related deaths occurred worldwide (2). However, it is also one of the most treatable malignances if detected early (3). Therefore, the accurate detection and diagnosis of breast lesions in the early stage are crucial in enabling more effective treatments and improving survival chances for breast cancer patients (4).

Clinically, many imaging modalities such as mammography, ultrasound, and magnetic resonance imaging have been applied to screen and diagnose breast cancer (5). Mammography is deemed the most commonly used tool due to its low cost and quick examination time, and it has been confirmed to effectively reduce mortality associated with breast cancer (6). Nevertheless, it is usually associated with high false negative rate and limited accuracy in dense breasts (7). Lately, a new technique named contrast-enhanced mammography (CEM), which combines the benefits of mammography with contrast-enhanced imaging techniques (8,9), has emerged as an effective method for breast cancer diagnosis. Several studies have demonstrated the superior sensitivity and low false positive (FP) rates of CEM, which mainly benefit from complementary information in the recombined (RC) image and low-energy (LE) image (10-12). Despite these advantages, interpreting breast CEM images remains a time-consuming and experience-dependent task for radiologists. Therefore, there is an increasing interest in developing time-efficient and accessible tools to assist radiologists on CEM diagnosis.

Currently, the application of artificial intelligence technology in the field of medicine is increasingly widespread (13-15). There is growing interest in applying deep learning for breast cancer diagnosis using imaging data (16-20). Nevertheless, the exploration of deep learning on CEM is still in the initial phase. Most present studies about CEM depended on manually created regions of interest (ROI) (21-24), which might increase the workload for radiologists. Additionally, the effective utilization of multiple CEM information could improve the performance of models (25,26). Multiview [craniocaudal (CC) and medio-lateral-oblique (MLO) views] can indicate different perspective information of breasts. Specially, the location of breast lesions on CEM is primarily performed using RC images, where correlative and supporting anatomic information is obtained from LE images. The classification relies more on richer information, such as the type and distribution of breast gland tissue and calcification as reflected in LE images, with the information of outline and enhancement of lesions in RC images as reference. A clever feature fusion method may further promote the detection and classification capability of networks. Yet some present works (25,26) only combined LE and RC images simply without considering that the task-related feature should be given more attention during the feature fusion process. On the other hand, no CEM-based multiprocess deep-learning system has been built to detect and classify breast lesions thus far.

For the abovementioned situation, we developed a multiprocess detection and classification system (MDCS) applying an innovation feature fusion way to achieve fast breast lesions detection and classification using multiple CEM information. Furthermore, the clinical performance of MDCS was further assessed by external and prospective validations, as well as by comparing performance of breast lesions classification with radiologists.

Materials and methods

This study was approved by the Institutional Ethics Committee of the Yantai Yuhuangding Hospital (retrospective study approval number: 2022-180; prospective study approval number: 2022-176) and prospectively registered at ClinicalTrials.gov (No. ChiCTR2200065169). The informed consent of all participants in the retrospective study were waived, and all participants in the prospective cohort gave informed and written consent.

Patients and datasets

Initially, 3,240 females were retrospectively studied, out of which 1,802 females were finally enrolled for analysis between September 2018 and April 2022. Among them, 1,707 patients from Yantai Yuhuangding Hospital (Institution 1), retrieved from September 2018 to April 2022, were randomly divided into a training set and an internal testing set (at the ratio of 8:2) to preliminary train and validate MDCS. Further evaluation of the generalization and robustness of MDCS was performed on a pooled external testing set collected from Fudan University Cancer Center (Institution 2) and Guangdong Maternal and Child Health Hospital (Institution 3) between January 2022 and March 2022. Later, from November 2022 to December 2022, 101 patients were included from Yantai Yuhuangding Hospital in the prospective testing set to further warrant the practicability of MDCS. All datasets were independent without overlapping, and four images including LE and RC images on CC and MLO views in unilateral breasts with lesions were obtained from each patient. The inclusion and exclusion criteria of retrospective and prospective study cohorts are respectively described in Supplementary materials, and the recruitment workflow is shown in Figure 1.

Patient recruitment workflow. CEM, contrast-enhanced mammography.

Image annotation

For image annotation, we invited two independent radiologists (one with 10 years of experience and another with 14 years of experience) to participate this work. Using ITK-SNAP (Version 3.6; www.itksnap.org), a rectangular box was labeled at the breast lesion in RC images of both CC and MLO views for most patients. For cases with carcinoma in situ, lesions were hardly observed in RC images because of obscure enhancement. In such cases, the labeling was performed on LE images instead. Before officially labeling the images, two radiologists participated in a training process until they had achieved 95% agreement in labeling the same 35 patients. In case of inconsistent opinions, a breast specialist with 20 years of experience performed further analysis. The annotations created by the radiologists were served as the reference standard for tumor detection, and the pathological diagnosis was the reference standard for tumor classification. More details of image acquisition can be found in Supplementary materials.

Image preprocessing

Before employing MDCS, we normalized and transformed the original Digital Imaging and Communications in Medicine (DICOM) images into 8-bit (0−255) JPEG images. Next, in order to generate a two-dimensional image that are compatible with our network’s input, we stacked three original single-channel LE images together into three channels, resulting in a three-channel RGB LE image. Similarly, we applied the same process to obtain a three-channel RC image. Then, the position of the entire breast region was located using OTSU (27) algorithm and was cut out from the original image. Finally, the cropped images were uniformly resized to 512×1,024 using linear interpolation.

Workflow of MDCS

The whole workflow of MDCS is depicted in Figure 2. To achieve a fully automatic diagnostic process for breast cancer, we proposed the MDCS comprised of one detector and one classifier. The deep-learning detector firstly takes the RC/LE images pairs from one case as inputs and directly predicts the spatial coordinates and confidence score (reflecting the confidence level of prediction) of localized lesions. The cropped lesions, identified by the detector in four images (LE/RC image pairs in CC and MLO views) are then fed into the classifier of MDCS, producing a final probability to determine whether they were benign or malignant. The source code of MSCD is publicly accessible at GitHub (https://github.com/yyyhd/MDCS).

Double-branch deep network detector based on view-level detection

During the detection stage, we built a double-branch deep network detector (Figure 2A) taking the RetinaNet (28) as the backbone after comparing several classic deep-learning detectors [RetinaNet , Faster RCNN (29), SSD (30) and YOLO V7]. The RC/LE images pairs from one case are taken as inputs to directly predict the spatial coordinates and confidence score of breast lesions. The primary framework of our detector consists of two components: a double-branch backbone with feature pyramid network (FPN) (31) for extracting the fusion feature from LE and RC images, and multiscale detection heads used for classification and regression to generate the final detection result. The detailed network structure is exhibited in Supplementary Figure S1.

Figure S1 — Detailed structure of detector of MDCS. (A) Structure of deep-learning detector; (B) Structure of detection head in this detector. MDCS, multiprocess detection and classification system; AFF, auxiliary feature fusion; DH, detection head; ECA, efficient channel attention; SPA, spatial attention; RELU, rectified linear unit.

We consider that more useful information for tumor detector might be provided in RC images. So to learn more useful information from CEM images, an auxiliary feature fusion (AFF) strategy (Supplementary Figure S2) inspired by the interactive learning method (32) is used in its backbone for cleverly fusing the features of RC and LE images to obtain more comprehensive information by regarding the RC-branch as the main line. Supplementary materials describe more detailed structures of the network. Besides this, the non-local module (33), which can capture context information by computing interactions between any two-pixel points, is added in the middle-layer feature extraction branch. In the part of detection head, the efficient channel attention module (34) and spatial attention module (35) are introduced separately to allow the subnets to pay more attention to the channels related to classification (it refers to the background class and the foreground class here) and assign more weight to the features focusing on the location information.

Figure S2 — Framework of AFF algorithm. Detailed structure of AFF algorithm (A), ECA module (B), and SPA module (C). AFF, auxiliary feature fusion; DCA, double convolutional attention; ECA, efficient channel attention; SPA, spatial attention.

All of above-mentioned modules and three methods of feature fusion were verified one by one through a series of ablation experiments on the training set and internal testing set. More introduction about ablation analysis can be seen in Supplementary materials.

Siamese classifier network based on combination of view-level results

For the classification of MDCS, we employed a Siamese classifier framework (Figure 2B) in which the first pathway processed the image patch extracted from the detected tumor region in RC-CC/LE-CC image, and the second one processed the patch extracted from the RC-MLO/LE-MLO image. Two pathways have the same basic structures, that is, DenseNet121 (36). Different classifiers regarding ResNet50 (37), ResNet101 (37), or Inception V4 as baseline were also trained separately for comparison. Similar to the skeleton of detector of MDCS, patch pairs from RC and LE images are separately fed into the two branches of each pathway and their final feature maps are fused using the auxiliary fusion method. Specifically, the LE-branch is considered as main line correspondingly in the classifier of MDCS. Each view-level pathway will ultimately output the probability of a malignant tumor and the patient-level result is calculated by averaging the scores of two pathways. Additionally, the loss function of the classifier was modified to focal loss (28) to address the problem of imbalanced classes in our data. We also explored the performance of classifiers with single-view and other methods of feature fusion. The comparative experiment is further described in Supplementary materials.

Evaluation of tumor localization

After detecting a lesion, the intersection over union (IoU), defined as the ratio between the predicted tumor bounding box and the ground truth box, is calculating. If IoU≥0.5, the case will be determined as true positive (TP). On the other hand, the case with IoU<0.5 is FP. Following above this, the performance of MDCS is evaluated using free-response receiver operating characteristic (FROC) analysis. The FROC is plotted using the sensitivity (correctly localized lesions in all lesions) versus the number of FPs per image by varying the threshold. Subsequently, the average FROC-score (38) which is computed by averaging the sensitivity at the defined range of FP rates is extracted to evaluate the overall performance of detectors.

Training process of MDCS

During the detection stage, we trained all the detectors for 100 epochs with ADM optimizer, with the momentum of 0.9 and the weight decay of 0.0001. The learning rate started at 0.001 and changes with a cosine annealing schedule. After completing the lesion detection, the detected lesion coordinates were expanded by 15 pixels around, and then the lesion was cut out from the original image according to the new coordinates. The cropped images were resized to 224×224 pixels and then fed into the classifier. The SGD optimizer with batch size of 8 and weight decay of 0.0005 was utilized to train each classifier for 150 epochs. Both detectors and classifiers applied focal loss as the loss function. The five-fold cross-validation strategy was employed to train MDCS, and the system was trained based on pretrained models on ImageNet. This study was implemented in Python 3.8 and the open-source Pytorch 1.8.2 library. Experiments were performed on a GPU-optimized workstation with four NVIDIA TITAN RTX cards (24 GB, Turing architecture).

Radiologists and MDCS comparison and two-round reader study

To investigate the reliable clinical practicability of MDCS, six radiologists with different years of experience on breast images (three junior radiologists with 4−8 years of experience; three senior radiologists with about 10−15 years of experience) participated in this reader study. All radiologists were provided with CEM images and corresponding clinical information (including age, tumor diameter, etc.) to perform breast cancer diagnosis twice. The first round was that blind to all pathological information and the performance of MDCS. The radiologists checked all CEM images in pooled external and prospective testing sets and classified each patient into malignant or benign based on their expertise and experience. In the second round, the radiologists could adjust their initial diagnostic results according to the classification probability of MDCS for each patient. Afterwards, we recorded each radiologist’s reading time of analyzing one patient’s CEM image by an external clock and calculated the average reading time for all radiologists. The reading time of MDCS, including the detection and classification time, was automatically recorded by the computer’s internal timing system. This process was performed for three times. Later, we calculated the average reading time of all radiologists and the MDCS, and compared them. Following this, a comparison was conducted between the diagnostic outcomes of radiologists working individually and MDCS alone in both the senior and junior groups, and a comprehensive investigation was finally conducted to further explore the MDCS-assisted performance.

MDCS’s visual explanation and analysis for representative cases

In an attempt to understand the MDCS’s potential utility as a decision support tool, gradient-weighted class activation mapping (Grad-CAM) (39) was implemented into the final convolutional layer of the classifier of MDCS. This allowed to generate heatmaps, which highlighted the important regions of CEM images that were relevant to the predicted results. Simultaneously, we selected several representative cases from pooled external testing set for further analysis by combining their respective heatmaps.

Statistical analysis

In this study, all statistical analyses and graphing were performed using R software (Version 4.1.2; R Foundation for Statistical Computing, Vienna, Austria, http://www.Rproject.org) and Python (Version 3.8.8; https://www.python.org/). For clinical characteristics, continuous variables were described as Inline graphic with two-sample t-test, and categorical variables were described as number and percentage. The detection performance was assessed by the FROC curve and the average FROC score. The ROC curve and the area under the curve (AUC) were used as the metrics to compare the diagnostic performances of different models in classification. The value that enabled MDCS to maintain maximum sensitivity was selected as the cut-off threshold. Sensitivity, specificity and accuracy values with 95% confidence intervals (95% CIs) calculated using bootstrapping with 1,000 resamples were reported for both MDCS and radiologists. Delong test was used to compare the AUCs. The improvement for the predictive performance of the models was evaluated by calculating the integrated discrimination improvement (IDI) and the net reclassification improvement (NRI) using R package “PredictABEL”. The radiologists’ performance with or without deep learning assistance were compared using McNemar’s x² test and paired sample t test. Beyond that, we analyzed the NRI situation of all radiologists with and without MDCS assistance by NRI utilizing R package “nricens”. P<0.05 was considered statistically significant. In terms of the interobserver agreement, the Kappa value was calculated using Cohen’s Kappa and Fleiss’ Kappa.

Results

Clinical characteristics

The clinical features of all datasets were obtained from the medical records, and the baseline characteristics of individuals are provided in Table 1. Altogether 1,903 patients were enrolled in this study. Among them, the internal training set included 1,355 patients (1,022 patients with breast cancer, 333 patients with benign lesions), the internal testing set included 352 patients (253 patients with breast cancer, 99 patients with benign lesions), the pooled external testing set included 95 patients (66 patients with breast cancer, 29 with benign lesions), and the prospective test set included 101 patients (72 with breast cancer and 29 with benign lesions). The average age of patients was 58.04±11.68 (range, 19−74) years old in the training set, 52.30±11.74 (range, 20−78) years old in the internal testing set, 49.32±10.23 (range, 21−78) years old in the pooled external testing set and 51.97±12.18 (range, 19−77) years old in the prospective testing set.

Table 1. Clinical characteristics of patients.

Characteristics	Training set (n=1,355) [n (%)]	Internal testing set (n=352) [n (%)]	P	Pooled external testing set (n=95) [n (%)]	P	Prospective testing set (n=101) [n (%)]	P
Age (year) ()	58.04±11.68	52.30±11.74	0.754	49.32±10.23	0.014	51.97±12.18	0.805
Diameter (cm) ()	2.30±1.15	2.29±1.28	0.965	2.71±1.29	0.006	2.60±1.41	0.048
<1	120 (8.86)	39 (11.08)		2 (2.11)		5 (4.95)
1−2	539 (39.77)	136 (38.63)		31 (32.63)		35 (34.65)
>2	696 (51.37)	177 (50.29)		62 (65.26)		61 (60.40)
Benign lesions	333 (24.58)	99 (28.13)	−	29 (30.53)	−	29 (28.71)	−
Fibroadenoma	159 (47.75)	45 (45.45)		14 (48.27)		13 (44.83)
Adenosis	79 (23.72)	26 (26.27)		13 (44.83)		10 (34.48)
Intraductal papilloma	34 (10.21)	12 (12.12)		2 (6.90)		4 (13.79)
Inflammation	8 (2.40)	2 (2.02)		0 (0)		1 (3.45)
Fibrocystic disease	10 (3.00)	5 (5.05)		0 (0)		1 (3.45)
Phyllodes tumor	19 (5.71)	4 (4.04)		0 (0)		0 (0)
Unknow/other	24 (7.21)	5 (5.05)		0 (0)		0 (0)
Malignant lesions	1,022 (75.42)	253 (71.87)	−	66 (69.47)	−	72 (71.29)	−
Invasive ductal 　carcinoma	897 (87.77)	223 (88.14)		46 (69.69)		55 (76.39)
Ductal carcinoma in situ	18 (1.76)	6 (2.37)		8 (12.12)		7 (9.72)
Invasive lobular 　carcinoma	24 (2.35)	6 (2.37)		1 (1.52)		3 (4.17)
Papillary carcinoma	10 (0.98)	7 (2.77)		0 (0)		1 (1.39)
Mucinous 　adenocarcinoma	14 (1.37)	6 (2.37)		0 (0)		5 (6.94)
Unknow/other	59 (5.77)	5 (1.98)		11 (16.67)		1 (1.39)

Open in a new tab

Detection performance of MDCS

Figure 3A shows the MDCS yielded AFROC-Score of 0.953, and 0.963 in the pooled external and prospective testing sets, respectively. Figure 4 displays the detection performance of MDCS for two CEM cases. MDCS showed better performance than all classic deep-learning detectors, and RetinaNet performed the best among four classic detectors, with an AFROC-Score value of 0.921 and 0.876 when taking RC images and LE images as input, respectively (Supplementary Figure S3). We also found that all classic detectors achieve higher AFROC-Score when taking RC images as input. Likewise, when treating LE images as dominant factors in feature fusion, the AFROC-Score was only 0.803 (Supplementary Table S1). This way of concatenation obtained an AFROC-Score of 0.942, which was 0.043 lower than the fusion strategies treating RC images as the dominant factor. This result corresponds to the fact that the actual process of detecting lesions on CEM relies more on RC images.

Detection and classification performance of MDCS in all datasets. (A) FROC curve which shows the performance of lesion detection for MDCS in all datasets; (B) ROC curve which shows the performance of lesion classification for MDCS in all datasets. MDCS, multiprocess detection and classification system; FROC, free-response receiver operating characteristic; ROC, receiver operating characteristic; AFROC-Score, average free-response receiver operating characteristic score; AUC, area under the receiver operating characteristic curve.

Detection performance of two CEM cases in two views. The CC view RC image (A) and LE image (B), and the MLO view RC image (C) and LE image (D), respectively. The yellow bounding boxes of lesion area were labeled by radiologists, and the red bounding boxes were predicted by MDCS. (I) Images in a 36-year-old female who was diagnosed with invasive ductal carcinoma. The size of this breast mass is 1.3 cm. This case showed that MDCS correctly detected the breast lesion with the confidence score of 92.03% and 90.55% in CC view and MLO view, respectively; (II) Images in a 55-year-old female with fibroadenoma. The size of this breast mass is 0.8 cm. MDCS has correctly detected this breast lesion with the confidence score of 86.49% and 72.60% in CC view and MLO view, respectively. CEM, contrast-enhanced mammography; CC, craniocaudal; RC, recombined; LE, low-energy; MLO, medio-lateral-oblique; MDCS, multiprocess detection and classification system.

Figure S3 — Contrast of performance between MDCS and four classic deep-learning detectors in the internal testing set. (A) ROC curve and AUC of classic detectors with single-RC input and detector of MDCS; (B) ROC curve and AUC of detectors with single-LE input and detector of MDCS. MDCS, multiprocess detection and classification system; ROC, receiver operating characteristic; AUC, area under the receiver operating characteristic curve; RC, recombined; LE, low-energy.

Table S1. Result of ablation analysis detector in internal testing set.

Module						AFROC-Score
Non-local	ECA	SPA	concatenate	auxiliary fusion (LE)	auxiliary fusion (RC)	AFROC-Score
ECA, efficient channel attention; SPA, spatial attention; LE, low-energy; RC, recombined; AFROC-Score, average free-response receiver operating characteristic score.
					√	0.940
√					√	0.957
	√				√	0.964
		√			√	0.960
√	√				√	0.977
	√	√			√	0.974
√		√			√	0.969
√	√	√	√			0.942
√	√	√		√		0.803
√	√	√			√	0.985

Open in a new tab

Classification performance of MDCS

MDCS also had an outcome performance in classification task. Figure 3B and Supplementary Table S2 exhibit that MDCS owns strong generalization and achieves excellent results in all testing sets, with AUCs of 0.932 (95% CI: 0.909−0.955) in the internal testing set, 0.909 (95% CI: 0.822−0.996) in the pooled external testing set, and 0.912 (95% CI: 0.840−0.985) in the prospective testing set. As can be seen in Supplementary Figure S4, MDCS’s performance has far exceeded that of three models using other baselines (ResNet50, ResNet101 and Inception V4). The performance of single-LE classifier in the internal testing set was significantly better than the single-RC classifier [AUC: 0.917 (95% CI: 0.883−0.934) vs. 0.837 (95% CI: 0.813−0.889), P=0.02]. When regarding LE image as the dominant factor in feature fusion, MDCS obtained the best performance compared with other two strategies in the training set and internal testing set. These results all illustrated that LE images could provide more useful information for classification than RC images. We also observed better classification results using double view than using single view, which showed the importance of multi-view information. What’s more, all values of IDI and NRI in the comparisons between the MDCS and other classifiers were positive and a majority of them were statistically significant (Supplementary Figure S4), which signified that the modules we added were capable of effectively enhancing model performance.

Table S2. Classification result of MDCS in training set, internal testing set, pooled external testing set and prospective testing set.

Variables	Training set	Internal testing set	Pooled external testing set	Prospective testing set
MDCS, multiprocess detection and classification system; AUC, area under the receiver operating characteristic curve; 95% CI, 95% confidence interval.
AUC (95% CI)	0.984 (0.979−0.989)	0.932 (0.909−0.955)	0.909 (0.822−0.996)	0.912 (0.840−0.985)
Sensitivity (95% CI)	0.955 (0.926−0.949)	0.941 (0.900−0.967)	0.954 (0.864−0.988)	0.947 (0.845−0.986)
Specificity (95% CI)	0.916 (0.894−0.934)	0.790 (0.727−0.841)	0.655 (0.457−0.841)	0.714 (0.511−0.860)
Accuracy (95% CI)	0.938 (0.926−0.949)	0.868 (0.832−0.898)	0.852 (0.765−0.917)	0.871 (0.780−0.934)

Open in a new tab

Figure S4 — Result of comparative experiment for classifiers in the internal testing set. (A) Contrast of several classifiers with different baseline networks; (B) Contrast of performance of several classifiers with different ways of feature fusion; (C) Contrast of performance of CC-view classifier, MLO-view classifier and the classifier with double view; (D) Contrast of performance of the LE classifier, RC classifier and the classifier with both LE images and RC images; (E) IDI and NRI by comparisons of above classifier with MDCS. CC, craniocaudal; MLO, medio lateral-oblique; LE, low-energy; RC, recombined; IDI, integrated discrimination improvement; NRI, net reclassification index; MDCS, multiprocess detection and classification system. ^*, P<0.05; ^**, P<0.001.

Comparison between MDCS and six radiologists

The numerical results of classification performance of radiologists and MDCS are available in Table 2 and Supplementary Table S3. We can see the MDCS yielded higher performance than three junior radiologists and a comparable level as three senior radiologists on the pooled external testing set. On the prospective testing set, the diagnostic performance of the MDCS [sensitivity: 0.947 (95% CI: 0.845−0.986), specificity: 0.714 (95% CI: 0.511−0.860), accuracy: 0.871 (95% CI: 0.780−0.934)] exceeded both junior [sensitivity: 0.936 (95% CI: 0.849−1.000), specificity: 0.345 (95% CI: 0.118−0.571), accuracy: 0.766 (95% CI: 0.691−0.840)] and senior [sensitivity: 0.908 (95% CI: 0.855−0.961), specificity: 0.529 (95% CI: 0.267−0.791), accuracy: 0.798 (95% CI: 0.761−0.837)] radiologists’ average diagnostic performance. Additionally, radiologists took approximately 3.2 min on average to complete a breast CEM diagnostic task, whereas MDCS only spent 5 seconds.

Table 2. Diagnostic performance for MDCS and radiologists with and without MDCS assistance in pooled external and prospective testing sets.

Variables	Pooled external testing set			Fleiss’s Kappa	Prospective testing set			Fleiss’s Kappa
Variables	Sensitivity (95% CI)	Specificity (95% CI)	Accuracy (95% CI)	Fleiss’s Kappa	Sensitivity (95% CI)	Specificity (95% CI)	Accuracy (95% CI)	Fleiss’s Kappa
MDCS, multiprocess detection and classification system; R, radiologist; 95% CI, 95% confidence interval; , indicates a statistically significant difference between MDCS and radiologists without MDCS assistance (, P<0.05; **, P<0.001); ^#, indicates a statistically significant difference between radiologists without and with MDCS assistance (^#, P<0.05; ^##, P<0.001).
MDCS	0.954 (0.864−0.988)	0.655 (0.457−0.841)	0.852 (0.765−0.917)		0.947 (0.845−0.986)	0.714 (0.511−0.860)	0.871 (0.780−0.934)
Radiologists without MDCS assistance
Junior R1	0.908 (0.803−0.962)*	0.400 (0.232−0.592)**	0.747 (0.648−0.831)		0.903 (0.804−0.957)	0.310 (0.160−0.510)**	0.733 (0.635−0.816)**
Junior R2	0.984 (0.906−1.000)	0.333 (0.179−0.529)**	0.779 (0.682−0.858)		0.972 (0.894−0.995)	0.276 (0.134−0.475)**	0.772 (0.678−0.850)**
Junior R3	0.939 (0.844−0.980)	0.552 (0.360−0.730)	0.821 (0.729−0.892)		0.931 (0.839−0.974)	0.448 (0.270−0.640)	0.792 (0.670−0.866)
Average junior	0.943 (0.849−1.000)	0.428 (0.150−0.707)	0.782 (0.690−0.875)	0.188	0.936 (0.849−1.000)	0.345 (0.118−0.571)	0.766 (0.691−0.840)	0.123
Senior R4	0.924 (0.825−0.972)	0.621 (0.424−0.787)	0.832 (0.741−0.901)		0.889 (0.787−0.947)	0.621 (0.424−0.787)	0.812 (0.722−0.883)
Senior R5	0.924 (0.825−0.972)	0.759 (0.561−0.890)	0.874 (0.790−0.933)		0.903 (0.804−0.957)*	0.552 (0.360−0.730)	0.802 (0.711−0.875)
Senior R6	0.939 (0.778−0.925)	0.690 (0.490−0.840)	0.863 (0.778−0.925)		0.931 (0.839−0.974)	0.414 (0.241−0.609)**	0.782 (0.689−0.858)
Average senior	0.929 (0.907−0.951)	0.700 (0.519−0.861)	0.856 (0.802−0.910)	0.525	0.908 (0.855−0.961)	0.529 (0.267−0.791)	0.798 (0.761−0.837)	0.328
Radiologists with MDCS assistance
Junior R1	0.939 (0.842−0.980)	0.533 (0.346−0.712)^#	0.811 (0.717−0.884)^##		0.944 (0.857−0.982)	0.379 (0.213−0.576)	0.782 (0.689−0.858)^##
Junior R2	0.984 (0.906−1.000)	0.400 (0.232−0.592)^#	0.800 (0.705−0.875)		0.958 (0.875−0.989)	0.379 (0.213−0.576)	0.792 (0.670−0.866)
Junior R3	0.955 (0.864−0.988)	0.552 (0.360−0.730)	0.832 (0.741−0.901)		0.958 (0.875−0.989)	0.483 (0.299−0.671)	0.822 (0.733−0.891)
Average junior	0.959 (0.903−1.000)	0.495 (0.290−0.701)	0.814 (0.774−0.855)^#	0.353	0.953 (0.933−0.973)	0.414 (0.265−0.563)	0.799 (0.747−0.850)^#	0.231
Senior R4	0.955 (0.864−0.988)	0.655 (0.457−0.841)	0.863 (0.777−0.925)		0.917 (0.821−0.966)	0.655 (0.457−0.814)	0.842 (0.756−0.907)
Senior R5	0.955 (0.864−0.988)	0.724 (0.525−0.866)	0.884 (0.802−0.941)		0.931 (0.839−0.974)	0.586 (0.391−0.759)	0.832 (0.744−0.899)
Senior R6	0.970 (0.885−0.995)	0.759 (0.561−0.890)	0.905 (0.828−0.956)		0.958 (0.875−0.989)	0.448 (0.270−0.640)	0.821 (0.722−0.883)
Average senior	0.960 (0.938−0.982)^##	0.713 (0.581−0.844)	0.884 (0.832−0.936)^#	0.659	0.935 (0.884−0.987)	0.563 (0.301−0.825)	0.832 (0.806−0.857)^#	0.438

Open in a new tab

Table S3. NRI of comparisons of MDCS and radiologists in pooled external and prospective testing sets.

Radiologist	Pooled external testing set		Prospective testing set
Radiologist	IDI (95% CI)	P	NRI (95% CI)	P
NRI, net reclassification index; MDCS, multiprocess detection and classification system; IDI, integrated discrimination improvement; 95% CI, 95% confidence interval; R, radiologist.
R1	0.312 (0.188−0.406)	0.041	0.456 (0.335−0.491)	<0.001
R2	0.302 (0.196−0.411)	0.030	0.397 (0.139−0.370)	0.370
R3	0.246 (0.138−0.359)	0.122	0.237 (0.175−0.331)	0.026
R4	0.197 (−0.015−0.318)	0.076	0.294 (0.198−0.353)	<0.001
R5	0.045 (−0.003−0.202)	0.286	0.166 (0.082−0.246)	0.004
R6	0.164 (−0.005−0.339)	0.009	0.218 (0.151−0.281)	<0.001

Open in a new tab

Performance of radiologists with MDCS assistance

Figure 5 and Table 2 show the comparison of the capability of classifying breast lesions on CEM images between radiologists alone and radiologists with MDCS’s assistance. We could see that the average sensitivity, specificity and accuracy for both junior and senior radiologists improved by varying degrees before and after the aid of MDCS on the pooled external and prospective testing sets. In particular, the average sensitivity of senior radiologists significantly improved than before [0.929 (95% CI: 0.907−0.951) vs. 0.960 (95% CI: 0.938−0.982), P<0.001) in the pooled external testing set. The average specificity of junior radiologist improved by 0.069 in the prospective testing set (P=0.450 with the help of MDCS. In Supplementary Table S4, all radiologists achieved positive NRI values, which indicated a positive improvement with the MDCS assistance. Notably, both in pooled external testing set and prospective testing set, the junior radiologists attained superior clinical benefit improvement. Moreover, the Kappa value of junior and senior radiologists, which could reflect the inter-observer agreement, increased by 0.165 and 0.134, respectively on the pooled external testing set, and increased by 0.108 and 0.110, respectively on the prospective testing set when MDCS assisted. The agreement degree of all pairs of radiologists also had improvement with MDCS assistance (Figure 6).

ROC curve of MDCS and performance of radiologists with and without MDCS assistance in pooled external and prospective testing sets. The curve of radiologists is formed by connecting three points (0, 0), (1-specificity, sensitivity) and (1, 1). (A,C) Performance of lesion classification for MDCS and three junior radiologists with and without MDSC assistance in the pooled external (A) and prospective testing sets (C); (B,D) Performance of lesion classification for MDCS and three senior radiologists with and without MDSC assistance in the pooled external (B) and prospective testing set (D). In junior group, Radiologist 1 had 5 years of experience, Radiologist 2 had 4 years of experience and Radiologist 3 had 8 years of experience; In senior group, Radiologist 4 had 10 years of experience, Radiologist 5 had 16 years of experience and Radiologist 6 had 18 years of experience. ROC, receiver operating characteristic; MDCS, multiprocess detection and classification system.

Table S4. NRI of comparisons radiologists with or without MDCS assistance in pooled external and prospective testing sets.

Radiologist	Pooled external testing set		Prospective testing set
Radiologist	IDI (95% CI)	P	NRI (95% CI)	P
NRI, net reclassification index; MDCS, multiprocess detection and classification system; 95% CI, 95% confidence interval; R, radiologist.
R1	0.164 (0.049−0.252)	0.041	0.211 (0.122−0.252)	0.006
R2	0.166 (0.066−0.209)	0.026	0.041 (−0.065−0.154)	<0.001
R3	0.045 (−0.032−0.089)	0.048	0.134 (0.095−0.245)	0.076
R4	0.112 (0.015−0.234)	<0.001	0.082 (0.027−0.110)	0.019
R5	0.031 (0.019−0.072)	0.107	0.014 (−0.068−0.094)	0.033
R6	0.097 (0.014−0.206)	<0.001	0.040 (−0.004−0.136)	0.162

Open in a new tab

Agreement degree of pairs of radiologists without and with MDCS assistance in the pooled external and prospective testing sets. The first row displays the agreement degree of pairs of six radiologists without (A) and with (B) MDCS assistance in the pooled external testing set, and the second row displays the agreement degree of pairs of radiologists without (C) and with (D) MDCS assistance in the prospective testing set. 1−6 separately represents six radiologists. MDCS, multiprocess detection and classification system.

Visual explanation of MDCS and analysis of representative cases

In heatmaps, the red and yellow regions often represent higher predictive significance than the green and blue regions. Figure 7A,B reflect that MDCS does pay more attention to the area of lesions same as radiologists for the cases correctly diagnosed. In Figure 7C, a radiologist and MDCS thought the lesion with invasive ductal carcinoma appeared benign. Standing at MDCS’s point, the lesion is obscured by gland in LE images and exhibits heterogeneous enhancement. Consequently, the classifier of MDCS with LE images as the main line cannot accurately focus on the entire lesion area and fails to recognize the malignant characteristics of breast lesions. In fact, several similar cases diagnosed mistakenly by MDCS have been found when testing. Therefore, it is necessary for radiologists to further examine this type of lesion after MDCS diagnosis in actual clinical scenes. The reason for misreading of radiologist may be attributed to the small size and insignificant enhancement of the lesion. In addition, the lack of experience and expertise of CEM diagnosis for this junior radiologist may also be a potential factor. In Figure 7D, due to disordered gland structure, uneven enhancement and irregular shape of the lesion, five out of six radiologists incorrectly thought this lesion was suspicious for malignancy while our system correctly identified it. Later, during the second round, four radiologists corrected their final decisions with the assistance of MDCS. This means that MDCS has the ability of learning more detailed features that are hard to observe visually and has the potential to complement radiologists in breast cancer diagnosis.

CEM images and heatmaps of four CC-view breast lesions. The heatmaps obtained by MDCS from the CEM images visualized the most indicative regions. The red and yellow regions have higher predictive significance than the green and blue regions. (A) Images in a 58-year-old female with invasive ductal carcinoma. CEM images show a 2.5 cm mass, BI-RADS 4C. MDCS and radiologists all correctly predicted the malignant lesion; (B) Images in a 36-year-old female with fibroadenoma. CEM images show a 6.4 cm mass. MDCS and radiologists all correctly predicted the benign lesion; (C) Images in a 48-year-old female with invasive ductal carcinoma. CEM images show a 1.5 cm mass, BI-RADS 4B. MDCS and a radiologist erroneously predicted the malignant lesion; (D) Images in a 50-year-old female. CEM images show a 1.2 cm lesion mass, BI-RADS 4A. Five radiologists erroneously predicted it, whereas MDCS and a senior radiologist made the correct prediction. CEM, contrast-enhanced mammography; CC, craniocaudal; MDCS, multiprocess detection and classification system; BI-RADS, Breast Imaging Reporting and Data System.

Discussion

In this work, we built a fully automatic multiprocess deep-learning system capable of fast detecting and classifying breast lesions on CEM images. A novel auxiliary fusion algorithm was applied in the system to cleverly learn richer features from CEM images. Evaluated on multicenter external dataset and prospective cohort, the generalization and repeatability of MDCS was fully confirmed. Furthermore, a two rounds of reader study reflected that MDCS has not only latent clinical practicability to streamline clinical workflow, but also improve the performance of radiologists with different seniority levels.

Our study has following the advantages. The first strength of our study is the automated capability of MDCS. Assessing CEM images is an iterative and labor-consuming process. During the actual review process, suspicious breast abnormalities need to be identified first, and then further characterized for diagnosis. A multiprocess deep-learning system with an automatic detector can greatly streamline clinical workflow. Skarping et al. (40) used a continuous system to detect breast tumors on digital mammograms and predict the response to neoadjuvant chemotherapy in breast cancer patients. Zhang et al. (41) implemented a fully automatic deep learning method using Mask R-CNN and provided a feasible method to search breast MRI, localize, and segment lesions. Yet most previous work [such as the studies of Dominique et al. (23) and Perek et al. (42)] on CEM still depended on hand-crafted ROIs. To our knowledge, we firstly present such a CEM-based multiprocess system including a lesion automatic detector.

Second, we fully utilized the diverse information of CEM images and proposed an innovative AFF algorithm. In order to leverage the advantages of LE and RC images, Song et al. (25) proposed a fusion model to generate the synthetic LE and RC images. Later on, their team proposed a multi-view multimodal network named MVMM-Net (26), which could extract the features of multi-view and multimodal images simultaneously. Gao et al. (24) adopted two Siamese networks to extract the features of LE and RC images and then the feature maps from the final layer of two pathways were summed element by element. However, these methods did not notice that the task-related features should be prioritized in their respective tasks. According to different tasks, we regarded different images as the main-line input to propose innovative AFF algorithm. The core idea of this algorithm is inspired by a method of interactive learning (32), which aims to supplement the missing content of the main-branch by introducing task-related features from the auxiliary-branch. Additionally, there were deep-learning networks improving its performance by applying convolutional block attention module (CBAM) (43) to assign higher weights to the task-related features. Building upon this approach, we designed the new attention mechanism applying in AFF module by integrating convolutional block attention idea into dual attention module, enabling the adaptive capture of context feature dependencies in both space and channel size. Moreover, a study conducted by Yan et al. (44) that presented a multitasking Siamese deep model that combined CC and MLO mammograms to enhance breast mass detection ability. Taking inspiration from this and MVMM-Net, we also adopted a Siamese classification network to combine the results from MLO and CC views, and produced excellent performance compared to the single-view network.

Finally, MDCS has exceptional clinical value. Most previous studies were designed to neither demonstrate actual clinical application nor illuminate potential guidance to clinicians in making clinical decisions (24,25). Our study showed that diagnostic performance and diagnostic consistency of both junior and senior radiologists all improved by varying degrees when MDCS was used, and the potential complementary role of MDSC was also further confirmed by visual analysis. These results all suggests that the MDCS can be leveraged in developed areas where medical experts are overloaded with demands and in remote regions with suboptimal medical resources.

Although our model achieved remarkable results, some limitations still must be addressed. First, we discovered that some lesions obscured by gland in LE images could not be accurately identified by MDCS. Therefore, further examination by radiologists is required for this type of lesion following the MDCS diagnosis. Second, CEM images are exclusively collected in China. The model’s capabilities to diagnose breast cancer in other ethnicities merit investigation. Third, while the MDCS has showed promising performance on the pooled external and prospective testing sets, there is a possibility of decreased performance in certain specific scenarios owing to its development in a single center. Going forward, we will forge collaborations with more centers and expand the data sample size to train our models for enhancing its generalization. Forth, because lacking of one-to-one pathological findings for multiple lesions, all cases we included were unilateral breast images with single lesions. However, multiple breast lesions in the bilateral breasts are common in real life. Sometimes radiologists refer to the contralateral breast to diagnose suspicious areas of the contralateral breast, and so we should consider more situations in our future study.

Conclusions

In brief, we built a fully automatic system MDCS based on CEM images, which shows outstanding performance on detection and classification breast lesions. The system also shows potential in assisting with breast diagnoses on CEM, thereby improving efficiency and quality for radiologists, especially in where subspecialists are unavailable. In the future, deeper research is warranted to improve the generalization of deep-learning system by utilizing richer datasets.

Acknowledgements

The study was supported by the National Natural Science Foundation of China (No. 82001775, 82371933); the Natural Science Foundation of Shandong Province of China (No. ZR2021MH120); the Special Fund for Breast Disease Research of Shandong Medical Association (No. YXH2021ZX055) and the Taishan Scholar Foundation of Shandong Province of China (No. tsgn202211378).

Contributor Information

Zhen Hua, Email: HuaZhen.edu@hotmail.com.

Haicheng Zhang, Email: haicheng92@126.com.

Ning Mao, Email: maoning@pku.edu.cn.

References

1.Britt KL, Cuzick J, Phillips KA Key steps for effective breast cancer prevention. Nat Rev Cancer. 2020;20:417–36. doi: 10.1038/s41568-020-0266-x. [DOI] [PubMed] [Google Scholar]
2.Lei S, Zheng R, Zhang S, et al Global patterns of breast cancer incidence and mortality: A population-based cancer registry data analysis from 2000 to 2020. Cancer Commun. 2021;41:1183–94. doi: 10.1002/cac2.12207>PMID:34399040>. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Rosenquist CJ, Lindfors KK Screening mammography beginning at age 40 years: a reappraisal of cost-effectiveness. Cancer. 1998;82:2235–40. doi: 10.1002/(sici)1097-0142(19980601)82:11<2235::aid-cncr19>3.0.co;2-v. [DOI] [PubMed] [Google Scholar]
4.Migowski A Early detection of breast cancer and the interpretation of results of survival studies. Cien Saude Colet (Article in Portuguese) 2015;20:1309. doi: 10.1590/1413-81232015204.17772014. [DOI] [PubMed] [Google Scholar]
5.Chen HL, Zhou JQ, Chen Q, et al Comparison of the sensitivity of mammography, ultrasound, magnetic resonance imaging and combinations of these imaging modalities for the detection of small (≤2 cm) breast cancer. Medicine. 2021;100:E26531. doi: 10.1097/MD.0000000000026531. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Loberg M, Lousdal ML, Bretthauer M, et al Benefits and harms of mammography screening. Breast Cancer Res. 2015;17:63. doi: 10.1186/s13058-015-0525-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Majid AS, de Paredes ES, Doherty RD, et al Missed breast carcinoma: pitfalls and pearls. Radiographics. 2003;23:881–95. doi: 10.1148/rg.234025083. [DOI] [PubMed] [Google Scholar]
8.Rudnicki W, Heinze S, Niemiec J, et al Correlation between quantitative assessment of contrast enhancement in contrast-enhanced spectral mammography (CESM) and histopathology — preliminary results. Eur Radiol. 2019;29:6220–6. doi: 10.1007/s00330-019-06232-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Patel BK, Ranjbar S, Wu T, et al Computer-aided diagnosis of contrast-enhanced spectral mammography: A feasibility study. Eur J Radiol. 2018;98:207–13. doi: 10.1016/j.ejrad.2017.11.024>PMID:29279165>. [DOI] [PubMed] [Google Scholar]
10.Perry H, Phillips J, Dialani V, et al Contrast-enhanced mammography: A systematic guide to interpretation and reporting. AJR Am J Roentgenol. 2019;212:222–31. doi: 10.2214/AJR.17.19265. [DOI] [PubMed] [Google Scholar]
11.Cheung YC, Lin YC, Wan YL, et al Diagnostic performance of dual-energy contrast-enhanced subtracted mammography in dense breasts compared to mammography alone: interobserver blind-reading analysis. Eur Radiol. 2014;24:2394–403. doi: 10.1007/s00330-014-3271-1. [DOI] [PubMed] [Google Scholar]
12.Jochelson MS, Dershaw DD, Sung JS, et al Bilateral contrast-enhanced dual-energy digital mammography: feasibility and comparison with conventional digital mammography and MR imaging in women with known breast carcinoma. Radiology. 2013;266:743–51. doi: 10.1148/radiol.12121084. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Tong T, Gu J, Xu D, et al Deep learning radiomics based on contrast-enhanced ultrasound images for assisted diagnosis of pancreatic ductal adenocarcinoma and chronic pancreatitis. BMC Med. 2022;20:74. doi: 10.1186/s12916-022-02258-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Wang Z, Wang Y, Li X, et al Correlation between imaging features on computed tomography and combined positive score of PD-L1 expression in patients with gastric cancer. Chin J Cancer Res. 2022;34:510–8. doi: 10.21147/j.issn.1000-9604.2022.05.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Huang Y, He L, Li Z, et al Coupling radiomics analysis of CT image with diversification of tumor ecosystem: A new insight to overall survival in stage I-III colorectal cancer. Chin J Cancer Res. 2022;34:40–52. doi: 10.21147/j.issn.1000-9604.2022.01.04. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Shen Y, Shamout FE, Oliver JR, et al Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nat Commun. 2021;12:5645. doi: 10.1038/s41467-021-26023-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Li H, Chen D, Nailon WH, et al Dual convolutional neural networks for breast mass segmentation and diagnosis in mammography. IEEE Trans Med Imaging. 2022;41:3–13. doi: 10.1109/TMI.2021.3102622. [DOI] [PubMed] [Google Scholar]
18.Witowski J, Heacock L, Reig B, et al Improving breast cancer diagnostics with deep learning for MRI. Sci Transl Med. 2022;14:eabo4802. doi: 10.1126/scitranslmed.abo4802. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Jiang Y, Edwards AV, Newstead GM Artificial intelligence applied to breast MRI for improved diagnosis. Radiology. 2021;298:38–46. doi: 10.1148/radiol.2020200292. [DOI] [PubMed] [Google Scholar]
20.Leibig C, Brehmer M, Bunk S, et al Combining the strengths of radiologists and AI for breast cancer screening: a retrospective analysis. Lancet Digit Health. 2022;4:e507–19. doi: 10.1016/S2589-7500(22)00070-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Mao N, Zhang H, Dai Y, et al Attention-based deep learning for breast lesions classification on contrast enhanced spectral mammography: a multicentre study. Br J Cancer. 2023;128:793–804. doi: 10.1038/s41416-022-02092-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Hsu W, Hippe DS, Nakhaei N, et al External validation of an ensemble model for automated mammography interpretation by artificial intelligence. JAMA Netw Open. 2022;5:e2242343. doi: 10.1001/jamanetworkopen.2022.42343. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Dominique C, Callonnec F, Berghian A, et al Deep learning analysis of contrast-enhanced spectral mammography to determine histoprognostic factors of malignant breast tumours. Eur Radiol. 2022;32:4834–44. doi: 10.1007/s00330-022-08538-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Gao F, Wu T, Li J, et al SD-CNN: A shallow-deep CNN for improved breast cancer diagnosis. Comput Med Imaging Graph. 2018;70:53–62. doi: 10.1016/j.compmedimag.2018.09.004. [DOI] [PubMed] [Google Scholar]
25.Song J, Zheng Y, Xu C, et al Improving the classification ability of network utilizing fusion technique in contrast-enhanced spectral mammography. Med Phys. 2022;49:966–77. doi: 10.1002/mp.15390. [DOI] [PubMed] [Google Scholar]
26.Song J, Zheng Y, Zakir Ullah M, et al Multiview multimodal network for breast cancer diagnosis in contrast-enhanced spectral mammography images. Int J Comput Assist Radiol Surg. 2021;16:979–88. doi: 10.1007/s11548-021-02391-4. [DOI] [PubMed] [Google Scholar]
27.Otsu N A threshold selection method from gray-level histograms. IEEE Trans. 1979;SMC-9:62–6. doi: 10.1109/TSMC.1979.4310076. [DOI] [Google Scholar]
28.Lin TY, Goyal P, Girshick R, et al Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell. 2020;42:318–27. doi: 10.1109/TPAMI.2018.2858826. [DOI] [PubMed] [Google Scholar]
29.Ren S, He K, Girshick R, et al Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017;39:1137–49. doi: 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]
30.Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector. 2016 European Conference on Computer Vision (ECCV) 2016;9905.
31.Lin TY, Dollol P, Girshick R, et al. Feature pyramid networks for object detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017:936-44.
32.Wang R, Chen S, Ji C, et al Boundary-aware context neural network for medical image segmentation. Med Image Anal. 2022;78:102395. doi: 10.1016/j.media.2022.102395. [DOI] [PubMed] [Google Scholar]
33.Wang X, Girshick R, Gupta A, et al. Non-local neural networks. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018: 7794-803.
34.Wang Q, Wu B, Zhu P, et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020:11531-9.
35.Jaderberg M, Simonyan K, Zisserman A, et al. Spatial Transformer Networks. 2015 International Conference on Neural Information Processing Systems (NIPS) 2015:2017-25.
36.Huang G, Liu Z, Maaten LVD, et al. Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017:2261-9.
37.He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016:770-8.
38.Ding J, Li A, Hu Z, et al. Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks. 2017 Medical Image Computing and Computer Assisted Intervention (ICCAI 2017). Quebec City: ICCAI, 2017:559-67.
39.Selvaraju RR, Cogswell M, Das A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Itlay. 2017:618-26.
40.Skarping I, Larsson M, Förnvik D Analysis of mammograms using artificial intelligence to predict response to neoadjuvant chemotherapy in breast cancer patients: proof of concept. Eur Radiol. 2022;32:3131–41. doi: 10.1007/s00330-021-08306-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Zhang Y, Chan S, Park VY, et al. Automatic detection and segmentation of breast cancer on MRI using mask R-CNN trained on non-fat-sat images and tested on fat-sat images. Acad Radiol 2022;29 Suppl 1(Suppl 1): S135-44.
42.Perek S, Kiryati N, Zimmerman-Moreno G, et al Classification of contrast-enhanced spectral mammography (CESM) images. Int J Comput Assist Radiol Surg. 2019;14:249–57. doi: 10.1007/s11548-018-1876-6. [DOI] [PubMed] [Google Scholar]
43.Woo S, Park J, Lee JY, et al. CBAM: Convolutional Block Attention Module. In: Ferrari, V, Hebert, M, Sminchisescu, C, et al. (eds) Computer Vision — ECCV 2018. Lecture Notes in Computer Science vol 11211. Switzerland: Springer Cham, 2018.
44.Yan Y, Conze PH, Lamard M, et al Towards improved breast mass detection using dual-view mammogram matching. Med Image Anal. 2021;71:102083. doi: 10.1016/j.media.2021.102083. [DOI] [PubMed] [Google Scholar]

[b1] 1.Britt KL, Cuzick J, Phillips KA Key steps for effective breast cancer prevention. Nat Rev Cancer. 2020;20:417–36. doi: 10.1038/s41568-020-0266-x. [DOI] [PubMed] [Google Scholar]

[b2] 2.Lei S, Zheng R, Zhang S, et al Global patterns of breast cancer incidence and mortality: A population-based cancer registry data analysis from 2000 to 2020. Cancer Commun. 2021;41:1183–94. doi: 10.1002/cac2.12207>PMID:34399040>. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b3] 3.Rosenquist CJ, Lindfors KK Screening mammography beginning at age 40 years: a reappraisal of cost-effectiveness. Cancer. 1998;82:2235–40. doi: 10.1002/(sici)1097-0142(19980601)82:11<2235::aid-cncr19>3.0.co;2-v. [DOI] [PubMed] [Google Scholar]

[b4] 4.Migowski A Early detection of breast cancer and the interpretation of results of survival studies. Cien Saude Colet (Article in Portuguese) 2015;20:1309. doi: 10.1590/1413-81232015204.17772014. [DOI] [PubMed] [Google Scholar]

[b5] 5.Chen HL, Zhou JQ, Chen Q, et al Comparison of the sensitivity of mammography, ultrasound, magnetic resonance imaging and combinations of these imaging modalities for the detection of small (≤2 cm) breast cancer. Medicine. 2021;100:E26531. doi: 10.1097/MD.0000000000026531. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b6] 6.Loberg M, Lousdal ML, Bretthauer M, et al Benefits and harms of mammography screening. Breast Cancer Res. 2015;17:63. doi: 10.1186/s13058-015-0525-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b7] 7.Majid AS, de Paredes ES, Doherty RD, et al Missed breast carcinoma: pitfalls and pearls. Radiographics. 2003;23:881–95. doi: 10.1148/rg.234025083. [DOI] [PubMed] [Google Scholar]

[b8] 8.Rudnicki W, Heinze S, Niemiec J, et al Correlation between quantitative assessment of contrast enhancement in contrast-enhanced spectral mammography (CESM) and histopathology — preliminary results. Eur Radiol. 2019;29:6220–6. doi: 10.1007/s00330-019-06232-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b9] 9.Patel BK, Ranjbar S, Wu T, et al Computer-aided diagnosis of contrast-enhanced spectral mammography: A feasibility study. Eur J Radiol. 2018;98:207–13. doi: 10.1016/j.ejrad.2017.11.024>PMID:29279165>. [DOI] [PubMed] [Google Scholar]

[b10] 10.Perry H, Phillips J, Dialani V, et al Contrast-enhanced mammography: A systematic guide to interpretation and reporting. AJR Am J Roentgenol. 2019;212:222–31. doi: 10.2214/AJR.17.19265. [DOI] [PubMed] [Google Scholar]

[b11] 11.Cheung YC, Lin YC, Wan YL, et al Diagnostic performance of dual-energy contrast-enhanced subtracted mammography in dense breasts compared to mammography alone: interobserver blind-reading analysis. Eur Radiol. 2014;24:2394–403. doi: 10.1007/s00330-014-3271-1. [DOI] [PubMed] [Google Scholar]

[b12] 12.Jochelson MS, Dershaw DD, Sung JS, et al Bilateral contrast-enhanced dual-energy digital mammography: feasibility and comparison with conventional digital mammography and MR imaging in women with known breast carcinoma. Radiology. 2013;266:743–51. doi: 10.1148/radiol.12121084. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b13] 13.Tong T, Gu J, Xu D, et al Deep learning radiomics based on contrast-enhanced ultrasound images for assisted diagnosis of pancreatic ductal adenocarcinoma and chronic pancreatitis. BMC Med. 2022;20:74. doi: 10.1186/s12916-022-02258-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b14] 14.Wang Z, Wang Y, Li X, et al Correlation between imaging features on computed tomography and combined positive score of PD-L1 expression in patients with gastric cancer. Chin J Cancer Res. 2022;34:510–8. doi: 10.21147/j.issn.1000-9604.2022.05.10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b15] 15.Huang Y, He L, Li Z, et al Coupling radiomics analysis of CT image with diversification of tumor ecosystem: A new insight to overall survival in stage I-III colorectal cancer. Chin J Cancer Res. 2022;34:40–52. doi: 10.21147/j.issn.1000-9604.2022.01.04. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b16] 16.Shen Y, Shamout FE, Oliver JR, et al Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nat Commun. 2021;12:5645. doi: 10.1038/s41467-021-26023-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b17] 17.Li H, Chen D, Nailon WH, et al Dual convolutional neural networks for breast mass segmentation and diagnosis in mammography. IEEE Trans Med Imaging. 2022;41:3–13. doi: 10.1109/TMI.2021.3102622. [DOI] [PubMed] [Google Scholar]

[b18] 18.Witowski J, Heacock L, Reig B, et al Improving breast cancer diagnostics with deep learning for MRI. Sci Transl Med. 2022;14:eabo4802. doi: 10.1126/scitranslmed.abo4802. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b19] 19.Jiang Y, Edwards AV, Newstead GM Artificial intelligence applied to breast MRI for improved diagnosis. Radiology. 2021;298:38–46. doi: 10.1148/radiol.2020200292. [DOI] [PubMed] [Google Scholar]

[b20] 20.Leibig C, Brehmer M, Bunk S, et al Combining the strengths of radiologists and AI for breast cancer screening: a retrospective analysis. Lancet Digit Health. 2022;4:e507–19. doi: 10.1016/S2589-7500(22)00070-X. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b21] 21.Mao N, Zhang H, Dai Y, et al Attention-based deep learning for breast lesions classification on contrast enhanced spectral mammography: a multicentre study. Br J Cancer. 2023;128:793–804. doi: 10.1038/s41416-022-02092-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b22] 22.Hsu W, Hippe DS, Nakhaei N, et al External validation of an ensemble model for automated mammography interpretation by artificial intelligence. JAMA Netw Open. 2022;5:e2242343. doi: 10.1001/jamanetworkopen.2022.42343. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b23] 23.Dominique C, Callonnec F, Berghian A, et al Deep learning analysis of contrast-enhanced spectral mammography to determine histoprognostic factors of malignant breast tumours. Eur Radiol. 2022;32:4834–44. doi: 10.1007/s00330-022-08538-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b24] 24.Gao F, Wu T, Li J, et al SD-CNN: A shallow-deep CNN for improved breast cancer diagnosis. Comput Med Imaging Graph. 2018;70:53–62. doi: 10.1016/j.compmedimag.2018.09.004. [DOI] [PubMed] [Google Scholar]

[b25] 25.Song J, Zheng Y, Xu C, et al Improving the classification ability of network utilizing fusion technique in contrast-enhanced spectral mammography. Med Phys. 2022;49:966–77. doi: 10.1002/mp.15390. [DOI] [PubMed] [Google Scholar]

[b26] 26.Song J, Zheng Y, Zakir Ullah M, et al Multiview multimodal network for breast cancer diagnosis in contrast-enhanced spectral mammography images. Int J Comput Assist Radiol Surg. 2021;16:979–88. doi: 10.1007/s11548-021-02391-4. [DOI] [PubMed] [Google Scholar]

[b27] 27.Otsu N A threshold selection method from gray-level histograms. IEEE Trans. 1979;SMC-9:62–6. doi: 10.1109/TSMC.1979.4310076. [DOI] [Google Scholar]

[b28] 28.Lin TY, Goyal P, Girshick R, et al Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell. 2020;42:318–27. doi: 10.1109/TPAMI.2018.2858826. [DOI] [PubMed] [Google Scholar]

[b29] 29.Ren S, He K, Girshick R, et al Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017;39:1137–49. doi: 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]

[b30] 30.Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector. 2016 European Conference on Computer Vision (ECCV) 2016;9905.

[b31] 31.Lin TY, Dollol P, Girshick R, et al. Feature pyramid networks for object detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017:936-44.

[b32] 32.Wang R, Chen S, Ji C, et al Boundary-aware context neural network for medical image segmentation. Med Image Anal. 2022;78:102395. doi: 10.1016/j.media.2022.102395. [DOI] [PubMed] [Google Scholar]

[b33] 33.Wang X, Girshick R, Gupta A, et al. Non-local neural networks. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018: 7794-803.

[b34] 34.Wang Q, Wu B, Zhu P, et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020:11531-9.

[b35] 35.Jaderberg M, Simonyan K, Zisserman A, et al. Spatial Transformer Networks. 2015 International Conference on Neural Information Processing Systems (NIPS) 2015:2017-25.

[b36] 36.Huang G, Liu Z, Maaten LVD, et al. Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017:2261-9.

[b37] 37.He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016:770-8.

[b38] 38.Ding J, Li A, Hu Z, et al. Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks. 2017 Medical Image Computing and Computer Assisted Intervention (ICCAI 2017). Quebec City: ICCAI, 2017:559-67.

[b39] 39.Selvaraju RR, Cogswell M, Das A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Itlay. 2017:618-26.

[b40] 40.Skarping I, Larsson M, Förnvik D Analysis of mammograms using artificial intelligence to predict response to neoadjuvant chemotherapy in breast cancer patients: proof of concept. Eur Radiol. 2022;32:3131–41. doi: 10.1007/s00330-021-08306-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b41] 41.Zhang Y, Chan S, Park VY, et al. Automatic detection and segmentation of breast cancer on MRI using mask R-CNN trained on non-fat-sat images and tested on fat-sat images. Acad Radiol 2022;29 Suppl 1(Suppl 1): S135-44.

[b42] 42.Perek S, Kiryati N, Zimmerman-Moreno G, et al Classification of contrast-enhanced spectral mammography (CESM) images. Int J Comput Assist Radiol Surg. 2019;14:249–57. doi: 10.1007/s11548-018-1876-6. [DOI] [PubMed] [Google Scholar]

[b43] 43.Woo S, Park J, Lee JY, et al. CBAM: Convolutional Block Attention Module. In: Ferrari, V, Hebert, M, Sminchisescu, C, et al. (eds) Computer Vision — ECCV 2018. Lecture Notes in Computer Science vol 11211. Switzerland: Springer Cham, 2018.

[b44] 44.Yan Y, Conze PH, Lamard M, et al Towards improved breast mass detection using dual-view mammogram matching. Med Image Anal. 2021;71:102083. doi: 10.1016/j.media.2021.102083. [DOI] [PubMed] [Google Scholar]

PERMALINK

Detection and classification of breast lesions using multiple information on contrast-enhanced mammography by a multiprocess deep-learning system: A multicenter study

Yuqian Chen

Zhen Hua

Fan Lin

Tiantian Zheng

Heng Zhou

Shijie Zhang

Jing Gao

Zhongyi Wang

Huafei Shao

Wenjuan Li

Fengjie Liu

Simin Wang

Yan Zhang

Feng Zhao

Hao Liu

Haizhu Xie

Heng Ma

Haicheng Zhang

Ning Mao

Abstract

Objective

Methods

Results

Conclusions

Introduction

Materials and methods

Patients and datasets

Figure 1.

Image annotation

Image preprocessing

Workflow of MDCS

Figure 2.

Double-branch deep network detector based on view-level detection

Figure S1.

Figure S2.

Siamese classifier network based on combination of view-level results

Evaluation of tumor localization

Training process of MDCS

Radiologists and MDCS comparison and two-round reader study

MDCS’s visual explanation and analysis for representative cases

Statistical analysis

Results

Clinical characteristics

Table 1. Clinical characteristics of patients.

Detection performance of MDCS

Figure 3.

Figure 4.

Figure S3.

Table S1. Result of ablation analysis detector in internal testing set.

Classification performance of MDCS

Table S2. Classification result of MDCS in training set, internal testing set, pooled external testing set and prospective testing set.

Figure S4.

Comparison between MDCS and six radiologists

Table 2. Diagnostic performance for MDCS and radiologists with and without MDCS assistance in pooled external and prospective testing sets.

Table S3. NRI of comparisons of MDCS and radiologists in pooled external and prospective testing sets.

Performance of radiologists with MDCS assistance

Figure 5.

Table S4. NRI of comparisons radiologists with or without MDCS assistance in pooled external and prospective testing sets.

Figure 6.

Visual explanation of MDCS and analysis of representative cases

Figure 7.

Discussion

Conclusions

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases