Abstract
Breast cancer (BC) remains a significant global health issue, necessitating innovative methodologies to improve early detection and diagnosis. Despite the existence of intelligent deep learning models, their efficacy is often limited due to the oversight of small-sized masses, leading to false positive and false negative outcomes. This research introduces a novel segmentation-guided classification model developed to increase BC detection accuracy. The designed model unfolds in two critical phases, each contributing to a comprehensive BC diagnostic pipeline. In Phase I, the Attention U-Net model is utilized for BC segmentation. The encoder extracts hierarchical features, while the decoder, supported by attention mechanisms, refines the segmentation, focusing on suspicious regions. In Phase II, a novel ensemble approach is introduced for BC classification, involving various feature extraction methods, base classifiers, and a meta-classifier. An ensemble of model classifiers-including support vector machine, decision trees, k-nearest neighbor and artificial neural network- captures diverse patterns within these features. The Random Forest meta-classifier amalgamates their outputs, leveraging their collective strengths. The proposed integrated model accurately identifies different breast tumor classes, including malignant, benign, and normal. The precise region-of-interest analysis from segmentation phase significantly boosted classification performance of ensemble meta-classifier. The model accomplished an overall accuracy rate of 99.57% with high segmentation performance of 95% f1-score, illustrating its high discriminative power in detecting malignant, benign, and normal cases within the ultrasound image dataset. This research contributes to reducing breast tumor morbidity and mortality by facilitating early detection and timely intervention, ultimately supporting better patient outcomes.
Supplementary Information
The online version contains supplementary material available at 10.1007/s13534-024-00435-7.
Keywords: BC, Segmentation, Attention U-Net, Classification, Ensemble classifiers, Random forest meta-classifier, Breast Ultrasound images
Introduction
Breast cancer (BC), a formidable adversary in healthcare, stands as root cause of death among women worldwide [1, 2]. It knows no boundaries, affecting individuals from diverse backgrounds, and imposing a significant burden on both patients and healthcare systems [3]. The relentless nature of this disease accentuates the significant consequence of premature detection, where timely intervention can be the difference between life and death [4]. In its early stages, BC often remains asymptomatic, silently infiltrating breast tissue, and evading detection through conventional means [5]. The misclassification of images during testing the proposed method is due to some factors such as intra-class variability, overlapping features, image quality, small size lesions and class imbalance. The prime motive of using segmentation approach before classification is to enhance robustness and accuracy of BC prediction [6]. The key benefits are listed by: (i) highlight target regions, (ii) enhanced feature extraction, (iii) reduced computational complexity, (iv) variability reduction and (v) high classifier performance [41]. Notable architectures include U-Net, utilizing a symmetrical encoder-decoder framework with skip connections, and Attention U-Net, incorporating attention strategy to concentrate on salient features [7, 8]. These deep learning models have significantly improved segmentation accuracy but are not without challenges [9–11].
Hossain et al. [12] introduced a fine-tuned U-Net for extracting cancer areas from ultrasound samples. The superior evaluation result only emphasized its ability to segment performance in small-scale datasets. Tekin et al. [13] presented a tubule U-net, a deep model for tubule segmentation from breast tumor images. Among testing three diverse network models, EfficientNetB3-U has accomplished effective segmentation results, which segmented very complex tubule structures in high precision however the training process was complicated. Iqbal and Sharif [14] jointly worked to introduce a semi-supervised segmentation module, PDF-U-Net for breast tumor segmenting process. However, the level of accuracy degraded when the tumor region was in very small-size. A weighted multi-modal U-Net technique was introduced by Misra et al. [39] for automated segmentation feature lesions. This technique allows ideal weight values for the imaging modality to highlight its importance. Another research work implemented by Misra et al. [40] was Bi-modal transfer learning approach. This technique uses dual deep models such as ResNet and Alexnet for training. But it requires large computational resources for task execution. The research work [37] introduces two architectures such as Attention3 UNet and Attention4 UNet for accurate segmentation for optimal classification. The dual attention-enabled UNet model introduced in [38] uses two influential attention blocks to capture contextual data and spatial patterns within local feature characteristics. These segmentation-guided classification approaches incorporate the descriptive spatial data captured during segmentation with the compelling pattern recognition capacity of classifier. The growing importance and efficacy of ensemble approach for predictive biomedical modeling will be pivotal in advancing clinical practices.
Following segmentation, classification methods are employed to determine whether detected lesions are benign or malignant [15, 16]. Obayya et al. [17] presented an Aquila-optimized Bayesian Neural Network (AOBNN) to detect BCs from ultrasound samples. It was noticed that validation accuracy was high compared to training accuracy, depicting higher chance of having overfitting problem. Wang and Yao [18] jointly introduced a binary classification model for breast tumor detection using anchor free neural network (AFNN) with segmentation based enhancement procedure. Despite using segmentation, the model obtained accuracy of 88.8% for detecting malignant lesions. Vigil et al. [19] demonstrated a convolutional deep encoder-assisted random forest model for screening breast lesions in ultrasound samples. There it necessitates the incorporation of clinical information of inputs to achieve reliable prediction. A method based on swin transformer (BTS-ST) model was introduced by Iqbal and Sharif [20]. The analysis has proved that high performance rate was achieved for ultrasound imaging whereas for other modalities, the performance was slightly lower. Yan et al. [21] presented a breast tumor segmentation concept using attention boosted UNet amid hybrid dilated convolution, termed as ‘AE U-net with HDC’ to detect breast lesion in the ultrasound image samples with high degree of accuracy. The limitation was irreplaceable pooling and up-sampling operations due to less Graphic Processing Unit capacity. Jabeen et al. [22] demonstrated a probability-based optimized deep feature fusion approach for breast tumor identification using ultrasound samples. However, no preprocessing steps were executed to remove noises. Grey wolf optimized and wavelet neural network (GWO-WNN) method was introduced by Bourouis et al. [23] for diagnosing diverse stages of BC. The drawback was, that the prediction performance was analyzed using a small-scale dataset. Liu et al. [24] developed a grid-based deep mechanism method for diagnosing breast tumors using ultrasound samples. But, the limitation was complex computational time.
Pretrained dual deep CNN with transfer learning method was illustrated by Saba et al. [25] for 1st-stage breast tumor diagnosis. However, poor training accuracy was its limitation. Sahu et al. [26] demonstrated a deep hybrid CNN for breast tumor diagnosis. This method does not consider any optimization algorithms for tuning hyperparameters. An automated breast tumor screening method was suggested for clinical practice by Vijayakumar et al. [27]. However, the accuracy obtained was 94% which was not adequate. Li et al. [28] demonstrated a breast cancer segmentation approach to increase the degree of prediction accuracy. Meanwhile, this strategy has only attained less dice similarity score of 77.3% and IoU of 66%. Although these ranges were higher than the compared methods, were not sufficient for accurate tumor detection. Inan et al. [29] suggested a lightweight CNN model, a segmentation-guided classifier to identify breast tumors from ultrasonography images. Besides, the integration of multiple concepts, the precision score achieved was lower, ranging ≤ 80%. Lu et al. [30] presented a spatial attention-deep CNN module for precise identification of BCs using ultrasound samples. This method provided effective classification accuracy on small-scale datasets but its efficiency was not examined using large-scale datasets. In light of these challenges and the critical need for precise BC detection methods, our research endeavors to develop an integrated approach that addresses these pressing concerns. The key contributions of the research are presented as concise points:
Developing a two-phase breast tumor classification approach, utilizing Attention U-Net for segmentation and ensemble meta-classifier for classification of different stages of breast tumor images.
In Phase I, the research employs ‘Attention U-Net’ for segmentation. The architecture excels in capturing fine-grained details while maintaining a broader contextual understanding of ultrasound images.
Phase II introduces an ensemble approach for BC classification, comprising diverse classifiers, including decision tree, k-nearest neighbor (KNN), support vector machine (SVM), and artificial neural network (ANN). This ensemble captures a wide range of patterns and features within extracted regions.
Adapting Random Forest meta-classifier to combine the outputs of individual classifiers, harnessing their collective strength. This approach enhances classification accuracy and robustness.
The following organized sections enumerate the processing of the proposed segmentation-guided BC classification approach: Sect. 2 describes the proposed segmentation-guided BC classification approach. Section 3 provides the experimental findings of two phases to highlight model’s worth in recognizing and discriminating diverse breast cancer cases. Section 4 presents discussion and at last, the summary of research is presented in Sect. 5 with future plans.
Materials and methods
Data source collection
"Breast Ultrasound Images Dataset" [31, 32] comprises images acquired from 600 female patients, all within age group of 25–75 years. The dataset includes three distinct classes- normal, benign, and malignant images. In total, it has 830 images. The images within the dataset are uniformly formatted in PNG. The image samples in the breast ultrasound dataset exhibit an average pixel dimension of 500 × 500. Table 1 presents the dataset distribution details with sample images. To tackle the potential complication of data loss and guarantee training, validation, and testing independence, a patient-level split is implemented. It means that each sample from a subject is limited to a single set (train, test and validate). It prevents the system from being learned from homogeneous image samples that belong to the same subject, assuring practical and demanding analysis.
Table 1.
Dataset distribution details with sample images
Data preprocessing
Image Resizing: To maintain uniformity and expedite processing, all images within the dataset are resized to a consistent dimension. This resizing step ensures that each image adheres to the average dimensions of 500 × 500 pixels, simplifying subsequent analysis.
Normalization: Image normalization is employed to standardize pixel values across the dataset. Normalization enhances the comparability of images and ensures that pixel intensity variations do not unduly influence model training.
Augmentation Techniques: Data augmentation techniques, such as rotation (45°, 90°, 135°, 180°), flipping (Horizontal, Vertical), and zooming [0.85, 1.15], are thoughtfully applied to diversify the dataset. Augmentation enriches the dataset by creating variations of existing images, which boosts model generalization and robustness.
Proposed segmentation guided classification approach
An integrated approach combining precise segmentation and robust classification is designed to enhance breast tumor identification. Figure 1 portrays the structure of the designed BC identification framework.
Fig. 1.
Designed BC detection framework
Attention U-Net module
Attention U-Net represents a pinnacle in image segmentation, adept at capturing significant and fine details from image’s context, maintaining full-fledged information.
-
A.
Encoder
- Input layer: The encoder starts with an input layer that accepts breast ultrasound images, typically of dimensions compatible with the dataset, such as , where is the width, is the height, and is the number of channels.
-
Convolutional Layers (Feature extractor): The core of the encoder comprises a series of convolutional layers. Each convolutional layer applies a set of filters to the feature maps from the previous layer. The operation for a convolutional layer can be represented as:
where represents the feature map at layer , is the ReLU (Rectified Linear Unit) activation function, denotes the learnable weights specific to the layer, is the feature map from the previous layer and represents the bias term.1 Convolutional layers are designed to recognize shallow features like corners, edge, and basic shapes. As data traverses these layers, it undergoes multi-level feature extraction. - Pooling Layers (Downsampling): After each convolutional layer, max-pooling layers are applied to downsample the feature maps. Max-pooling reduces the spatial dimensions of the feature maps while preserving essential features. The operation for max-pooling is given as:
where is the pooled feature map produced by the pooling layer.2 - High-Level Representations: As data progresses through the encoder, each convolutional layer captures increasingly abstract and high-level features. The network learns to recognize low-level features like edges and basic shapes in the early layers, gradually aggregating them into more complex representations. This hierarchical feature extraction is essential for retaining essential details while reducing the spatial dimensions of the data.
-
B.
Decoder
- Upsampling layer: The decoder begins with upsampling layers that maximize feature vector’s spatial size. It is given by,
where signifies upsampled feature map produced by the upsampling layer, implies feature map from the layer of earlier previous decoder.3 - Skip connections: Skip connections from respective encoder layers are concatenated with the decoder feature maps. These skip connections facilitate the flow of low-level feature representation to decoder from encoder component, aiding in precise segmentation. The operation of concatenation is represented as:
where signifies concatenated feature vector at decoder layer, implies upsampled feature vector and is corresponding encoder feature map.4 - Convolutional layers (Upsampling): Convolutional layers in decoder further process the concatenated feature maps, refining the segmentation. The operation of convolution is represented as:
where depicts segmented feature map generated through convolutional layer and depicts concatenated feature map.5
-
C.
Attention Mechanisms: It model incorporates attention strategy within the decoder.
Attention gates: The attention mechanisms give rise to attention gates, which are dynamic weights applied to the feature maps. These gates adaptively maintain data flow from different network parts, allowing model to concentrate on vital image areas. The attention gate operation is represented by,6
where represents attention gate for decoder level, represents learnable weights for the gate operation, implies concatenation of feature map from decoder and upsampled feature map from corresponding encoder level. indicates gated feature map, which is the result of Hadamard product between actual feature map and attention gate .7
-
D.Output layer: The final layer of decoder block utilizing sigmoid activation function to produce pixel-wise probability maps, depicting the likelihood of each pixel belonging to breast lesion or background. The output size matches the input image dimensions. The operation for sigmoid action is presented as:
where is the probability map representing the likelihood of breast lesions in the image and implies feature vector generated via attention mechanism. Attention U-Net model’s structural framework adapted for segmenting breast tumors is portrayed in Fig. 2.8
Fig. 2.
Architecture of Attention U-Net model
Training procedure and loss function estimation
The training process of Attention U-Net relies on labeled breast ultrasound images, where ground truth images are carefully paired with the original ultrasound images.
Loss function: During training, the model minimizes a loss function, such as the Dice coefficient (DSC).
| 9 |
where represents total overlapping pixels between predicted segmentation and ground truth , indicates total pixels in predicted segmentation and depicts total pixels in ground truth.
Phase II: BC classification using random forest ensemble
In Phase II, focus shifts from BC segmentation to classification, aiming to accomplish greater exactness in distinguishing between malignant, normal and benign cases in breast ultrasound dataset. For BC classification, an ensemble of novel classifiers is employed. These classifiers are modeled to capture wide variety of features from segmented breast ultrasound images, allowing for robust classification.
-
A.
Feature Extraction: The feature extractors are applied to segmented breast ultrasound images. These methods include texture analysis, shape analysis, and intensity-based features.
- Texture Analysis: Texture features are extracted to capture textural patterns within images. One commonly used texture feature is Haralick texture feature. It quantifies sample texture by analyzing spatial data distribution of pixel intensity ranges. A formula for computing a Haralick texture features, such as contrast, might look like this:
where total grey levels are represented as and normalized co-occurrence matrix of pixel intensities is indicated by .10 - Shape Analysis: Geometric features are extracted to characterize the shape of segmented regions. These features include area, perimeter, circularity, and eccentricity. The area of a segmented region is calculated by,
where and are the dimensions of the segmented region, and represents a binary mask indicating the region of interest.11 - Intensity-Based Features: Intensity-based features represent pixel intensity statistics within segmented regions. The pixel intensity’s standard deviation and mean within a region are computed as follows:
12
where and signifies dimensions of segmented region, is the binary mask, and represents pixel intensities within the region. Here, 13 features are extracted from texture-based features, 5 features are extracted from shape-based features, and 2 features are extracted from intensity-based features. Totally 20 different features are extracted from the segmented images.13
-
B.B. Classifier models: A set of classifier models is trained on the extracted features to perform BC classification. These classifiers specialize in capturing specific patterns within the feature space. These classifiers include ANN, SVM, decision trees and KNN.
- Decision Tree: Decision trees are versatile classifiers that capture both linear and nonlinear relationships in the feature space. They divide feature space into number of regions according to their feature values and assign class labels to each region.
- Support Vector Machine (SVM): SVMs are powerful predictor models that aim to find the hyperplane that increases margin among diverse classes within the feature dimension. They are robust for handling large-dimensional data.
- k-Nearest Neighbor (k-NN): It is distribution-free technique that allocates class labels to data points on the basis of majority class among closest neighbors within the feature dimension.
- Artificial Neural Network (ANN): Neural networks, including deep model, have the potential of learning complex patterns in the samples through layers of neurons. They capture intricate relationships between features.
Through combining the detection results of each decision tree, random forest algorithm minimizes over-fitting issues and maximizes model generalization ability. It creates a robust ensemble decision that enhances the reliability and accuracy of BC classification. Figure 3 represents visual layout of designed ensemble technique.
-
C.
Implementation of the Random Forest Meta-Classifier: Random forest algorithm is selected as meta-classifier that combines outcome of the individual models into a robust ensemble decision.
To make the final ensemble decision, the predictions from each decision tree are typically aggregated using a weighted average. In a weighted average, each decision tree's prediction is given a weight, and the final classification is the weighted sum of these predictions.
where indicates final prediction, implies number of decision trees in ensemble, depicts prediction of the decision tree and is the weight assigned to the decision tree’s prediction.14
Fig. 3.

Schematic diagram of the proposed ensemble classifier
Experimental results
Here, we showcase the findings of integrated approach to detecting different types of BCs from ultrasound images. Here, it details the simulation settings, performance indicators and simulation results. The dataset is partitioned into 3 major divisions: train, validate and test. To guarantee robust analysis and neglect overfitting, the whole dataset collected is partitioned into three major divisions- train, test and validate. Each category with different proportion of sample ratios (train: test): 90:10, 80:20, 70:30 and 60:40, in which specific percentage of train set utilized for validation, in each case. Case (i) holds data of 70% for Train_data (training), 20% for valid_data (validation) and 10% for test_data (testing). Case (ii) holds data of 60% for Train_data, 20% for valid_data and 20% for test_data. Case (iii) holds data of 50% for Train_data, 20% for valid_data and 30% for test_data. Case (iv) holds data of 40% for Train_data, 20% for valid_data and 40% for test_data. The empirical analysis conducted using different test cases is provided in Supplementary document (Refer part A-Table S1). The analysis carried out in terms of various metrics aids in envisaging the efficacy of the designed approach over others.
Experimental setup
The whole experimental operation is conducted on a MATLAB R2021a tool. The system for simulation process holds the configurations-Intel(R) core(TM) i7-11,700 processor, with 16 GB RAM and NVIDIA Titan X 12 GB GPU.
Parameter setting
Tables 2 and 3 present structure parameters of Attention U-Net model and parameter configuration of ensemble classifier, respectively. The upsampling process executed at Attention U-Net applies transpose convolution to update weights and generate high-resolution images. The ensemble classifier-SVM, ANN, kNN, Decision tree and Random forest are tuned utilizing Adam optimizer, which is chosen because of adaptable learning rate and capability to effectively tune the parameters. This well ensures high convergence speed and enhanced overall performance.
Table 2.
Structure parameters of Attention U-Net model
| Layers | Output dimension | Operation | |
|---|---|---|---|
| Conv_1 | 500 × 500 | [3 × 3, 64] × 2 | |
| Pool-Conv_2 | 250 × 250 | 2 × 2 max-pool | [3 × 3, 128] × 2 |
| Pool-Conv_3 | 125 × 125 | [3 × 3, 256] × 2 | |
| Pool-Conv_4 | 62 × 62 | [3 × 3, 512] × 2 | |
| Pool-Conv_5 | 31 × 31 | [3 × 3, 1024] × 2 | |
| Upsamp-Conv_6 | 62 × 62 | 2 × 2 upsamp | [3 × 3, 512] × 2 |
| Upsamp-Conv_7 | 125 × 125 | [3 × 3, 256] × 2 | |
| Upsamp-Conv_8 | 250 × 250 | [3 × 3, 128] × 2 | |
| Upsamp-Conv_9 | 500 × 500 |
[3 × 3, 64] × 2 [3 × 3, 1] × 2 |
|
Table 3.
Parameter configuration of ensemble classifier
| Models | Parameters | Ranges |
|---|---|---|
| SVM | Kernel function | Radial bias function |
| Regularization coefficient | 1.0 | |
| Kernel parameter | 0.5 | |
| ANN | Learning rate | 0.01 |
| Number of hidden neurons | 5 | |
| kNN | Number of neighbors | 5 |
| Leaf size | 30 | |
| Decision tree | Learning rate | 0.01 |
| Maximum depth | 10 | |
| Number of estimators | 100 | |
| Random forest | Number of trees | 50 |
| Sample proportion | 0.5 | |
| Weight | 1.5 |
Segmentation results
This section enumerates the result obtained with data split 90:10 ratio, which has proved its high effectiveness over other data splits. The comparison approaches include U-Net (method A) [17], U-Net3 + (method B) [33], SegNet (Method C) [34], AutoSegNet (method D) [23], DeepLab3 + (method E) [28], pspNet (pyramid scene parsing network) method F [35], Hybrid U-Net (method G) [36] and Modified UNet (method H) [29].
Figure 4 presents the graphical representation of segmentation performance analysis for different methods. The segmentation approach employed in our study ‘Attention U-Net model’ obtains higher DSC score of about 0.92 while other compared segmentation methods such as A to H only obtained less DSC score values. The IoU range measured for Attention U-Net model is 0.94 whereas for other techniques, the ranges are slightly lower.
Fig. 4.

Comparative analysis results of segmentation approach
The Attention U-Net model showed greater sensitiveness and specificity performance values of 94.75% and 92.92% correspondingly. In terms of F1 score, the Attention U-Net model achieves a high f1 score rate of 94.75%. Table S2 represents the performance values of segmentation results (Refer supplementary: part B-Table S2). Table 4 visualizes the segmentation output of approaches that are modern in current trend for diagnosing BC.
Table 4.
Visualization of segmentation results
From the Table 5, it is justified that the designed model with the assistance form segmentation module gains higher performance level in terms of evaluated measures such as accuracy, precision, recall and f1-score. This also highlights the need of segmentation component before classification, which means that identifying specific (target regions) is crucial for precise classification of severities.
Table 5.
Performance analysis with and without segmentation
| Modules | Performance measures in % | |||
|---|---|---|---|---|
| Accuracy | Precision | Recall | F1-score | |
| With segmentation | 99.57 | 99.62 | 99.25 | 99.05 |
| Without segmentation | 92.29 | 90.02 | 92.76 | 92.39 |
BC classification results
For fair evaluation, some breast tumor detection methods that used segmentation techniques to reach out to improved detection results are compared. Those techniques include AOBNN [17], Anchor-free NN [18], BTS-ST [20], GWO-WNN [23], CAM-DLS [28] and lightweight CNN [29]. The proposed random forest ensemble classifier uses segmented outputs of Attention U-Net to detect breast tumor cases such as malignant, benign and normal. Figure 5 shows confusion matrix of Attention U-Net based segmentation guided random forest ensemble classifier.
Fig. 5.

Confusion matrix
Table 6 presents the classifier performance analysis results by individual and ensemble manner. The ensemble approach integrating multiple models has surpasses the individual model performance of classifier in terms of classification performance. The increased performance is due to the combined strength of models such as- robust decision boundaries by SVM, simplicity and easy adaptability by k-NN, learned complex relationship by ANN, better interpretability by Decision tree and reduced over-fitting by random forest. Figure 6 shows classifier’s performance- training and testing.
Table 6.
Classifier performance analysis by individual and ensemble manner
| Models | Performance measures (%) | |||
|---|---|---|---|---|
| Accuracy | Precision | Recall | F1-score | |
| SVM | 83.2 | 82.4 | 81.8 | 82.1 |
| ANN | 81.2 | 80.3 | 79.9 | 80.2 |
| k-NN | 76.5 | 74.8 | 74.2 | 74.5 |
| Decision tree | 78.3 | 76.4 | 77.5 | 76.9 |
| Random forest | 82.7 | 81.9 | 81.2 | 81.5 |
| Ensemble (Proposed) | 99.57 | 99.62 | 99.25 | 99.05 |
Fig. 6.
Training and testing analysis: a Accuracy and b Loss
AUROC analysis performed to investigate classification ability of model in differentiating diverse classes within dataset is provided in Fig. 7. The graphical representation of designed method, investigated in terms of common classification evaluation measures compared with existing techniques (shown in Fig. 8). The designed model gives better results of 99.57% accuracy, 99.62% precision 99.25% Recall, and 99.05% f1-score for identification of breast tumors. The experimental outcome of proposed model improves accuracy rate by 0.34%, 9.32%, 8.82%, 1.57%, and 9.57% as compared to methods such as AOBNN, AFNN, BTS-ST, GWO-WNN, and CAM-DLS respectively. The performance values of Fig. 8 are represented in supplementary document (Refer part C-Table S3). Figure 9 presents graphical depiction of AUROC analysis with respect to different methods. Figure 10 visualizes extorted features from ultrasound images utilizing t-distributed stochastic neighbor embedding (t-SNE) scatter plot.
Fig. 7.

AUROC analysis
Fig. 8.

Classifiaction performance evaluation
Fig. 9.

Classification efficiency analysis
Fig. 10.

t-SNE scatter plot for feature visualization
Discussion
It is obvious that attention network-guided ensemble classifiers have been introduced in different areas of application because of their ability to address multi-label classification issues. Though segmentation-guided classification methods steadily improved in terms of diverse aspects, particularly for breast ultrasound analysis, they are still open for research improvements. The traditional methods that have used optimization-enabled neural network models although has satisfactory training performance, there occurs issue like over-fitting, time complexity, unsuitability to large-scale datasets, poor training accuracy, high false positives and negative rates. These problems are tackled in our proposed approach by infusing two-phase detection procedures such as segmentation and classification. Before injecting breast ultrasound images into proposed model, the pre-processing operations namely resizing, normalization and augmentation are performed on the images that rescale them into uniform dimensions, normalize variations in pixel intensity and augment data to increase data diversity. After pre-processing, the attention U-Net mechanism employed for segmentation purposes, segments breast tumor region in the image over the background to simplify the process and increase the computation efficiency. After segmentation, the features that are more contributable for tumor detection such as shape, intensity and texture are extracted via feature extraction procedures. This procedure aids the model by efficient utilization of computation resources, boosting learning speed and enhancing diagnostic accuracy. Also, the ensemble classifier formed by leveraging the benefits of combination of classifiers such as decision tree, SVM, k-NN and ANN, predict tumor stages via their own learning ability, and their prediction results are combined and trained using random forest algorithm with different subset of data and features to enhance reliability and accuracy of classification. These procedure as a combined algorithm makes diagnosis of breast cancer more consistent and reliable than existing methods. The discussion on methodology and research findings are provided in supplementary-Refer part D).
Conclusion
This research presents an integrated approach that combines precise segmentation and robust classification to enhance detection output using the Breast Ultrasound Images Dataset. In Phase I, we employed the state-of-art Attention U-Net for tumor segmentation. The Attention U-Net's ability to capture fine-grained details while maintaining a broader contextual understanding of ultrasound images proved instrumental in this endeavor. Phase II introduced a novel ensemble approach for breast tumor discrimination. The Random Forest meta-classifier then integrated these outputs, harnessing their collective strength to make robust classification decisions. This integrated approach capitalizes on the strengths of both segmentation and classification, addressing the intricacies of ultrasound images and improving diagnostic accuracy. The extensive evaluation, including metrics such as precision (99.62), F1-Score (99.05), recall (99.25), accuracy (99.57%) and AUROC (99.2%), highlighted the effectiveness of our integrated approach. Meanwhile, it is noticed that on continuous training of images for segmentation, the Attention U-Net model loss its ability to emphasize salient features and lead to information loss. To withstand its capacity to segment distinct feature characteristics of breast tumor images, a hybrid convolutional network is essential. So, as a future work, an advanced segmentation technique incorporating hybrid convolutional network with attention mechanism will be developed. In addition, it is planned to incorporate multiple datasets to guarantee its efficiency in diverse clinical settings and expanding the model’s usage to other imaging modalities to improve reliability. Also, it is planned to check model’s applicability on some other US imaging modalities such as abdominal, cardiac, or musculoskeletal etc.
Supplementary Information
Below is the link to the electronic supplementary material.
Author contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by MB and P. The first draft of the manuscript was written by MB and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent to publish
Not applicable.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Qi X, Yi F, Zhang L, Chen Y, Pi Y, Chen Y, Guo J, Wang J, Guo Q, Li J, Chen Y. Computer-aided diagnosis of BC in ultrasonography images by deep learning. Neurocomputing. 2022;472:152–65. 10.1016/j.neucom.2021.11.047. [Google Scholar]
- 2.Raza A, Ullah N, Khan JA, Assam M, Guzzo A, Aljuaid H. DeepBreastCancerNet: a novel deep learning model for BC detection using ultrasound images. Appl Sci. 2023;13(4):2082. 10.3390/app13042082. [Google Scholar]
- 3.Jahwar AF, Abdulazeez AM. Segmentation and classification for BC ultrasound images using deep learning techniques: a review. In 2022 IEEE 18th International Colloquium on Signal Processing & Applications. 2022; 225–230. 10.1109/CSPA55076.2022.9781824
- 4.Xu S, Liu L, Zhao Z. DTFTCNet: radar modulation recognition with deep time-frequency transformation. IEEE Trans Cogn Commun Netw. 2023;9(5):1200–10. 10.1109/TCCN.2023.3280949. [Google Scholar]
- 5.Uysal F, Köse MM. Classification of BC ultrasound images with deep learning-based models. Eng Proc. 2022;31(1):8. 10.3390/ASEC2022-13791. [Google Scholar]
- 6.Ragab M, Albukhari A, Alyami J, Mansour RF. Ensemble deep-learning-enabled clinical decision support system for BC diagnosis and classification on ultrasound images. Biology. 2022;11(3):439. 10.3390/biology11030439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang S, Liao M, Wang J, Zhu Y, Zhang Y, Zhang J, Zheng R, Lv L, Zhu D, Chen H, Wang W. Fully automatic tumor segmentation of breast ultrasound images with deep learning. J Appl Clin Med Phys. 2023;24(1):13863. 10.1002/acm2.13863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Li Y, Gu H, Wang H, Qin P, Wang J. BUSnet: a deep learning model of breast tumor lesion detection for ultrasound images. Front Oncol. 2022;12:848271. 10.3389/fonc.2022.848271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Luo Y, Huang Q, Li X. Segmentation information with attention integration for classification of breast tumor in ultrasound image. Pattern Recogn. 2022;124:108427. 10.1016/j.patcog.2021.108427. [Google Scholar]
- 10.Dar RA, Rasool M, Assad A. BC detection using deep learning: Datasets, methods, and challenges ahead. Comput Biol Med 2022; 106073. 10.1016/j.compbiomed.2022.106073 [DOI] [PubMed]
- 11.Michael E, Ma H, Li H, Qi S. An optimized framework for BC classification using machine learning. BioMed Res Int. 2022;2022. 10.1155/2022/8482022 [DOI] [PMC free article] [PubMed]
- 12.Hossain S, Azam S, Montaha S, Karim A, Chowa SS, Mondol C, Hasan MZ, Jonkman M. Automated breast tumor ultrasound image segmentation with hybrid UNet and classification using fine-tuned CNN model. Heliyon. 2023;9(11). 10.1016/j.heliyon.2023.e21369 [DOI] [PMC free article] [PubMed]
- 13.Tekin E, Yazıcı Ç, Kusetogullari H, Tokat F, Yavariabdi A, Iheme LO, Çayır S, Bozaba E, Solmaz G, Darbaz B, Özsoy G. Tubule-U-Net: a novel dataset and deep learning-based tubule segmentation framework in whole slide images of BC. Sci Rep. 2023;13(1):128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Iqbal A. Sharif M UNet: a semi-supervised method for segmentation of breast tumor images using a U-shaped pyramid-dilated network. Expert Syst Appl. 2023;221:119718. 10.1016/j.eswa.2023.119718. [Google Scholar]
- 15.Hossain AA, Nisha JK, Johora F. BC classification from ultrasound images using VGG16 model based transfer learning. Int J Image Graph Signal Process. 2023;13:12. [Google Scholar]
- 16.Cruz-Ramos C, García-Avila O, Almaraz-Damian JA, Ponomaryov V, Reyes-Reyes R, Sadovnychiy S. Benign and malignant breast tumor classification in ultrasound and mammography images via fusion of deep learning and handcraft features. Entropy. 2023;25(7):991. 10.3390/e25070991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Obayya M, Haj Hassine SB, Alazwari SK, Nour M, Mohamed A, Motwakel A, Yaseen I, Sarwar Zamani A, Abdelmageed AA, Mohammed GP. Aquila optimizer with Bayesian neural network for BC detection on ultrasound images. Appl Sci. 2022;12(17):8679. 10.3390/app12178679. [Google Scholar]
- 18.Wang Y, Yao Y. Breast lesion detection using an anchor-free network from ultrasound images with segmentation-based enhancement. Sci Rep. 2022;12(1):14720. 10.1038/s41598-022-18747-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Vigil N, Barry M, Amini A, Akhloufi M, Maldague XP, Ma L, Ren L, Yousefi B. Dual-intended deep learning model for BC diagnosis in ultrasound imaging. Cancers. 2022;14(11):2663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Iqbal A, Sharif M. BTS-ST: Swin transformer network for segmentation and classification of multimodality BC images. Knowl-Based Syst. 2023;267:110393. 10.1016/j.knosys.2023.110393. [Google Scholar]
- 21.Yan Y, Liu Y, Wu Y, Zhang H, Zhang Y, Meng L. Accurate segmentation of breast tumors using AE U-net with HDC model in ultrasound images. Biomed Signal Process Control. 2022;72:103299. 10.1016/j.bspc.2021.103299. [Google Scholar]
- 22.Jabeen K, Khan MA, Alhaisoni M, Tariq U, Zhang YD, Hamza A, Mickus A, Damaševičius R. BC classification from ultrasound images using probability-based optimal deep learning feature fusion. Sensors. 2022;22(3):807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bourouis S, Band SS, Mosavi A, Agrawal S, Hamdi M. Meta-heuristic algorithm-tuned neural network for BC diagnosis using ultrasound images. Front Oncol. 2022;12:834028. 10.3389/fonc.2022.834028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Liu H, Cui G, Luo Y, Guo Y, Zhao L, Wang Y, Subasi A, Dogan S, Tuncer T. Artificial intelligence-based BC diagnosis using ultrasound images and grid-based deep feature generator. Int J Gen Med, 2022; 2271–2282. [DOI] [PMC free article] [PubMed]
- 25.Saba T, Abunadi I, Sadad T, Khan AR, Bahaj SA. Optimizing the transfer‐learning with pretrained deep convolutional neural networks for first stage breast tumor diagnosis using breast ultrasound visual images. Microsc Res Tech, 2022;85(4):1444–1453. 10.1002/jemt.2400810.1002/jemt.24008 [DOI] [PubMed]
- 26.Sahu A, Das PK, Meher S. High accuracy hybrid CNN classifiers for BC detection using mammogram and ultrasound datasets. Biomed Signal Process Control. 2023;80:104292. 10.1016/j.bspc.2022.104292. [Google Scholar]
- 27.Vijayakumar K, Rajinikanth V, Kirubakaran MK. Automatic detection of BC in ultrasound images using Mayfly algorithm optimized handcrafted features. J Xray Sci Technol. 2022;30(4):751–66. 10.3233/XST-221136. [DOI] [PubMed] [Google Scholar]
- 28.Li Y, Liu Y, Huang L, Wang Z, Luo J. Deep weakly-supervised breast tumor segmentation in ultrasound images with explicit anatomical constraints. Med Image Anal. 2022;76:102315. [DOI] [PubMed] [Google Scholar]
- 29.Inan MSK, Alam FI, Hasan R. Deep integrated pipeline of segmentation guided classification of BC from ultrasound images. Biomed Signal Process Control. 2022;75:103553. 10.1016/j.bspc.2022.103553. [Google Scholar]
- 30.Lu SY, Wang SH, Zhang YD. SAFNet: a deep spatial attention network with classifier fusion for BC detection. Comput Biol Med. 2022;148:105812. 10.1016/j.compbiomed.2022.105812. [DOI] [PubMed] [Google Scholar]
- 31.https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset
- 32.Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A. Dataset of breast ultrasound images. Data Brief. 2020;28:104863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Alam T, Shia WC, Hsu FR, Hassan T. Improving BC detection and diagnosis through semantic segmentation using the Unet3+ deep learning framework. Biomedicines. 2023;11(6):1536. 10.3390/biomedicines11061536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Badrinarayanan V, Kendall A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39(12):2481–95. 10.1109/TPAMI.2016.2644615. [DOI] [PubMed] [Google Scholar]
- 35.Zhang Z, Gao S, Huang Z. An automatic glioma segmentation system using a multilevel attention pyramid scene parsing network. Curr Med Imag. 2021;17(6):751–61. 10.2174/1573405616666201231100623. [DOI] [PubMed] [Google Scholar]
- 36.Ben Ahmed I, Ouarda W, Ben Amar C. Hybrid UNET Model Segmentation for an Early BC Detection Using Ultrasound Images. In International Conference on Computational Collective Intelligence. 2022;464–476. 10.1007/978-3-031-16014-1_37
- 37.Laghmati S, Hicham K, Cherradi B, Hamida S, Tmiri A. Segmentation of BC on Ultrasound Images using Attention U-Net Model. Int J Adv Comput Sci Appl. 2023;14(8).
- 38.Pramanik P, Roy A, Cuevas E, Perez-Cisneros M, Sarkar R. DAU-Net: Dual attention-aided U-Net for segmenting tumor in breast ultrasound images. PLoS ONE. 2024;19(5):e0303670. 10.1371/journal.pone.0303670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Misra S, Yoon C, Kim KJ Managuli R, Barr RG, Baek J, Kim C. Deep learning‐based multimodal fusion network for segmentation and classification of BCs using B‐mode and elastography ultrasound images. Bioeng Transl Med, 2023;8(6):e10480. 10.1002/btm2.10480 [DOI] [PMC free article] [PubMed]
- 40.Misra S, Jeon S, Managuli R, Lee S, Kim G, Yoon C, Lee S, Barr RG, Kim C. Bi-modal transfer learning for classifying BCs via combined b-mode and ultrasound strain imaging. IEEE Trans Ultrason Ferroelectr Freq Control. 2021;69(1):222–32. 10.1109/TUFFC.2021.3119251. [DOI] [PubMed] [Google Scholar]
- 41.Sivamurugan J, Sureshkumar G. Applying dual models on optimized LSTM with U-net segmentation for breast cancer diagnosis using mammogram images. Artif Intell Med. 2023;143:102626. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





