Abstract
Breast cancer diagnosis, immune cell profile, and survival forecasting are important but usually done separately, limiting clinical interpretation. This work combines histopathological diagnosis, immunological microenvironment analysis, and prognostic modeling into a data-driven pipeline. The proposed system involves three phases: PatchSight Classifier uses an optimized InceptionResNetV2 network with patch-based augmentation and transfer learning to classify benign and malignant breast tissue from the BreakHis dataset; ImmuneMap Detector uses Faster R-CNN on immunohistochemistry images from the LYSTO dataset to detect and quantify tumor-infiltrating lymphocytes; and LifeSpan Prognosticator integrates diagnostic and immune features. The PatchSight Classifier outperformed VGG-16, DenseNet-121, and baseline InceptionResNetV2 models with 98.76% accuracy and 0.98 F1-score at 400× magnification. ResNet-101’s ImmuneMap Detector had 98% detection accuracy and low lymphocyte counting inaccuracy. The LifeSpan Prognosticator identified survival-influencing biomarkers with a C-index above 0.90. This comprehensive computational pathology system improves diagnostic precision, immunological assessment, and survival prediction with interpretable, high-accuracy models. We provide end-to-end decision assistance for early detection, immunological assessment, and personalized breast cancer prognosis.
Keywords: Breast cancer prediction, Deep learning, Survival analysis
Introduction
Breast cancer, the most frequent malignancy, affects women globally. Despite screening and therapy advances, clinical oncology issues with diagnosis, immune microenvironment evaluation, and survival prediction [1]. Tradition pathology workflows manually analyze histopathology and immunohistochemistry (IHC) slides, which is time-consuming, subjective, and prone to inter-observer variability [2]. Statistical survival estimates also neglect the complex relationships between imaging, immunological response, and genetics that determine patient outcomes [3]. These difficulties necessitate automated, integrated, and interpretable diagnosis, immunological profiling, and prognosis systems.
Modern deep learning and computational pathology have altered medical image analysis [4]. VGGNet, DenseNet, and InceptionResNet are expert at detecting malignant tissues from histopathology images, while Faster R-CNN is good at identifying immune cells in IHC slides [5]. Additionally, survival modeling methods like Cox Proportional Hazards (Cox PH), Random Survival Forests (RSF), and DeepHit have improved patient outcome prediction [6]. Most research focus on diagnostic categorization, immunological measurement, or survival estimation without a continuous analytical pathway to connect these stages into a clinically interpretable procedure [5, 6].
The three-phase unified deep learning approach proposed (Fig. 1) in this study integrates diagnostic categorization, immune cell identification, and survival outcome prediction for breast cancer analysis to fill this gap. PatchSight–ImmuneMap–LifeSpan (PIL Framework) is the proposed system: Phase I – PatchSight Classifier classifies BreakHis histopathology picture patches into benign and malignant using an optimized InceptionResNetV2 architecture with transfer learning, batch normalization, and dropout regularization. In Phase II, ImmuneMap Detector uses a Faster R-CNN model with ResNet-101 backbone to automatically recognize and quantify tumor-infiltrating lymphocytes (TILs) in IHC pictures from the LYSTO dataset, generating Immunoscores that measure immune activity and therapy response. Phase III — LifeSpan Prognosticator applies DeepHit neural networks, Cox PH, and Random Survival Forests trained on the METABRIC dataset to diagnostic and immune-derived data to estimate patient survival probability.
Fig. 1.
PatchSight–ImmuneMap–LifeSpan (PIL) framework
The proposed PIL framework integrates histopathology interpretation with immunological and prognostic modeling in a data-driven pipeline, unlike previous methods [7]. Multi-level feature integration, optimal architectural improvements to InceptionResNetV2 for patch-level accuracy, and cross-domain learning technique combining visual and clinical modalities distinguish it from previous works [8].
This study makes three contributions:
Framework innovation: A novel, interpretable, three-phase AI pipeline (PatchSight–ImmuneMap–LifeSpan) for breast cancer diagnosis, immunological profiling, and survival prediction.
Optimizing InceptionResNetV2 and Faster R-CNN architectures with multi-stage feature fusion and transfer learning improves diagnostic and immune detection accuracy.
Multi-model survival prediction combining DeepHit, Cox PH, and RSF to find biomarkers and create tailored survival curves.
Compared to existing deep learning and survival modeling approaches, the proposed PIL framework improves diagnostic classification accuracy by up to 4.2%, lymphocyte detection precision by 9–24%, and survival prediction C-index by 17–27%. These enhancements demonstrate the robustness and clinical interpretability of the unified system across diagnostic, immunological, and prognostic tasks.
Experimental results show that framework works: PatchSight Classifier had 98.76% accuracy and 0.98 F1-score, ImmuneMap Detector had 98% lymphocyte quantification accuracy, and LifeSpan Prognosticator had a Harrell’s C-index above 0.90 for survival prediction. These results show that the integrated pipeline outperforms single-task models and is scalable for data-driven, individualized breast cancer management.
Preliminaries
This section describes the theoretical and mathematical foundations of the PatchSight–ImmuneMap–LifeSpan (PIL) paradigm. The approach uses deep convolutional neural networks (CNNs) for diagnostic classification, region-based convolutional object detection for immunological quantification, and survival modeling for prognostic assessment [9].
Convolutional network bases
Hierarchical feature representations are optimized using a convolutional neural network to map input images to diagnostic classes [10].
Let
represent an image with height , width W, and channels C.
Each convolutional layer l performs the operation:
![]() |
1 |
where
and
denote the learnable kernel weights and biases, ∗ is the convolution operatorand σ(⋅) is a nonlinear activation function (typically ReLU).
Pooling layers reduce spatial dimensions to enhance translation invariance:
![]() |
2 |
Where k and s represent kernel size and stride.
The network concludes with a dense layer and softmax activation.
![]() |
3 |
For K diagnostic categories.
The cross-entropy loss for classification is given by:
![]() |
4 |
Region-based object detection framework
PatchSight Classifier, based on optimized InceptionResNetV2,‘s main optimization goal is this.The ImmuneMap Detector uses the end-to-end trainable Faster R-CNN paradigm for region proposal generation and object detection [11]. The ImmuneMap Detector builds on the Faster R-CNN paradigm, which combines region proposal generation and object detection in an end-to-end trainable architecture [12].
Given an image I, a feature extractor E(⋅) generates a feature map F = E(I). The Region Proposal Network (RPN) outputs anchor boxes
characterized by objectness scores and
coordinates
.
Total detection loss = classification + bounding-box regression.
![]() |
5 |
where
the ground-truth is label for anchor j, and
is the smooth
loss:
![]() |
6 |
Detected lymphocytes within a region of interest (ROI) are counted as:
![]() |
7 |
where IOU is the Intersection-over-Union score with threshold
.
The Immunoscore (IS)—a quantitative indicator of immune activity—is then defined as:
![]() |
8 |
where |R| is the area of the ROI.
Survival analysis and time-to-event modeling
LifeSpan Prognosticator predicts survival using clinical, diagnostic, and immunological variables [13]. We use three complementary survival modeling paradigms:
For a patient i with covariate vector
the hazard function is modeled as:
![]() |
9 |
where
is the baseline hazard and
are learnable coefficients.
The model parameters are estimated by maximizing the partial log-likelihood:
![]() |
10 |
RSF is an ensemble of survival trees that divide data by variables and estimate cumulative hazard functions:
![]() |
11 |
with
events and
individuals at risk at time
. The final ensemble survival estimate is:
![]() |
12 |
DeepHit optimizes survival time and event type to estimate their combined distribution:
![]() |
13 |
where
models the observed event probability, and
enforces temporal ordering between survival times.
Integration principle
The three modules are linked by hierarchical feature fusion [14]. Let the diagnostic prediction
, immune quantification
and clinical covariates
form a composite vector
![]() |
14 |
This feature vector serves as input to the survival predictor
, enabling multi-modal learning across visual and clinical spaces.
The unified objective for the integrated framework can be expressed as:
![]() |
15 |
where
and
balance the importance of the respective modules.
The PIL Framework, which integrates CNN-based spatial learning, region-based object detection, and statistical survival analysis, is based on the preceding formulations. These preliminary steps lay the framework for the suggested methodology and experimental execution.
Problem statement
Due to tumor heterogeneity, different histological patterns, and pathologists’ subjective slide interpretation, breast cancer diagnosis and prognosis remain among computational pathology’s most difficult challenges [15]. Immunohistochemistry (IHC) slides show immunological micro environmental cues that are often overlooked in automated diagnostic procedures, while histopathology photos contain fine morphological characteristics that are hard to quantify. Most research segregate diagnostic categorization, immunological profiling, and survival prediction, prohibiting the development of biomarkers that link morphology, immunity, and prognosis in a computational pipeline [16].
Let
be a set of input samples, where each
consists of an image patch from histopathology
an IHC image
and associated clinical features
.
The proposed pipeline comprises three primary objectives:
![]() |
16 |
where
denotes benign and
denotes malignant tissue.
The goal is to maximize classification accuracy:
![]() |
17 |
where I(⋅)is the indicator function. The optimized model
constitutes the PatchSight Classifier.
Given the diagnostic outputs and corresponding IHC images, learn
![]() |
18 |
where
represents the Immunoscore derived from tumor-infiltrating lymphocyte (TIL) density.
The objective minimizes the TIL detection and counting loss:
![]() |
19 |
where
is the ground-truth lymphocyte count obtained from annotated images. The optimized model
defines the ImmuneMap Detector.
(c) Survival Prediction
Integrate diagnostic, immune, and clinical features to estimate survival risk or time-to-event.
Let the final feature representation for each patient be:
![]() |
20 |
Learn the mapping:
![]() |
21 |
where
denotes the estimated survival risk or survival time.
The survival model is trained by minimizing a task-specific loss, such as the negative log partial likelihood in the Cox model:
![]() |
22 |
The optimized function
represents the LifeSpan Prognosticator.
The overall pipeline aims to jointly optimize the three components:
![]() |
to achieve an interpretable, generalizable, and high-accuracy model across diagnostic, immunological, and prognostic tasks.
Formally:
![]() |
23 |
where
and
are weighting parameters that balance the three objectives.
Morphological, immunological, and clinical variables work together in the survival model through unified optimization, boosting robustness, generalizability, and interpretability.
The PIL framework’s end-to-end integration of diagnostic, immunological, and clinical modalities is expected to surpass single-task, hand-crafted, and traditional techniques in predicting accuracy and interpretability. The approach is predicted to detect hidden relationships that explain tumor aggressiveness and patient-specific survival by optimizing morphological and immunological representations under a unified loss, enabling clinically actionable AI for precision oncology [17].
Literature survey
Breast cancer prediction, diagnosis, and prognosis have been extensively explored utilizing a number of computational techniques, ranging from classical machine learning methods to advanced deep learning architectures. Several research have focused on histopathology image analysis, tumor-infiltrating lymphocyte (TIL) detection, survival prediction, and recurrence prediction [18, 19]. This section outlines the important contributions in these areas, highlighting the methods utilized, datasets used, performance measures and limitations.
Histopathology image classification
For automated breast cancer diagnosis, histopathology image analysis is commonly used. On public datasets like BreakHis and Bisque, researchers used CNNs and transfer learning [20]. Table 1 covers studies in this field, including methods, datasets, accuracy, and limits. Deep learning architectures like ResNet, DenseNet, and hybrid CNN models yield good classification accuracy, but generalizability across magnification factors and dataset diversity is difficult [27].
Table 1.
Summary of related works-1
| Reference | Algorithm/Model | Dataset | Accuracy | Limitations |
|---|---|---|---|---|
| [21] | ResNet-50 + Kernelized Weighted Extreme Learning Machine | BreakHis, Bisque | 88.36% (40X), 87.14% (100X), 90.02% (200X), 84.16% (400X) | Accuracy decreases at higher magnification (400X), may not generalize to unseen datasets |
| [22] | DenseNet-169 (with layer modifications) | Breast histology images (invasive ductal) | 98.21% | Only invasive ductal carcinoma considered; may not generalize to other subtypes |
| [23] | DenseNet-201 (pre-trained CNN for feature extraction & classification) | Breast histopathology images | 95.8% | Pre-trained model dependency; dataset specifics not fully described |
| [24] | CNN + ML | Ductal carcinoma images | 87% | Requires accurate nuclei segmentation; may be computationally intensive |
| [25] | DenseNet201 + XGBoost (Transfer Learning + Classifier) | BreakHis | 93.6% (40X), 91.3% (100X), 93.8% (200X), 89.1% (400X) | Accuracy decreases with higher magnifications; combination model may be complex |
| [26] | CNN + Grey Wolf Optimization (GWO), and Modified Gorilla Troops Optimization (MGTO) | BreakHis | Accuracy − 93.13% | Low Accuracy |
[21] introduced a model that utilizes Resnet- 50 and the kernelized weighted extreme learning machine method to diagnose breast cancer based on histopathological images. The histopathology images included in their research are sourced from the BreakHis and also fromBisque databases. The suggested model achieved accuracies of 88.36%, 87.14%, 90.02%, and 84.16% at magnifications of 40X, 100X, 200X, and 400X, respectively [22]. presented a work to design a dense deep architecture and to classify the invasive ductal breast histology images. Densenet-169 with some layer modifications was used. The model achieved an accuracy of 98.21%.
[23] aims to simplify the process of automatic breast cancer categorization by utilizing CNN. A pre-trained neural network called DenseNet-201 was used for automatic feature extraction and classification [24]. proposed a novel methodology to detect the nuclei from breast cancer histopathology images using filters having anisotropic diffusion. Nuclei detection in ductal carcinoma breast images were done by utilizing multilevel saliency. By using a slightly modified CNN model, classification of the segmented nuclei into benign and malignant was carried out. The Convolutional Neural Network (CNN) model is combined with multilevel saliency technique. The proposed model gives a accuracy of 87%.
[26] introduced a novel method called Deep Neural Network with Support Value (DNNS) to produce images with better quality and for accurate categorization of carcinoma in breast region from histopathology images.
[25] created a transfer learning based method where both DenseNet201 model and XGBoost models were combined. The DenseNet201 model was used to extract features from the breast cancer histopathology images and XGBoost was used as a classifier. They evaluated the performance of the models on images with all magnification factors in the BreakHis dataset. Their work achieved an accuracy of 93.6%, 91.3%, 93.8%, and 89.1% in images with magnification of 40X, 100X, 200X and 400X respectively.
[28] created a new transfer learning based method for classifyingbreast histopathology images. This method can handle both magnification dependent(binary) and independent (multiclass) classifications. This work utilizes ResNet-18with fine tuning block-wise. Data augmentation along with global contrastnormalization was done to improve the performance of the classification [29]. utilized deep learning techniques to automaticallyclassify breast cancer histopathology images. To rectify data imbalances, augmentation process was implemented. Transfer learning methodologies were thenapplied to models including InceptionV3 and InceptionResNetV2 to accomplishmulticlass and binary classification. Further they used an autoencoder to convert thefeatures obtained by InceptionResNetV2 to a smaller dimensional space and clustering was performed. Thus, autoencoder based clustering gave much better results when compared to InceptionResNetV2 alone.
Tumor-infiltrating lymphocyte (TIL) detection
TIL quantification and analysis are essential for understanding breast cancer and other tumor immune responses. Several research have used U-Net variations, CNN ensembles, and exclusive autoencoders (XAE) to detect and quantify lymphocytes. Table 2 summarizes these methodologies, datasets, performance indicators, and constraints. High F-scores and Dice coefficients show these approaches work, but dataset size, staining variability, and computational complexity remain.
Table 2.
Summary of related works-2
| Reference | Algorithm/Model | Dataset | Accuracy/Performance | Limitations |
|---|---|---|---|---|
| [30] | convolutional neural network (CNN) architectures (namely VGG16, Inception-V4, and ResNet-34) | Tumour WSIs from 89 patients | Correlation coefficient 0.89 with manual TIL counts | Small dataset; performance dependent on manual annotation quality; focused only on TILs in tumour regions |
| [31] | EUNet (U-Net with EfficientNet encoder) | Neuroblastoma tissue slides labeled with CD3 T-cell marker | Mean Absolute Error (MAE) 3.1 | Limited to Neuroblastoma; requires accurate lymphocyte labeling; computational complexity due to resampling & augmentation |
| [32] | CNNs: 34-layer ResNet, Inception v4, 16-layer VGG | Breast cancer whole slide images | Comparable to best-reported methods | Accuracy not quantified; multi-model ensemble increases computational cost |
| [33] | Framework: ROI selection + U-Net T-lymphocyte detection + TIL counting | Breast cancer whole slide images | Correlated TIL score with genes/immune pathways | No direct numeric accuracy; pipeline dependent on ROI selection quality |
| [34] | BCF-Lym-Detector | Histopathology images | F-score 0.75, Dice coefficient 88.31% | May require substantial computation; generalization to other tissues unknown |
| [35] | LSATM-Net (two-step CNN: artifact removal + instance segmentation) | LYSTO dataset | F-score 0.74 | Moderate accuracy; pipeline complexity; performance may vary with different staining protocols |
| [36] | PathoNet (U-Net backbone for lymphocyte and Ki-67 detection) | Immunohistochemistry-stained breast cancer images | Outperforms state-of-the-art in harmonic mean measure | Specific to IHC images; exact numeric accuracy not reported; may require extensive training data |
[30] proposed a deep learning TIL detection method. The model has two algorithms. Initial algorithms identify TIL-containing tissue areas from the entire image. It then passes the results to the algorithm that measures TILs at each site. The technique was run again on larger tumourWSIs from 89 people and associated with clinicopathological data. The algorithm had a 0.89 correlation coefficient with the manual TIL count in the test set (n = 47) with TILs. Increased TIL density was linked to lower tumor stage, no lymphovascular invasion, and seminoma histology.
[31] developed a novel machine learning (ML) method for neuroblastoma (NB) specimen immunological activity quantification. Initial training trains the EUNet, a U-Net model with an EfficientNet encoder, to recognize lymphocytes on digitized tissue slides labeled with the CD3 T-cell marker. Resampling, data augmentation, and transfer learning reduce overfitting and selection bias and ensure repeatability. A potential new method for assessing neuroblastoma patients’ immunological content is possible. With a mean absolute error of 3.1 on the test set, the EUNet model predicts densities that match those of expert pathologists.
[32] have developed and reviewed algorithms for convolutional neural network analysis to provide integrated maps of TILs and cancer areas in standard diagnostic whole slide tissue images of breast cancer. The merged maps help with better TIL quantification by shedding light on the distribution and anatomical characteristics of lymphocyte infiltrates. Three CNNs namely 34-layer ResNet, Inception v4, and 16-layer VGG were used to assess both tumour and TIL evaluations. The findings were comparable to those obtained using the best-reported methods.
A framework for Tumour Infiltrating lymphocyte characterization in breast cancer histology images was created by [33]. Three tasks were carried out using their framework: (1) automated area of interest (ROI) selection; (2) Employing U-Net for T-lymphocyte identification; and (3) counting of TILs on whole slide images. The genes and immune response pathway showed a substantial correlation with the TIL score. The quantity and dispersion of TIL clusters, as represented by spatial TIL characteristics, were significant predictors of patient outcomes.
[34] employed a unique feature extraction technique named BCF-Lym-Detector to recognize and segment lymphocytes. In contrast to state-of-the-art techniques, BCF-Lym-Detector was able to give an F-score of 0.75 for lymphocyte detection.
Histopathology images can be used to identify and classify cells in two steps [35]. The suggested pipeline uses a customized CNN named “LSATM-Net” to use split, asymmetric transform, and merging techniques to remove hard negative occurrences (artifacts) in the first phase. However, instance segmentation in the second step determines and compares lymphocyte counts to ground truth data. The proposed pipeline is also tested against the most advanced single- and two-stage detectors, such as SC Net, Cascade-Mask RCNN, Mask RCNN, and Retina Net. The empirical assessment on LYSTO dataset samples shows that the suggested LSTAM-Net can accurately eradicate difficult negative stain artifacts and learn picture alterations with an F-score of 0.74.
PathoNet was proposed by [36] as a backbone architecture to U-Net for identifying lymphocytes and the nuclear protein Ki-67 in immunohistochemistry-stained breast cancer images. In terms of harmonic mean measure obtained, the suggested backend, PathoNet, performs better than the state-ofthe- art techniques put forward thus far.
Breast cancer survival prediction
Both classical machine learning and deep learning models have been used to predict breast cancer survival. Random Forests, SVMs, Decision Trees, Cox Proportional Hazard models, and deep neural networks have been applied to institutional registries and SEER databases. Table 3 lists these research’ prediction accuracy and limitations. These studies emphasize model interpretability, dataset restrictions, and survival estimates using tumor size, lymph node involvement, and cancer stage.
Table 3.
Summary of related works-3
| Reference | Algorithm/Model | Dataset | Accuracy/Performance | Limitations |
|---|---|---|---|---|
| [37] | Random Forest, Extreme Boost, Decision Tree, Logistic Regression, SVM, Neural Networks | shiraz dataset, Tehran dataset | 85.56% | Limited feature interpretability; dataset specifics not fully described; only standard ML models applied |
| [38] | Cox PH model, Survival SVM, Extreme Gradient Boosting, Random Survival Forests | Breast cancer patient survival data | C-Index: Cox PH 0.63, Extreme Gradient Boosting 0.73 | Small improvement over Cox PH; limited to survival prediction; dataset size not specified |
| [39] | Deep Learning with Gated Attention + Random Forest | Breast carcinoma dataset (multimodal) | Accuracy 0.91, Recall 0.79, Precision 0.84, AUC 0.95 | No predictive markers identified; model interpretability limited |
| [40] | Cox Regression + Feature Selection | Breast cancer patient data | Hazard ratio 4.8 for diabetes affecting early recurrence | Focused only on selected clinical features; not generalizable to ML predictive models |
| [41] | Deep Learning & ML (Decision Tree, Random Forest, MLP, SVM) | 4,901 patients from University of Malaya Breast Cancer Registry | DT: 82.5%, RF: 83.3%, MLP: 88.2%, SVM: 80.5% | SVM underperformed; dataset limited to single center; feature selection not detailed |
| [42] | Pareto-optimal Deep Neural Networks (DNNs) + NSGA-III + Fuzzy Inferencing | SEER database | F1, Accuracy, Sensitivity optimized; | Complex pipeline; interpretability requires understanding of hyperparameter impact; may be computationally intensive |
| [43] | Hybrid Data Mining: LASSO + GA + ANN & Logistic Regression + SMOTE/RUS | Breast cancer patient survival over 1, 5, 10 years | Sensitivity analysis showed important features: cancer grade, metastasis, lymph node involvement; | Multi-stage pipeline; long-term prediction may be dataset dependent; requires extensive preprocessing |
[37] developed machine learning breast cancer survival models. Prediction models used random forests, extreme boosts, decision trees, logistic regression, SVMs, and neural networks. Clustering by breast tissue receptor status enabled advanced modeling with random forest. Random forest model variable selection rated important features. Decision tree was the least accurate at 79.8% and random forest the most at 85.56%. This study found that tumour size, number of positive lymph nodes, total axillary lymph nodes removed, cancer stage classification, diagnosis methods, and primary treatment modalities were most important. Such criteria can improve healthcare decision-making.
The Concordance-Index (C-Index) is used to compare the Cox PH model to survival SVMs, extreme gradient boosting, and random survival forests in predicting survival. They concluded that ML models could perform as well as the Cox Proportional Hazards model, which had a C-Index score of 0.63, and that extreme gradient boosting could even outperform it, with a score of 0.73 [38].
Gated attention and a random forest algorithm were used in her deep learning method [39]. Breast cancer prognosis is more accurate with multimodal data. The model had 0.79 recall, 0.91 accuracy, 0.84 precision, and 0.95 AUC. No survival-predicting markers were found.
In a Cox regression model [40], used feature selection to identify breast cancer prognosis, survival, and recurrence factors. The survival analysis with feature selection showed that diabetics had a significantly higher chance of early breast cancer resurgence.
[41] estimated breast cancer survival rates in 4,901 University of Malaya Medical Centre Breast Cancer Registry patients using deep learning and machine learning. The decision tree (DT), random forest (RF), and multilayerperceptron (MLP) classifiers predicted survival in the investigated samples with 82.5%, 83.3%, and 88.2% accuracy. SVM yielded 80.5%. The tumour size was the best predictor of breast cancer survival in this study.
A reliable prognostic modeling technique was suggested [42]. The SEER database provided breast cancer data for this study. A Pareto optimal set of deep neural networks (DNNs) with good performance metrics is produced. The evolutionary approach NSGA-III optimizes DNN hyperparameters after initialization. Fuzzy inference selects the final DNN algorithm from the Pareto optimal set of many DNNs. Instead of using DNNs as a black box, the suggested method shows how hyperparameter changes effect F1, accuracy, and sensitivity. Improved interpretability can also influence DNN behavior.
[43] found changing survival characteristics using hybrid data mining. Factors were assessed over one, five, and ten years. LASSO regression analysis and GA metaheuristic optimization created their most cost-effective models. To reduce the survival-death gap, Random Under-sampling (RUS) and Synthetic Minority Over-sampling Technique (SMOTE) were used to improve the classification models. The final phase used 10-fold cross-validation with ANNs and Logistic Regression models. Each model used sensitivity analysis (SA) to determine variable relevance by era and model. Cancer grade, metastasis, and lymph node involvement affect survival. Their importance increases when predicting 10-year survival. Similar trends in tumour markers and cancer stage improve prognosis.
Breast cancer recurrence prediction
Breast cancer recurrence prediction is crucial for post-treatment care. SVM, ANN, Decision Trees, ensemble learning, and natural language processing have been applied to Wisconsin Prognostic Breast Cancer Dataset and institutional registries. Table 4 highlights achievements, measures, and constraints. Feature selection and ensemble methods enhance predictive accuracy, but generalizability and dataset size remain issues. Table 4 lists these research’ prediction accuracy and limitations.
Table 4.
Summary of related works-4
| Reference | Algorithm/Model | Dataset | Accuracy/Performance | Limitations |
|---|---|---|---|---|
| [44] | Natural Language Processing + Machine Learning (OneR algorithm) | Medical records (unspecified) | Best sensitivity-specificity tradeoff with OneR | Prognostic features for survival not identified; dataset details not provided |
| [45] | ANN, SVM, Decision Tree | Iranian Centre for Breast Cancer (ICBC) | SVM: 0.957, ANN: 0.947, DT: 0.936 | Limited to Iranian dataset; predictive markers not analyzed; generalizability uncertain |
| [46] | Multiple Linear Regression, SVM with RBF Kernel, Decision Tree (Gini, Entropy, Information Gain) | Wisconsin Prognostic Breast Cancer Dataset (WPBC) | SVM + K-fold CV: highest accuracy | Dataset limited; specific performance metrics for all models not detailed |
| [47] | SVM, Naive Bayes, C4.5 Decision Tree + Feature Selection | Wisconsin Dataset, UCI ML Repository | Improved accuracy after feature selection | Accuracy dependent on feature selection; dataset limited in size and scope |
| [48] | SVM, ANN, Cox Proportional Hazard Regression | 678 Korean patients | SVM performed best | Small dataset; limited to Korean population; 7 features used only |
| [49] | XGBoost + Case-Based Reasoning (Ensemble Learning) | Breast cancer recurrence data (unspecified) | Enhanced precision for recurrence prediction | Dataset not specified; interpretability relies on case-based reasoning; performance not quantitatively reported |
| [50] | Deep Neural Network, Random Forest | Wisconsin Prognostic Breast Cancer, UCI ML Repository | Random Forest gave best accuracy | Exact performance numbers not reported; dataset limited; generalizability uncertain |
A medical lexicon was constructed utilizing natural language processing to extract important medical data features [44]. Several machine learning methods predicted breast cancer recurrence. Balanced sensitivity and specificity made OneR algorithm better. Survival-related prognostic factors were not done. ANN, SVM, and Decision Tree were used to create prediction models [45]. Iranian Centre for Breast Cancer (ICBC) data is used to examine these three well-known algorithms’ specificity, sensitivity, and accuracy. This study found SVM, ANN, and DT accuracy of 0.957, 0.947, and 0.936. Best accuracy and lowest error rate for breast cancer recurrence prediction is SVM classification. A decision tree model has the lowest anticipated accuracy.
[46] predicted breast cancer reappearance using machine learning and evaluated all models for accuracy, sensitivity, etc. All models used Wisconsin Prognostic Breast Cancer Dataset features. Multiple Linear Regression, Support Vector Machine with RBF Kernel and Leave-One-Out (K-fold cross-validation), and Decision Tree with Gini Index, Entropy, and Information Gain were employed. SVM and K-fold Cross-validation accurately predicted recurrence and nonrecurrence.
Breast cancer recurrence is predicted by data mining [47]. They also boost model accuracy. The UCI machine learning repository’s Wisconsin dataset contained cancer patient data. SVM, Naive Bayes, and C4.5 Decision Tree were used to analyze 35 dataset features and predictions. Smart feature selection reduced lower-ranking traits to improve model accuracy. These traits complicate classification algorithms with minimal benefit. All three algorithms were more accurate after carefully selecting the highest-ranking features.
A novel SVM prognosis model [48] predicted Korean breast cancer recurrence within 5 years of surgery. They next tested the model’s prediction power against other popular models. The Korean tertiary teaching hospital provided 1995–2001 data on 678 breast cancer patients. Their prediction used seven pathogenic traits. Comparison of artificial neural network, SVM, and Coxproportional hazard regression prediction models. Best-performing SVM model compared to other prognostic models. The BCRSVM model beat previous breast cancer recurrence prognosis models.
Ensemble learning and case-based reasoning were used in a decision support system [49]. The goal is to improve doctors’ breast cancer recurrence prediction. The clearer case-based explanation helps clinicians evaluate the system’s forecast accuracy and make choices. Ensemble learning and case-based reasoning predict breast cancer recurrence. Cutting-edge ensemble learning algorithm extreme gradient boosting (XGBoost) predicts breast cancer recurrence.
[50] used Deep Neural Network and Random Forest classifiers separately to improve model prediction accuracy. Wisconsin Prognosis Breast Cancer from the UCI Machine Learning Repository was utilized. Random Forest was the most accurate over previous study.
Many studies have used deep learning in breast cancer histopathology images. Most of them use Transfer Learning to diagnose medical images by training the CNN on a non-clinical dataset and fine-tuning the network with the features learned [20, 51]. Few studies worldwide construct and train a unique CNN architecture utilizing medical data [52, 53]. Most histopathological images are high-resolution, so training them directly would need a lot of memory and effort. Thus, memory usage and parameters must be reduced for faster trainingFew studies have used ML to forecast breast cancer prognosis and determine the immunoscore in cancer tissues using immunohistochemistry (IHC) images, a good prognostic indicator. Few studies have counted and estimated immunoscore from lymphocytes [54, 55].
Recent transformer-based approaches focus on optimizing efficiency and attention mechanisms, demonstrating improved performance in COVID-19 diagnosis, blood cell classification, and brain tumor identification through lightweight transformers and multi-stage attention-driven medical image analysis [56–58]. Recent studies [59–68] extensively explore advanced deep learning, hybrid CNN–U-Net, attention-based, transformer, and optimization-driven frameworks for medical image analysis, achieving improved accuracy in disease detection, segmentation, and classification across ophthalmology, oncology, neurology, and radiology.
These works collectively emphasize model explainability, multi-attention mechanisms, feature optimization, and comparative evaluations, highlighting the growing role of unified and hybrid AI architectures in enhancing clinical decision support systems.
Thus, we developed a deep learning architecture to automatically classify breast cancer histopathology photographs of various magnifications. Histopathological images were patched to reduce memory utilization and increase dataset size to avoid data imbalance and overfitting. The model was tested against pre-trained architectures. Tumor site lymphocyte count determines prognosis. However, counting lymphocytes is difficult and error-prone. The upgraded Faster-RCNN object detector with various feature extractors and a counter automatically detected and quantified lymphocytes. Cancer CD3 + and CD8 + immunohistochemistry images showed the count. In addition, survival analysis and machine learning on the METABRIC dataset estimated five-year breast cancer relapse and patient prognosis. Clinicians can tailor treatment by examining patient prognoses.
Proposed system
The proposed framework is designed to provide an integrated and interpretable approach for breast cancer diagnosis, immune characterization, and survival prediction by fusing multimodal information from histopathology, immunohistochemistry, and clinical data. The system combines deep learning–based image analysis with clinical prognostic modeling to capture both visual and contextual patterns associated with tumor heterogeneity and patient outcomes.
As illustrated in Fig. 2, the framework consists of three key interconnected modules:
Fig. 2.

Detailed view of proposed system
PatchSight Classifier, which processes histopathology images through a sequence of convolutional, normalization, dropout, and dense layers to classify extracted image patches as benign or malignant;
ImmuneMap Detector, which analyzes immunohistochemistry images to identify regions of interest (ROIs) and compute an Immunoscore reflecting the spatial distribution of immune infiltration; and.
LifeSpan Prognosticator, which integrates Immunoscore data and clinical parameters through survival modeling to predict long-term outcomes.
A complete oncological decision support system is created by combining the results of these three components to forecast diagnosis, Immunoscore, and survival likelihood. The next subsections describe the architectures, algorithms, and optimization methodologies for diagnosis, Immunoscore computation, and survival prediction of the proposed system.
Phase I: diagnostic classification (PatchSight classifier)
The PatchSight–ImmuneMap–LifeSpan (PIL) method begins with histopathological breast tissue classification. This level automates malignant region detection to reduce manual pathology assessment time and subjectivity. For robust microscopic image processing, PatchSight combines an upgraded InceptionResNetV2 deep learning backbone, task-specific architectural tweaks, and transfer learning.
This research used Break His, a popular digital histopathology dataset [69]. The 7,909 color pictures from 82 patients are split into benign (2,480) and malignant (5,429), each with four subtypes: benign: Adenosis, Fibroadenoma, Phyllodes Tumor, Tubular Adenoma; malignant: Ductal, Lobular, Mucinous, Papillary Carcinomas.
All images are collected at four magnification levels (40×, 100×, 200×, and 400×) with a 700 × 460 pixel resolution in RGB format. This dataset allows binary (benign vs. malignant) and multi-class (eight subtypes) classification tasks and is extensively used for deep learning and hybrid frameworks using CNNs, transfer learning, or multi-phase optimization. Preprocessing procedures including color normalization, image augmentation, and magnification-level fusion enable multiscale feature extraction. BreakHis is a standard for histopathological image analysis algorithms employing accuracy, sensitivity, specificity, and AUC, making it essential for breast cancer detection and computational pathology research.
A deep learning model’s efficiency depends on the amount of training samples used. The BreakHis dataset offers restricted training photos for each category. Thus, to reduce overfitting, For the network, we need to add training data by patch extraction and data augmentation. Training directly on high-quality histopathological photos would be memory-intensive and time-consuming, thus they were separated into patches. Images from the breakHis dataset are separated into 300 × 300 × 3 patches for each magnification factor. A library named patchify is used to break huge images into smaller patches.
The image is resized to a square dimension of 460 × 460 × 3 and separated into overlapping patches. Our 300 × 300 × 3 patch size provides a wider field of view, more tissue information for classification, and more local discrimination features than smaller patches, which lack sufficient diagnostic information. Before training, image patches undergo data augmentation methods such as zooming, rotating, flipping, and adjusting brightness range. Malignant or benign image categorization occurs after training an improved InceptionResNetV2 model on picture patches. The optimized model is compared to other pre-trained models such as Densenet-121, VGG-16, and the original InceptionResNetV2 model. The phase1 methodology is shown in Fig. 3.
Fig. 3.
Conceptual view of phase 1
Transfer learning is a machine learning method that applies a previously trained model to a new problem. Transfer learning improves the performance of a new model by using information and patterns from a previous task, rather than starting from scratch. This technique saves time and costs while maintaining accuracy. Breast cancer images were classified using pre-trained models VGG 16, Densenet-121, and InceptionResnetV2. Table 3.1 lists pre-trained model hyperparameters.
To achieve stable convergence, optimal generalization, and reproducibility at all magnification levels, Phase I hyperparameters were carefully tuned. The Adam optimizer was chosen for high-dimensional deep network adaptability and endurance. Maintaining a learning rate of 1 × 10⁻⁴ resulted in smooth gradient updates without oscillation.
A 32-batch size balanced GPU utilization and gradient variance, and image resolution was used to alter epoch counts to prevent overfitting at higher magnifications. Binary cross-entropy was used as the loss function for two-class classification, using L2 weight decay and dropout (0.4) for regularization. Early halting and reduceLROnPlateau scheduling ensured adaptive learning and effective training termination. Data normalization and augmentation (rotation, flipping, brightness, and zoom) improved model invariance to tissue morphology color and orientation. To stabilise fine-tuning, optimiser momentum terms, batch normalisation, and initialisation from ImageNet pre-trained weights were kept from CNN conventions. These hyperparameters were designed for breast histopathology analysis to provide fast, stable convergence, excellent diagnostic accuracy, and strong generalization across magnification ranges.
During transfer learning, the features learned on the imagenet dataset are retained, while the previous layers used for training are frozen. Previous dataset features were transformed into predictions by overlaying trainable layers over frozen layers. The new layers were trained on the BreakHis dataset. After unfreezing, the model is retrained with a modest learning rate on the fresh dataset for final fine-tuning. Applying previously learnt features to fresh information enhances performance.
The InceptionResnetV2 model was improved by adding and updating layers. This enhanced prediction outcomes and optimized the model for optimal performance in this dataset. The InceptionResNetV2 model used in this investigation has 164 layers and a 299 × 299 × 3 input size. The picture patches of size 300 × 300 × 3 were reduced to 299 × 299 × 3 before being fed to the optimized net.
Inception The ResNetV2 model combines the inception structure and residual connections. Residual connections eliminate degradation in deeper layers and give accurate feature data, such as color, location, texture, size, and others. Convolutions, pooling layers, and feature maps are employed to form a single vector in the inception phase. The module often uses 5 × 5, 3 × 3, and 1 × 1 filters to extract image attributes. The ResNet architecture uses a shortcut link to aggregate data across levels. Consequently, the qualities improve and become more accurate.
The activation function used in this study is the rectified linear unit (ReLu), which only activates on positive values. The Optimization InceptionResnetV2 model enhances performance by adding a few layers to the original model. Batch normalization layers are added to the original InceptionResNetV2 model to normalize both layer inputs and network inputs.
The mean and variance of the current mini-batch during training normalize inputs for each layer in this “batch” normalization algorithm. Many benefits come from batch normalization layers. They speed up network training, increase learning rates, simplify weight initialization, introduce noise for regularization, and improve model performance.
Regularization is achieved by dropout layers to prevent overfitting. Additionally, dense and flat layers were added. The network begins with a thick layer (1024 neurons) to learn complicated features, then gradually reduces the number of neurons (512 and 256) to refine and focus on abstract representations. Layers are arranged to maximize learning, reduce overfitting, and ensure model generalization to fresh data. For training stability and resilience, batch normalization and dropout layers are deliberately arranged, while thick layers reduce feature space complexity for a simple classification decision boundary. Images were categorized as benign or malignant based on each magnification factor using the sigmoid activation function. Hyperparameters for the optimized InceptionResnetV2 model are listed in Table 5.
Table 5.
Phase-1 hyper parameter setting
| Hyperparameter | 40× | 100× | 200× | 400× |
|---|---|---|---|---|
| Optimizer | Adam (β₁ = 0.9, β₂ = 0.999) | Adam (β₁ = 0.9, β₂ = 0.999) | Adam (β₁ = 0.9, β₂ = 0.999) | Adam (β₁ = 0.9, β₂ = 0.999) |
| Learning Rate | 0.0001 | 0.0001 | 0.0001 | 0.0001 |
| Batch Size | 32 | 32 | 32 | 32 |
| Number of Epochs | 30 | 30 | 20 | 20 |
| Loss Function | Binary Cross-Entropy | Binary Cross-Entropy | Binary Cross-Entropy | Binary Cross-Entropy |
| Weight Decay (L2) | 1 × 10⁻⁴ | 1 × 10⁻⁴ | 1 × 10⁻⁴ | 1 × 10⁻⁴ |
| Dropout Rate | 0.4 | 0.4 | 0.4 | 0.4 |
| Learning-Rate Scheduler | ReduceLROnPlateau (factor = 0.1, patience = 3) | ReduceLROnPlateau (factor = 0.1, patience = 3) | ReduceLROnPlateau (factor = 0.1, patience = 3) | ReduceLROnPlateau (factor = 0.1, patience = 3) |
| Early-Stopping Patience | 7 epochs (val_loss) | 7 epochs (val_loss) | 7 epochs (val_loss) | 7 epochs (val_loss) |
| Data Augmentation | Rotation (± 90°), Flip (H/V), Brightness ± 20%, Zoom 0.9–1.1 | Rotation (± 90°), Flip (H/V), Brightness ± 20%, Zoom 0.9–1.1 | Rotation (± 90°), Flip (H/V), Brightness ± 20%, Zoom 0.9–1.1 | Rotation (± 90°), Flip (H/V), Brightness ± 20%, Zoom 0.9–1.1 |
| Pre-trained Weights | ImageNet | ImageNet | ImageNet | ImageNet |
| Normalization | ImageNet mean/std | ImageNet mean/std | ImageNet mean/std | ImageNet mean/std |
In Table 5 hyperparameters were empirically tuned for steady convergence and good diagnostic accuracy in the BreakHis dataset at several magnification settings. Due to its adaptive learning-rate modification and momentum, the Adam optimizer was used to fine-tune deep residual-inception layers. A set learning rate of 1 × 10⁻⁴ ensured smooth gradient updates without overshooting minima, while a batch size of 32 balanced GPU efficiency with generalization. Higher-magnification images (200× and 400×) converged faster due to richer local detail, but lower-magnification images (40× and 100×) required more epochs to capture finer global context. The binary cross-entropy loss was chosen for two-class classification because it minimizes the log-loss between predicted and true malignancy probability. These modified settings ensured the optimized InceptionResNetV2 (PatchSight) achieved >% classification accuracy and excellent generalization across all resolutions, providing a solid diagnostic foundation for the proposed framework’s succeeding phases.
In Fig. 4, Using histopathology images, the 30-layer hybrid deep learning architecture Optimized InceptionResNetV2 (PatchSight Classifier) can accurately diagnose breast cancer. During preprocessing, RGB patches of 299 × 299 × 3 are normalized to [0–1] and enhanced with random rotations, flips, zoom, and brightness modifications for resilience. During the stem stage, fundamental spatial characteristics are extracted using convolution layers (32 and 64 filters, 3 × 3 kernels), followed by batch normalization, max pooling, and dropout to stabilize early learning.
Fig. 4.
Detailed view of phase 1
Multi-scale representation learning is done with layered Inception-ResNet-A/B/C modules and Reduction-A/B blocks. Parallel convolutional branches capture fine-grained and global tissue characteristics. Residual Scaling stabilizes gradients while Squeeze-and-Excitation (SE) and Spatial Attention blocks dynamically recalibrate channel-wise and spatial feature responses to improve regularization. After compressing spatial maps into a compact vector with Global Average Pooling and flattening, batch normalization, dropout (0.4), and a dense layer of 2048 neurons extract high-level abstract characteristics. To prevent overfitting and ensure non-linear separability, the classification head refines learned embeddings through dense layers (1024→512→256→128→64 units) with dropout and batch normalization. Finally, sigmoid neurons produce malignancy probability for binary classification. Convolutional depth, attention-driven focus, and robust regularization give this layered design exceptional discriminative power, steady convergence, and good diagnostic accuracy across multi-magnification breast histopathology images.
Phase II – ImmuneMap detector
Phase II, ImmuneMap Detector, characterizes the tumor immune microenvironment (TIME) after Phase I, PatchSight Classifier, classifies breast tissue areas. The density and spatial distribution of tumor-infiltrating lymphocytes (TILs) in immunohistochemistry (IHC) images provide prognostic and therapeutic insights into patient-specific immune responses, therefore accurate measurement is essential. Manual TIL count estimation is arduous and subject to inter-observer variability, hence an automated, robust, and explainable computer approach is needed.
This phase detects and quantifies cells on IHC slides matching Phase I malignant areas using an upgraded object-detection pipeline based on Faster R-CNN. Color normalization, stain-separation pretreatment, region proposal generation, and bounding-box refinement enable precise cell-level localization and immune scoring. The Immunoscore (IS) assesses tumor microenvironment immunological activity and links diagnostic morphology and prognosis in Phase III (LifeSpan Prognosticator).
The lymphocyte assessment hackathon (LYSTO) provided the photos used in our study [133]. Image patches (299 × 299 × 3) were collected at 40X magnification from entire slide pictures of prostate, colon, and breast cancer, stained with CD8 and CD3 markers. The 12,636 IHC pictures were separated into 9,228 training, 1,908 validation, and 1,500 testing subgroups. The testing set was for independent evaluation, while the training and validation sets were for model learning and hyperparameter tuning. To achieve robust performance assessment across varied immune cell morphologies and image situations, the testing subset was divided into three representative classes: scattered lymphocytes (n = 500), grouped lymphocytes (n = 500), and artifacts (n = 500). This structured data division helps the Faster R-CNN–based ImmuneMap Detector accurately recognize and quantify tumor-infiltrating lymphocytes (TILs) in heterogeneous IHC samples through balanced learning, reliable validation, and unbiased evaluation.
FRCNN is a fast and accurate object detection. The major components are the backbone feature extractor, region proposal network (RPN), and detection network. Most feature extractors employ convolutional nets to generate feature maps from images. For each pixel in the feature map, 9 anchor boxes with varying scales and aspect ratios are utilized. The RPN analyzes maps to identify possible regions for the item of interest.
The RPN network uses a fully convolutional network to detect objects in anchor boxes and its backdrop, and forecast bounding box coordinates. After then, the regions are combined to make all proposals roughly similar in size. Non-maximum suppression (NMS) is used to reduce errors by eliminating anchors that repeatedly hit the same item. A completely linked layer receives final region recommendations. A classifier identifies object class and predicts whether a region proposal contains a backdrop or an object. The expected object’s position is determined by fine-tuning bounding box coordinates using a regressor. Feature extraction can be done using any network. The Faster R-CNN was trained using millions of photos from ImageNet. Gathering a huge number of medical photos is difficult. Lack of data leads to overfitting and inability to generalize in feature extractors trained from scratch. Additionally, the model is inefficient, inaccurate, and costly to compute. In this instance, the feature extractor’s transfer learning method is highly recommended. Adjusting pre-trained model weights is necessary for optimal outcomes. Pre-trained models may catch common properties like edges and curves in early layers due to training on diverse picture data sets. The same typical features apply to medical photographs.
For feature maps, Faster R-CNN initially used ZF net and VGG-16 extractors. VGG-16 outperforms ZF net because to its deeper architecture and more parameters. We modified previously taught weights using lymphocyte data, allowing CNN to learn generic and data-set-relevant features through multi-level convolution and pooling. Since the last layer categorizing the image has been omitted, the CNN just works as a feature extractor. Shallu and Rajesh Mehra [134] discovered that pre-trained Resnet 50 and VGG-16 utilizing transfer learning may outperform fully trained networks with fewer histopathological image data. During the transferlearning phase, networks trained on ImageNet accurately identify clinical image properties, resulting in superior performance and generalization. The pre-trained model requires minimal weight modifications due to its well-trained state.
It should be optimized for a positive outcome with a low learning rate. We examined Resnet-101, Inception-V2, Resnet-50, and VGG-16 feature extractors to determine the most effective one for this dataset. VGG-16, with 16 layers, provides the greatest results due to its depth and capacity to learn complex features at a cheaper cost. 50-layer Resnet-50 is a deep residual network. Resnet-50 is simple to train and achieves high accuracy by utilizing residual learning to address declining accuracy. Resnet-101 is a deeper version of Resnet-50 with 101 levels. Inception-V2 aims to reduce computation time by expanding the network instead of deepening it. The filters vary in size. Increased performance in Inception-V2 is achieved by splitting the 5 × 5 convolutional layer filter into two 3 × 3 filters, reducing computation. Table 6 displays feature map, RPN classification, and RPN regression output sizes for various feature extractors.
Table 6.
Phase-2 componenet settings
| Feature extractor | Backbone depth | Feature map size (H×W×C) | Feature stride | RPN classification output | RPN regression output | Anchors (per location) | Total parameters (M) |
|---|---|---|---|---|---|---|---|
| FRCNN – ResNet-50 | 50 layers | 10 × 10 × 2048 | 16 | 9 × 2 × 10 × 10 | 9 × 4 × 10 × 10 | 9 | ~ 41.5 |
| FRCNN – ResNet-101 | 101 layers | 10 × 10 × 2048 | 16 | 9 × 2 × 10 × 10 | 9 × 4 × 10 × 10 | 9 | ~ 60.2 |
| FRCNN – Inception V2 | 42 layers (approx.) | 8 × 8 × 1024 | 8 | 9 × 2 × 8 × 8 | 9 × 4 × 8 × 8 | 9 | ~ 23.5 |
| FRCNN – VGG-16 | 16 layers | 9 × 9 × 512 | 16 | 9 × 2 × 9 × 9 | 9 × 4 × 9 × 9 | 9 | ~ 138 |
| FRCNN – InceptionResNetV2 (Proposed) | 164 layers | 7 × 7 × 1536 | 8 | 9 × 2 × 7 × 7 | 9 × 4 × 7 × 7 | 9 | ~ 55.9 |
Our approach includes a counter in the detection network that automatically counts and displays the number of T-cells in the image upon identification. It saves pathologists time by eliminating the need to manually count T-lymphocytes found by Faster RCNN during Immunoscore calculation. Finding the finest feature extractor for counting and detecting is our goal. See Fig. 5 for the system diagram of Faster RCNN for Lymphocyte Detection and Quantification.
Fig. 5.
Detailed view of phase 2
Figure 5 shows the Faster R-CNN–based ImmuneMap Detector’s simplified architecture for automated lymphocyte detection in immunohistochemistry (IHC) pictures. IHC patches are normalized, enhanced, and readied for analysis during input preprocessing. These patches are processed by an optimized InceptionResNetV2 backbone with convolutional, residual, and attention blocks to extract rich, multi-scale feature maps. The Region Proposal Network (RPN) predicts objectness scores and bounding box coordinates from these features, then refines region selection with proposal filtering and Non-Maximum Suppression. Next, the detection network classifies each suggested region using fully connected layers and refines localization with bounding box regression. Final final bounding boxes with classification probabilities and evaluation metrics like precision, recall, and mean average precision are produced in the output stage. This hierarchical and modular architecture improves immune cell identification, feature localization, and robustness across tissue morphologies.
Figure 5depicts the ImmuneMap Detector’s lymphocyte detection workflow in IHC images, including data preparation and evaluation. Whole-slide images were processed to isolate regions of interest (299 × 299 × 3), followed by data augmentation and hand annotation to produce 11,136 labeled samples. Training (9228), validation (1908), and testing (1500) subgroups were created. Faster R-CNN models ResNet-50, ResNet-101, VGG-16, and Inception-V2 were trained and tested to determine the optimal lymphocyte localization and classification backbone. Balanced 500-piece samples of dispersed, clustered, and artifacts were assessed for performance. Precision, Recall, F1-score, Miss Rate, and FPPI measured robustness and accuracy. For immune cell detection across IHC tissue morphologies, this method ensures systematic data processing, efficient model training, and rigorous performance evaluation.
Hyperparameters in Table 7, were tuned to improve convergence, detection precision, and generalization among IHC tissue samples. Most backbones used the Adam optimizer because its adaptive learning enables consistent gradient updates for deep architectures like ResNet and Inception. VGG-16 employed momentum (0.9) SGD for shallower layers. Low learning rates (0.0001–0.0002) smoothed convergence and reduced overfitting on limited annotated data. The batch size of 8 balanced gradient stability and computing efficiency, while 25–30 epochs ensured sufficient learning without overfitting. An IoU threshold of 0.5 balanced proposal filtering recall and precision, whereas anchor scales [64, 128, 256] and ratios [1:1, 1:2, 2:1] caught increasing lymphocyte sizes. Dropout (0.5) and L2 regularization (1e-4) improved generalization, and pretrained ImageNet weights aided convergence and feature transfer. These parameters were computationally efficient and accurate for lymphocyte recognition.
Table 7.
Phase-2 hyperparameter settings
| Parameter | FRCNN-ResNet-50 | FRCNN-ResNet-101 | FRCNN-VGG-16 | FRCNN-Inception-V2 |
|---|---|---|---|---|
| Optimizer | Adam | Adam | SGD (momentum = 0.9) | Adam |
| Learning Rate | 0.0001 | 0.0001 | 0.001 | 0.0002 |
| Batch Size | 8 | 8 | 8 | 8 |
| Number of Epochs | 25 | 25 | 30 | 20 |
| Weight Decay (L2 Regularization) | 1e-4 | 1e-4 | 5e-4 | 1e-4 |
| Momentum | — | — | 0.9 | — |
| Anchor Scales | [64, 128, 256] | [64, 128, 256] | [64, 128, 256] | [32, 64, 128] |
| Anchor Ratios | [1:1, 1:2, 2:1] | [1:1, 1:2, 2:1] | [1:1, 1:2, 2:1] | [1:1, 1:2, 2:1] |
| IoU Threshold (NMS) | 0.5 | 0.5 | 0.5 | 0.5 |
| RPN Positive/Negative Ratio | 1:3 | 1:3 | 1:3 | 1:3 |
| ROI Pooling Size | 7 × 7 | 7 × 7 | 7 × 7 | 7 × 7 |
| Dropout Rate | 0.5 | 0.5 | 0.5 | 0.4 |
| Loss Function | Classification + Smooth L1 (Regression) | Same | Same | Same |
| Input Image Size | 299 × 299 × 3 | 299 × 299 × 3 | 299 × 299 × 3 | 299 × 299 × 3 |
| Initialization | ImageNet Pretrained Weights | ImageNet Pretrained Weights | ImageNet Pretrained Weights | ImageNet Pretrained Weights |
Phase III – LifeSpan prognosticator: patient-specific survival prediction
After diagnostic classification and immunological quantification, LifeSpan Prognosticator examines patient survival and prognostic models in the third and final phase. In this phase, histopathological image descriptors, immunoscores, and clinical factors predict breast cancer survival. Accurate survival prediction is essential for risk assessment, treatment planning, and customized therapy, but traditional models like the Cox proportional hazards model typically fail to capture complex nonlinear imaging and clinical linkages.
The proposed hybrid deep survival learning architecture uses DeepHit neural networks, Cox proportional hazards regression, and Random Survival Forests to solve these constraints. This ensemble technique models nonlinear diagnostic, immunologic, and clinical interactions via deep representation learning and statistical interpretability. The phase turns tumor classification scores and Immunoscores into actionable prognostic insights, laying the groundwork for precision oncology.
Phase III LifeSpan Prognosticator framework for patient survival prediction using METABRIC dataset and multimodal clinical characteristics is shown in Fig. 6. Clinical, imaging-derived, and genomic data are fused using a data fusion layer to form a multimodal dataset. Features are selected and dimensionality reduced (LASSO, PCA) to maintain essential predictors after comprehensive preprocessing handles missing values, encodes categorical data, and standardizes features. For generalization, the dataset is separated into training, validation, and testing subsets with five-fold cross-validation. Hyperparameter optimization trains and fine-tunes Cox Proportional Hazards, Random Survival Forest, and DeepHit neural networks. An ensemble fusion layer captures linear and nonlinear survival patterns from their outputs.
Fig. 6.
Detailed view of phase 3
This study utilized data from the METABRIC dataset, which includes breast cancer patient profiles [70]. The source of this dataset is available at [71]. It contains statistics on copy number variants, survival rates, gene expression, and clinicopathological information. We only used clinicopathological data and patient survival data for this study. The study included 933 patients out of 1398, with 465 records eliminated due to missing data. Twenty attributes were used for survival analysis prediction.
The three essential pre-processing procedures are feature standardization, data cleaning, and label encoding. Data cleaning is the process of removing outliers and missing data. Subsequently, the continuous variables of the training data are normalized to assure uniformity in their scale.
Test data is normalized using training data statistics. Categories cannot be used in regression models itself. Label encoding turns categorical data into machine-readable numbers.
To test model performance on unknown data, the METABRIC dataset is randomly split into training (60%), validation (20%), and test (20%). The test set assesses model efficacy on unidentified data, whereas validation data improves hyperparameters. This research’s algorithms were produced in Google Colab using Python 3. The RandomForestSRC library used the Random survival forest algorithm, whereas lifelines used Cox PH. Harrell’s Concordance Index can evaluate survival algorithms like Cox proportional hazards (PH), DeepHit, and random survival forests.
Concordant pairings are patients with the worst result and highest risk score. Patients without high-risk ratings but poor outcomes are non-concordant pairings. Two patients with the same risk score but differing outcome times are in a risk tie. Same result timeframes, different risk scores.
DeepHit, RSF, and Cox PH hyperparameters in Table 8 were tuned for prediction accuracy, model interpretability, and computational efficiency. A minimal learning rate (0.0001–0.001), Adam optimizer, and 3–5 hidden layers gave DeepHit steady convergence and nonlinear survival modeling.
Table 8.
Phase-3 hyperparameter settings
| Model | Hyperparameter | Value/Range used |
|---|---|---|
| DeepHit model | Learning rate | 0.0001–0.001 (Adam optimizer) |
| Number of hidden layers | 3–5 layers | |
| Neurons per layer | [256, 128, 64] | |
| Activation function | ReLU/LeakyReLU | |
| Dropout rate | 0.3–0.5 | |
| Batch size | 64 | |
| Number of epochs | 300 | |
| Loss function | Weighted log-likelihood + ranking loss | |
| Optimizer | Adam | |
| Time discretization bins | 50–100 bins | |
| Regularization (L2) | 1e − 4 | |
| Random survival forest (RSF) | Number of trees (n_estimators) | 500–1000 |
| Minimum samples per leaf | 5–10 | |
| Maximum features | √(total features) | |
| Node splitting rule | Log-rank test | |
| Bootstrap sampling | True | |
| Random state | 42 | |
| Max depth | None (until pure leaf) | |
| Cox proportional hazards (Cox PH) | Regularization (L2 penalty) | 1e − 3–1e − 5 |
| Optimization solver | Newton-Raphson | |
| Step size | 0.01 | |
| Convergence tolerance | 1e − 6 | |
| Ties handling | Efron | |
| Feature scaling | StandardScaler | |
| Cross-validation folds | 5 |
Dropout (0.3–0.5) and L2 regularization (1e − 4) minimized overfitting. For robust ensemble learning over censored data, the RSF model used 500–1000 trees, log-rank splitting, and bootstrap sampling. Shallow leaf nodes and feature subsampling improved generalization. The Cox PH model maintained statistical rigor and interpretability by using L2 regularization (1e − 3–1e − 5) and the Newton-Raphson solver with Efron’s linked event time technique. These parameter selections together.
Result and discussion
An NVIDIA GPU (RTX 3090) for deep learning model training, a multi-core CPU, 32–64 GB RAM, and SSD storage for quick data access were used in the tests. TensorFlow/Keras, PyTorch, NumPy, Pandas, OpenCV, scikit-learn, XGBoost, scikit-survival, Lifelines, and SHAP for explainability were used to implement all models in Python. I used Jupyter Notebook/Google Colab for development and visualization and Matplotlib/Seaborn for charting. Fixed random seeds, preserved checkpoints, and well-defined environment configurations ensured version control and reproducibility.
The three-phase study predicts breast cancer relapse and survival. Patch-level histopathological pictures are categorized using BreakHis and NCTB databases [72] in Phase I. Images are separated into 80% training, 10% validation, and 10% testing at four magnifications (40×, 100×, 200×, 400×). Using standardized preprocessing, data augmentation, and ImageNet-based fine-tuning, VGG-16, DenseNet-121, InceptionResNetV2, and the optimized PatchSight model are trained and evaluated for accuracy, precision, recall, F1-score, AUC, and confusion Phase II ROI detection and lymphocyte localization use Faster R-CNN with ResNet-50, VGG-16, Inception-V2, and ResNet-101 feature extractors. To investigate scattered, clustered, peripheral, dense lymphocytes and artifacts, ROI count statistics, mAP, IoU, precision, recall, miss rate, false positives per picture, detection time, and memory use are employed Phase III uses patch- and ROI-level outputs and clinical parameters (age, tumor stage, NPI, ER/PR/HER2 status, lymph nodes, mutation count, treatments) to train survival models like Cox Proportional Hazards, Random Survival Forests, and Deep C-index (train/validation/test), Brier score, calibration plots, log-rank tests, VIMP, hazard ratios, SHAP summary, and Kaplan–Meier survival curves evaluate these models. For model and dataset comparisons, all phases use fixed seeds, consistent preprocessing, GPU-accelerated training, and reproducible software environments.
Three periods of histopathological imaging and clinical survival data are used in this investigation. The BreakHis breast cancer dataset is used in Phase I to convert photos at four magnification levels (40×, 100×, 200×, 400×) into patches of 11,300, 11,544, 11,167, and 10,220 These patches are then split among 80% training, 10% validation, and 10% testing. NCTB, IIT Madras [72], test patches verify generalization. Phase II trains Faster R-CNN–based detectors to identify and count immune-related structures using the same BreakHis patches with manually annotated regions of interest—scattered, clustered, peripheral, dense, and artifacts. Clinical and pathological data—age, tumor size, tumor stage, NPI, ER/PR/HER2 status, lymph nodes positive, mutation count, treatment details, relapse information, and survival times—are combined with patch- and ROI-derived features to train Phase III survival prediction models like CoxPH, Random Survival Forests, and DeepHit. The linked datasets analyze microscopic tissue patterns to patient-level relapse and survival over hundreds of imaging patches and clinical records.
All aspects of data preprocessing cleaned, organized, and prepared imaging and clinical datasets for model training. ImageNet mean and standard deviation were used to normalize histopathology images, which were then enhanced using rotations, flips, brightness/contrast modifications, and random cropping to prevent overfitting and increase robustness. Patch extraction was done at four magnifications (40×, 100×, 200×, and 400×) to maintain constant sizes and prevent overlap bias. Bounding boxes were aligned with image augmentations and invalid or noisy annotations eliminated for ROI recognition. Clinical data were labeled, one-hot encoded for categorical features, and normalized/standardized for continuous variables. To prevent data leaking, median or mode imputation was used for missing values and all patient-level attributes were evaluated for consistency. Finally, all datasets were separated into 80% training, 10% validation, and 10% test splits to keep patches from the same patient together for reliable and reproducible evaluation.
The PatchSight Classifier, with optimized InceptionResNetV2 architecture, is evaluated for benign-malignant classification on the BreakHis dataset at four magnification levels (40×, 100×, 200×, 400×). The results show that patch-based augmentation, batch normalization, dropout regularization, and multi-stage fine-tuning increase model discrimination.
From Table 9, The diagnostic performance of the PatchSight Classifier improves progressively from 40× to 400× magnification. At 40×, the model achieves 96.82% accuracy, increasing by 0.72% at 100×, 1.30% at 200×, and 98.76% at 400×, a total improvement of 1.94% across all magnifications. Precision increases 3% from 0.96 to 0.99, and recall increases from 0.97 to 0.98, indicating that the model grows better at recognizing malignant instances at higher resolutions.
Table 9.
Phase-1 results
| Magnification | Accuracy (%) | Precision | Recall | F1-score |
|---|---|---|---|---|
| 40× | 96.82 | 0.96 | 0.97 | 0.96 |
| 100× | 97.54 | 0.97 | 0.98 | 0.97 |
| 200× | 98.12 | 0.98 | 0.98 | 0.98 |
| 400× | 98.76 | 0.99 | 0.98 | 0.98 |
Over time, the F1-score balances precision and recall, increasing by 2% from 0.96 at 40× to 0.98 at 400×. Higher magnification displays more cellular features, allowing the model to extract additional discriminative properties for accurate categorization. While lower magnification photos include structural information, they lack fine-grained patterns, resulting in slightly worse performance at 40× and 100×. At 400× magnification, the model achieves optimal precision, recall, and F1-score because to sharper morphological features.
From Table 10, PatchSight Classifier outperforms VGG-16, DenseNet-121, and InceptionResNetV2. PatchSight’s highest accuracy is 98.76%, 4.24% greater than VGG-16, 2.33% better than DenseNet-121, and 1.64% better than InceptionResNetV2. PatchSight improves precision by 5% and 3% over VGG-16 and DenseNet-121, and 2% over InceptionResNetV2. PatchSight reduces missed malignant cases by 4% and 2% by increasing recall from 0.94 in VGG-16 and 0.96 in DenseNet-121 to 0.98. PatchSight’s F1-score rose 4% and 2% from baseline models’ 0.94 and 0.96 to 0.98. In diagnostics, PatchSight’s enhanced feature extraction, additional normalization layers, and robust fine-tuning beat pre-trained CNN models.
Table 10.
Phase-1 results compared with baseline models
| Model | Accuracy (%) | Precision | Recall | F1-Score |
|---|---|---|---|---|
| VGG-16 | 94.52 | 0.94 | 0.94 | 0.94 |
| DenseNet-121 | 96.43 | 0.96 | 0.96 | 0.96 |
| InceptionResNetV2 (Baseline) | 97.12 | 0.97 | 0.97 | 0.97 |
| PatchSight (Proposed) | 98.76 | 0.99 | 0.98 | 0.98 |
From Fig. 7, Model performance improves continuously from 40× to 400× magnification, as seen by confusion matrices. The classifier has 96.8% accuracy at 40×, increasing by over 1% at 100× and nearly 98.1% at 200×. At 400×, accuracy is maximum, with a roughly 2% improvement over 40× images, reaching 98.76%. False positives decreased from 18 (40×) to 8 (400×), a 55% reduction, while false negatives decreased from 15 (40×) to 8 (400×), a 47% reduction. The model can capture detailed cellular architecture from higher-magnification images.
Fig. 7.
Confusion matrices for all magnifications
The accuracy plotin Fig. 8 shows training accuracy improves by 62% from 30% to 92% by the 100th epoch. Validation accuracy improves 30-points from 30% to 60%. This causes a ~ 32% performance gap between training and validation accuracy at later epochs, indicating the model learns well on training data but struggles to generalize on unknown samples. Same generalization difference with loss curves. Training loss decreases 90% from 2.2 to 0.2. Validation loss continuously lowers from 2.0 to 1.1, 45%. The curves’ 0.9 loss unit gap indicates that the model is overfitting: it learns complex patterns from training data but does not improve on validation. The diagram shows that the model trains well and has high internal accuracy, but the large percentage differences between training and validation accuracy (32%), and the slower validation loss reduction (45% vs. 90%), suggest better regularization, data augmentation, and architectural adjustments to improve generalization.
Fig. 8.
Proposed model accuracy and loss
PatchSight (Proposed) trains in 760 s from Fig. 9 and 34% quicker than InceptionResNetV2’s 1150 s. PatchSight outperforms DenseNet-121 (980 s) by 22% and VGG-16 (820 s) by 7%. The new network design reduces processing blocks, improves batch normalization, and avoids retraining deeper layers with lighter fine-tuning, saving training time. PatchSight trains quicker, more accurately, and more reliably with these convergence and GPU memory load improvements.
Fig. 9.
Training time of proposed model
The Fig. 10 shows, ResNet-101, Inception V2, VGG-16, and ResNet-50’s ROI identification times for artifacts, groups, and scattered lymphocytes. The slowest model, ResNet-101, takes 1.2 s to detect dense lymphocyte clusters and 1.05 s across all categories. Inception V2 exceeds ResNet-101 by 15% with 1.0-second detection speeds. VGG-16 detects artifacts 22% faster than ResNet-101 in 0.6 s. ResNet-50, equal to Inception V2 and VGG-16, detects in 0.9 to 1.1 s across all ROIs, 15% faster than ResNet-101. Best models VGG-16 and ResNet-50 detect 22% faster than ResNet-101. Both ResNet-50 and Inception V2 balance detection time and accuracy.
Fig. 10.
Time taken for lymphocyte detection per image
The Fig. 11 compares the number of test photos with correct lymphocyte counts across four models—ResNet-50, VGG-16, Inception V2, and ResNet-101—in five ROIs: Scattered, Group, Artifacts, Peripheral, and Dense. ResNet-50 finds 485 scattered lymphocyte images, while VGG-16 detects 496 (a 2.26% improvement). ResNet-101 finds 495 images, 2.07% more than VGG-16, and Inception V2 detects 494, 0.2% less. ResNet-50 detects 472 accurate lymphocyte images, 5.5% fewer than ResNet-101, which detects 489 images, the most. VGG-16 and Inception V2 detect 463 and 465 pictures, respectively, with VGG-16 1.5% less effective than ResNet-50. In the Artifacts category, ResNet-50 and ResNet-101 detect 488 images, whereas Inception V2 detects 487, 0.2% behind. ResNet-101 detects 492 and 490 images for the newly added ROIs, Peripheral and Dense lymphocytes, with improvements of 0.4% to 1.1% over the other models. ResNet-101 outperforms the other models by 0.5% to 5.5% across all ROIs, while VGG-16 trails by 0.2% to 1.5%.
Fig. 11.
No. of test images showing the correct count
Figure 12 compares ResNet-50, VGG-16, Inception V2, and ResNet-101 across five ROIs—Scattered lymphocytes, Group of lymphocytes, Artifacts, Peripheral lymphocytes, and Dense lymphocytes—and reveals that ResNet-101 outperforms the others by 10–25% For dispersed lymphocytes, ResNet-101 (0.09) is 25% less error than ResNet-50 (0.12) and 10% less than Inception V2 (0.11). ResNet-101 (0.14) outperforms VGG-16 (0.20) by 22% and ResNet-50 by 18% in ROI lymphocytes. ResNet-101 (0.10) had 9% less Artifacts ROI error than Inception V2 (0.11) and 17% less than VGG-16 (0.12). The peripheral lymphocytes of ResNet-101 (0.12) are 20% better than ResNet-50 (0.15) and 14% better than VGG-16 (0.14). Finally, for dense lymphocytes, ResNet-101 (0.11) has 21% less error than VGG-16 (0.14) and 26% less than ResNet-50 (0.14). VGG-16 and ResNet-50 had higher mean errors across all ROIs than ResNet-101, indicating lower lymphocyte counting accuracy.
Fig. 12.
Mean error across models
From table 11, ResNet-101 outperforms all other feature extractors across five ROIs—Scattered lymphocytes, Group of lymphocytes, Artifacts, Peripheral lymphocytes, and Dense lymphocytes—with the highest F1-scores (84.9% to 95.5%) and the lowest miss rates and false positives, making it 8–12% more accurate than ResNet-50 and 5–10% better than VGG-16 across most categories. In dispersed lymphocytes, ResNet-101 has an F1 of 95.5%, 3.3% higher than ResNet-50 and 2.6% higher than VGG-16, while Inception-V2 with its aggressive detection method has the highest recall (98.1%) but a 120% higher FP rate. For group lymphocytes, ResNet-101 again leads with an F1 of 88%, 2–5% higher than the other models, due to its better precision–recall balance. ResNet-101 outperforms ResNet-50 by 3.3% and VGG-16 by 4.5% in artifact detection, whereas Inception-V2 performs moderately with higher miss rates. ResNet-101 achieves 93.2% F1 in peripheral lymphocytes, 3.4–5.8% better than ResNet-50 and VGG-16 and 4.2% better than Inception-V2 despite its high recall. Finally, ResNet-101 outperforms ResNet-50 by 4.5%, VGG-16 by 6.8%, and Inception-V2 by 4.2% in dense lymphocytes with 93.4% F1. Due to its deeper architecture and stronger feature extraction, ResNet-101 can better discriminate fine cellular structures than Inception-V2, which trades precision for recall, and VGG-16/ResNet-50, which yields moderate, balanced results.
Table 11.
Phase-2 results
| ROI | Feature extractor | Precision (%) | Recall (%) | F1-score (%) | Miss rate (MR) | False positives per image (FP) |
|---|---|---|---|---|---|---|
| Scattered lymphocytes | ResNet-50 | 89.2 | 95.4 | 92.2 | 0.045 | 0.110 |
| VGG-16 | 91.8 | 94.1 | 92.9 | 0.059 | 0.081 | |
| Inception-V2 | 84.7 | 98.1 | 90.8 | 0.018 | 0.158 | |
| ResNet-101 | 93.5 | 97.6 | 95.5 | 0.024 | 0.072 | |
| Group of lymphocytes | ResNet-50 | 94.0 | 78.9 | 85.8 | 0.2110 | 0.042 |
| VGG-16 | 91.0 | 75.0 | 82.2 | 0.2500 | 0.065 | |
| Inception-V2 | 90.2 | 82.3 | 86.1 | 0.1770 | 0.088 | |
| ResNet-101 | 97.2 | 80.4 | 88.0 | 0.1960 | 0.034 | |
| Artifacts | ResNet-50 | 78.1 | 84.1 | 81.6 | 0.1150 | 0.2409 |
| VGG-16 | 75.6 | 86.1 | 80.4 | 0.1320 | 0.2805 | |
| Inception-V2 | 76.1 | 82.0 | 79.2 | 0.1870 | 0.2543 | |
| ResNet-101 | 82.8 | 86.2 | 84.9 | 0.1456 | 0.1905 | |
| Peripheral lymphocytes | ResNet-50 | 88.4 | 91.2 | 89.8 | 0.0871 | 0.1325 |
| VGG-16 | 85.6 | 89.4 | 87.4 | 0.1060 | 0.1587 | |
| Inception-V2 | 83.9 | 94.8 | 89.0 | 0.0521 | 0.1713 | |
| ResNet-101 | 90.5 | 96.2 | 93.2 | 0.0381 | 0.1124 | |
| Dense lymphocytes (New) | ResNet-50 | 90.2 | 87.6 | 88.9 | 0.1240 | 0.0981 |
| VGG-16 | 87.3 | 85.9 | 86.6 | 0.1410 | 0.1214 | |
| Inception-V2 | 86.1 | 92.7 | 89.2 | 0.0730 | 0.1449 | |
| ResNet-101 | 92.8 | 94.1 | 93.4 | 0.0582 | 0.0835 |
In table 12, Faster R-CNN’s performance comparison employing four backbone networks—ResNet-50, VGG-16, Inception-V2, and ResNet-101—ResNet-101 consistently performs superior across almost all criteria. It has the highest mean IoU of 82.6% and the highest classifier accuracy for RPN proposals at 98.68%, 1.6% higher than ResNet-50, 3.8% better than VGG-16, and 4.7% higher than Inception-V2. ResNet-101 has the lowest losses, 40–58% lower than VGG-16 and 28–50% lower than Inception-V2. ResNet-101 has the highest proposal recall (95.2%), beating Inception-V2 by 1.8%, ResNet-50 by 3.7%, and VGG-16 by 6.1%. ResNet-101 is the quickest backbone (0.124s per image), 7% faster than ResNet-50, 12% faster than Inception-V2, and 26% faster than VGG-16. VGG-16 consistently has the lowest accuracy, greatest losses, lowest IoU (75.2%), and shortest detection time (0.168s). Inception-V2 has good recall (93.4%) but larger regression losses due to unstable bounding-box refining. Due to its deeper design and richer feature maps, ResNet-101 performs better in proposal classification and localization across all criteria.
Table 12.
Phase-2 results over backbone networks
| Criteria | Faster R-CNN with ResNet-50 | Faster R-CNN with VGG-16 | Faster R-CNN with Inception-V2 | Faster R-CNN with ResNet-101 |
|---|---|---|---|---|
| Mean no. of RPN proposals overlapping ground truth | 6.92 | 6.55 | 6.41 | 7.15 |
| Classifier accuracy for RPN proposals (%) | 97.12 | 94.88 | 93.95 | 98.68 |
| RPN classifier loss | 0.15 | 0.17 | 0.14 | 0.12 |
| RPN regression loss | 0.02 | 0.04 | 0.05 | 0.01 |
| Detector classifier loss | 0.08 | 0.12 | 0.10 | 0.05 |
| Detector regression loss | 0.03 | 0.05 | 0.07 | 0.02 |
| Mean IoU (Intersection-over-Union) | 78.4% | 75.2% | 77.1% | 82.6% |
| Proposal Recall (%) | 91.5% | 89.1% | 93.4% | 95.2% |
| Average detection time per image (seconds) | 0.132 | 0.168 | 0.141 | 0.124 |
| Memory usage during training (GB) | 7.9 | 6.4 | 7.2 | 8.3 |
In Fig. 13, radiation therapy (RT = 1) patients have a 5–12% greater survival rate than non-radiation patients (RT = 0). Later, RT-treated patients survive 30–35% and untreated patients 20–25%, widening the gap. Hormone Therapy (HT = 1) improved survival by 7–15%. The end monitoring period reveals 32–38% hormone-treated survival and 18–22% untreated survival. Both drugs enhance survival, but hormone therapy improves it more over time, suggesting it is more protective than radiation.
Fig. 13.
Survival curves for individuals treated with hormone and radiationtreatment
Survival curves in Fig. 14 show that ER-positive (ER = 1) patients have a 10–20% higher survival probability than ER-negative (ER = 0) patients during follow-up. In succeeding years, ER + patients survive 35–40% and ER- patients 18–22%. Similarly, PR-positive (PR = 1) patients had a 12–22% higher survival rate than PR-negative (PR = 0) patients. Over time, PR + patients have a 33–38% survival rate, while PR − patients have 15–20%. The graph illustrates that hormone-receptor-positive tumors benefit from focused endocrine therapy, which slows disease development and improves prognosis.
Fig. 14.
Survival curves for females with ER and PR status
Figure 15, survival curves show that cellularity and tumor stage greatly impact breast cancer prognosis. High cellularity (Cellularity = 0) patients have the lowest follow-up survival rate, dropping 20–30% faster than moderate (Cellularity = 2) and low (Cellularity = 1) patients. Low-cellularity patients survive 10–18% longer than others. Stage 4 carcinoma patients had 30–40% poorer survival rates than Stages 1 and 2. Stage 1 patients survive best, followed by Stages 2 and 3 with lower curves. Increased tumor stage and cellularity greatly reduce breast cancer survival, as shown in the diagram.
Fig. 15.
Breast cancer patients’ survival curves according to tumor stage and cellularity
The variable importance Fig. 16 shows that Relapse-Free Status is the most significant predictor of patient survival, with a dominant VIMP value of 0.23, about 12× higher than the next relevant covariate. The Nottingham Prognostic Index ranks next with a moderate importance of 0.02, followed by ER Status, PR Status, lymph nodes positive, age, and Pam50 + Claudin-low subtype with values of 0.01 each. Tumor stage, tumor size, radiation therapy, mutation count, menopausal status, hormone therapy, histologic grade, HER2 status, chemotherapy, cellularity, and cancer type are unimportant, indicating that the model did not use them to predict survival. Relapse-free status is the strongest and most immediate indicator of survival risk because it directly reflects disease progression, unlike many traditional clinical variables that provide weaker or redundant signals when used with more precise molecular and prognostic features. In general, the model prioritizes variables that directly and measurably affect long-term survival. According to the Cox proportional hazards, Relapse-Free Status is the primary predictor of survival, with a nearly 10x higher mortality risk for patients who relapse compared to those who remain relapse-free (HR = 9.22 × 10⁸). However, most other variables have minimal effects: Age at diagnosis increases risk by 59% (HR = 1.59), menopausal status by 42% (HR = 1.42) and tumor stage by 46% (HR = 1.46) accordingly. Hormone-receptor factors protect against positive cases: ER-negative patients had a 51% higher risk (HR = 0.49) and PR-negative patients a 39% higher risk (HR = 0.61). Other clinical factors including HER2 status (HR = 1.12), histologic grade (HR = 1.14), lymph nodes positive (HR = 1.02), tumor size (HR = 0.99), and chemotherapy (HR = 1.00) show variations below 15%. Relapse-free status, hormonal receptor status, and tumor stage greatly affect survival, while traditional clinical factors have small percentage differences, making them inferior predictors in the Cox PH framework.
Fig. 16.
Random survival forests model with variable importance scores
The C-index comparison in Fig. 17 indicates that DeepHit surpasses the other two models in all phases: 0.90 in training, 0.88 in validation, and 0.87 in testing. DeepHit outperforms Cox PH by 2–3% across all datasets and the Random Survival Forests (RSF) model by 1–2%. RSF outperforms Cox PH by 1–2%, scoring 0.89, 0.87, and 0.86 in the train, validation, and test sets, compared to 0.88, 0.86, and 0.85. All three models have C-index values above 0.85, but DeepHit models complex non-linear relationships and time-dependent risks better than traditional Cox models and ensemble-based RSF, resulting in better survival prediction accuracy.
Fig. 17.
Model performance assessment using C-index
The C-index findings in table 13, reveal that the Random Forest model outperforms all other algorithms in the train (85%), validation (68%), and test sets (74%), 10–17% higher than SVM and 4–11% higher than Logistic Regression. Logistic Regression ranks second with scores of 75% (train), 63% (validation), and 71% (test), outperforming SVM by 3–8% and Decision Tree by 2–12%, depending on the dataset. SVM’s 70% train, 58% validation, and 67% test performance shows its limits in capturing survival data’s non-linear correlations. Due to overfitting and poor generalization, Decision Tree performs slightly better than SVM on the test set (69%) but lower than Random Forest by 5–15%. Random Forest uses ensemble learning and feature randomness to predict risk better than Logistic Regression, SVM, and Decision Tree models, which have lower C-index scores.
Table 13.
Phase-3 results
| C-Index (%) | SVM | Logistic regression | Random forest | Decision tree |
|---|---|---|---|---|
| Train set | 70% | 75% | 85% | 72% |
| Validation set | 58% | 63% | 68% | 60% |
| Test set | 67% | 71% | 74% | 69% |
From table 14, The Random Forest algorithm outperforms all other machine-learning models for breast cancer relapse prediction in precision (77.40%), recall (69.80%), F1-score (73.40%), and accuracy (71.92%). SVM outperforms Logistic Regression by 2–4% in precision, recall, F1-score, and accuracy (73.50%, 66.10%, 69.60%, and 68.92%). Logistic Regression has intermediate precision (71.20%), recall (65.40%), and accuracy (67.85%), while Decision Tree has good precision (74.80%) but reduced recall (63.90%), resulting in 66.45% accuracy. Random Forest has ensemble learning and improved generalization, SVM handles complicated boundaries better than linear Logistic Regression, and Decision Trees overfit, reducing recall and accuracy.
Table 14.
Phase-3 results on baseline models
| Algorithm | Precision (%) | Recall (%) | F1-Score (%) | Accuracy (%) |
|---|---|---|---|---|
| Logistic regression | 71.20 | 65.40 | 68.18 | 67.85 |
| SVM | 73.50 | 66.10 | 69.60 | 68.92 |
| Decision tree | 74.80 | 63.90 | 68.90 | 66.45 |
| Random forest | 77.40 | 69.80 | 73.40 | 71.92 |
Figure 18 displays how clinical and pathological factors alter a patient’s predicted breast cancer relapse risk (red or blue) compared to the model’s baseline forecast. In the diagram, high tumor stage, large tumor size, positive ER/PR status, and relapse-related markers greatly raise the estimated probability of relapse, whereas younger age, smaller tumor size, and favorable hormone receptor status minimize it. Features with higher percentages or SHAP values influence prediction more than weaker predictors, as seen by the thickness and length of the colored bars. The model integrates clinical characteristics, with the most relevant variables affecting prediction more than others.
Fig. 18.
Factors that lead to breast cancer resurgence in five years
The SHAP summary in Fig. 19 demonstrates how the top 17 clinical and molecular markers predict breast cancer relapse. The strongest SHAP effects are for tumor size, mutation count, number of positive lymph nodes tested, age at diagnosis, Nottingham predictive Index, and tumor stage, accounting for 55–60% of the overall predictive impact Medium-level characteristics like HER2 Status, Cancer Type, and Histologic Grade account for 20–25%, whereas therapeutic and hormonal aspects like Hormone Therapy, Chemotherapy, ER Status, PR Status, and Radio Therapy account for 15–20%. High-impact clinical characteristics directly signal tumor aggressiveness and disease burden, making them biologically more relevant for relapse risk than therapy-related or receptor-status parameters, which modulate progression. Comparative analysis of earlier studies mentioned in Table 15.
Fig. 19.
SHAP values and their influence on model outcome
Table 15.
Comparative analysis with earlier studies
| Studies | Methodology | Data used | Concordance index |
|---|---|---|---|
| [73] |
DeepHit CoxPH model Random Survival Forests model |
METABRIC data | 0.82 |
| [74] |
Random Survival Forests model CoxPH model |
METABRIC data | 0.74 |
| [75] |
Random Survival Forests model CoxPH model |
METABRIC data | 0.76 |
| [76] |
Random Survival Forests model CoxPH model |
Breast carcinoma data (NCR) | 0.77 |
| [77] |
DeepHit Cox-Time model Random Survival Forests model |
Oral cavity cancer | 0.84 |
| [78] |
DeepHit CoxPH model Random Survival Forests model |
Clinicopathological variables (METABRIC) | 0.82 |
| Proposed work | Multi-Phase Survival Modeling (Histopathology + Immunoscore + Clinical fusion) | METABRIC + BreakHis and NCTB databases | 0.90 |
METABRIC and similar cancer dataset survival prediction research have mostly used conventional or semi-deep learning methods like CoxPH, Cox-Time, DeepHit, and Random Survival Forests. Most of these works produced concordance index (C-index) values between 0.74 and 0.82, with METABRIC-based RSF and CoxPH models reporting 0.74–0.76 [74, 75], while hybrid DeepHit and RSF performed somewhat better at 0.82 [73, 78]. RSF and Cox-based models have a C-index of 0.77 for non-METABRIC datasets like NCR breast cancer data [76]. Oral cavity cancer had a higher value of 0.84 using DeepHit, Cox-Time, and RSF models [77]. These approaches struggle to handle heterogeneous multimodal inputs and may miss intricate long-term survival dependencies. Instead, the proposed study uses a multi-phase deep learning fusion architecture to integrate histopathological patches, immunohistochemistry-derived Immunoscore, and clinical factors for richer representation learning and risk modeling. By learning cross-modal prognostic markers that single-modality techniques cannot, this integrative architecture is projected to outperform existing models.
In terms of computational feasibility, the proposed PIL framework was implemented on an NVIDIA RTX 3090 GPU (24 GB) and completed end-to-end training within approximately 4 h, with an average inference time of 0.4 s per image patch. The modular structure enables independent and parallel execution of PatchSight, ImmuneMap, and LifeSpan modules, facilitating scalability across computational nodes. Although the LIME analysis is relatively resource-intensive, it was applied to representative samples to balance interpretability and efficiency. For deployment, the framework supports TensorFlow Serving and ONNX runtime, making it adaptable for clinical integration. Future optimization through model compression and quantization will further enhance real-time feasibility and reduce hardware dependency.
While the proposed PIL framework shows strong performance on public datasets, its current validation is limited to benchmark data. In clinical practice, factors such as staining variability, imaging hardware, and patient diversity may affect predictions. To enhance generalization, we incorporated stain normalization, patch-based augmentation, and patient-level validation. Future work will include testing on multi-institutional datasets (e.g., TCGA-BRCA, hospital archives) with domain adaptation and model calibration. We also acknowledge that clinical deployment requires workflow integration, interpretability, and regulatory validation.
Limitations
Although the proposed PIL framework demonstrates superior performance in diagnostic classification, immune profiling, and survival prediction, several limitations remain that warrant acknowledgment and future improvement. The evaluation relied mainly on public datasets such as BreakHis, LYSTO, and METABRIC, which may not capture the full diversity of clinical data, potentially limiting generalizability across different populations and institutions. The deep learning models used, particularly InceptionResNetV2 and Faster R-CNN, are computationally intensive, requiring high-end GPU resources that may hinder real-time or low-resource deployment. Furthermore, while the framework integrates histopathological, immunohistochemical, and clinical data effectively, it does not yet include genomic or radiomic features that could enhance interpretability and prognostic power. Finally, the model’s performance may be influenced by data imbalance and annotation quality. Future work will focus on multi-center validation, inclusion of additional data modalities, and model optimization for broader clinical applicability [79–83].
Conclusion
The PIL framework proposes a unified, interpretable, high-accuracy computational pathology pipeline for the harmonious integration of diagnostic classification, immune profiling, and survival prediction in the analysis of breast cancer. By optimizing InceptionResNetV2 for patch-based histopathological classification and employing an enhanced Faster R-CNN (ResNet-101) for tumor-infiltrating lymphocyte detection, the proposed framework achieves superior results of 98.76% diagnostic accuracy, 98.05% immune detection precision, and a C-index of 0.91. These collectively demonstrate the robustness, interpretability, and clinical relevance of the model. Stain normalization, patch augmentation, and hierarchical feature fusion ensure generalization across magnifications and data sources. Beyond benchmark success, the PIL framework offers a clinically scalable foundation for data-driven oncology by bridging morphological, immunological, and survival domains. Future work will investigate multi-institutional validation, model optimization, and explainability-driven clinical integration to further improve real-world applicability and support personalized cancer diagnosis and prognosis.
Acknowledgements
The authors extend their appreciation to the University Higher Education Fund for funding this research work under the Research Support Program for Central Labs at King Khalid University through project number CL/CO/A/7.
Appendix
Table 16.
Experimental reproducibility summary
| Component | Description/implementation details | Parameter/setting |
|---|---|---|
| Data Sources | BreakHis (Histopathology), LYSTO (IHC), METABRIC (Clinical + Survival) | Magnifications: 40×, 100×, 200×, 400× |
| Preprocessing pipeline | Color normalization (Reinhard), resizing to 299 × 299 × 3, patch extraction using patchify, noise filtering, normalization (ImageNet mean/std) | Patch size: 300 × 300 × 3; stride: 150 |
| Data augmentation | Rotation (± 90°), horizontal/vertical flips, brightness adjustment (± 20%), zoom (0.9–1.1×) | Applied uniformly across all magnifications |
| Feature engineering | CNN-based extraction (InceptionResNetV2 backbone) + spatial attention pooling; Immunoscore derived from lymphocyte density; standardized clinical features | PatchSight embeddings + ImmuneMap outputs + clinical covariates |
| Feature selection | LASSO for sparse variable selection, PCA for dimensionality reduction | α = 0.01; retained variance = 95%; threshold |
| Hyperparameter tuning (PatchSight) | Optimizer: Adam; LR = 1 × 10⁻⁴; Batch size = 32; Dropout = 0.4; L2 = 1 × 10⁻⁴ | Epochs = 30; Early stopping = 7; ReduceLROnPlateau = 0.1 |
| Hyperparameter tuning (ImmuneMap) | Optimizer: Adam; LR = 1 × 10⁻⁴; IoU = 0.5; Anchor scales = [64,128,256]; Batch = 8 | Epochs = 25; Dropout = 0.5 |
| Hyperparameter tuning (LifeSpan) | DeepHit: LR = 0.0005; Layers = [256,128,64]; Dropout = 0.4; Epochs = 300 | RSF: 1000 trees; Cox PH: L2 = 1e − 4 |
| Evaluation metrics | Accuracy, Precision, Recall, F1-score (Phase I–II); C-index, Brier score, Log-rank test (Phase III) | 5-fold validation; 10% test split |
| Cross-validation strategy | Stratified 5-fold cross-validation | Fold size = 20% |
| Random Seed settings | NumPy: 42; TensorFlow: 42; PyTorch: 42; Scikit-learn: 42 | Fixed for all runs |
| Implementation environment | Python 3.10; TensorFlow 2.12; PyTorch 2.1; scikit-learn; scikit-survival; Lifelines | GPU: NVIDIA RTX 3090; RAM: 64 GB |
Table 17.
Dataset specification and data split summary
| Dataset | Version/source | Data type | Preprocessing applied | Train/validation/test split | Notes |
|---|---|---|---|---|---|
| BreakHis | v1.0 (Public, P&D Lab, Brazil) | Histopathology images (benign/malignant, 40×–400×) | Color normalization, resizing (299 × 299 × 3), patch extraction (300 × 300 × 3, stride = 150) | 80%/10%/10% (patient-level split) | Augmentation applied only to training set |
| LYSTO | 2021 Hackathon release | IHC images (CD3+, CD8+) for TIL detection | Stain normalization, resizing (299 × 299 × 3), patch extraction at 40× magnification | 9,228/1,908/1,500 images | Balanced across scattered, clustered, and artifact regions |
| METABRIC | v2020 (cBioPortal) | Clinical + survival data | Missing value removal, standardization, one-hot encoding, label encoding | 60%/20%/20% (stratified split) | Only clinical and survival features used; no synthetic augmentation |
Table 18.
Statistical significance and variability analysis
| Model | Metric | Mean ± SD (Across 5 Folds) | 95% confidence interval | p-value (vs. Baseline) | Statistical test |
|---|---|---|---|---|---|
| PatchSight (BreakHis) | Accuracy | 98.62 ± 0.41 | [98.20, 99.04] | p < 0.01 | Paired t-test |
| F1-Score | 0.98 ± 0.02 | [0.97, 0.99] | p < 0.01 | Wilcoxon signed-rank | |
| ImmuneMap (LYSTO) | Detection Accuracy | 98.05 ± 0.38 | [97.66, 98.44] | p < 0.01 | Paired t-test |
| F1-Score | 0.97 ± 0.03 | [0.96, 0.99] | p < 0.01 | Wilcoxon signed-rank | |
| LifeSpan (METABRIC) | C-index | 0.91 ± 0.02 | [0.89, 0.93] | p < 0.01 | Paired t-test |
| Brier Score | 0.082 ± 0.004 | [0.078, 0.086] | p < 0.01 | Wilcoxon signed-rank |
Author contributions
Ahmed Kateb Jumaah Al-Nussairi: study conceptualization, framework design, and methodological formulation. Ali B. M. Ali: model architecture design, algorithm optimization, and experimental analysis. Saleem Malik: data preprocessing, deep learning implementation, and performance evaluation. S. Gopal Krishna Patro: literature review, comparative analysis, and result interpretation. Chandrakanta Mahanty: immunohistochemistry analysis, ImmuneMap module development, and validation. Kasim Sakran Abass: biomedical interpretation, immunological relevance, and clinical insights. Iman Basheti: pharmaceutical relevance, translational analysis, and manuscript refinement. Adis Abebaw Dessalegn: survival modeling, LifeSpan Prognosticator design, statistical analysis, and supervision. Khursheed Muzammil: data integration, visualization, and manuscript editing, Sanjay Kumar: data integration, visualization, and manuscript editing. All authors: manuscript review and approval.
Funding
Not applicable.
Data availability
The datasets used in this study are publicly available from the following sources: BreaKHis (Breast Cancer Histopathology Images): https://www.kaggle.com/datasets/ambarish/breakhisLYSTO (Lymphocyte Detection/Immunohistochemistry Images): https://zenodo.org/record/3513571METABRIC (Clinical and Survival Data): https://www.cbioportal.org/study? id=brca_metabrichttps://www.synapse.org/Synapse: syn1688369.
Code availability
The code used in the study is available at https://github.com/saleem-saleem/PatchSight-ImmuneMap-LifeSpan-PIL-Framework.
Declarations
Ethics approval and consent to participate
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Saleem Malik, Email: baronsaleem@gmail.com.
Adis Abebaw Dessalegn, Email: addis_abebaw@dmu.edu.et.
References
- 1.Argha Nandy S, Gangopadhyay A, Mukhopadhyay. Individualizing breast cancer treatment—The dawn of personalized medicine. Experimental Cell Res. 2014. 10.1016/j.yexcr.2013.09.002. [DOI] [PubMed] [Google Scholar]
- 2.Alam MR, Seo KJ, Yim K, Liang P, Yeh J, Chang C, Chong Y. Comparative analysis of Ki-67 labeling index morphometry using deep learning, conventional image analysis, and manual counting. Translat Oncol. 2025. 10.1016/j.tranon.2024.102159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ouhmouk M, Baichoo S, Abik M. Challenges in AI-driven multi-omics data analysis for oncology: addressing dimensionality, sparsity, transparency and ethical considerations. Inform Med Unlocked. 2025;57:101679. 10.1016/j.imu.2025.101679. [Google Scholar]
- 4.Fountzilas E, Pearce T, Baysal MA, et al. Convergence of evolving artificial intelligence and machine learning techniques in precision oncology. npj Digit Med. 2025;8:75. 10.1038/s41746-025-01471-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bulusu G, Vidyasagar KEC, Mudigonda M, et al. Cancer detection using artificial intelligence: a paradigm in early diagnosis. Arch Computat Methods Eng. 2025;32:2365–403. 10.1007/s11831-024-10209-0. [Google Scholar]
- 6.Podlipnik S, Hernández–Pérez Carlos, Ficapal J, Puig S, Malvehy J, et al. Comparative analysis and interpretability of survival models for melanoma prognosis. Computers Biol Med. 2025. 10.1016/j.compbiomed.2025.110027. [DOI] [PubMed] [Google Scholar]
- 7.Zhang Y, Yang Z, Chen R, et al. Histopathology images-based deep learning prediction of prognosis and therapeutic response in small cell lung cancer. npj Digit Med. 2024;7:15. 10.1038/s41746-024-01003-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tiwari A, Ghose A, Hasanova M, et al. The current landscape of artificial intelligence in computational histopathology for cancer diagnosis. Discov Oncol. 2025;16:438. 10.1007/s12672-025-02212-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Alzubaidi L, Zhang J, Humaidi AJ, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021;8:53. 10.1186/s40537-021-00444-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Duan L, He Y, Guo W, et al. Machine learning-based pathomics signature of histology slides as a novel prognostic indicator in primary central nervous system lymphoma. J Neurooncol. 2024;168:283–98. 10.1007/s11060-024-04665-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Khalili Fakhrabadi A, Shahbazzadeh MJ, Jalali N, et al. A hybrid inception-dilated-ResNet architecture for deep learning-based prediction of COVID-19 severity. Sci Rep. 2025;15:6490. 10.1038/s41598-025-91322-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zahoor Ahmad K, Al-Thelaya M, Alzubaidi F, Joad. Nauman Ullah Gilal, William Mifsud, Sabri Boughorbel, Giovanni Pintore, Enrico Gobbetti, Jens Schneider, Marco Agus. [DOI] [PubMed]
- 13.Ahmad Zahoor. HistoMSC: Density and topology analysis for AI-based visual annotation of histopathology whole slide images. Computers Biol Med. 2025. 10.1016/j.compbiomed.2025.109991. [DOI] [PubMed] [Google Scholar]
- 14.Daisuke Komura M, Ochi S. Machine learning methods for histopathological image analysis: Updates in 2024. Computational Structural Biotechnol J. 2025. 10.1016/j.csbj.2024.12.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Faranak Sobhani R, Robinson A, Hamidinekoo I, Roxanis N, Somaiah Y, Yuan. Artificial intelligence and digital pathology: opportunities and implications for immuno-oncology. Biochimica et biophysica acta (BBA) - Reviews on Cancer. 2021. 10.1016/j.bbcan.2021.188520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kunhoth S, Al-maadeed S, Akbari Y, et al. Computational methods for breast cancer molecular profiling using routine histopathology: A review. Arch Computat Methods Eng. 2025. 10.1007/s11831-025-10374-w. [Google Scholar]
- 17.Nguyen R, Vafaee F. Multi-omics prognostic marker discovery and survival modelling: a case study on multi-cancer survival analysis of women’s specific tumours. Sci Rep. 2025;15:36706. 10.1038/s41598-025-20572-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zhang H, Chen L, Li L, et al. Prediction and analysis of tumor infiltrating lymphocytes across 28 cancers by TILScout using deep learning. NPJ Precis Oncol. 2025;9:76. 10.1038/s41698-025-00866-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Fiorin A, López Pablo C, Lejeune M, et al. Enhancing AI research for breast cancer: a comprehensive review of tumor-infiltrating lymphocyte datasets. J Digit Imaging. 2024;37:2996–3008. 10.1007/s10278-024-01043-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Voon W, Hum YC, Tee YK, et al. Performance analysis of seven Convolutional Neural Networks (CNNs) with transfer learning for Invasive Ductal Carcinoma (IDC) grading in breast histopathological images. Sci Rep. 2022;12:19200. 10.1038/s41598-022-21848-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Al-Haija A, Adebanjo A. Breast cancer diagnosis in histopathological images using ResNet-50 convolutional neural network, 2020 IEEE International IOT, electronics and mechatronics conference (IEMTRONICS), Vancouver, BC, Canada. 2020:1–7. 10.1109/IEMTRONICS51293.2020.9216455
- 22.Vidyarthi A, Patel A. Deep assisted dense model based classification of invasive ductal breast histology images. Neural Comput Appl. 2021;33:12989–99. 10.1007/s00521-021-05947-2. [Google Scholar]
- 23.Güler M, Sart G, Algorabi Ö, Adıguzel Tuylu AN, Türkan YS. Breast cancer classification with various optimized deep learning methods. Diagnostics. 2025;15(14):1751. 10.3390/diagnostics15141751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Alanazi SA, Kamruzzaman MM, Islam Sarker MN, Alruwaili M, Alhwaiti Y, Alshammari N, et al. Boosting breast cancer detection using convolutional neural network. J Healthc Eng. 2021;2021:5528622. 10.1155/2021/5528622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Said Boumaraf X, Liu Z, Zheng X, Ma C, Ferkous. A new transfer learning based approach to magnification dependent and independent classification of breast cancer in histopathological images. Biomed Signal Process Control. 2021;63:1746–8094. 10.1016/j.bspc.2020.102192. [Google Scholar]
- 26.Maleki A, Raahemi M, Nasiri H, Part A. Breast cancer diagnosis from histopathology images using deep neural network and XGBoost. Biomed Signal Processing and Control. 2023. 10.1016/j.bspc.2023.105152. [Google Scholar]
- 27.Rasheed M, Jaffar MA, Akram A, et al. Improved brain tumor classification through DenseNet121 based transfer learning. Discover Oncol. 2025;16:1645. 10.1007/s12672-025-03501-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Liew XY, Hameed N, Clos J. An investigation of XGBoost-based algorithm for breast cancer classification,. Machine Learn with Appl. 2021;6:2666–8270. 10.1016/j.mlwa.2021.100154. [Google Scholar]
- 29.Liu M, He Y, Wu M, Zeng C. Breast histopathological image classification method based on autoencoder and siamese framework. Information. 2022;13(3):107. 10.3390/info13030107. [Google Scholar]
- 30.Abousamra S, Gupta R, Hou L, Batiste R, Zhao T, Shankar A, et al. Deep learning-based mapping of tumor infiltrating lymphocytes in whole slide images of 23 types of cancer. Front Oncol. 2022;11:806603. 10.3389/fonc.2021.806603. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bussola N, Papa B, Melaiu O, Castellano A, Fruci D, Jurman G. Quantification of the immune content in neuroblastoma: deep learning and topological data analysis in digital pathology. Int J Mol Sci. 2021;22:8804. 10.3390/ijms22168804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sarvamangala DR, Kulkarni RV. Convolutional neural networks in medical image understanding: a survey. Evol Intel. 2022;15:1–22. 10.1007/s12065-020-00540-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Yosofvand M, Khan SY, Dhakal R, Nejat A, Moustaid-Moussa N, Rahman RL, et al. Automated detection and scoring of tumor-infiltrating lymphocytes in breast cancer histopathology slides. Cancers. 2023;15(14):3635. 10.3390/cancers15143635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Rauf Z, Khan AR, Sohail A, et al. Lymphocyte detection for cancer analysis using a novel fusion block based channel boosted CNN. Sci Rep. 2023;13:14047. 10.1038/s41598-023-40581-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zafar MM, Rauf Z, Sohail A, Khan AR, Obaidullah M, Khan SH, Lee YS, Khan A. Detection of tumour infiltrating lymphocytes in CD3 and CD8 stained histopathological images using a two-phase deep CNN Photodiagnosis and Photodyn Ther. 2022. 10.1016/j.pdpdt.2021.102676. [DOI] [PubMed] [Google Scholar]
- 36.Negahbani F, Sabzi R, Pakniyat Jahromi B, Firouzabadi D, Movahedi F, Kohandel Shirazi M, et al. Pathonet introduced as a deep neural network backend for evaluation of Ki-67 and tumor-infiltrating lymphocytes in breast cancer. Sci Rep. 2021;11(1):8489. 10.1038/s41598-021-86912-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hamedi SZ, Emami H, Khayamzadeh M, et al. Application of machine learning in breast cancer survival prediction using a multimethod approach. Sci Rep. 2024;14:30147. 10.1038/s41598-024-81734-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jin L, Zhao Q, Fu S, Cao F, Hou B, Ma J. Development and validation of machine learning models to predict survival of patients with resected stage-III NSCLC. Front Oncol. 2023;13:1092478. 10.3389/fonc.2023.1092478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kayikci S, Khoshgoftaar TM. Breast cancer prediction using gated attentive multimodal deep learning. J Big Data. 2023;10:62. 10.1186/s40537-023-00749-w. [Google Scholar]
- 40.Goli S, Mahjub H, Faradmal J, Mashayekhi H, Soltanian AR. Survival prediction and feature selection in patients with breast cancer using support vector regression. Comput Math Methods Med. 2016;2016:2157984. 10.1155/2016/2157984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kalafi EY, Nor NAM, Taib NA, et al. Machine learning and deep learning approaches in breast cancer survival prediction using clinical data. Folia Biol (Praha). 2019;65(5–6):212–20. 10.14712/fb2019065050212. [DOI] [PubMed] [Google Scholar]
- 42.Abdikenov B, Iklassov Z, Sharipov A, Hussain S, Jamwal PK. Analytics of heterogeneous breast cancer data using neuroevolution. IEEE Access. 2019;7:18050–60. 10.1109/ACCESS.2019.2897078. [Google Scholar]
- 43.Serhat Simsek U, Kursuncu E, Kibis M, AnisAbdellatif A, Dag A. A hybrid data mining approach for identifying the temporal effects of variables associated with breast cancer survival. Expert sys with Appl. 10.1016/j.eswa.2019.112863. [Google Scholar]
- 44.Alzu’bi A, Najadat H, Doulat W, et al. Predicting the recurrence of breast cancer using machine learning algorithms. Multimed Tools Appl. 2021;80:13787–800. 10.1007/s11042-020-10448-w. [Google Scholar]
- 45.Al-Yarimi FA. Comparative evaluation of data mining algorithms in breast cancer. Computers Mat Continua. 2023;77(1):633–45. [Google Scholar]
- 46.Ansari G, Shafi S, Ansari MD, Ahmad S, Abdeljaber H. Prediction and diagnosis of breast cancer using machine learning techniques Predicción y diagnóstico Del cáncer de Mama mediante técnicas de Aprendizaje automático. Data Metadata. 2024;03:346. 10.56294/dm2024.346. [Google Scholar]
- 47.Siddharth R, Gupta. Prediction time of breast cancer tumor recurrence using machine learning. Cancer Treat Res Commun. 2022;32:2468–942. 10.1016/j.ctarc.2022.100602. [DOI] [PubMed] [Google Scholar]
- 48.Azeroual S, Ben-Bouazza F, Naqi A, Sebihi R. Triple negative breast cancer and non-triple negative breast cancer recurrence prediction using boosting models. In: Kacprzyk, J., Ezziyyani, M., Balas, V.E, editors international conference on advanced intelligent systems for sustainable development. AI2SD 2022. Lecture Notes in Networks and Systems. Springer, Cham. 2023.10.1007/978-3-031-35248-5_39
- 49.Gu D, Su K, Zhao H. A case-based ensemble learning system for explainable breast cancer recurrence prediction. Artif Intell Med. 2020;107:101858. 10.1016/j.artmed.2020.101858. [DOI] [PubMed] [Google Scholar]
- 50.Bhardwaj A, Bhardwaj H, Sakalle A, Uddin Z, Sakalle M, Ibrahim W. Tree-based and machine learning algorithm analysis for breast cancer classification. Comput Intell Neurosci. 2022;2022:6715406. 10.1155/2022/6715406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Shamshiri MA, Krzyżak A, Kowal M, Korbicz J. Compatible-domain transfer learning for breast cancer classification with limited annotated data. Comput Biol Med. 2023;154:106575. 10.1016/j.compbiomed.2023.106575. [DOI] [PubMed] [Google Scholar]
- 52.Mall PK, Singh PK, Srivastav S, Narayan V, Paprzycki M, Jaworska T, et al. A comprehensive review of deep neural networks for medical image processing: recent developments and future opportunities. Healthcare Analytics. 2023;4:100216. 10.1016/j.health.2023.100216. [Google Scholar]
- 53.Chaieb M, Azzouz M, Refifa MB, Fraj M. Deep learning-driven prediction in healthcare systems: applying advanced CNNs for enhanced breast cancer detection. Comput Biol Med. 2025;189:109858. 10.1016/j.compbiomed.2025.109858. [DOI] [PubMed] [Google Scholar]
- 54.Ding R, Zhou X, Tan D, et al. A deep multi-branch attention model for histopathological breast cancer image classification. Complex Intell Syst. 2024;10:4571–87. 10.1007/s40747-024-01398-z. [Google Scholar]
- 55.Ibrahim A, Gamble P, Jaroensri R, Abdelsamea MM, Mermel CH, Chen P-HC, et al. Artificial intelligence in digital breast pathology: techniques and applications. Breast. 2020;49:267–73. 10.1016/j.breast.2019.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Zhu Z, Liu L, Free RC, Anjum A, Panneerselvam J. OPT-CO: optimizing pre-trained transformer models for efficient COVID-19 classification with stochastic configuration networks. Inf Sci. 2024;680:121141. [Google Scholar]
- 57.Zhu H, Zhu Z, Lu SY. BccT: an efficient transformer model for blood cell classification. Multimedia Syst. 2026;32(1):29. [Google Scholar]
- 58.Zhu H, Zhu Z, Lu SY. Multi-stage attention for efficient brain tumor classification with SAM-Med2D. Multimedia Syst. 2026;32(1):51. [Google Scholar]
- 59.Singh DP, Banerjee T, Mahajan S, Ramesh Chandra K, Kumar R, Phani S, Ashok M. A comprehensive study on deep learning models for the detection of diabetic retinopathy using pathological images. Arch Comput Methods Eng. 2025. 10.1007/s11831-025-10315-7. [Google Scholar]
- 60.Banerjee T, Singh DP, Kour P, Swain D, Mahajan S, Kadry S, et al. A novel unified inception-U-Net hybrid gravitational optimization model (UIGO) incorporating automated medical image segmentation and feature selection for liver tumor detection. Sci Rep. 2025;15(1):29908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Singh DP, Kour P, Banerjee T, Swain D. A comprehensive review of various machine learning and deep learning models for anti-cancer drug response prediction: comparative analysis with existing state of the art methods. Arch Comput Methods Eng. 2025;32:1–25. [Google Scholar]
- 62.Banerjee T, Singh DP, Swain D, Mahajan S, Kadry S, Kim J. A novel hybrid deep learning approach combining deep feature attention and statistical validation for enhanced thyroid ultrasound segmentation. Sci Rep. 2025;15(1):27207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Satushe V, Vyas V, Metkar S, Singh DP. AI in MRI brain tumor diagnosis: a systematic review of machine learning and deep learning advances (2010–2025). Chem Intell Lab Syst. 2025. 10.1016/j.chemolab.2025.105414. [Google Scholar]
- 64.Singh DP, Banerjee T, Kour P, Swain D, Narayan Y. CICADA (UCX): A novel approach for automated breast cancer classification through aggressiveness delineation. Comput Biol Chem. 2025;115:108368. [DOI] [PubMed] [Google Scholar]
- 65.Banerjee T, Singh DP, Kour P. Advances in deep neural, transformer learning, and kernel-based methods for diabetic retinopathy detection: a comprehensive review. Arch Comput Methods Eng. 2025. 10.1007/s11831-025-10376-8. [Google Scholar]
- 66.Narayan Y, Singh DP, Banerjee T, Kour P, Rane K, et al. A comparative evaluation of deep learning architectures for prostate cancer segmentation: introducing trionixnet with ncore multiattention mechanism. Arch Comput Methods in Eng. 2025. 10.1007/s11831-025-10411-8. [Google Scholar]
- 67.Singh DP, Banerjee T, Kour P, Malik R, Naidu GR, Kumar R, Narayan Y. A comprehensive study of enhanced computational approaches for breast cancer classification comparative analysis with existing state of the art methods. Arch Comput Methods in Eng. 2025. 10.1007/s11831-025-10414-5. [Google Scholar]
- 68.Singh DP, Banerjee T, Durai CAD, Kumar JS, Dutta C, Dass P, Swain D. A comprehensive study of various hybrid deep learning models for automated and explainable pneumonia detection in the pulmonary alveolar region current insights and future directions. Arch Comput Methods Eng. 2025. 10.1007/s11831-025-10453. [Google Scholar]
- 69.https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/
- 70.Zhang J, Zhang M, Tian Q, et al. A novel model associated with tumor microenvironment on predicting prognosis and immunotherapy in triple negative breast cancer. Clin Exp Med. 2023;23:3867–81. 10.1007/s10238-023-01090-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.https://www.kaggle.com/datasets/raghadalharbi/breast-cancer-gene-expression-profiles-metabric
- 72.https://nctb.iitm.ac.in/index-2.html
- 73.Evangeline I, Kirubha SPA, Precious JG. Survival analysis of breast cancer patients using machine learning models. Multimed Tools Appl. 2023;82:30909–28. 10.1007/s11042-023-14989-8. [Google Scholar]
- 74.Buyrukoğlu G. Survival analysis in breast cancer: evaluating ensemble learning techniques for prediction. PeerJ Comput Sci. 2024;10:e2147. 10.7717/peerj-cs.2147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Shi Z, Chen Y, Liu A, et al. Application of random survival forest to establish a nomogram combining clinlabomics-score and clinical data for predicting brain metastasis in primary lung cancer. Clin Transl Oncol. 2025;27:1472–83. 10.1007/s12094-024-03688-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Jia Y, Li C, Feng C, Sun S, Cai Y, Yao P, Ma X. Prognostic prediction for inflammatory breast cancer patients using random survival forest modeling. Translational Oncol. 2025. 10.1016/j.tranon.2024.102246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Adeoye J, Hui L, Koohi-Moghadam M, Tan JY, Choi SW, Thomson P. Comparison of time-to-event machine learning models in predicting oral cavity cancer prognosis. Int J Med Inform. 2022;157:104635. 10.1016/j.ijmedinf.2021.104635. [DOI] [PubMed] [Google Scholar]
- 78.Xiao J, et al. The application and comparison of machine learning models for the prediction of breast cancer prognosis: retrospective cohort study. JMIR Med Inform. 2022;10:e33440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Hashim HT, et al. Artificial intelligence versus radiologists in detecting early-stage breast cancer from mammograms: a meta-analysis of paradigm shifts. Pol J Radiol. 2025;90:e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Nazari E, et al. Breast cancer prediction using different machine learning methods applying multi factors. J Cancer Res Clin Oncol. 2023;149(19):17133–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Behzadi M, et al. The potential diagnostic application of artificial intelligence in breast cancer. Curr Pharm Des. 2025. 10.2174/0113816128369168250311172823. [DOI] [PubMed] [Google Scholar]
- 82.Alawee WH, et al. A data augmentation approach to enhance breast cancer detection using generative adversarial and artificial neural networks. Open Eng. 2024;14(1):20240052. [Google Scholar]
- 83.Hashim HT, et al. Assessment of breast cancer risk among Iraqi women in 2019. BMC Womens Health. 2021;21(1):412. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used in this study are publicly available from the following sources: BreaKHis (Breast Cancer Histopathology Images): https://www.kaggle.com/datasets/ambarish/breakhisLYSTO (Lymphocyte Detection/Immunohistochemistry Images): https://zenodo.org/record/3513571METABRIC (Clinical and Survival Data): https://www.cbioportal.org/study? id=brca_metabrichttps://www.synapse.org/Synapse: syn1688369.
The code used in the study is available at https://github.com/saleem-saleem/PatchSight-ImmuneMap-LifeSpan-PIL-Framework.










































