Abstract
Breast cancer is the most frequently diagnosed cancer among women and persists as a societal problem worldwide. It remains a leading cause of cancer associated morbidity and mortality, specifically in low- and middle-income countries where access to timely diagnosis and treatment is often limited. This study aims to compare survival and classical machine learning models for predicting breast cancer survival in Ethiopia to identify approaches that balance predictive accuracy with interpretability. The study utilized retrospective data from 1164 women treated at Tikur Anbesa Specialized Hospital and Hiwot Fana Specialized University Hospital between 2019 and 2024. Methods like Kaplan–Meier estimation, Cox proportional hazards, random survival forests (RSF), DeepSurv, and classical machine learning (SVM, XGBoost, LGBM, and RF) classifiers were used with evaluation metrics such as AUC, C-index, and Integrated Brier Score (IBS). The Shapley additive explanation approach was used to ensure the interpretability of results from models such as RSF, DeepSurv, and random forests (RF). It allowed the identification of important predictors of breast cancer outcome by indicating consistent predictors across models. The findings demonstrated that random survival forest and random forest achieved the highest performance (C-index: 0.754; IBS: 0.091) and (0.729 ± 0.006), respectively, outperforming the other models under consideration. The Shapley Additive Explanations (SHAP) analysis for the RSF model showed that age, tumour size, metastasis, stage, comorbidities, and marital status as the most important predictors of breast cancer survival. Furthermore, the SHAP analysis for the RF model indicated that the higher age category (45 and above), metastasis status (M1), stage four, and larger tumour size contribute a strong influence on predictions. Among the machine learning models, the random forest algorithm effectively identifies the key predictors of breast cancer outcomes. For the survival analysis methods, the RSF offers robust capabilities for handling time-to-event data and censoring, making it well-suited for accurate survival prediction. By combining these approaches, we were able to gain clearer insights and better identify the key factors influencing breast cancer prognosis. This study highlights the value of data-driven methods in helping healthcare professionals identify high-risk patients with greater precision and take timely, informed actions to support their care.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-026-40565-9.
Keywords: Breast, Cancer, SHAP, DeepSurv, C-index
Subject terms: Cancer, Computational biology and bioinformatics, Diseases, Medical research, Oncology, Risk factors
Introduction
Worldwide, breast cancer is the most frequently diagnosed cancer among women and persists as a societal problem. Globally, in 2022, an estimated 2.3 million women were diagnosed with breast cancer. Among the diagnosed women with breast cancer, an estimated 670,000 deaths were registered1. Though there are developments in screening, diagnosis, and treatment of breast cancer to improve the survival of women in high-income countries, inequalities persist, specifically in low- and middle-income regions like Sub-Saharan Africa. Likewise, breast cancer survival rates differ meaningfully based on geographical locations2. It is the second leading cause of cancer-associated deaths in sub-Saharan Africa. In sub-Saharan Africa, most of the breast cancer patients are women3,4. About 97% of all breast cancer incidents were seen in women in Africa4. It is estimated that the cancer burden will be 28.4 million by 2040 worldwide, which is about a 47% increase over the cancer burden in 2020. In sub-Saharan Africa, almost 80% of breast cancer incidences are detected at late stages compared to 15% in developed countries5.
Likewise, the breast cancer case is increasing, and it accounts for 34% of all cancer incidences and is the second leading cause of all cancer-related mortality in Ethiopia6,7. Most of the women in Ethiopia were detected advanced stage due to limited infrastructure for screening, lack of awareness, and resource limitations within the health sector8. Of the five patients detected for breast cancer cases, only one received a diagnosis of breast cancer. Some of the risk factors for the development of breast cancer are family history, lack of breastfeeding, high cholesterol, and the use of processed foods9. These contextual problems underline the urgent need for a detailed understanding of risk factors, accurate and open predictive tools that can guide customized treatment and resource allocation.
Consequently, for improved clinical decision-making, precision survival prediction is vital for stratifying patient risk and treatment planning. However, classical clinical predictive models often depend on a limited set of predictors such as tumour size, stage, and age at diagnosis. Though it is informative, these methods may not fully capture the heterogeneity of breast cancer or the intricate interaction between demographic, clinical, and molecular factors10. This constraint is specifically evident in low-resource countries, where there are various genetic backgrounds, late diagnosis, and different epidemiological characteristics that complicate survival prediction9,11,12. To overcome such limitations, there is a rising demand for advanced analytical approaches that can accommodate nonlinearities, interactions, and high-dimensional data structures.
Recently, in addition to classical survival data analysis models, machine learning techniques have been developed as promising tools for event prediction in oncology. Methods such as Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LGBM) have reliably shown strong predictive performance across different public health applications8,13–15. Such techniques are specifically valuable for breast cancer prediction, since they are capable of handling large-scale, heterogeneous data, though uncovering hidden patterns that classical methods may overlook. For instance, support vector machines perform well in high-dimensional biomedical datasets. The ensemble tree-based techniques leverage bagging or boosting techniques to improve generalizability and minimize overfitting15. In resource-limited contexts such as Ethiopia, where genome data might be scarce, machine learning methods trained on readily available demographic and clinical factors can provide real-world and useful survival predictions.
Survival analysis is vital in modeling time-to-event outcomes in breast cancer, where censoring is common. The widely used model in survival analysis is the Cox proportional hazards which is preferred for its interpretability and ability to estimate covariate effects on hazard rates16. However, when there is a heterogeneous population, the assumptions of proportional hazards and linear covariate effects are mostly violated. This violation mostly happens in developing countries like Sub-Saharan Africa, where late-stage diagnosis and comorbidities complicate survival dynamics17. These drawbacks are overcome by the advancement of models, such as random survival forest (RSF) address these restrictions by capturing nonlinearities and interactions without proportional hazard assumptions13,18. RSFs are criticized for limited interpretability and clinical applicability, while they are more accurate19. Currently, the advancement of machine learning comes with deep learning models such as deep neural networks for survival data analysis (DeepSurv) to integrate complex molecular and clinical characteristics while retaining hazard-based outputs20,21. Notwithstanding the advancement and their potential, there are concerns that continue around data demands, generalizability, and the “black-box” nature of such models22–24. As a whole, though newer methods extend beyond Cox proportional hazard, harmonizing predictive accuracy with interpretability and reasonable validation remains a key challenge in breast cancer survival analysis. Furthermore, classical machine learning (ML) models face constraints in survival analysis because they cannot correctly account for censoring and time-to-event data, which results in biased predictions. To overcome these problems, survival-specific techniques such as Cox proportional hazards (CPH), RSF, and DeepSurv were developed. Specifically, the random survival forest (RSF) extends the random forests to accommodate censoring. Similarly, DeepSurv integrates the CPH model into a deep learning setting to accommodate complex and nonlinear relationships in public health data. This is where a data-driven approach becomes essential for guiding practitioners in selecting appropriate models. Recent studies have systematically benchmarked both traditional and machine learning models for breast cancer survival prediction25. Compared Cox proportional hazards (CPH), random survival forests (RSF), DeepSurv, and several classical machine learning models, including random forests (RF), XGBoost, support vector machines (SVM), and LightGBM, on SEER breast cancer data, evaluating performance using the concordance index (C-index), integrated Brier score (IBS), and area under the curve (AUC). Similar comprehensive comparisons have been conducted on other oncological datasets15,26,27. The present study extends this methodological framework to a critically understudied population and clinical context. Although this approach follows the established comparative paradigm, its primary contribution lies in its application within a unique, resource-limited setting. To the best of the authors’ knowledge, this is the first comprehensive survival modeling benchmark for breast cancer using a multi-center Ethiopian cohort. The findings provide evidence-based guidance on prognostic modeling tailored to the specific epidemiological, clinical, and infrastructural characteristics of low-resource health systems. Depending on this, the current study aimed to evaluate both classical and survival machine learning techniques in predicting breast cancer data with the aim of improving predictive accuracy and supporting precision in treatment.
Materials and methods
Source of data
The study was carried out at Tikur Anbesa Specialized Hospital (TASH) in Addis Ababa (Finfinnee) and Hiwot Fana Specialized University Hospital (HFSUH) in Harar, Harari Regional State. Women with a diagnosis of breast cancer who were admitted and treated between September 2019 and April 2024 made up the target population. Secondary data from patient cards at both hospitals were used in this investigation. The target variable in this dataset, which consists of 1,164 instances with 13 attributes, is patient status. To improve the analysis’s reliability, patients with missing values for key variables such as tumor size, metastasis, lymph node, and stage were omitted from the analysis. In addition, patients younger than 15 years of age were excluded from the study. Using the survival months variable as the time-to-event metric and the status variable to indicate the occurrence of the event of interest (death) or censoring, the analysis’s main focus was on survival time, expressed in months. In this study, the status variable indicates whether a patient died during the follow-up period (event = 1) or if their data were censored (event = 0). Tumor size, cancer stage, metastasis, node, age, number of children, marital status, habits, residence, prior surgery, ECOG status, and breastfeeding status were among the predictors to be taken into account. This study was conducted in accordance with all relevant hospital guidelines and regulations. Retrospective data collection was approved by the College of Health and Medical Science at Haramaya University and by Tikur Anbesa Specialized Hospital. The detailed description of the predictor variables is presented as follows in Table 1.
Table 1.
Predictors of breast cancer and their description.
| Variables | Description |
|---|---|
| Tumor size | Indicates the size of the tumor and is categorized into five groups (T0, T1, T2, T3, T4) |
| Metastasis | It indicates the spread of cancer from its original site to a different part of the body (M0- no distant metastasis, M1- expanded to other parts of the body) |
| Lymph Node | N0-no cancer in the lymph nodes, N1-spread to 1 to 3 lymph nodes, N2-spread to 4 to 9 lymph nodes, N3-spread to 10 or more lymph nodes |
| Age of the patient | It is categorized into four: 15-24.9, 25-34.9, 35-44.9, 45 and above. |
| Number of children | It is categorized into three: 1 for 0–2, 2 for 3–5, and 3 for 6 and above. |
| Marital Status | Marital status of the patient is categorized into: 1 for single, 2 for married, 3 for divorced, and 4 for widowed. |
| Habits | Numbered 0 to 4, 0 for none, 1 for smoking, 2 for alcohol, 3 for khat, 4 for others |
| Residence | Classified as 1 for urban, and 2 for rural |
| Previous Surgery | 1 for yes and 0 for no |
| ECOG | 1 for I, and 2 for II |
| Breastfeeding | Classified as 1 for yes, and 2 for no |
| Comorbid illness | Categorized as 1 for yes, and 2 for no |
| Cancer stage | Numbered 1 to 5, 1 for stage 0, 2 for stage I, 3 for stage II, 4 for stage III, and 5 for stage IV |
Methods of data analysis
The current study used various survival and machine learning techniques to investigate the predictors of breast cancer survival in patients. Relevant predictors were identified depending on through an assessment of prior research works to validate their suitability for predictive survival and ML modeling. The study employed different survival and ML models, such as Kaplan-Meier (KM), CPH, RSF, DeepSurv, SVM, XGBoost, LGBM, and RF.
Kaplan-Meier (KM): KM is a non-parametric survival method used when there is a doubt about the distribution of the data as an alternative parametric survival method. KM provides insight into the shape of the survival function for each group. Hence, it is used in this paper to estimate the survival function. The KM survival curves indicate the patterns of one category survivorship over the other. The global log-rank tests were performed to compare survival distributions between groups. It means that the category with a lower curve had a worse survival as compared to the category with the higher curve. We used pairwise log-rank tests with Bonferroni-adjusted p-values to find specific differences between groups. Albeit, KM does not detect a difference when survival curves overlap. Hence, it is used to estimate the survival of patients at each time point. The KM estimator for the survival function is given as follows.
![]() |
1 |
where
is the number of breast cancer patients who experience the event at the time
and
is the number of individuals who didn’t attend the event on time before the time
.
Cox proportional hazard model (CPH): Next, Cox proportional hazards regression was used to evaluate the effect of multiple covariates on breast cancer survival. CPH is the most common survival model used to predict time-to-event data, which was developed by16. It’s a semiparametric model that permits the hazard function for the addition of covariates, whereas the baseline hazard is unknown and only has positive values. In addition, it takes into account the effects of censored observations. The Cox model is used to test whether the survival times among two or more groups are different or not. Then, the semi-parametric hazard model is defined as.
![]() |
2 |
where
is the hazard function for individual
at
conditional on predictors
,
is an unspecified baseline hazard function and
are the unknown regression parameters to be estimated.
is the number of predictors. Because the model is semi-parametric, it can flexibly estimate the baseline hazard while simultaneously measuring the impact of covariates, making it well-suited for analyzing predictors of breast cancer survival risk. The partial likelihood function is used to estimate the unknown regression parameters.
Random survival forests (RSF): RSF is employed to predict right-censored survival data13. It is an extension of random forest, a user-friendly method, and fairly robust. It needs to set a random selection of the number of predictors, the number of trees grown, and the splitting rule to be used. Furthermore, RSF is a highly data-adaptive and model assumption-free13. Most of the classical models, like CPH, depend on limiting assumptions. However, there is an interest in determining when the inclusion of interactions for risk factors and non-linear effects is needed to predict the association between predictors and outcome properly. However, such drawbacks are handled properly by random survival forest settings28. The following are steps to fit a random survival forest as described in13.
Draw
bootstrap samples from the original dataset; here
is the number of resamples. On average, each bootstrap sample excludes about
of the observations. These are the out-of-bag (OOB) data, which can be used for an unbiased assessment of model performance on unseen data.For each bootstrap sample, grow a survival tree. At each node, randomly select
predictor variables as candidate splitters, and split the node on the candidate that maximizes the survival difference between daughter nodes. In random survival forests (RSF), the most common splitting rules are the log-rank and log-rank-score rules (with variants such as conservation-of-events or random log-rank also used); these rules evaluate splits by how well they separate individuals with different survival experiences. The ensemble is formed by averaging tree-level cumulative hazard or survival estimates across all trees.Grow your tree to its full size according to the given conditions.
For each tree, estimate a cumulative hazard function (CHF) and a survival function (SF). Averaging over all trees produces the ensemble CHF. This provides one prediction per person in the data.
DeepSurv: Cox proportional hazard depends on linear predictor functions, which limits its ability to capture nonlinear and high-dimensional relationships common in biomedical data. DeepSurv was developed by20as an extension of the CPH model to handle its limitations. It uses a nonlinear neural network in place of a linear log-risk function to flexibly allow inclusion of covariates while preserving the Cox partial likelihood setting. DeepSurv uses a deep learning setting that combines the idea of survival modeling with neural networks. The model is trained using a loss function derived from the partial likelihood of the Cox model, allowing it to estimate the hazard function directly from the data. DeepSurv is used to handle censored data well, which is a common event in survival analysis20. In DeepSurv, the partial likelihood of the Cox model, given a set of censored event times
is defined as:
![]() |
3 |
where,
indicates the set of patients at risk at each event time, and
represents whether an event was observed for patient
. The loss function in DeepSurv is derived by taking the negative log of the partial likelihood:
![]() |
4 |
where
is the regularization parameter, balancing fit and prevention of overfitting by penalizing large parameter values.
Random forest (RF): is an ML method widely used to overcome the limitations of decision trees by improving prediction performance and reducing instability. The RF method works by developing multiple decision trees and aggregating their outputs via bagging29. Given a training dataset
, random forests generate
bootstrap samples
(sampled with replacement), and train a decision tree
on each. The final prediction is obtained by averaging in regression.
![]() |
5 |
A random subset of predictors is picked at each split to reduce correlation between trees and include additional randomness. There is a key drawback of RF, which loss of interpretability in spite of its strong predictive performance. Variable importance is one of the common approaches to increase interpretability, which measures how often a predictor is used in the trees30.
Extreme gradient boosting machines (XGBM): is a ML method that is a highly efficient and scalable form of XGBoost decision trees31. In contrast to RF, which builds trees independently and averages their predictions, XGBM constructs trees sequentially, where each new tree attempts to correct the errors of the prior ensemble. Given a dataset
, the model prediction at iteration
is.
![]() |
6 |
where
is the space of regression trees. The objective function to be minimized combines a differentiable loss function
with a regularization term
that penalizes model complexity:
![]() |
7 |
where
is the number of leaves,
is the leaf weights,
penalizes the number of leaves, and
controls L2 regularization. To efficiently find an optimal tree structure, the second-order Taylor expansion of the loss function is used. Feature importance is used for model interpretability.
LightGBM (LGBM): is a highly efficient gradient boosting framework that improves upon traditional boosting methods such as XGBoost by introducing optimized tree growth strategies and efficient handling of large-scale data32. Like XGBM, LGBM builds trees sequentially, where each new learner minimizes the residual errors of the previous ensemble33. The model prediction at iteration
is:
![]() |
8 |
with the objective function:
where
is a differentiable loss function and
is a regularization term penalizing tree complexity. The model allows deeper, more complex trees in regions with higher error, leading to improved accuracy over the traditional XGBoost.
Support vector machines (SVMs) It is a class of supervised ML commonly employed for classification. Also, it is robust in handling high-dimensional datasets34. SVM is used to find an optimal hyperplane that splits data points of various classes with the maximum margin. For instance, given a dataset
with
and class labels
, the primal optimization problem can be formulated as.
![]() |
9 |
subject to.
where
is the normal vector to the hyperplane,
is the bias,
are slack variables allowing misclassifications, and
is a regularization parameter controlling the trade-off between maximizing the margin and minimizing classification errors35. It employs the kernel trick to map input space into a higher-dimensional feature space for non-linearly separable data. The decision function for a new point
is given by:
![]() |
10 |
where
are Lagrange multipliers obtained from the dual problem, and
is a kernel function such as linear, Polynomial kernel, and Radial basis function (RBF)36.
Model tuning and validation
In this study, 1164 records were used by splitting into a training dataset (80%) and a test dataset (20%) for final model testing. In addition, there are strategies used for validation and tuning for DeepSurv using 20% of the training dataset. This strategy was used to adjust hyperparameters and check performance for early stopping. The model was further trained by adjusting options such as the number of layers, dropout rate, and learning rate, using Optuna37for optimization. To increase the model reliability, a 10-fold cross-validation was conducted on the training dataset. For CPH and RSF models, 5-fold cross-validation was employed on the training dataset to evaluate performance and tune parameters. To find the best setting in the RSF model, the number of trees, the number of predictors at each split, and the minimum size of a node were tested. Furthermore, log-rank and log-rank score ways of splitting were used to compare which worked better. Similarly, other ML models were tuned using grid search or random search with 5-fold cross-validation. For RSF, we have used random-state = 42 and enabled parallel computation. The hyperparameters were tuned using/random search. The maximum tree depth (1–9), minimum samples per leaf (25–50), minimum samples per split (50–98, step size 2), number of trees (50, 100, 150, 500, 1,000), and log-rank splitting rule were used. XGBoost/LightGBM: models were trained with a fixed random seed and parallel computation enabled. Hyperparameters were tuned over wide ranges to avoid under-constraining the model, including: maximum tree depth (4–39), learning rate (0.001–0.99), number of estimators (100–190), minimum child weight (1–10), subsampling ratio (0.1–1.0), column subsampling ratio (0.1–1.0) and L1/ L2 (reg_lambda: 0–1.0). For the DeepSurv model, a fully connected neural network with two hidden layers of 32 units each, ReLU activations, and a single linear output was used. The Adam optimizer (learning rate of 0.01), batch size of 16, 50 epochs, and no dropout, regularization, or early stopping were used. Finally, all models were tested on a separate test dataset to check their reliability and how well they might work on new data.
Model performance evaluation metrics
In this study, the Integrated Brier Score (IBS) and C-index were used to measure the prediction accuracy of all methods. For instance, for the patient
, suppose
is event time,
is censoring time and
is the predicted risk score from a model. And suppose
represent the censored time and
represents right censoring38,39. The C-index for the randomly selected pair of cases
, the estimate of predicted probability is given by:
C-probability=
.
And the concordance index is given by
![]() |
11 |
where 0.5 or lower C-index indicates the model is predicting an outcome no better than random chance, while a higher C-index corresponds to a model with higher prediction accuracy38.
What is more to C-index, the IBS is used to assess the prediction performance of different models. In the case of binary prediction algorithms, the IBS score is the mean squared prediction error (MSPE). For time-to-event data, the IBS for a single subject is considered at a given time point
, and the squared difference between observed event status also a model-based prediction of survival time
. The smaller the IBS value, the greater the prediction. The IBS takes the place of a cumulative BS over time and is given by:
![]() |
12 |
where,
is the individual survival status at time t, and
is the predicted survival probabilities from the model with covariates Z. The IBS is cognized as the prediction error rate. An IBS value of 0.5 or greater shows the model’s predictive performance is no better than chance, whereas a lower IBS indicates higher prediction power40.
Shapley additive explanations (SHAP): SHAP values provide a consistent approach to quantify how much each feature contributes to a model’s prediction. By drawing from Shapley values in cooperative game theory, SHAP distributes the model’s output among individual features in a fair and interpretable way41. The contribution of feature
is defined as.
![]() |
13 |
Where,
is a SHAP value measuring the contribution of feature
in subset
,
is a model prediction for the sample
,
is a modified version of
where feature
is replaced with a value from another sample and
is the set of samples used for replacing feature
. In practice, for each sample
, the feature
is substituted with values from other samples to create
. The model predictions on these modified samples are compared with the original prediction, and the differences are averaged. This average difference gives
, which quantifies the importance of feature
42.
Results
Table 2 shows the baseline characteristics of breast cancer patients at Tikur Anbesa Specialized Hospital and Hiwot Fana Specialized University Hospital. From Table 2, the chi-square test of association revealed that comorbidity, habits, age, tumour size, stage, and metastasis are significantly associated with breast cancer. The table reveals that 53.4% of the deceased group had metastasis (M1) at diagnosis compared to 17.2% of the censored category. Comorbidity is more frequent in dead patients (24.1%), unlike the censored patients (12.3%). Most of the dead patients were patients 45 or older (67.7%). Likewise, patients with the largest tumour size (T4) were more deceased (54.9%), compared to the censored category (17.7%). Stage IV is the most frequent in dead patients (51.9%). Additionally, node 3 is more common in deceased patients (54.9%).
Table 2.
Baseline characteristics of breast cancer Patients.
| Factors | Overall(N = 1164) | Censored (N = 1031) (%) | Dead (N = 133) (%) | p-value |
|---|---|---|---|---|
| Marital Status | ||||
| Single | 21 (1.8) | 17 (1.6) | 4 (3.0) | 0.627 |
| Married | 1026 (88.1) | 908 (88.1) | 118 (88.7) | |
| Divorced | 99 (8.5) | 90 (8.7) | 9 (6.8) | |
| Widowed | 18 (1.5) | 16 (1.6) | 2 (1.5) | |
| Previous Surgery | ||||
| No | 960 (82.5) | 851 (82.5) | 109 (82.0) | 0.963 |
| Yes | 204 (17.5) | 180 (17.5) | 24 (18.0) | |
| Residence | ||||
| Rural | 483 (41.5) | 425 (41.2) | 58 (43.6) | 0.666 |
| Urban | 681 (58.5) | 606 (58.8) | 75 (56.4) | |
| Metastasis | ||||
| M0 | 916 (78.7) | 854 (82.8) | 62 (46.6) | < 0.001 |
| M1 | 248 (21.3) | 177 (17.2) | 71 (53.4) | |
| Breast feeding | ||||
| No | 202 (17.4) | 178 (17.3) | 24 (18.0) | 0.919 |
| Yes | 962 (82.6) | 853 (82.7) | 109 (82.0) | |
| Comorbidity | ||||
| No | 1005 (86.3) | 904 (87.7) | 101 (75.9) | < 0.001 |
| Yes | 159 (13.7) | 127 (12.3) | 32 (24.1) | |
| Habits | ||||
| None | 795 (68.3 | 698 (67.7) | 97 (72.9) | 0.037 |
| Alcohol | 23 (2.0) | 20 (1.9) | 3 (2.3) | |
| Khat | 191 (16.4) | 166 (16.1) | 25 (18.8) | |
| Smoking | 155 (13.3) | 147 (14.3) | 8 (6.0) | |
| ECOG | ||||
| I | 1106 (95.0) | 984 (95.4) | 122 (91.7) | 0.101 |
| II | 58 (5.0) | 47 (4.6) | 11 (8.3) | |
| Age category | ||||
| 15-24.9 | 26 (2.2) | 23 (2.2) | 3 (2.3) | < 0.001 |
| 25-34.9 | 233 (20.0) | 220 (21.3) | 13 (9.8) | |
| 35-44.9 | 375 (32.2) | 348 (33.8) | 27 (20.3) | |
| 45 and above | 530 (45.5) | 440 (42.7) | 90 (67.7) | |
| Tumour size | ||||
| T0 (non-invasive) | 38 (3.3 | 38 (3.7) | 0 (0.0) | < 0.001 |
| T1 (≤ 2 cm) | 100 (8.6) | 96 (9.3) | 4 (3.0) | |
| T2 (3–5 cm) | 380 (32.6) | 357 (34.6) | 23 (17.3) | |
| T3 (≥ 5 cm) | 390 (33.5) | 357 (34.6) | 33 (24.8) | |
| T4 (any size) | 256 (22.0) | 183 (17.7) | 73 (54.9) | |
| Stage | ||||
| Stage I | 137 (11.8) | 133 (12.9) | 4 (3.0) | < 0.001 |
| Stage II | 378 (32.5) | 356 (34.5) | 22 (16.5) | |
| Stage III | 403 (34.6) | 365 (35.4) | 38 (28.6) | |
| Stage IV | 246 (21.1) | 177 (17.2) | 69 (51.9) | |
| Node | ||||
| N0 | 38 (3.3) | 38 (3.7) | 0 (0.0) | < 0.001 |
| N1 (1–3 nodes) | 104 (8.9) | 100 (9.7) | 4 (3.0) | |
| N2 (4–9 nodes) | 763 (65.5) | 707 (68.6) | 56 (42.1) | |
| N3 (≥ 10 nodes) | 259 (22.3) | 186 (18.0) | 73 (54.9) | |
| Number of children | ||||
| 0–2 | 603 (51.8) | 532 (51.6) | 71 (53.4) | 0.92 |
| 3–5 | 438 (37.6) | 390 (37.8) | 48 (36.1) | |
| ≥ 6 | 123 (10.6) | 109 (10.6) | 14 (10.5) | |
Figure 1 (top) shows the overall survival curve of women with breast cancer. The graph shows that about 50% of the breast cancer patients survive up to 45 months, and about 40–50% remain alive at 5 years. Figure 1 (bottom) shows KM survival curves stratified by cancer stage. The graph shows that the survival probability of breast cancer patients decreases with advancing cancer stage. Patients with stage 1 show better survival as compared to patients with stage 4, which shows the poorest survival. Differences in survival across stages were statistically significant (log-rank test, p < 0.0001).
Fig. 1.
KM overall survival curve of women with breast cancer (top). KM survival curves stratified by cancer stage (bottom).
Figure 2 reveals the KM survival curves stratified by lymph node and comorbidity (top) and ECOG and tumour size (bottom). The graph shows that patients without lymph nodes (N0) indicate the best survival as compared to N2 and N3, which shows a decline in patients’ survival, showing the poorest survival (p < 0.0001). Similarly, patients with comorbidities live shorter as compared to patients without comorbidities (p < 0.0001). ECOG is among the factors that affect the survival of breast cancer patients (p = 0.0032). Patients with ECOG II have shorter survival status as compared to patients with ECOG I. Tumour size is another strong predictor of women’s breast cancer. From Fig. 2 (bottom), patients with non-invasive tumours (T0) show the best survival as compared to other patient categories. In addition, patients with small tumours (T1 and T2) show relatively better outcomes as compared to larger or invasive tumours (T3 and T4), which are related to poor survival (p < 0.0001).
Fig. 2.
KM survival curves stratified by lymph node and ECOG (top) and comorbidity and tumour size (bottom).
Figure 3 presents the KM survival curves stratified by number of children and habits (top) and age category and metastasis (bottom). The number of children does not significantly affect the survival of breast cancer patients (p = 0.98). In contrast, the lifestyle habit of the patient is a statistically significant predictor. Patients without habits have better survival as compared to other patients with different habits. Patients with smoking, alcohol, or khat use show poorer outcomes, while smoking habit appears to be associated with worse survival of breast cancer outcome (p = 0.037). Likewise, the age of the patient is a strong predictor of breast cancer patient outcome, with younger patients (15–24 years) indicating the best survival. However, survival of patients decreases progressively with increasing age, and patients aged 45 and above show the worst outcomes (p < 0.0001). Another important predictor is metastasis status. Patient with metastasis (M1) shows a decrease in survival over time as compared to patients without metastasis (p < 0.0001).
Fig. 3.
KM survival curves stratified by number of children and age category (top) and habits and metastasis (bottom).
Figure 4 shows the KM survival curves stratified by marital status and breastfeeding (top) and residence and previous surgery (bottom). The figure shows that marital status is a significant predictor of breast cancer outcome (p = 0.034). From the graph, we can see that married patients tend to have better survival status compared to the other categories.
Fig. 4.
KM survival curves stratified by marital status and residence (top) and breastfeeding and previous surgery (bottom).
Figure 5 presents a comparison of prediction error curves of KM, RSF, and CPH. The graph indicates prediction error against time, where a lower curve indicates better accuracy. It demonstrates that the RSF model shows superior predictive performance than the CPH and KM baselines. As expected, the KM curve, which is a non-parametric baseline that does not include individual predictor variables, indicates the highest error. The classical regression method, the CPH model, for survival analysis performs better than the baseline but is steadily outperformed by the RSF model across the timeline. This indicates that RSF is the most accurate model for this particular dataset. As a result, it suggests that the ML-based RSF approach is better at capturing the underlying complex relationships in the data compared to the standard CPH model.
Fig. 5.
Comparison of Prediction Error Curves.
Next, we compare results from different survival and ML models based on different evaluation metrics. The Integrated Brier Score (IBS) is reported in Table 3 for only the survival models (Cox PH, RSF, DeepSurv), because it is a time-dependent measure designed to assess the accuracy of predicted survival probability across the entire observed period. Table 3 shows the comparison of the performance of machine learning and survival models. The table compares the models based on different evaluation metrics such as mean accuracy, AUC/C-index, and IBS (Integrated Brier Score). For the ML models, LightGBM (0.857 ± 0.014) outperforms in terms of classification accuracy than XGBoost (0.8437 ± 0.014), SVM (0.841 ± 0.000), and RF (0.794 ± 0.012). However, accuracy alone cannot confirm its outperformance in survival or an imbalanced dataset. Consequently, AUC/C-index provides a more informative measure of judgment. In this case, RF attains the highest AUC (0.729 ± 0.006) outperforming LightGBM (0.697 ± 0.006), XGBoost (0.699 ± 0.013), and SVM (0.626). From this, we deduce that while having lower accuracy, RF is more reliable at predicting patient outcomes. For the survival models, the C-index shows that RSF (0.754) is outperforming CPH (0.736) and DeepSurv (0.689). IBS values, where lower scores indicate better calibration, confirm this trend: RSF has the best IBS (0.091), followed by DeepSurv (0.1050) and CPH (0.1083). The results indicate a trade-off: machine learning models like LightGBM and XGBoost achieve high classification accuracy, while Random Forest excels in discrimination. Among survival models, RSF is the most effective, offering the best balance of predictive discrimination (highest C-index) and calibration (lowest IBS). This suggests that while ML classifiers can perform well, survival-specific methods, particularly RSF, are better tailored to time-to-event prediction tasks.
Table 3.
Comparison of performance of machine learning and survival models.
| Model | Mean Accuracy (± SD) | mean AUC/ C-index (± SD |
IBS | |
|---|---|---|---|---|
| Machine learning | SVM | 0.841 ± 0.000 | 0.626 ± 0.000 | |
| XGBoost | 0.844 ± 0.014 | 0.699 ± 0.013 | ||
| LGBM | 0.857 ± 0.014 | 0.697 ± 0.006 | ||
| RF | 0.794 ± 0.012 | 0.729 ± 0.006 | ||
| Survival models | CPH | 0.736 | 0.108 | |
| RSF | 0.754 | 0.091 | ||
| DeepSurv | 0.689 | 0.105 | ||
Figure 6 shows the ROC curves for the machine learning models. The ROC is used to compare the performance of machine learning models based on their ability to balance the true positive rate against the false positive rate across thresholds. The AUC (area under the curve) indicates the summary measure of each model’s classification power, where higher values indicate better discrimination. From Fig. 6, we observe that RF obtained the highest AUC of 0.73, which places RF as the strongest performing model in distinguishing between classes. XGBoost and LightGBM followed closely with AUC values of 0.70 and 0.70, respectively, both showing good predictive capability. The SVM performed less well, with an AUC of 0.67, though still better than random guessing. In general, all four models demonstrated reasonable predictive ability, but RF emerged as the most reliable in this comparison, offering the best balance of sensitivity and specificity.
Fig. 6.
ROC curves for the machine learning models.
Figure 7 indicates the random survival forest model beeswarm (a) and feature importance (b) determined by mean absolute SHAP. The SHAP values were computed on the test set and shap.Explainer algorithm was used. It indicates the SHAP value distribution for features and their impact on model predictions in decreasing order of importance. The SHAP analysis of the RSF shows that age category, tumour size, node and metastasis status are the most important predictors of breast cancer survival. The beeswarm plot (a) indicates that older age groups, larger tumours, and the presence of metastasis constantly push the model toward higher predicted risk, while younger patients, smaller tumours, and the absence of metastasis are associated with lower risk of breast cancer. Figure 7 also shows that stage and comorbidities contribute significantly to predicting the risk of breast cancer. The higher breast cancer stages and the presence of comorbidities are associated with worse survival of breast cancer patients. The bar plot (b) indicates the importance of features in predicting the risk of breast cancer. It shows that age, tumour size, node and metastasis dominate the model’s predictions, followed by stage and comorbidities. The other variables contribute relatively low predictive power.
Fig. 7.
Random survival forest model beeswarm (a) and feature importance (b).
Figure 8 presents the DeepSurv model beeswarm (a) and feature importance (b) to demonstrate how different clinical and demographic features influence predicted survival outcomes. The dot shows a patient position on the x-axis, whether the feature increases or decreases the survival prediction of breast cancer. Figure 8 (a) shows that features with wide spread, such as age category, metastasis, tumour size, stage, and ECOG score, are the most important predictors of breast cancer. For instance, older age patients’ shift prediction toward poorer survival outcomes, while the younger age is related to better survival. Likewise, the presence of metastasis, larger tumour size, advanced stage, and poor ECOG performance score strongly decreases survival predictions. In general, the graph indicates that the strongest negative predictors of survival are metastasis, large tumour size, advanced stage, poor ECOG score, and comorbidities, whereas positive predictors include younger age, absence of metastasis, smaller tumour size, early disease stage, and history of surgery. Figure 8 (b) demonstrates that the prediction model relies most on metastasis, age, and habits.
Fig. 8.
DeepSurv model beeswarm (a) and feature importance (b).
Figure 9 shows the Random forest model beeswarm (a) and feature importance (b). The beeswarm and feature importance are determined by mean absolute SHAP values. The SHAP graph demonstrates the relative importance of and contribution of various features to the predictive model. The plot (a) shows both direction and magnitude of clinical features on the model output. Risk factors such as higher age category (AgeCat_4), lymph node (Node_3) metastasis status, stage (Stage_4), and larger tumour size (Tsize_4) contribute a high influence on predictions. The higher feature values are related to worse survival outcomes. Similarly, the bar plot (b) quantifies mean absolute SHAP values, showing that AgeCat_4, metastasis status (M1), node involvement, and stage are the most important predictors. While notably, top-ranked features are individually important, the model also leverages a wide range of additional predictors to improve prediction accuracy. Together, these results highlight the multifactorial nature of survival outcomes, with age, metastasis, nodal status, and stage being dominant predictors.
Fig. 9.
Random forest model beeswarm (a) and feature importance (b).
Discussions and conclusions
The objective of this investigation was to enhance understanding and comparison of various machine learning and survival models in predicting the significance of clinical and demographic covariates of breast cancer survival among Ethiopian women. To identify the non-linear relationships within clinical and demographic factors, we have used SVM, RF, XGBoost, and LGBM, where the RF outperformed in highlighting the important predictors of the breast cancer data. The ML models do not consider the survival time or censoring, but the evaluation metrics, such as accuracy and AUC used to show their importance in identifying the predictive covariates. Conversely, the survival models are used to handle the censored data to predict time-to-event data. Among the models we have tested, the RSF and RF achieved the highest predictive performance. The SHAP analysis was used to enhance the model interpretability. The results revealed that age, tumour size, metastasis, stage, and comorbidities as the most influential predictors of survival, alongside social factors such as marital status.
Similar to previous research works, predictors such as tumour size, stage, lymph node, and metastasis were identified as the strongest predictors of breast cancer survival11,43. It indicates their recognized role in breast cancer development and prediction. Notably, age and comorbidities contribute to the patient survival outcome disparities, consistent with worldwide evidence that shows older age and coexisting illnesses negatively affect the survival of breast cancer patients1,2. Consistent with previous studies, marital status is identified as a significant predictor of breast cancer44. It is a social support that positively influences treatment devotion and outcomes.
Furthermore, the results indicate the advantages of survival machine learning models over classical methods. Though the CPH model is the foundation of survival analysis because of its interpretability, its challenges in handling nonlinearities and time-dependent effects were evident. Importantly, our result indicated the superior predictive performance of RSF, which is consistent with previous studies that promote its robustness in heterogeneous environments13,45. While DeepSurv is theoretically strong, it underperformed compared to RSF, possibly due to data size constraints and high variability inherent in clinical datasets from low-resource settings, which likely disadvantaged the deep learning method. The performance of DeepSurv critically relies on large sample sizes. Our finding confirms with46. Similarly, the ML models demonstrated strong performance in identifying the non-linear relationship between survival outcome and predictors, outperformed by the random forest. These results importantly highlight the strength of various ML and survival-specific approaches. Even though the interpretability of ML remains challenging, the integration of SHAP values in this study provided meaningful explanations of model predictions, ensuring transparency and enabling clinicians to identify key risk factors at the individual level22. This is particularly valuable in settings like Ethiopia, where precision medicine is still emerging and resource allocation must be optimized.
In conclusion, the objective of this investigation was to compare classical and survival ML methods for predicting survival outcomes to identify methods that balance predictive accuracy with interpretability for Ethiopian breast cancer patients. Data from 1,164 women treated at Tikur Anbesa Specialized Hospital and Hiwot Fana Specialized University Hospital between 2019 and 2024 were used. Machine learning approaches such as SVM, RF, XGBoost, and LGBM, and survival-specific models, CPH, RSF, and DeepSurv were utilized with evaluation metrics such as AUC, C-index, and IBS. The results demonstrated that RSF and RF provide the most accurate and clinically important predictions of breast cancer survival in Ethiopian patients when compared with Cox Proportional Hazards (CPH), DeepSurv, and classical machine learning models. Both models not only obtained higher predictive performance but also presented improved interpretability through SHAP analysis. The findings from RSF indicated that age, tumour size, metastasis, stage, marital status, and comorbidities as dominant predictive variables. Moreover, the SHAP analysis for the RF model revealed that the higher age group (45 and above), metastatic status (M1), stage four, and larger tumour size all have a significant impact on predictions. From a clinical perspective, the results emphasize that integrating survival-specific ML approaches like RSF into patient care can strengthen risk stratification, enable more personalized treatment planning, and support clinicians in making data-informed decisions. In resource-limited settings such as Ethiopia, where late-stage presentation and limited access to advanced therapies remain pressing challenges, the adoption of interpretable ML models could enhance the allocation of scarce resources and improve patient counselling.
Limitations of the study
This study has a few limitations that future research should address. Key predictors of breast cancer progression, such as molecular, genetic, and lifestyle factors, were not included in the dataset, which may limit a comprehensive understanding of the disease. To improve accuracy and ensure validity across hospitals and patient populations in Ethiopia, future research should include these factors alongside imaging and genomic data. Larger datasets will also be needed to support the use of deep learning methods for a more thorough understanding of breast cancer. In addition, a retrospective design might include numerous types of bias, including selection bias and inconsistencies in data reporting, which may compromise the study’s validity. These steps are essential to enhance the clinical usefulness and generalizability of data-driven models for breast cancer prognosis.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
This work is partially based upon research supported by the South Africa National Research Foundation (NRF) and South Africa Medical Research Council (SAMRC) (South Africa DST-NRF-SAMRC SARChI Research Chair in Biostatistics, Grant number 114613). Opinions expressed and conclusions arrived at are those of the author and are not necessarily to be attributed to the NRF and SAMRC.
Author contributions
KT and DGC were responsible for the study concept and methods. KT and DGC collaborated on the data analysis and manuscript preparation. Both authors critically reviewed and approved the final version.
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code Availability
All data preprocessing, modelling and visualizations for all models are publicly available at https://github.com/Kasahun100/DeedSurv_Kasahun and provided as supplementary document.
Declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
This study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki. Informed consent was waived because patient data were anonymized by the College of Health and Medical Science at Haramaya University, Hiwot Fana Comprehensive Specialized Hospital, and Tikur Anbessa Specialized Hospital, as approved by the Institutional Health Research Ethics Review Committee (IHRERC) (Ref: D/R/G/P/05/37/2023).
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J. Clin.74(3), 229–263 (2024). [DOI] [PubMed] [Google Scholar]
- 2.Adeoye, P. A. Epidemiology of breast cancer in sub-Saharan Africa. In Breast Cancer Updates (IntechOpen, 2023).
- 3.Sayed, S. et al. Is breast cancer from sub saharan Africa truly receptor poor? Prevalence of ER/PR/HER2 in breast cancer from Kenya. Breast23(5), 591–596 (2014). [DOI] [PubMed] [Google Scholar]
- 4.Olayide, A. et al. Demographic pattern, tumor size and stage of breast cancer in africa: a meta-analysis. Asian Pac. J. Cancer Care. 6(4), 477–492 (2021). [Google Scholar]
- 5.Dandena, F. G., Teklewold, B. T., Darebo, T. D. & Suga, Y. D. Epidemiology and clinical characteristics of breast cancer in ethiopia: a systematic review. BMC Cancer. 24(1), 1102 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tesema, G. A. et al. Breast cancer in ethiopia: clinical profile and management patterns at Tikur Anbessa specialized hospital. Ethiop. J. Health Sci.30(4), 623–632 (2020).33897223 [Google Scholar]
- 7.Assefa, M. et al. Breast cancer in ethiopia: evidence for action. Ethiop. Med. J.59(1), 3–12 (2021). [Google Scholar]
- 8.Jemal, A. et al. Breast cancer presentation, stage, and survival in ethiopia: results from a population-based registry. J. Global Oncol.4, 1–9 (2018). [Google Scholar]
- 9.Shama, A. T. et al. Breast cancer and its determinants in ethiopia: a systematic review and meta-analysis. BMJ open.14(11), e080080 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Rakha, E. A. et al. Breast cancer prognostic classification in the molecular era: the role of histological grade. Breast Cancer Res.12(4), 207 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kantelhardt, E. J. et al. Breast cancer survival in ethiopia: a cohort study of 1,070 women. Int. J. Cancer. 135(3), 702–709 (2015). [DOI] [PubMed] [Google Scholar]
- 12.Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J. Clin.71(3), 209–249 (2021). [DOI] [PubMed] [Google Scholar]
- 13.Ishwaran, H., Kogalur, U. B., Blackstone, E. H. & Lauer, M. S. Random survival forests. Annals Appl. Stat.2(3), 841–860 (2008). [Google Scholar]
- 14.Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V. & Fotiadis, D. I. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J.13, 8–17 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhu, W. et al. Machine learning for survival analysis in cancer prognosis. Front. Oncol.9, 1073 (2020). [Google Scholar]
- 16.Cox, D. R. Regression models and life-tables. J. Roy. Stat. Soc. B. 34(2), 187–202 (1972). [Google Scholar]
- 17.Jedy-Agba, E., McCormack, V., Adebamowo, C. & Dos-Santos-Silva, I. Stage at diagnosis of breast cancer in sub-Saharan africa: A systematic review and meta-analysis. Lancet Global Health. 4(12), e923–e935 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wang, H. & Zhou, J. Random survival forest with space extensions for censored data. Artif. Intell. Med.79, 52–61 (2017). [DOI] [PubMed] [Google Scholar]
- 19.Boulesteix, A. L., Porzelius, C. & Daumer, M. Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value. Bioinformatics24(15), 1698–1706 (2012). [DOI] [PubMed] [Google Scholar]
- 20.Katzman, J. L. et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol.18, 24 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lee, C., Zame, W., Yoon, J. & Van Der Schaar, M. Deephit: A deep learning approach to survival analysis with competing risks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1). (2018).
- 22.Luck, M., Sylvain, T., Cardinal, H., Lodi, A. & Bengio, Y. Deep learning for patient-specific kidney graft survival analysis. arXiv preprint arXiv:1705.10245 (2017).
- 23.Ching, T., Zhu, X. & Garmire, L. X. Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput. Biol.14(4), e1006076 (2018). [DOI] [PMC free article] [PubMed]
- 24.Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med.380(14), 1347–1358 (2019). [DOI] [PubMed] [Google Scholar]
- 25.Baidoo, T. G. & Rodrigo, H. Data-driven survival modeling for breast cancer prognostics: A comparative study with machine learning and traditional survival modeling methods. PloS One20(4), e0318167 (2025). [DOI] [PMC free article] [PubMed]
- 26.Islam, M. M. et al. Breast cancer prediction: a comparative study using machine learning techniques. SN Comput. Sci.1(5), 290 (2020). [Google Scholar]
- 27.Rabiei, R., Ayyoubzadeh, S. M., Sohrabei, S., Esmaeili, M. & Atashi, A. Prediction of breast cancer using machine learning approaches. J. Biomedical Phys. Eng.12(3), 297 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ishwaran, H. & Kogalur, U. B. Random survival forests for R. R News. 7(2), 25–31 (2007). [Google Scholar]
- 29.Breiman, L. Random forests. Mach. Learn.45(1), 5–32 (2001). [Google Scholar]
- 30.Liaw, A. & Wiener, M. Classification and regression by randomforest. R News. 2(3), 18–22 (2002). [Google Scholar]
- 31.Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 785–794 (2016).
- 32.Ke, G. et al. Lightgbm:A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 30 (2017).
- 33.Aziz, R. M., Baluch, M. F., Patel, S. & Ganie, A. H. LGBM: a machine learning approach for Ethereum fraud detection. Int. J. Inform. Technol.14(7), 3321–3331 (2022). [Google Scholar]
- 34.Kecman, V., Huang, T. M. & Vogt, M. Support vector machines for classification and regression. In Handbook of Computational Intelligence 211–238 (Springer, 2020).
- 35.Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction 2nd edn (Springer, 2009).
- 36.Huang, S., Liu, Y. & Zhang, Y. Recent advances in support vector machines for big data analytics. IEEE Access.10, 44237–44252 (2022). [Google Scholar]
- 37.Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining 2623–2631 (2019).
- 38.Harrell, F. E., Lee, K. L. & Mark, D. B. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med.15(4), 361–387 (1996). [DOI] [PubMed] [Google Scholar]
- 39.Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med.30(10), 1105–1117. 10.1002/sim.4154 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Steyerberg, E. W. et al. Assessing the performance of prediction models: A framework for traditional and novel measures. Epidemiology21(1), 128–138 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lundberg, S. M. & Lee, S. I. A Unified Approach To Interpreting Model Predictions 30 (Advances in Neural Information Processing Systems (NeurIPS), 2017).
- 42.Lundberg, S. M. et al. From local explanations to global Understanding with explainable AI for trees. Nat. Mach. Intell.2(1), 56–67 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Anderson, B. O., Ilbawi, A. M. & El Saghir, N. S. Breast cancer in low- and middle-income countries (LMICs): A shifting tide in global health. Breast J.26(1), 69–71 (2020). [DOI] [PubMed] [Google Scholar]
- 44.Pinquart, M. & Duberstein, P. R. Associations of social networks with cancer mortality: A meta-analysis. Crit. Rev. Oncol. Hematol.75(2), 122–137. 10.1016/j.critrevonc.2009.06.003 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A. & Van Der Laan, M. J. Survival ensembles. Biostatistics7(3), 355–373 (2014). [DOI] [PubMed] [Google Scholar]
- 46.Haider, H., Hoehn, B., Davis, S. & Greiner, R. Effective ways to build and evaluate individual survival distributions. J. Mach. Learn. Res.21(85), 1–63 (2020).34305477 [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
All data preprocessing, modelling and visualizations for all models are publicly available at https://github.com/Kasahun100/DeedSurv_Kasahun and provided as supplementary document.






















