Skip to main content
. 2024 Aug 1;19(8):e0304768. doi: 10.1371/journal.pone.0304768

Table 1. Summary of the considered state-of-the-art works.

Ref Techniques Employed Dataset(s) Employed Findings Weaknesses
[8] KNN, SVM, DT, MLP WBCD and WDBC Accuracy: 98.12%, Precision: 99.2%, Recall: 97.85% Authors have limited performance parameters. More parameters should be employed including the Info Gain test, Gain Ratio test, and Chi-square tests
[9] MLP, SVM, DNN, RNN Breast Cancer Relapse Dataset (BCRD) Accuracy: 94.53% The rough neural network results in the lowest accuracy in comparison to other methods.
[10] ML-DSS, RO SEER Accuracy: 86.0%, F-measure: 69.8%, Sensitivity: 67.1%, Specificity: 88.4%, AUC: 0.822 There is a need to improve the precision of the model through the weighting relative importance of the attributes by making a hybrid approach of ML algorithms and the models of RO.
[11] MLP, SVM, SMO WBCD Accuracy: 96.99%, Precision: 97%, Recall: 97%, AUC: 0.968 More ML techniques should be considered to achieve enhanced predictive outcomes.
[12] DT, SVM, MLP, KNN, LR, RF Coimbra Breast Cancer Dataset (CBCD) Accuracy: 100%, Precision: 100%, Recall: 100%, F1-score: 100% The authors should have employed more models and parameters
[13] MLP, CNN, SVM EHRs Precision: 81.47%, Recall: 77.82%, F1-score: 79.42%, AUC: 0.9489 The heterogeneity problem in clinical narratives should be addressed in the study.
[14] SVM, ANN, NB, LDA WDBC Accuracy: 98.82%, Sensitivity: 98.41%, Specificity: 99.07%, AUC: 0.9994 SVM-LDA and NN-LDA outperform the other ML classifier models, but, NN-LDA is not chosen because of its longer time for computational.
[15] LR, SVM, KNN, and PCA UCI Sourced Accuracy: 92.78%, Precision: 96.55%, Sensitivity: 91.07%, Specificity: 95.14% This research work considers fewer variables in prediction.
[16] LR, KNN, SVM, DT, RF, GBDT, MLP, XGBoost and Ensemble Learning NFSC Accuracy: 91.62%, Recall: 90.28%, F1-score: 89.39% The authors have only designed the framework of a system and adopted existing methods.
[17] ANN, KNN, SVM, NBC, COX TCGA Accuracy: 98.82%, Sensitivity: 100%, Specificity: 100%, PPV: 100%, NPV: 99.08%, AUROC: 99.81% This research can be explored by designing the model of two-level or multi-level which will provide the effects of contextual volume of surgeon and hospital on the recurrence of breast cancer.
[18] LR, NB, KNN, SVM and PCA WPBC Accuracy: 80%, Precision: 80%, Recall: 62%, F1-score: 76%, AUROC: 0.81, AUPRC: 0.62 SVM performance on imbalanced datasets is not very effective whereas on balanced datasets it is effective.
[19] J48 DT, NB, LR, SVM, KNN, MLP, PART, OneR, RF and TF-IDF KAUH sourced dataset Accuracy: 92.25%, Sensitivity: 92.3%, Specificity: 88.7% The unstructured and clinical variable format of data stored in the HER hospital increases the variability and complexity of their extraction.
[20] LR, XGBoost, MLP, NB, RF, KNN, DT WBCD Accuracy: 98.3%, AUC: 99.3%, Precision: 96.6%, Recall: 97%, F1- score: 96.7% This proposed work showed limited performance due to the imbalanced and small size of the dataset which leads to low prediction as compared with the classification of cancer on the other two datasets.
WDBC Accuracy: 99.2%, AUC: 99.5%, Precision: 97.4%, Recall: 97.4%, F1- score: 97.4%
WPBC Accuracy: 78.6%, AUC: 78.9%, Precision: 77.7%, Recall: 77.2%, F1- score: 78%
[21] DT, LDA, LR, SVM, ET, PNN DNN, and RNN NIH sourced dataset Accuracy: 98.7%, Precision: 96.7%, Recall: 76.4%, F1-score: 85.2% This proposed work would be more confirm the accuracy of the techniques of classification in the prediction of breast cancer considering the feature selection technique.