. 2024 Aug 1;19(8):e0304768. doi: 10.1371/journal.pone.0304768

Table 1. Summary of the considered state-of-the-art works.

Ref	Techniques Employed	Dataset(s) Employed	Findings	Weaknesses
[8]	KNN, SVM, DT, MLP	WBCD and WDBC	Accuracy: 98.12%, Precision: 99.2%, Recall: 97.85%	Authors have limited performance parameters. More parameters should be employed including the Info Gain test, Gain Ratio test, and Chi-square tests
[9]	MLP, SVM, DNN, RNN	Breast Cancer Relapse Dataset (BCRD)	Accuracy: 94.53%	The rough neural network results in the lowest accuracy in comparison to other methods.
[10]	ML-DSS, RO	SEER	Accuracy: 86.0%, F-measure: 69.8%, Sensitivity: 67.1%, Specificity: 88.4%, AUC: 0.822	There is a need to improve the precision of the model through the weighting relative importance of the attributes by making a hybrid approach of ML algorithms and the models of RO.
[11]	MLP, SVM, SMO	WBCD	Accuracy: 96.99%, Precision: 97%, Recall: 97%, AUC: 0.968	More ML techniques should be considered to achieve enhanced predictive outcomes.
[12]	DT, SVM, MLP, KNN, LR, RF	Coimbra Breast Cancer Dataset (CBCD)	Accuracy: 100%, Precision: 100%, Recall: 100%, F1-score: 100%	The authors should have employed more models and parameters
[13]	MLP, CNN, SVM	EHRs	Precision: 81.47%, Recall: 77.82%, F1-score: 79.42%, AUC: 0.9489	The heterogeneity problem in clinical narratives should be addressed in the study.
[14]	SVM, ANN, NB, LDA	WDBC	Accuracy: 98.82%, Sensitivity: 98.41%, Specificity: 99.07%, AUC: 0.9994	SVM-LDA and NN-LDA outperform the other ML classifier models, but, NN-LDA is not chosen because of its longer time for computational.
[15]	LR, SVM, KNN, and PCA	UCI Sourced	Accuracy: 92.78%, Precision: 96.55%, Sensitivity: 91.07%, Specificity: 95.14%	This research work considers fewer variables in prediction.
[16]	LR, KNN, SVM, DT, RF, GBDT, MLP, XGBoost and Ensemble Learning	NFSC	Accuracy: 91.62%, Recall: 90.28%, F1-score: 89.39%	The authors have only designed the framework of a system and adopted existing methods.
[17]	ANN, KNN, SVM, NBC, COX	TCGA	Accuracy: 98.82%, Sensitivity: 100%, Specificity: 100%, PPV: 100%, NPV: 99.08%, AUROC: 99.81%	This research can be explored by designing the model of two-level or multi-level which will provide the effects of contextual volume of surgeon and hospital on the recurrence of breast cancer.
[18]	LR, NB, KNN, SVM and PCA	WPBC	Accuracy: 80%, Precision: 80%, Recall: 62%, F1-score: 76%, AUROC: 0.81, AUPRC: 0.62	SVM performance on imbalanced datasets is not very effective whereas on balanced datasets it is effective.
[19]	J48 DT, NB, LR, SVM, KNN, MLP, PART, OneR, RF and TF-IDF	KAUH sourced dataset	Accuracy: 92.25%, Sensitivity: 92.3%, Specificity: 88.7%	The unstructured and clinical variable format of data stored in the HER hospital increases the variability and complexity of their extraction.
[20]	LR, XGBoost, MLP, NB, RF, KNN, DT	WBCD	Accuracy: 98.3%, AUC: 99.3%, Precision: 96.6%, Recall: 97%, F1- score: 96.7%	This proposed work showed limited performance due to the imbalanced and small size of the dataset which leads to low prediction as compared with the classification of cancer on the other two datasets.
		WDBC	Accuracy: 99.2%, AUC: 99.5%, Precision: 97.4%, Recall: 97.4%, F1- score: 97.4%
		WPBC	Accuracy: 78.6%, AUC: 78.9%, Precision: 77.7%, Recall: 77.2%, F1- score: 78%
[21]	DT, LDA, LR, SVM, ET, PNN DNN, and RNN	NIH sourced dataset	Accuracy: 98.7%, Precision: 96.7%, Recall: 76.4%, F1-score: 85.2%	This proposed work would be more confirm the accuracy of the techniques of classification in the prediction of breast cancer considering the feature selection technique.