. 2025 Feb 18;14(2):706–716. doi: 10.21037/tcr-24-1672

Table 3. Evaluation of the performance of classification models on imbalance dataset using ADASYN technique in validation set.

Model	ADASYN	Precision	Accuracy	Sensitivity	Specificity	F1-score
XGBoost	200%	0.678 (0.667–0.689)	0.748 (0.740–0.756)	0.801 (0.778–0.823)	0.714 (0.686–0.742)	0.734 (0.724–0.744)
	250%	0.724 (0.713–0.735)	0.752 (0.745–0.759)	0.805 (0.784–0.826)	0.715 (0.699–0.731)	0.762 (0.750–0.774)
	300%	0.753 (0.743–0.764)	0.754 (0.748–0.760)	0.824 (0.800–0.848)	0.697 (0.678–0.715)	0.787 (0.775–0.798)
LR	200%	0.643 (0.633–0.652)	0.726 (0.718–0.733)	0.774 (0.753–0.794)	0.701 (0.689–0.714)	0.702 (0.688–0.716)
	250%	0.695 (0.685–0.706)	0.733 (0.727–0.739)	0.754 (0.741–0.766)	0.722 (0.714–0.730)	0.723 (0.716–0.731)
	300%	0.738 (0.731–0.746)	0.742 (0.736–0.748)	0.785 (0.774–0.796)	0.701 (0.691–0.710)	0.761 (0.754–0.768)
RF	200%	0.676 (0.664–0.687)	0.745 (0.741–0.750)	0.762 (0.741–0.784)	0.742 (0.724–0.761)	0.716 (0.703–0.730)
	250%	0.713 (0.698–0.728)	0.738 (0.728–0.747)	0.797 (0.771–0.823)	0.692 (0.657–0.726)	0.752 (0.740–0.764)
	300%	0.744 (0.737–0.750)	0.744 (0.739–0.749)	0.804 (0.776–0.832)	0.693 (0.661–0.725)	0.772 (0.760–0.785)
CNB	200%	0.647 (0.633–0.660)	0.728 (0.720–0.737)	0.777 (0.761–0.792)	0.708 (0.695–0.721)	0.705 (0.694–0.717)
	250%	0.692 (0.681–0.704)	0.728 (0.725–0.732)	0.779 (0.768–0.789)	0.688 (0.673–0.703)	0.733 (0.726–0.740)
	300%	0.724 (0.714–0.733)	0.734 (0.727–0.741)	0.785 (0.772–0.797)	0.688 (0.668–0.708)	0.753 (0.744–0.762)
SVM	200%	0.650 (0.641–0.660)	0.733 (0.727–0.738)	0.792 (0.778–0.806)	0.696 (0.684–0.707)	0.714 (0.705–0.723)
	250%	0.665 (0.658–0.673)	0.718 (0.714–0.722)	0.808 (0.797–0.820)	0.645 (0.634–0.655)	0.730 (0.721–0.738)
	300%	0.703 (0.695–0.711)	0.727 (0.719–0.734)	0.817 (0.785–0.850)	0.638 (0.605–0.670)	0.755 (0.740–0.771)
kNN	200%	0.759 (0.747–0.770)	0.713 (0.706–0.719)	0.753 (0.705–0.801)	0.722 (0.674–0.770)	0.754 (0.730–0.779)
	250%	0.776 (0.765–0.787)	0.706 (0.700–0.712)	0.761 (0.732–0.790)	0.714 (0.685–0.742)	0.768 (0.755–0.780)
	300%	0.796 (0.789–0.803)	0.703 (0.698–0.708)	0.759 (0.727–0.791)	0.717 (0.691–0.743)	0.776 (0.758–0.795)

Data are presented as the estimated value with its 95% confidence interval. ADASYN, adaptive synthetic; XGBoost, Extreme Gradient Boosting; LR, logistic regression; SVM, support vector machine; CNB, Complement Naive Bayes; RF, RandomForest; kNN, the k-nearest neighbor algorithm.