Table 3. Evaluation of the performance of classification models on imbalance dataset using ADASYN technique in validation set.
| Model | ADASYN | Precision | Accuracy | Sensitivity | Specificity | F1-score |
|---|---|---|---|---|---|---|
| XGBoost | 200% | 0.678 (0.667–0.689) | 0.748 (0.740–0.756) | 0.801 (0.778–0.823) | 0.714 (0.686–0.742) | 0.734 (0.724–0.744) |
| 250% | 0.724 (0.713–0.735) | 0.752 (0.745–0.759) | 0.805 (0.784–0.826) | 0.715 (0.699–0.731) | 0.762 (0.750–0.774) | |
| 300% | 0.753 (0.743–0.764) | 0.754 (0.748–0.760) | 0.824 (0.800–0.848) | 0.697 (0.678–0.715) | 0.787 (0.775–0.798) | |
| LR | 200% | 0.643 (0.633–0.652) | 0.726 (0.718–0.733) | 0.774 (0.753–0.794) | 0.701 (0.689–0.714) | 0.702 (0.688–0.716) |
| 250% | 0.695 (0.685–0.706) | 0.733 (0.727–0.739) | 0.754 (0.741–0.766) | 0.722 (0.714–0.730) | 0.723 (0.716–0.731) | |
| 300% | 0.738 (0.731–0.746) | 0.742 (0.736–0.748) | 0.785 (0.774–0.796) | 0.701 (0.691–0.710) | 0.761 (0.754–0.768) | |
| RF | 200% | 0.676 (0.664–0.687) | 0.745 (0.741–0.750) | 0.762 (0.741–0.784) | 0.742 (0.724–0.761) | 0.716 (0.703–0.730) |
| 250% | 0.713 (0.698–0.728) | 0.738 (0.728–0.747) | 0.797 (0.771–0.823) | 0.692 (0.657–0.726) | 0.752 (0.740–0.764) | |
| 300% | 0.744 (0.737–0.750) | 0.744 (0.739–0.749) | 0.804 (0.776–0.832) | 0.693 (0.661–0.725) | 0.772 (0.760–0.785) | |
| CNB | 200% | 0.647 (0.633–0.660) | 0.728 (0.720–0.737) | 0.777 (0.761–0.792) | 0.708 (0.695–0.721) | 0.705 (0.694–0.717) |
| 250% | 0.692 (0.681–0.704) | 0.728 (0.725–0.732) | 0.779 (0.768–0.789) | 0.688 (0.673–0.703) | 0.733 (0.726–0.740) | |
| 300% | 0.724 (0.714–0.733) | 0.734 (0.727–0.741) | 0.785 (0.772–0.797) | 0.688 (0.668–0.708) | 0.753 (0.744–0.762) | |
| SVM | 200% | 0.650 (0.641–0.660) | 0.733 (0.727–0.738) | 0.792 (0.778–0.806) | 0.696 (0.684–0.707) | 0.714 (0.705–0.723) |
| 250% | 0.665 (0.658–0.673) | 0.718 (0.714–0.722) | 0.808 (0.797–0.820) | 0.645 (0.634–0.655) | 0.730 (0.721–0.738) | |
| 300% | 0.703 (0.695–0.711) | 0.727 (0.719–0.734) | 0.817 (0.785–0.850) | 0.638 (0.605–0.670) | 0.755 (0.740–0.771) | |
| kNN | 200% | 0.759 (0.747–0.770) | 0.713 (0.706–0.719) | 0.753 (0.705–0.801) | 0.722 (0.674–0.770) | 0.754 (0.730–0.779) |
| 250% | 0.776 (0.765–0.787) | 0.706 (0.700–0.712) | 0.761 (0.732–0.790) | 0.714 (0.685–0.742) | 0.768 (0.755–0.780) | |
| 300% | 0.796 (0.789–0.803) | 0.703 (0.698–0.708) | 0.759 (0.727–0.791) | 0.717 (0.691–0.743) | 0.776 (0.758–0.795) |
Data are presented as the estimated value with its 95% confidence interval. ADASYN, adaptive synthetic; XGBoost, Extreme Gradient Boosting; LR, logistic regression; SVM, support vector machine; CNB, Complement Naive Bayes; RF, RandomForest; kNN, the k-nearest neighbor algorithm.