Table 3.
Compares imbalanced data handling techniques using accuracy and Area under the curve (AUC)
| Algorithms | Comparison method | Unbalanced | SMOTE |
|---|---|---|---|
| Logistic Regression | Accuracy (%) | 80.25 | 70.00 |
| AUC | 0.668 | 0.775 | |
| Decision Tree | Accuracy (%) | 66.75 | 75.95 |
| AUC | 0.557 | 0.760 | |
| Random Forest | Accuracy (%) | 79.41 | 84.40 |
| AUC | 0.659 | 0.924 | |
| Gradient Boosting | Accuracy (%) | 79.13 | 74.91 |
| AUC | 0.682 | 0.824 | |
| XGBoost | Accuracy (%) | 77.32 | 82.22 |
| AUC | 0.641 | 0.898 | |
| Extra Tree classifier | Accuracy (%) | 78.74 | 84.93 |
| AUC | 0.628 | 0.926 |
SMOTE: Synthetic Minority Over-sampling Technique, AUC: Area Under Curve, Underline and bold numbers were the highest score of the classifier