Table 2.
Model performance with imputed imbalanced training data set DMI(ITD) and validation data set DMI(VD)
Training in ITD (n = 1029) |
Validation in VD (n = 1029) |
||||||
---|---|---|---|---|---|---|---|
Classifier | Specificity (TNR) | Sensitivity (TPR) | AUC | Specificity (TNR) | Sensitivity (TPR) | AUC | Rank |
(K = 1) NN | 0.908 | 0.167 | 0.548 | 0.923 | 0.292 | 0.607 | 9 |
(K = 3) NN | 0.975 | 0.094 | 0.601 | 0.979 | 0.125 | 0.627 | 8 |
(K = 5) NN | 0.985 | 0.042 | 0.624 | 0.989 | 0.063 | 0.651 | 6 |
(K = 7) NN | 0.996 | 0.031 | 0.648 | 0.998 | 0.052 | 0.644 | 7 |
(K = 9) NN | 0.999 | 0.031 | 0.660 | 0.999 | 0.042 | 0.665 | 5 |
ANN | 0.945 | 0.198 | 0.694 | 0.953 | 0.177 | 0.676 | 4 |
C4.5 | 0.985 | 0.083 | 0.575 | 0.979 | 0.125 | 0.496 | 12 |
LMT | 0.996 | 0.010 | 0.578 | 0.995 | 0.042 | 0.746 | 1 |
LR | 0.910 | 0.188 | 0.567 | 0.959 | 0.135 | 0.596 | 10 |
NB | 0.810 | 0.438 | 0.697 | 0.833 | 0.500 | 0.737 | 3 |
SVM | 0.966 | 0.156 | 0.561 | 0.976 | 0.146 | 0.561 | 11 |
RF | 0.998 | 0.021 | 0.725 | 0.999 | 0.010 | 0.742 | 2 |
Abbreviations: ANN = artificial neural network; AUC = area under the curve; C4.5 = decision tree; DMI = decision-tree based missing value imputation; ITD = imbalanced training; KNN = K-nearest neighbor; LMT = logistic model tree; LR = logistic regression; NB = naïve Bayes; RF = random forest; SVM = support vector machine; TNR = true negative rate; TPR = true positive rate; VD = validation.