Table 6.
Evaluation of models.a
Explainability/interpretabilityb | AUC results over all test samples | AUC results by each positionc |
||
---|---|---|---|---|
Number of positions with AUC > 0.7d | Average rank over all positions (1 - highest)e | |||
GBM (gradient boosting) | No | 0.730 | 174 | 3.07 |
RF (random forest) | No | 0.719 | 164 | 3.39 |
VOBN (variable-order Bayesian networks) | Yes | 0.705 | 145 | 3.78 |
LR (logistic regression) | Partial | 0.700 | 129 | 4.30 |
SVM (support vector machine) | No | 0.697 | 100 | 5.16 |
C45 (J48) | Yes | 0.682 | 103 | 5.37 |
CHAID | Yes | 0.681 | 105 | 5.12 |
Naive Bayes | Yes | 0.677 | 80 | 5.81 |
CART | Yes | 0.644 | 7 | 7.92 |
Note that both the RF and GBM models and their implementations are generally robust to noisy and high dimensionality datasets, since they base their decisions on multiple permutations of the dataset (see [56,[66], [67], [68], [69]]). For the logistic regression and decision tree models, we implemented a feature selection preprocess by using information gain analysis (see [70]). For the SVM model, we used the built-in model as implemented in [71], that can deal with high dimensionality by testing different subsets of the data. In the VOBN model, there is a built-in preprocess procedure that uses mutual information to identify the high-impact features (see the Appendix A for further details).
We consider interpretable and non-interpretable models based on the classification presented in [72].
These results show the AUC for each position in the organization. The AUC scores were calculated over all the candidates that were recruited and placed in specific positions.
Out of 456 positions.
For each position, the compared algorithms were ranked by the AUC score —the values in this column represent the average rank for each algorithm over all positions. A lower rank implies a better average AUC score.