. 2020 Apr 3;134:113290. doi: 10.1016/j.dss.2020.113290

Table 6.

Evaluation of models.^a

	Explainability/interpretability^b	AUC results over all test samples	AUC results by each position^c
	Explainability/interpretability^b	AUC results over all test samples	Number of positions with AUC > 0.7^d	Average rank over all positions (1 - highest)^e
GBM (gradient boosting)	No	0.730	174	3.07
RF (random forest)	No	0.719	164	3.39
VOBN (variable-order Bayesian networks)	Yes	0.705	145	3.78
LR (logistic regression)	Partial	0.700	129	4.30
SVM (support vector machine)	No	0.697	100	5.16
C45 (J48)	Yes	0.682	103	5.37
CHAID	Yes	0.681	105	5.12
Naive Bayes	Yes	0.677	80	5.81
CART	Yes	0.644	7	7.92

Note that both the RF and GBM models and their implementations are generally robust to noisy and high dimensionality datasets, since they base their decisions on multiple permutations of the dataset (see [56,[66], [67], [68], [69]]). For the logistic regression and decision tree models, we implemented a feature selection preprocess by using information gain analysis (see [70]). For the SVM model, we used the built-in model as implemented in [71], that can deal with high dimensionality by testing different subsets of the data. In the VOBN model, there is a built-in preprocess procedure that uses mutual information to identify the high-impact features (see the Appendix A for further details).

We consider interpretable and non-interpretable models based on the classification presented in [72].

These results show the AUC for each position in the organization. The AUC scores were calculated over all the candidates that were recruited and placed in specific positions.

Out of 456 positions.

For each position, the compared algorithms were ranked by the AUC score —the values in this column represent the average rank for each algorithm over all positions. A lower rank implies a better average AUC score.