Table 3.
Ensemble ML-algorithm | Classifiers | Number of features (ML-model outputs) | Most important ML-classifiers/outer fold | Optimized metric | Hyperparameters | Selected number of features or hyperparameter settings on outer fold 1.0–5.0 | Accuracy# [95%CI] | ME | AUC | BS | LL |
---|---|---|---|---|---|---|---|---|---|---|---|
vRF | vRF | 8 × ML-models (findings) | Top 1: | ME | ntree = 500, mtry = 2, pvarsel = 8 | pvarsel = 8 | 83.5 [77.7–88.3] | 0.17 | 0.83 | 0.29 | 0.47 |
tRFBS, | vRF-find 1/5 | ||||||||||
tRFME, | SVM-find 2/5 | ||||||||||
tRFLL, | ELNET-find 1/5 | ||||||||||
ELNET, | XGBoost 1/5 | ||||||||||
SVM-LK, XGBoost, fastText | Top 2: | ||||||||||
XGBoost-find 1/5 | |||||||||||
tRF-ME-find 2/5 | |||||||||||
fasstext-find 1/5 | |||||||||||
ELNET-find 1/5 | |||||||||||
vRF | vRF | 8 × ML-models (impressions) | Top 1: | ME | ntree = 500, mtry = 2, pvarsel = 8 | pvarsel = 8 | 89.3 [84.3–93.2] | 0.11 | 0.90 | 0.19 | 0.34 |
tRFBS, | fasstext-impr 5/5 | ||||||||||
tRFME, | Top 2: | ||||||||||
tRFLL, | svm-impr 1/5 | ||||||||||
ELNET, | XGBoost-impr 2/5 | ||||||||||
SVM-LK, XGBoost, fastText | tRF-BS-impr 1/5 | ||||||||||
ELNET-impr 1/5 | |||||||||||
vRF | vRF | 16 × ML-models | Top 1: | ME | ntree = 500, mtry = 4, pvarsel = 16 | pvarsel = 16 | 88.8 [83.7–92.8] | 0.11 | 0.90 | 0.20 | 0.36 |
tRFBS, | (8 × findings & | fasstext-impr 5/5 | |||||||||
tRFME, | 8 × impressions) | Top 2: | |||||||||
tRFLL, | svm-impr 3/5 | ||||||||||
ELNET, | tRF-BS-impr 1/5 | ||||||||||
SVM-LK, XGBoost, fastText | ELNET-impr 1/5 | ||||||||||
XGBoost | vRF | 16 × ML-models | Top 1: | ME | nrouds/ntree = [5, 10, 25, 50, 75, 100] | nrounds = [75, 5, 75, 5, 10] | 87.4 [82.0–91.6] | 0.13 | 0.87 | 0.30 | 0.46 |
tRFBS, | (findings & impressions) | fasstext-impr 3/5 | max_depth = [3, 5, 6, 8] | max_depth = [3, 6, 5, 3, 5] | |||||||
tRFME, | svm-impr 2/5 | eta = [0.01, 0.1, 0.3] |
eta = [0.3, 0.01, 0.1, 0.01, 0.1] gamma = [1, 0.01, 0.1, 0, 0.5] colsample_bytree = [0.1, 0.5, ln2~RF, 0.1, 0.25] |
||||||||
tRFLL, | Top 2: | gamma = [0, 0.001, 0.01, 0.1, 0.5, 1] | |||||||||
ELNET, | fasstext-impr 2/5 | colsample_bytree = [0.1, 0.25, 0.5, 0.693 (ln2) ~ RF, 1.0], | |||||||||
SVM-LK, XGBoost, fastText | tRF-BS-impr 2/5 | min_child_weight = 1, | |||||||||
svm-impr 1/5 | subsample = 1 |
AUC: multiclass area under the ROC after Hand and Till (that can only be calculated if probabilities are scaled to 1), us var.filt: unsupervised variance filtering using p = 300 most variable RadLex terms -this step was previous of training to prevent information leakage, BS: Brier score, ME: misclassification error, LL: multiclass log loss, vRF and tRF: vanilla- and tuned random forests, ELNET: elastic net penalized multinomial logistic regression, SVM: support vector machines, LK: linear kernel SVM, n.SV: number of support vectors; XGBoost: extreme gradient boosting using trees as base learners, BT: boosted trees.