Table 3.
McNemar’s Test, χ21 | Macro F1-score | Macro unknown recall/sensitivity | Macro unknown precision | |
---|---|---|---|---|
Full methods | 86.24 ± 2.48% | 94.09 ± 2.42% | 79.66 ± 3.24% | |
Soft voting of all T2 components |
χ21 = 332 a p < 0.00001 |
86.87 ± 3.11% | 88.55 ± 3.58% | 85.36 ± 3.93% |
T2–closed-set Random Forest |
χ21 = 920 p < 0.00001 |
82.80 ± 3.84% | 83.79 ± 4.86% | 81.97 ± 4.23% |
T2 – closed-set SVM |
χ21 = 1120 p < 0.00001 |
82.68 ± 4.51% | 82.93 ± 5.68% | 82.58 ± 4.60% |
T2–closed-set WDNN |
χ21 = 905 p < 0.00001 |
81.87 ± 4.53% | 87.34 ± 5.96% | 77.11 ± 3.93% |
Softmax with a threshold |
χ21 = 12,151 p < 0.00001 |
72.38 ± 4.43% | 61.81 ± 4.31% | 87.43 ± 5.47% |
Open-set re-mapped |
χ21 = 24,656 p < 0.00001 |
72.72 ± 4.28% | 58.03 ± 5.52% | 98.02 ± 1.98% |
ODIN |
χ21 = 6414 p < 0.00001 |
49.58 ± 26.02% | 68.87 ± 42.70% | 82.02 ± 5.77% |
aA high chi-squared (χ21) value dictates a low p-value, which indicates a statistically significant difference with the full methods.