Table 2:
Experiment | Baseline | PEP | Temp. Scaling | MCD | SWA | Deep Ensembles |
---|---|---|---|---|---|---|
NLL | ||||||
| ||||||
MNIST (MLP) | 0.096 ± 0.01 | 0.079 ± 0.01 | 0.074 ± 0.01 | 0.094 ± 0.00 | 0.067 ± 0.00 | 0.044 ± 0.00 |
MNIST (CNN) | 0.036 ± 0.00 | 0.034 ± 0.00 | 0.032 ± 0.00 | 0.031 ± 0.00 | 0.028 ± 0.00 | 0.021 ± 0.00 |
Fashion MNIST | 0.360 ± 0.01 | 0.275 ± 0.01 | 0.271 ± 0.01 | 0.218 ± 0.01 | 0.277 ± 0.01 | 0.198 ± 0.00 |
CIFAR-10 | 1.063 ± 0.03 | 0.982 ± 0.02 | 0.956 ± 0.02 | 0.798 ± 0.01 | 0.827 ± 0.01 | 0.709 ± 0.00 |
CIFAR-100 | 2.685 ± 0.03 | 2.651 ± 0.03 | 2.606 ± 0.03 | 2.435 ± 0.03 | 2.314 ± 0.02 | 2.159 ± 0.01 |
| ||||||
Brier | ||||||
| ||||||
MNIST (MLP) | 0.037 ± 0.00 | 0.035 ± 0.00 | 0.035 ± 0.00 | 0.040 ± 0.00 | 0.032 ± 0.00 | 0.020 ± 0.00 |
MNIST (CNN) | 0.016 ± 0.00 | 0.015 ± 0.00 | 0.015 ± 0.00 | 0.014 ± 0.00 | 0.013 ± 0.00 | 0.010 ± 0.00 |
Fashion MNIST | 0.137 ± 0.01 | 0.127 ± 0.01 | 0.126 ± 0.00 | 0.111 ± 0.00 | 0.121 ± 0.00 | 0.096 ± 0.00 |
CIFAR-10 | 0.469 ± 0.01 | 0.450 ± 0.01 | 0.447 ± 0.01 | 0.381 ± 0.01 | 0.373 ± 0.00 | 0.335 ± 0.00 |
CIFAR-100 | 0.795 ± 0.01 | 0.786 ± 0.01 | 0.782 ± 0.01 | 0.768 ± 0.01 | 0.723 ± 0.00 | 0.695 ± 0.00 |
| ||||||
ECE % | ||||||
| ||||||
MNIST (MLP) | 1.324 ± 0.16 | 0.528 ± 0.12 | 0.415 ± 0.10 | 2.569 ± 0.17 | 0.536 ± 0.08 | 0.839 ± 0.08 |
MNIST (CNN) | 0.517 ± 0.07 | 0.366 ± 0.08 | 0.259 ± 0.06 | 0.832 ± 0.06 | 0.282 ± 0.04 | 0.287 ± 0.05 |
Fashion MNIST | 5.269 ± 0.22 | 1.784 ± 0.54 | 1.098 ± 0.18 | 1.466 ± 0.30 | 3.988 ± 0.11 | 0.942 ± 0.13 |
CIFAR-10 | 11.718 ± 0.72 | 4.599 ± 0.82 | 1.318 ± 0.26 | 7.109 ± 0.62 | 8.655 ± 0.29 | 8.867 ± 0.23 |
CIFAR-100 | 9.780 ± 0.69 | 5.535 ± 0.50 | 2.012 ± 0.31 | 12.608 ± 0.59 | 7.180 ± 0.48 | 11.954 ± 0.29 |
| ||||||
Classification Error % | ||||||
| ||||||
MNIST (MLP) | 2.264 ± 0.22 | 2.286 ± 0.24 | 2.264 ± 0.22 | 2.452 ± 0.14 | 2.082 ± 0.10 | 1.285 ± 0.05 |
MNIST (CNN) | 0.990 ± 0.13 | 0.990 ± 0.12 | 0.990 ± 0.13 | 0.842 ± 0.06 | 0.868 ± 0.06 | 0.659 ± 0.03 |
Fashion MNIST | 8.420 ± 0.32 | 8.522 ± 0.34 | 8.420 ± 0.32 | 7.692 ± 0.34 | 7.734 ± 0.11 | 6.508 ± 0.10 |
CIFAR-10 | 33.023 ± 0.68 | 32.949 ± 0.74 | 33.023 ± 0.68 | 27.207 ± 0.66 | 26.004 ± 0.36 | 22.880 ± 0.21 |
CIFAR-100 | 64.843 ± 0.69 | 64.789 ± 0.69 | 64.843 ± 0.69 | 60.772 ± 0.58 | 58.092 ± 0.42 | 53.917 ± 0.30 |