Table 5.
Explanation consistency, C, of checkpoint averaging, submodel averaging and random submodel picking on baseline models and both normal and explanation ensembles. CA-SA is checkpoint averaging followed by ensemble submodel averaging, CU the Codon Usage dataset, class. is classification and regr. regression.
| BCW | Diabetes (Class.) | Diabetes (Regr.) | CU (DNA) | CU (Kingdom) | MIMIC-IV | ||
|---|---|---|---|---|---|---|---|
| Baseline models | Checkpoint averaging | 0.2117 | 0.75322 | 0.53356 | 0.06585 | 0.06378 | 0.1527 |
| Normal ensemble models | Checkpoint averaging | 0.2497 | 0.2790 | 0.5604 | 0.0007 | 0.2264 | n/a |
| Random submodel | 0.1392 | 0.3062 | 0.5075 | 0.0440 | 0.01722 | n/a | |
| Submodel averaging | 0.1952 | 0.4597 | 0.4713 | 0.3882 | 0.1738 | n/a | |
| CA-SA | 0.2906 | 0.551 | 0.5193 | 0.5654 | 0.1638 | n/a | |
| Explanation ensemble models | Checkpoint averaging | 0.2485 | 0.0175 | 0.5322 | 0.2510 | 0.2695 | 0.2954 |
| Random submodel | 0.1365 | 0.2641 | 0.0625 | 0.2939 | 0.0193 | 0.01333 | |
| Submodel averaging | 0.2983 | 0.0355 | 0.8389 | 0.0953 | 0.0330 | 0.2080 | |
| CA-SA | 0.3964 | 0.89222 | 0.8529 | 0.6462 | 0.3481 | 0.1784 |