. 2022 Dec 15;84:102722. doi: 10.1016/j.media.2022.102722

Table 7.

Ablation study on the effectiveness of the UC-MIL’s backbone networks and the proposed Uncertainty-aware Consensus-assisted mechanism. Specifically, we respectively replace the proposed UC-MIL to another two classic MIL methods, such as Campanella et al. (2019) (w/ Instance-based) and Ilse et al. (2018) (w/ Embedding-based). The performance is reported as F1 (%), AUROC (%). 95% confidence intervals are presented in brackets, respectively.

Methods	Learning ability		Generalisation ability
	F1 (%)↑	AUROC (%)↑	F1 (%)↑	AUROC (%)↑
Backbone
w/ ResNet34	93.2 (90.9, 95.5)	98.0 (96.1, 99.1)	86.8 (84.7, 88.1)	90.5 (88.7, 92.0)
w/ ResWide50	93.3 (91.7, 95.0)	97.7 (95.4, 98.9)	86.0 (84.2, 88.1)	90.2 (88.4, 92.3)
w/ EfficientNetB3	90.2 (88.1, 92.3)	96.0 (93.9, 98.0)	84.6 (82.7, 86.1)	88.1 (86.5, 89.7)
w/ Res2Net50	91.7 (89.9, 93.2)	96.8 (94.7, 98.0)	85.0 (83.1, 87.2)	88.7 (86.6, 89.9)
Ours	94.9 (93.0, 96.8)	98.7 (97.6, 99.4)	88.0 (82.3, 92.7)	91.8 (84.6, 93.3)

Component
w/o Uncertainty	92.2 (90.1, 94.1)	97.0 (95.8, 98.1)	86.2 (94.9, 88.3)	90.2 (88.1, 92.1)
w/o Consensus	92.2 (90.4, 94.6)	97.7 (95.1, 98.6)	85.8 (83.3, 87.0)	89.4 (87.2, 90.5)
w/ Instance-based	88.7 (86.0, 90.1)	95.5 (93.3, 96.8)	81.5 (80.0, 83.1)	85.7 (83.3, 87.1)
w/ Embedding-based	89.9 (87.4, 91.2)	95.9 (93.3, 97.6)	83.0 (81.0, 85.2)	87.0 (85.1, 88.9)
Ours	94.9 (93.0, 96.8)	98.7 (97.6, 99.4)	88.0 (82.3, 92.7)	91.8 (84.6, 93.3)