. Author manuscript; available in PMC: 2025 Oct 15.

Published in final edited form as: IEEE Access. 2025 Jul 28;13:133351–133369. doi: 10.1109/access.2025.3593420

TABLE 4.

T-test results comparing the ood auc in the (panda,artif) experiment using uni features.

(a) Statistical comparison for vaeabmil
Model	OoD/auroc	t_stat	p_value	Significant
vaeabmil	0.9993 ± 0.0004	-	-	-
daeabmil	0.9988 ± 0.0001	2.2618	0.1521	False
dsmil	0.5408 ± 0.0386	20.3737	0.0024	True
clam	0.6794 ± 0.0276	20.3732	0.0024	True
dftdmil	0.6883 ± 0.0300	18.1359	0.0030	True
transmil	0.6982 ± 0.0410	12.8230	0.0060	True
abmil	0.7546 ± 0.0255	16.7655	0.0035	True
(b) Statistical comparison for daeabmil
Model	OoD/auroc	t_stat	p_value	Significant

Model	OoD/auroc	t_stat	p_value	Significant
daeabmil	0.9988 ± 0.0001	-	-	-
vaeabmil	0.9993 ± 0.0004	-2.2618	0.1521	False
dsmil	0.5408 ± 0.0386	20.5614	0.0024	True
clam	0.6794 ± 0.0276	20.0429	0.0025	True
dftdmil	0.6883 ± 0.0300	17.9140	0.0031	True
transmil	0.6982 ± 0.0410	12.6833	0.0062	True
abmil	0.7546 ± 0.0255	16.5333	0.0036	True