. Author manuscript; available in PMC: 2024 Apr 1.

Published in final edited form as: IEEE Trans Artif Intell. 2022 Mar 15;4(2):383–397. doi: 10.1109/tai.2022.3159510

TABLE VII.

OOD detection accuracy in an experiment where the model was trained on CP-younger fetus. As OOD, we used images two dataset groups: CP datasets (CP-older fetus and CP-newborn) and non-CP datasets (Heart, Liver-CT, Hippocampus, and Pancreas).

test data	Method	accuracy	sensitivity	specificity	AUC
CP-older fetus and CP-newborn	Proposed method	0.90	0.91	0.87	0.90
	UNC-Dropout	0.60	0.61	0.60	0.67
	UNC-Ensemble	0.84	0.82	0.74	0.81
	Outlier exposure	0.80	0.86	0.80	0.81
	Lee-2017	0.75	0.78	0.71	0.78
	ODIN	0.64	0.65	0.68	0.65
	Mah-Dist	0.67	0.70	0.60	0.64
Heart, Liver-CT, Hippocampus, and Pancreas	Proposed method	1.00	1.00	1.00	1.00
	UNC-Dropout	0.64	0.60	0.78	0.65
	UNC-Ensemble	0.80	0.88	0.71	0.76
	Outlier exposure	0.83	0.83	0.85	0.85
	Lee-2017	0.76	0.70	0.80	0.77
	ODIN	0.70	0.71	0.70	0.73
	Mah-Dist	0.78	0.77	0.78	0.75