. 2023 Mar 21;239(4):499–513. doi: 10.1159/000530225

Table 5.

Summary of quality assessment on AI models reviewed

Domain	Checklist item		Quality assessment, n (%)
Domain	Checklist item		fully addressed	partially addressed	not addressed
Data	1	Image types	21 (95)	1 (5)	0 (0)
	2	Image artifacts	12 (55)	5 (23)	5 (23)
	3	Technical acquisition details	22 (100)	0 (0)	0 (0)
	4	Pre-processing procedures	20 (91)	0 (0)	2 (9)
	5	Synthetic images made public if used	22 (100)^a	0 (0)	0 (0)
	6	Public images adequately referenced	22 (100)	0 (0)	0 (0)
	7	Patient-level metadata	5 (23)	17 (77)	0 (23)
	8	Skin tone information and procedure by which skin tone was assessed	3 (14)	16 (73)	3 (14)
	9	Potential biases that may arise from use of patient information and metadata	9 (41)	7 (32)	6 (27)
	10	Dataset partitions	12 (55)	9 (41)	1 (5)
	11	Sample sizes of training, validation, and test sets	7 (32)	14 (64)	1 (5)
	12	External test set	3 (14)	2 (9)	17 (77)
	13	Multivendor images	20 (91)	2 (9)	0 (0)
	14	Class distribution and balance	5 (23)	15 (68)	2 (9)
	15	OOD images	2 (9)	7 (32)	13 (59)
Technique	16	Labeling method (ground truth, who did it)	15 (68)	7 (32)	0 (0)
	17	References to common/accepted diagnostic labels	22 (100)	0 (0)	0 (0)
	18	Histopathologic review for malignancies	16 (73)	2 (9)	4 (18)
	19	Detailed description of algorithm development	14 (64)	6 (27)	2 (10)
Technical Assessment	20	How to publicly evaluate algorithm	5 (23)	0 (0)	17 (77)
	21	Performance measures	9 (41)	13 (59)	0 (0)
	22	Benchmarking, technical comparison, and novelty	15 (68)	0 (0)	7 (32)
	23	Bias assessment	10 (45)	6 (27)	6 (27)
Application	24	Use cases and target conditions (inside distribution)	16 (73)	6 (27)	0 (0)
Application	25	Potential impacts on the healthcare team and patients	3 (14)	13 (59)	6 (27)

^aNo studies included synthetic images (checklist item 5), therefore marked as “fully addressed” to not negatively impact quality score.

OOD, out of distribution.