. 2021 May 19;4(5):e217234. doi: 10.1001/jamanetworkopen.2021.7234

Table 2. Masked Reviewer Survey Responses for the Qualitative Evaluation of Digital Wound Assessments.

Question	Annotator	No./total No. (%)
		Site 1			Site 2
		R1	R2	R3	R1	R2	R3
1. Area tracing meets definition?	AI	42/100 (42.0)	53/110 (48.2)	67/110 (60.9)	65/89 (73.0)	78/85 (91.8)	47/89 (52.8)
	H1	65/100 (65.0)	59/110 (53.6)	82/110 (74.5)	67/89 (75.3)	79/85 (92.9)	63/89 (70.8)
	H2	51/100 (51.0)	53/110 (48.2)	72/110 (65.5)	65/89 (73.0)	82/85 (96.5)	55/89 (61.8)
	P value	.01^a	.73	.11	.88	.41	.04^a
2. Which is AI?	AI	37/105 (35.2)	42/109 (38.5)	42/109 (38.5)	3/89 (3.4)	42/85 (49.4)	24/89 (27.0)
	H1	39/105 (37.1)	27/109 (24.8)	33/109 (30.3)	36/89 (40.4)	20/85 (23.5)	44/89 (49.4)
	H2	29/105 (27.6)	40/109 (36.7)	34/109 (31.2)	50/89 (56.2)	23/85 (27.1)	21/89 (23.6)
	P value	.51	.21	.48	<.001^a	.004^a	.004^a
3. Which is most accurate?	AI	32/91 (35.2)	39/108 (36.1)	35/109 (32.1)	19/89 (21.3)	25/85 (29.4)	24/89 (27.0)
	H1	27/91 (29.7)	32/108 (29.6)	42/109 (38.5)	48/89 (53.9)	38/85 (44.7)	44/89 (49.4)
	H2	32/91 (35.2)	37/108 (34.3)	32/109 (29.4)	22/89 (24.7)	22/85 (25.9)	21/89 (23.6)
	P value	.78	.76	.45	<.001^a	.04^a	.004^a

Abbreviations: AI, artificial intelligence; H, human; R, reviewer.

^{^a}

Statistically significant differences in frequency of yes answers for Q1 between AI and human traces for Fisher exact test P values (P < .05) and statistically significant bias in frequency of selection vs random selection for χ² P values (P < .05).