. 2024 Jun 25;31(8):1665–1670. doi: 10.1093/jamia/ocae142

Table 2.

Means and SD for survey questions rated on a 5-point Likert scale, with 1 indicating “strongly disagree” and 5 indicating “strongly agree.”

Model	Clarity	Completeness	Conciseness	Utility
Healthcare Provider	4.2 ± 1.1	1.9 ± 1.2	4.1 ± 1.3	3.5 ± 1.5
CLAIR	4.3 ± 0.9	2.8 ± 1.2^a	4.1 ± 1.0	4.2 ± 0.7
GPT4-Simple	3.1 ± 1.5^a^,^b	4.1 ± 1.1^a^,^b	2.5 ± 1.5^a^,^b	3.8 ± 1.0
GPT4-Complex	3.2 ± 1.5^a^,^b	3.6 ± 1.1^a^,^b	3.6 ± 1.2^c	4.1 ± 0.8

Results are denoted by

if models had a significant effect relative to Healthcare Provider,

if significant to CLAIR,

if significant to GPT4-Simple at the P = .05 using Kruskal-Wallis H test with Dunn’s post hoc tests.