. 2023 Oct 9;6(10):e2336997. doi: 10.1001/jamanetworkopen.2023.36997

Table 4. Accuracy and Completeness Scores for Risks, Benefits, and Alternatives to Surgery, Generated by Surgeons vs a Large Language Model−Based Chatbot.

Area	Accuracy and completeness score, mean (SD)
	Laparoscopic cholecystectomy		Inguinal hernia		Colectomy		Coronary artery bypass graft		Knee arthoplasty		Spine fusion
	Surgeon	Chatbot	Surgeon	Chatbot	Surgeon	Chatbot	Surgeon	Chatbot	Surgeon	Chatbot	Surgeon	Chatbot
Risks	1.4 (0.2)	1.5 (0.3)	1.7 (0.6)	1.8 (0.3)	1.8 (0.5)	1.7 (0.7)	1.3 (0.2)	1.6 (0.3)	2.1 (0.6)	1.7 (0.3)	1.8 (0.5)	1.6 (0.5)
Benefits	1.5 (0.7)	1.3 (0.4)	1.7 (0.8)	2.9 (0.3)	1.4 (0.8)	2.2 (0.6)	1.3 (0.5)	2.2 (0.4)	1.5 (0.5)	2.8 (0.3)	1.5 (0.7)	2.6 (0.5)
Alternatives	1.4 (0.7)	2.4 (0.9)	1.5 (0.9)	2.8 (0.5)	1.6 (0.9)	2.8 (0.4)	1.4 (0.7)	2.6 (0.5)	1.3 (0.6)	3.0 (0)	1.2 (0.4)	2.8 (0.4)
Overall impression	1.9 (0.3)	2.3 (0.5)	1.9 (0.4)	2.7 (0.6)	2.0 (0.6)	2.4 (0.5)	1.6 (0.6)	2.3 (0.7)	2.2 (0.4)	2.0 (0)	2.1 (0.5)	2.3 (0.5)
Composite	1.6 (0.3)	1.9 (0.4)	1.7 (0.5)	2.5 (0.3)	1.6 (0.6)	2.3 (0.5)	1.4 (0.4)	2.2 (0.3)	1.8 (0.5)	2.4 (0.1)	1.6 (0.4)	2.3 (0.4)