. 2023 Sep 12;12(6):3109–3119. doi: 10.1007/s40123-023-00800-2

Table 1.

Comparison of the distribution of scores between the two experts for all ChatGPT responses collected (n = 100)

Expert 2	Expert 1
Expert 2	1 point	2 points	3 points	4 points	5 points
1 point	0	0	0	0	0
2 points	0	0	0	0	0
3 points	0	5	18	14	9
4 points	0	0	3	6	15
5 points	0	0	1	12	17

Scores were defined as follows: 1 point was given for “Irrelevant response/no response”, 2 points were given for “Relevant response with major inaccuracies and potential for harm”, 3 points were given for “Relevant response with major inaccuracies and no potential for harm”, 4 points were given for “Relevant response with minor inaccuracies and no potential for harm”, and 5 points were given for “Relevant response without any inaccuracies”