Skip to main content
. 2024 Apr 1;20(4):583–594. doi: 10.5664/jcsm.10948

Table 3.

Readability and interrater reliability scores.

Prompt 1 Prompt 2 Prompt 3 Prompt 4
Average Flesch–Kincaid score 13.2 ± 2.2 8.1 ± 1.9 15.4 ± 2.8 17.3 ± 2.3
Clinical accuracya 0.141 (0.002 to 0.279) 0.185 (0.046 to 0.323) 0.141 (0.002 to 0.279) 0.176 (0.038 to 0.315)
Relevancea −0.020 (−0.159 to 0.118) N/Ab N/Ab
Percent agreement (CA) 97% 92% 97% 85%
Percent agreement (relevance) 98% 100% 100%
Evaluator 1 grading (CA) 100% 95% 95% 100%
Evaluator 2 grading (CA) 95% 95% 100% 100%
Evaluator 3 grading (CA) 95% 90% 95% 85%
Evaluator 4 grading (CA) 100% 85% 95% 65%
Evaluator 5 grading (CA) 95% 95% 100% 75%
a

Fleiss kappa (95% confidence interval).

b

All ratings were the same, so no Fleiss kappa could be calculated. CA = clinical accuracy.