Table 8.
Volunteers’ evaluation of the original radiology reports (ORRs) and interpretative radiology reports (IRRs) generated based on GPT-4 for comprehension score.
| Cancer sites | Comprehension score for ORRs, mean (SD; range) | Comprehension score for IRRs, mean (SD; range) | Difference (95% CI)a | P value | Tukey post hoc test |
| All sites | 5.51 (0.89; 4-7) | 7.83 (0.99; 6-9) | −2.32 (−2.40 to −2.24) | <.001 | Significant |
| Brain | 5.59 (1.07; 4-7) | 8.13 (0.87; 6-9) | −2.53 (−2.97 to −2.09) | <.001 | Significant |
| Thyroid | 5.45 (0.85; 4-7) | 7.8 (1.05; 6-9) | −2.36 (−2.62 to −2.09) | <.001 | Significant |
| Breast | 5.38 (0.86; 4-7) | 7.74 (0.95; 6-9) | −2.36 (−2.58 to −2.14) | <.001 | Significant |
| Lung | 5.46 (0.89; 4-7) | 7.81 (1.05; 6-9) | −2.35 (−2.56 to −2.13) | <.001 | Significant |
| Esophagus | 5.7 (0.67; 4-7) | 7.4 (0.97; 6-9) | −1.70 (−2.29 to −1.11) | <.001 | Significant |
| Gastric | 5.87 (0.78; 4-7) | 8.06 (0.87; 6-9) | −2.20 (−2.60 to −1.80) | <.001 | Significant |
| Liver | 5.75 (0.98; 4-7) | 7.94 (1.08; 6-9) | −2.19 (−2.59 to −1.78) | <.001 | Significant |
| Pancreas | 5.44 (0.7; 4-7) | 7.56 (0.98; 6-9) | −2.11 (−2.56 to −1.66) | <.001 | Significant |
| Colorectal | 5.51 (0.86; 4-7) | 7.74 (0.98; 6-9) | −2.23 (−2.48 to −1.98) | <.001 | Significant |
| Kidney | 5.51 (0.94; 4-7) | 7.9 (1.01; 6-9) | −2.39 (−2.65 to −2.13) | <.001 | Significant |
| Prostate | 5.49 (1.02; 4-7) | 7.84 (1.07; 6-9) | −2.35 (−2.78 to −1.93) | <.001 | Significant |
| Bladder | 5.58 (0.88; 4-7) | 8 (0.9; 6-9) | −2.42 (−2.79 to −2.05) | <.001 | Significant |
| Ovary | 5.39 (0.82; 4-7) | 7.72 (1.03; 6-9) | −2.35 (−2.56 to −2.13) | <.001 | Significant |
| Uterus | 5.52 (0.94; 4-7) | 7.79 (0.96; 6-9) | −2.27 (−2.58 to −1.96) | <.001 | Significant |
aThe 95% CI of the difference between the IRRs and ORRs.