Skip to main content
. 2025 Apr 21;14(6):1281–1295. doi: 10.1007/s40123-025-01142-x

Table 4.

Readability of original online resources and performance of LLMs for improving the readability of the original online resources

Readability metrics Original resources ChatGPT-4o (01 preview) p value (Orig. vs GPT-4o) ChatGPT-3.5 p value (Orig. vs GPT-3.5) Google Gemini p value (Orig. vs Gemini)
Syllables 1457.2 (736.9) 330.4 (159.6) < 0.001 984.5 (412.7) 0.008 808.4 (205.7) < 0.001
Words 871.0 (392.7) 230.4 (107.9) < 0.001 649.2 (256.2) 0.21 521.9 (122.4) < 0.001
3+ syllable words 91.0 (60.7) 13.3 (10.4) < 0.001 43.0 (23.1) 0.001 38.5 (12.6) < 0.001
Sentences 41.5 (17.1) 18.4 (7.8) 0.04 41.0 (16.6) 0.9 31.1 (9.0) 0.02
SMOG Readability Score 10.3 (2.2) 5.3 (1.6) < 0.001 7.6 (1.2) < 0.001 7.8 (1.3) < 0.001
Flesch-Kincaid Grade Level 9.7 (1.9) 5.8 (1.5) < 0.001 7.7 (1.4) < 0.001 7.5 (1.1) < 0.001

LLM large language model, SMOG Simple Measure of Gobbledygook