Skip to main content
. 2025 Nov 10;10:1684137. doi: 10.3389/frma.2025.1684137

Table 4.

Per-article appraisal time for ChatGPT-4.0 and DeepSeek R1 compared with manual evaluation (seconds).

Tools Models Minimum Maximum Mean 95%CI
AMSTAR2 ChatGPT-4.0 12.83 24.00 16.94 15.98–17.99
DeepSeek R1 31.28 111.69 65.02 56.25–73.78
Manual 1,200.00
CASP ChatGPT-4.0 18.08 28.21 22.26 20.98–23.55
DeepSeek R1 28.97 157.97 58.22 43.91–72.52
Manual 750.00
PEDro ChatGPT-4.0 13.41 25.32 18.41 17.16–19.66
DeepSeek R1 31.83 64.26 44.95 41.66–48.25
Manual 2,700.00
ROB2 ChatGPT-4.0 15.16 25.63 19.13 17.56–20.70
DeepSeek R1 29.83 61.57 40.92 36.62–45.23
Manual 1,680.00
Overall ChatGPT-4.0 12.83 28.21 19.19 18.16–19.59
DeepSeek R1 28.97 157.97 52.28 48.50–57.23
Manual 1,582.50