Skip to main content
. 2025 Sep 12;28(10):113559. doi: 10.1016/j.isci.2025.113559

Table 3.

Observed agreement between manual extraction and data extraction with ChatGPT, Elicit, and SciSpace for creating the summary table

Metric CHATGPT ELICIT SCISPACE Number of applicable reviews Kappa [95% CI] between tools
First author 97.22% 96.88% 84.85% 4 κ = −0.16 [−0.265 to 0.048]
Year of publication 100% 62.5% 63.64% 4 κ = −0.11 [−0.288 to 0.067]
Study design 37.5% 41.67% 30.77% 2 κ = 0.73 [0.435 to 1]
Country of publication 90% 100% 75% 1 κ = −0.09 [−0.24 to 0.06]
Sample size 69.44% 81.25% 69.7% 4 κ = 0.44 [0.24 to 0.66]
Age 55.56% 31.25% 42.42% 4 κ = 0.42 [0.18 to 0.67]
Sex 55.56% 46.88% 42.42% 4 κ = 0.21 [−0.005 to 0.42]
BMI 70% 20% 30% 1 κ = 0.23 [−0.22 to 0.68]
Population 15% 83.33% 39.89% 2 κ = 0.13 [−0.14 to 0.40]
Treatment modality 90% 70% 50% 1 κ = 0.13 [−0.23 to 0.49]
Interpretation results 70% 70% 40% 1 κ = 0.45 [−0.01 to 0.91]
Microbiome data 30% Alpha-diversity
50% Beta-diversity
10% Relative abundance
80% Alpha-diversity
70% Beta-diversity
50% Relative abundance
40% Alpha-diversity
60% Beta-diversity
10% Relative abundance
1 κ = 0.20 [−0.23 to 0.62]
κ = 0.38 [0.02 to 0.75]
κ = −0.08 [−0.24 to 0.07]
Type of rehabilitation program 0% 100% 20% 1 κ = −0.37 [−0.62 to −0.12]
Comparator intervention 83.33% 100% 100% 1
Work participation 0% 0% 0% 1 κ = 0.11 [−0.64 to 0.86]
Type of neuromodulation 70% 100% 87.5% 1 κ = 0.45 [0.34 to 0.57]
Work-related outcomes
Follow-up intervals
60% work
70% intervals
75% work
75% intervals
62.5% work intervals
87.5%
1 κ = 0.81 [0.39 to 1]
κ = 0.7 [−0.06 to 1]
Work status at baseline and after neuromodulation 30% baseline
20% neuromodulation
62.5% baseline
50% neuromodulation
75% baseline
50% neuromodulation
1 κ = 0.55 [−0.02 to 1]
κ = 0.49 [0.05 to 0.94]
Percentage return to work 10% 25% 12.5% 1 κ = 0.51 [−0.28 to 1]

Abbreviations. BMI: body mass index, CI: confidence interval.