Skip to main content
. 2023 Nov 20;20:30. doi: 10.3352/jeehp.2023.20.30

Table 3.

Factors associated with correct answers provided by chatbots in a bivariate logistic regression model

GPT-4 Bing Claude Bard GPT-3
Area
 Surgery Ref Ref Ref Ref Ref
 Internal medicine 0.37 (0.02 to 2.25) 2.21 (0.60 to 7.63) 2.17 (0.82 to 5.64) 0.79 (0.28 to 2.07) 1.16 (0.42 to 3.00)
 Pediatrics 0.13 (0.01 to 1.02) 0.36 (0.09 to 1.37) 1.08 (0.27 to 3.23) 1.82 (0.15 to 1.99) 0.53 (0.15 to 1.83)
 Obstetrics & gynecology 0.13 (0.01 to 0.82) 1.53 (0.36 to 6.86) 0.71 (0.24 to 2.04) 0.42 (0.13 to 1.27) 0.88 (0.28 to 2.70)
 Public health 0.23 (0.11 to 1.96) 0.72 (0.17 to 3.02) 3.53 (0.90 to 17.79) 1.49 (0.38 to 6.50) 1.05 (0.30 to 3.83)
 Emergency medicine 0.27 (0.01 to 7.38) Not estimable 4.12 (0.60 to 82.89) 2.45 (0.34 to 50.03) 0.42 (0.08 to 2.17)
Peruvian knowledge
 Not required Ref Ref Ref Ref Ref
 Required 0.23 (0.09 to 0.61)a) 0.65 (0.26 to 1.78) 0.94 (0.42 to 2.21) 0.67 (0.31 to 1.50) 0.67 (0.31 to 1.50)
Type of item
 Recall Ref Ref Ref Ref Ref
 Application of knowledge 2.25 (0.84 to 5.71) 1.02 (0.35 to 2.60) 0.61 (0.25 to 1.39) 0.43 (0.16 to 0.99) 0.88 (0.39 to 1.89)

Values are presented as odds ratio (95% confidence interval).

Ref, reference.

a)

The odds ratio was statistically significant.