Skip to main content
. 2024 Jan 17;10:20552076241227132. doi: 10.1177/20552076241227132

Table 4.

Fleiss’ kappa values.

Fleiss’ K
(95% CI)
Between Four Human Raters Human Raters and ChatGPT3.5 Human Raters and ChatGPT4.0
All Cases Text No Text All Cases Text No Text All Cases Text No Text
Overall .646 .577 .702 .320 .272 .355 .523 .482 .546
(.610–.682) (.522–.631) (.654–.750) (.294–.346) (.233–.310) (.320–.391) (.496–.551) (.441–.524) (.508–.583)
Level 1
(resuscitation)
.696 .488 .828 .182 .067 .294 .565 .322 .750
(.639–.752) (.409–.568) (.748–.908) (.138–.226) (.006–.129) (.232–.356) (.522–.609) (.261–.384) (.688–.812)
Level 2
(emergent)
.710 .671 .743 .281 .282 .256 .600 .565 .610
(.654–.767) (.591–.750) (.663–.823) (.238–.325) (.221–.343) (.194–.318) (.557–.644) (.503–.626) (.548–.672)
Level 3
(urgent)
.593 .539 .649 .359 .333 .386 .443 .440 .446
(.537–.650) (.459–.618) (.569–.729) (.316–403) (.272–.394) (.324–.448) (.400–.487) (.378–.501) (.384–.508)
Level 4
(less urgent)
.616 .505 .685 .429 .358 .467 .592 .468 .559
(.560–.673) (.426–.584) (.605–.765) (.385–.473) (.296–.419) (.405–.529) (.485–.573) (.406–.529) (.497–.621)
Level 5
(non-urgent)
.660 .330 .708 .492 .247 .526 .464 .247 .481
(.604–.717) (.251–.409) (.628–.788) (.449–.536) (.186–.308) (.464–.588) (.421–.508) (.186–.308) (.419–.543)