Skip to main content
. 2024 Nov 18;8:129. doi: 10.1186/s41747-024-00527-0

Table 2.

Readers’ and model’s evaluation at the mammogram level

True positives True negatives Weighted accuracya Cohen κ
Model versus Reader 1

26/32

81% (67–94%)

97/127

76% (68–84%)

0.027 + 0.813

0.840 (0.693–0.966)

0.45 (0.30–0.59)
Model versus Reader 2

18/23

78% (60–94%)

98/136

72% (64–80%)

0.025 + 0.783

0.808 (0.623–0.972)

0.31 (0.17–0.46)
Reader 2 versus Reader 1

17/32

53% (35–70%)

121/127

95% (91–98%)

0.033 + 0.530

0.563 (0.387–0.737)

0.54 (0.35–0.70)
Model versus consensus

26/32

78% (67–83%)

97/127

75% (67–83%)

0.25 + 0.579

0.829 (0.680–1.000)

0.32 (0.17–0.49)

a Class weights are assigned inversely proportional to their respective frequencies as assigned by Reader 1. The 95% CI values are indicated in parentheses. All p-values are below 0.001