Skip to main content
. 2026 Mar 18;6:1747783. doi: 10.3389/fradi.2026.1747783

Table 2.

Per-Reader diagnostic performance without vs With AI assistance.

Reader (year of experience) Metric Without AI (95% CI) With AI (95% CI) Δ (95% CI) p-value
Pooled AUROC 0.921 (0.889–0.954) 0.953 (0.930–0.975) +0.032 (0.013–0.050) 0.002
AUPRC 0.932 (0.921- 0.943) 0.933 (0.923- 0.942) +0.001 (−0.013- 0.015) 0.944
Sensitivity 94.2% (90.6–96.5) 96.3% (93.5–97.9) +2.1% (–1.6–5.6) 0.243
Specificity 64.0% (57.9–69.6) 71.6% (65.5–76.9) +7.6% (–0.5–15.8) 0.069
PPV 58.4% (51.8–64.7) 63.9% (57.1–70.2) +5.5% (–3.7–14.8) 0.243
NPV 86.3% (80.1–90.9) 90.8% (85.1–94.4) +4.5% (–2.6–11.5) 0.219
Accuracy 79.1% (75.1–82.6) 83.9% (80.2–87.0) +4.8% (–2.4–11.9) 0.061
Reader 1 (17) AUROC 0.944 (0.918–0.970) 0.952 (0.926–0.978) +0.008 (–0.044–0.060) 0.339
AUPRC 0.922 (0.895-0.945) 0.947 (0.927-0.965) +0.025 (−0.003-0.057) 0.495
Sensitivity 90.7% (84.3–95.1) 96.9% (92.3–99.1) +6.2% (–2.8–14.8) 0.013
Specificity 75.2% (66.8–82.4) 63.6% (54.6–71.9) –11.6% (–27.8–5.1) 0.005
PPV 78.5% (71.1–84.8) 72.7% (65.4–79.2) –5.8% (–19.4–8.1) 0.015
NPV 89.0% (81.6–94.2) 95.3% (88.5–98.7) +6.3% (–5.7–17.1) 0.019
Accuracy 82.9% (77.8–87.3) 80.2% (74.8–84.9) –2.7% (–12.5–7.1) 0.296
Reader 2 (1) AUROC 0.913 (0.878–0.948) 0.937 (0.908–0.966) +0.024 (–0.040–0.088) 0.078
AUPRC 0.917 (0.890-0.942) 0.917 (0.888-0.942) +0.001 (−0.039-0.038) 0.965
Sensitivity 96.9% (92.3–99.1) 94.6% (89.1–97.8) –2.3% (–10.0–5.5) 0.45
Specificity 51.9% (43.0–60.8) 72.9% (64.3–80.3) +21.0% (3.5–37.3) <.001
PPV 66.8% (59.6–73.5) 77.7% (70.4–84.0) +10.9% (–3.1–24.4) <.001
NPV 94.4% (86.2–98.4) 93.1% (86.2–97.2) –1.3% (–12.2–11.0) 0.66
Accuracy 74.4% (68.6–79.6) 83.7% (78.6–88.0) +9.3% (–1.0–19.4) <.001
Reader 3 (14) AUC 0.931 (0.902–0.960) 0.974 (0.960–0.989) +0.043 (0.000–0.087) <.001
AUPRC 0.931 (0.903-0.953) 0.942 (0.920-0.960) +0.011 (−0.018-0.043) 0.546
Sensitivity 91.5% (85.3–95.7) 96.1% (91.2–98.7) +4.6% (–4.5–13.4) 0.114
Specificity 73.6% (65.2–81.0) 81.4% (73.6–87.7) +7.8% (–7.4–22.5) 0.034
PPV 77.6% (70.2–84.0) 83.8% (76.8–89.3) +6.2% (–7.2–19.1) 0.008
NPV 85.4% (80.3–89.5) 86.6% (81.8–90.5) +1.2% (–7.7–10.2) 0.604
Accuracy 82.6% (77.4–87.0) 88.8% (84.3–92.3) +6.2% (–2.7–14.9) 0.005
Reader 4 (38) AUC 0.942 (0.916–0.969) 0.970 (0.953–0.987) +0.028 (–0.016–0.071) 0.008
AUPRC 0.965 (0.949-0.977) 0.938 (0.915-0.959) -0.027 (−0.052-0.002) 0.478
Sensitivity 95.3% (90.2–98.3) 97.7% (93.4–99.5) +2.4% (–4.9–9.3) 0.450
Specificity 76.0% (67.7–83.1) 72.9% (64.3–80.3) –3.1% (–18.8–12.6) 0.556
PPV 79.9% (72.7–85.9) 78.3% (71.1–84.4) –1.6% (–14.8–11.7) 0.533
NPV 88.2% (83.5–92.0) 86.6% (81.8–90.5) –1.6% (–10.2–7.0) 0.462
Accuracy 85.7% (80.8–89.7) 85.3% (80.3–89.4) –0.4% (–9.4–8.6) 1
Reader 5 (1) AUROC 0.890 (0.851–0.930) 0.941 (0.912–0.970) +0.051 (–0.018–0.119) 0.003
AUPRC 0.948 (0.913-0.974) 0.908 (0.871-0.9380 -0.04 (−0.086-0.004) 0.531
Sensitivity 96.9% (92.3–99.1) 97.7% (93.4–99.5) +0.8% (–5.7–7.2) 1.000
Specificity 41.1% (32.5–50.1) 61.2% (52.3–69.7) +20.1% (2.2–37.2) <.001
PPV 62.2% (55.1–68.9) 71.6% (64.3–78.1) +9.4% (–4.6–23.0) <.001
NPV 88.6% (82.7–93.0) 86.6% (81.8–90.5) –2.0% (–11.2–7.8) 0.436
Accuracy 69.0% (63.0–74.6) 79.5% (74.0–84.2) +10.5% (–0.6–21.2) <.001
Reader 6 (1) AUROC 0.903 (0.865–0.941) 0.937 (0.904–0.970) +0.034 (–0.037–0.105) 0.081
AUPRC 0.911 (0.880-0.937) 0.945 (0.922-0.965) +0.035 (−0.003-0.072) 0.493
Sensitivity 93.8% (88.1–97.3) 94.6% (89.1–97.8) +0.8% (–8.2–9.7) 1.000
Specificity 65.9% (57.0–74.0) 77.5% (69.3–84.4) +11.6% (–4.7–27.4) 0.012
PPV 73.3% (65.9–79.9) 80.8% (73.6–86.7) +7.5% (–6.3–20.8) 0.006
NPV 86.5% (81.5–90.6) 86.6% (81.8–90.5) +0.1% (–8.8–9.0) 0.967
Accuracy 79.8% (74.4–84.6) 86.0% (81.2–90.0) +6.2% (–3.4–15.6) 0.015

AUROC, area under the receiver operating characteristic curve; AUPRC, area under the precision–recall curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; Δ = absolute change (With AI minus Without AI).