Skip to main content
. 2024 Jul 30;14(8):5845–5860. doi: 10.21037/qims-24-729

Table 4. Performance of independent and AI-assisted clinicians on the internal, external, and prospective test sets.

Data sets Clinician Accuracy Sensitivity Specificity
Independent AI-assisted P value Independent AI-assisted P value Independent AI-assisted P value
Internal test set Rad1 72.2 (60.4–82.1) 81.9 (71.1–90.0) 0.0003 66.7 (51.0–80.0) 77.8 (62.9–88.8) 0.0007 81.5 (62.0–93.7) 88.9 (70.8–97.6) 0.007
Rad2 75.0 (63.4–84.5) 86.1 (75.9–93.1) 0.0002 68.9 (53.4–81.8) 82.2 (68.0–92.0) 0.0006 85.2 (66.3–95.8) 92.6 (75.7–99.1) 0.042
Rad3 83.3 (70.0–90.1) 88.9 (79.3–95.1) 0.008 80.0 (65.4–90.4) 86.7 (73.2–95.0) 0.032 85.0 (62.1–96.8) 92.6 (75.7–99.1) 0.008
Rad4 88.9 (79.3–95.1) 94.4 (86.4–98.5) 0.033 86.7 (73.2–94.9) 93.3 (81.7–98.6) 0.047 92.6 (75.7–99.1) 96.3 (81.0–99.9) 0.17
External test set Rad1 66.7 (52.5–78.9) 74.1 (60.3–85.0) 0.0004 68.0 (46.5–85.1) 69.0 (49.2–84.7) 0.008 65.5 (45.7–82.1) 80.0 (59.3–93.2) 0.0006
Rad2 70.3 (56.4–82.0) 77.8 (64.4–88.0) 0.0002 80.0 (59.3–93.2) 84.0 (63.9–95.5) 0.01 62.1 (42.3–79.3) 72.4 (52.7–87.3) 0.009
Rad3 77.8 (64.4–88.0) 85.2 (72.9–93.4) 0.008 84.0 (63.9–95.5) 92.0 (74.0–99.0) 0.023 72.4 (52.8–87.3) 79.3 (60.3–92.0) 0.027
Rad4 79.6 (66.5–89.4) 90.7 (79.7–96.9) 0.039 84.0 (63.9–95.5) 96.0 (79.6–99.9) 0.28 75.9 (56.5–89.7) 86.2 (68.3–96.1) 0.041
Prospective test set Rad2 67.8 (56.9–77.4) 79.3 (69.3–87.3) 0.016 75.5 (61.7–86.2) 81.1 (68.0–90.6) 0.022 55.9 (37.9–72.8) 76.5 (58.8–89.3) 0.031
Rad3 75.9 (65.5–84.4) 82.8 (73.2–90.0) 0.007 84.9 (72.4–93.3) 84.9 (72.4–93.3) 0.040 61.8 (43.6–77.8) 79.4 (62.1–91.3) 0.19
Rheu1 76.2 (68.0–86.3) 83.9 (74.5–91.0) 0.006 77.2 (65.9–89.2) 84.9 (72.4–93.3) 0.006 76.5 (58.8–89.3) 83.4 (65.5–93.2) 0.005
Rheu2 86.2 (77.1–92.7) 88.5 (79.9–96.9) 0.044 88.7 (77.0–95.7) 90.6 (79.3–96.9) 0.53 82.4 (65.5–93.2) 85.3 (68.9–95.0) 0.36

Accuracy, sensitivity, and specificity are expressed as percentages. Data in brackets are 95% confidence intervals. P<0.05 was considered significant. Rad1 and Rad2 are junior radiologists. Rad3 and Rad4 are senior radiologists. Rheu1 and Rheu2 are junior and senior rheumatologists, respectively. AI, artificial intelligence.