Table 4.
Cut-off Score | TP | FP | FN | TN | Sensitivity (95% CI) | Specificity (95% CI) | Accuracy (95% CI) | |
---|---|---|---|---|---|---|---|---|
Intermediate Reader | N/A | 109 | 386 | 24 | 513 | 82.0% (74.4–88.1) | 57.1% (53.8–60.3) | 60.3% (57.2–63.3) |
Abnormality scores obtained by FIT | ||||||||
Qure.ai | 76.5 | 109 | 307 | 24 | 592 | 82.0% (74.4–88.1) | 65.9% (62.7–69.0) | 67.9% (65.0–70.8) |
Delft Imaging | 64.7 | 109 | 309 | 24 | 590 | 82.0% (74.4–88.1) | 65.6% (62.4–68.7) | 67.7% (64.8–70.6) |
DeepTek | 55.7 | 109 | 331 | 24 | 568 | 82.0% (74.4–88.1) | 63.2% (59.9–66.3) | 65.6% (62.6–68.5) |
Abnormality scores provided by CAD company | ||||||||
Lunit | 20.7 | 109 | 314 | 24 | 585 | 82.0% (74.4–88.1) | 65.1% (61.9–68.2) | 67.2% (64.3–70.1) |
JF Healthcare | 98.3 | 109 | 379 | 24 | 520 | 82.0% (74.4–88.1) | 57.8% (54.5–61.1) | 60.9% (57.9–63.9) |
InferVision | 77.4 | 109 | 387 | 24 | 512 | 82.0% (74.4–88.1) | 57.0% (53.6–60.2) | 60.2% (57.1–63.2) |
OXIPIT | 23.8 | 109 | 441 | 24 | 458 | 82.0% (74.4–88.1) | 50.9% (47.6–54.3) | 54.9% (51.9–58.0) |
Artelus | 5.6 | 109 | 492 | 24 | 431 | 82.0% (74.4–88.1) | 45.3% (42.0–48.6) | 50.0% (46.9–53.1) |
EPCON | 11.7 | 109 | 547 | 24 | 352 | 82.0% (74.4–88.1) | 39.2% (36.0–42.4) | 44.7% (41.6–47.8) |
COTO | 12.2 | 109 | 568 | 24 | 331 | 82.0% (74.4–88.1) | 36.8% (33.7–40.1) | 42.6% (39.6–45.7) |
Dr CADx | 64.1 | 108 | 713 | 25 | 186 | 81.2% (73.5–87.5)╪ | 20.7% (18.1–23.5) | 28.5% (25.8–31.4) |
SemanticMD | 0.9 | 109 | 714 | 24 | 185 | 82.0% (74.4–88.1) | 20.6% (18.0–23.4) | 28.5% (25.8–31.4) |
TP true positive, FP false positive, FN false negative; TN True Negative.
Bolded figures indicate performance significantly higher than the Intermediate Reader.
╪It was impossible to select a cut-off score achieving 109 true positives, as two Xpert-positive participants have the same score.