. 2021 Dec 13;11:23895. doi: 10.1038/s41598-021-03265-0

Table 4.

CAD software performance when matching the sensitivity of the Intermediate Reader.

	Cut-off Score	TP	FP	FN	TN	Sensitivity (95% CI)	Specificity (95% CI)	Accuracy (95% CI)
Intermediate Reader	N/A	109	386	24	513	82.0% (74.4–88.1)	57.1% (53.8–60.3)	60.3% (57.2–63.3)
Abnormality scores obtained by FIT
Qure.ai	76.5	109	307	24	592	82.0% (74.4–88.1)	65.9% (62.7–69.0)	67.9% (65.0–70.8)
Delft Imaging	64.7	109	309	24	590	82.0% (74.4–88.1)	65.6% (62.4–68.7)	67.7% (64.8–70.6)
DeepTek	55.7	109	331	24	568	82.0% (74.4–88.1)	63.2% (59.9–66.3)	65.6% (62.6–68.5)
Abnormality scores provided by CAD company
Lunit	20.7	109	314	24	585	82.0% (74.4–88.1)	65.1% (61.9–68.2)	67.2% (64.3–70.1)
JF Healthcare	98.3	109	379	24	520	82.0% (74.4–88.1)	57.8% (54.5–61.1)	60.9% (57.9–63.9)
InferVision	77.4	109	387	24	512	82.0% (74.4–88.1)	57.0% (53.6–60.2)	60.2% (57.1–63.2)
OXIPIT	23.8	109	441	24	458	82.0% (74.4–88.1)	50.9% (47.6–54.3)	54.9% (51.9–58.0)
Artelus	5.6	109	492	24	431	82.0% (74.4–88.1)	45.3% (42.0–48.6)	50.0% (46.9–53.1)
EPCON	11.7	109	547	24	352	82.0% (74.4–88.1)	39.2% (36.0–42.4)	44.7% (41.6–47.8)
COTO	12.2	109	568	24	331	82.0% (74.4–88.1)	36.8% (33.7–40.1)	42.6% (39.6–45.7)
Dr CADx	64.1	108	713	25	186	81.2% (73.5–87.5)^╪	20.7% (18.1–23.5)	28.5% (25.8–31.4)
SemanticMD	0.9	109	714	24	185	82.0% (74.4–88.1)	20.6% (18.0–23.4)	28.5% (25.8–31.4)

TP true positive, FP false positive, FN false negative; TN True Negative.

Bolded figures indicate performance significantly higher than the Intermediate Reader.

^╪It was impossible to select a cut-off score achieving 109 true positives, as two Xpert-positive participants have the same score.