. 2020 May 28;10:680. doi: 10.3389/fonc.2020.00680

Table 3.

Diagnostic performance of three-way classifiers and radiologists in the malignancy validation set.

	Value (95%CI)				McNemar's P-Value^a	Delong P-Value
	Model E	Model F	Model G	Reader consensus	Model G vs. Reader	Model E vs. Model G	Model F vs. Model G
HCC
AUC	0.879 (0.808, 0.949)	0.972 (0.938, 1.000)	0.951 (0.906, 0.997)			0.002	0.792
Sensitivity, %	93.6 (82.5, 98.7)	95.7 (85.5, 99.5)	95.7 (85.5, 99.5)	89.1 (76.4, 96.4)	0.289
Specificity, %	67.3 (52.9, 79.7)	96.2 (86.8, 99.5)	90.4 (79.0, 96.8)	90.4 (79.0, 96.8)	0.063
Metastatic malignancy
AUC	0.814 (0.722, 0.907)	0.980 (0.947, 1.000)	0.985 (0.958, 1.000)			0.0002	0.403
Sensitivity, %	59.5 (42.1, 75.3)	100 (90.5, 100)	94.6 (81.8, 99.3)	89.2 (74.6, 97.0)	0.688
Specificity, %	93.6 (84.3, 98.2)	96.8 (88.8, 99.6)	100 (94.2, 100)	95.1 (86.3, 99.0)	1.000
Primary malignancy except HCC
AUC	0.761 (0.613, 0.909)	0.989 (0.951, 1.000)	0.905 (0.801, 1.000)			0.008	0.081
Sensitivity, %	53.3 (26.6, 78.7)	86.7 (59.5, 98.3)	73.3 (44.,9 92.2)	60.0 (32.3, 83.7)	0.688
Specificity, %	95.2 (88.3, 98.7)	100 (95.7, 100)	96.4 (89.9, 99.3)	91.6 (83.4, 96.5)	0.250

P value was calculated between Model G (three-sequence images + clinical data) vs. Reader consensus using the McNemar's test.