Skip to main content
. 2024 Mar 27;15:2681. doi: 10.1038/s41467-024-46700-2

Table 2.

Diagnostic performance of different models

Internal test dataset Clinical model Image based-DL predictions OvcaFinder
AUC 0.936 0.970 0.978
(0.902, 0.975) (0.934, 0.993) (0.953, 0.998)
p 0.007 0.152 Reference
Sensitivity (%) 97.3 97.3 97.3
(93.3, 100.0) (93.3, 100.0) (93.3, 100.0)
p 1.00 1.00 Reference
Specificity (%) 40.7 74.1 83.3
(28.3, 52.8) (62.3, 84.9) (73.6, 92.4)
p 1.52 × 10−5 0.062 Reference
Accuracy (%) 73.6 87.6 91.5
(68.0, 79.7) (82.0, 92.2) (86.7, 96.1)
PPV (%) 69.5 83.9 89.0
(65.5, 74.8) (78.5, 90.1) (83.3, 94.9)
NPV (%) 91.7 95.2 95.7
(79.2, 100.0) (88.6, 100.0) (88.9, 100.0)
External test dataset
AUC 0.842 0.893 0.947
(0.776, 0.895) (0.855, 0.933) (0.917, 0.970)
p 4.65 × 10−5 3.93 × 10−6 Reference
Sensitivity (%) 85.2 88.9 88.9
(76.5, 92.6) (81.5, 95.1) (81.5, 95.1)
p 0.581 1.000 Reference
Specificity (%) 53.3 68.6 90.5
(047.7, 58.5) (64.0, 73.5) (87.3, 93.8)
p 2.21 × 10−29 1.36 × 10−20 Reference
Accuracy (%) 59.9 72.9 90.2
(55.3, 64.3) (68.5, 77.3) (87.1, 93.0)
PPV (%) 32.5 42.9 0.713
(29.5, 35.7) (38.6, 47.8) (64.2, 79.1)
NPV (%) 93.1 95.9 96.9
(89.8, 96.3) (93.3, 98.2) (94.8, 98.6)

Data in parentheses are 95% confidence intervals; DL Deep learning, AUC Area under the receiver operating characteristic curve, PPV Positive predictive value, NPV Negative predictive value. We used an average value of O-RADS scores as the input factor of OvcaFinder. p values are for a comparison with OvcaFinder. The p-values of AUC were calculated using the function ‘roc_test’ in the python package of pROC. The p-values of sensitivity and specificity were calculated via two-sided McNemar test.