Skip to main content
. 2024 Mar 27;15:2681. doi: 10.1038/s41467-024-46700-2

Table 3.

Diagnostic performance of OvcaFinder and human readers using O-RADS

Internal dataset Reader A Reader B Reader C Reader D Reader E
O-RADS OvcaFinder O-RADS OvcaFinder O-RADS OvcaFinder O-RADS OvcaFinder O-RADS OvcaFinder
AUC 0.907 (0.860, 0.946) 0.971 (0.943, 0.993) 0.900 (0.838, 0.946) 0.980 (0.957, 0.999) 0.958 (0.919, 0.987) 0.981 (0.949, 0.998) 0.947 (0.907, 0.978) 0.978 (0.949, 0.994) 0.924 (0.874, 0.966) 0.976 (0.951, 0.999)
  p 0.002 1.50 × 10−3 0.120 0.056 0.007
Sensitivity (%) 97.3 (93.3, 100.0) 97.3 (93.3, 100.0) 96.0 (90.7, 100.0) 96.0 (89.3, 100.0) 93.3 (86.7, 98.7) 97.3 (93.3, 100.0) 97.3 (94.7, 100.0) 97.3 (92.0, 100.0) 97.3 (93.3, 100.0) 97.3 (93.3, 100.0)
  p 1.00 1.00 0.375 1.00 1.00
Specificity (%) 61.1 (49.1, 75.5) 81.5 (71.7, 92.5) 72.2 (60.4, 81.1) 92.6 (83.0, 98.1) 87.0 (79.3, 94.3) 92.6 (83.0, 98.1) 77.8 (66.0, 88.7) 83.3 (73.6, 90.6) 68.5 (54.7, 81.1) 83.3 (71.7, 94.3)
  p 0.013 9.77 × 10−4 0.375 0.549 0.057
Accuracy (%) 82.2 (77.3, 88.3) 90.7 (85.9, 94.5) 86.1 (81.3, 90.6) 94.6 (89.8, 97.7) 90.7 (85.2, 95.3) 95.4 (91.4, 97.7) 89.2 (84.4, 94.5) 91.5 (86.7, 95.3) 85.3 (80.5, 90.6) 91.5 (86.7, 95.3)
  PPV (%) 77.7 (73.3, 85.2) 88.0 (83.0, 94.9) 82.8(77.9, 88.1) 94.7 (88.9, 98.7) 90.9 (86.3, 96.0) 94.8 (88.9, 98.7) 85.9 (80.2, 92.4) 89.0 (84.1, 93.7) 81.1 (75.5, 87.5) 89.0 (83.3, 95.9)
  NPV (%) 94.3 (86.5, 100.0) 95.7 (89.6, 100.0) 92.9 (84.4, 100.0) 94.3 (86.2, 100.0) 90.4 (82.0, 97.9) 96.2 (90.9, 100.0) 95.5 (90.0, 100.0) 95.7 (88.5, 100.0) 94.9 (87.8, 100.0) 95.7 (89.1, 100.0)
External test dataset
  AUC 0.888 (0.847, 0.928) 0.941 (0.902, 0.966) 0.894 (0.854, 0.922) 0.935 (0.902, 0.967) 0.894 (0.855, 0.932) 0.942 (0.913, 0.971) 0.915 (0.882, 0.945) 0.943 (0.909, 0.968) 0.927 (0.890, 0.954) 0.946 (0.911, 0.971)
  p 0.006 0.005 0.008 0.025 0.149
 Sensitivity (%) 87.7 (79.0, 93.8) 88.9 (81.5, 96.3) 86.4 (79.0, 92.6) 87.7 (81.5, 92.6) 77.8 (69.1, 86.4) 87.7 (77.8, 95.1) 87.7 (80.3, 95.1) 88.9 (81.5, 95.1) 88.9 (79.0, 95.1) 88.9 (81.5, 93.8)
1.00 1.00 0.077 1.00 1.00
 Specificity (%) 70.6 (65.0, 75.5) 87.6 (83.7, 91.2) 81.4 (76.1, 85.6) 90.5 (86.9, 93.5) 89.2 (85.3, 92.5) 91.8 (88.6, 94.8) 81.7 (78.4, 86.0) 89.5 (85.6, 92.5) 86.0 (82.0, 89.2) 90.9 (87.3, 93.5)
3.07 × 10−12 8.36 × 10−6 0.185 6.96 × 10−5 0.004
 Accuracy (%) 74.2 (69.8, 78.0) 87.9 (83.7, 91.0) 82.4 (77.8, 86.1) 89.9 (86.1, 93.3) 86.8 (83.7, 89.9) 91.0 (87.9, 94.1) 83.0 (79.6, 86.6) 89.4 (85.8, 91.7) 86.6 (83.2, 89.4) 90.4 (87.6, 92.8)
  PPV (%) 44.1 (39.7, 48.7) 65.5 (57.6, 72.6) 55.1 (48.0, 61.5) 71.0 (62.6, 78.4) 65.6 (58.5, 74.2) 74.0 (67.0, 82.0) 55.9 (50.8, 62.0) 69.2 (61.4, 75.3) 62.6 (56.5, 68.9) 72.0 (65.2, 78.3)
  NPV (%) 95.6 (92.8, 97.8) 96.8 (94.6, 98.9) 95.8 (93.6, 97.7) 96.5 (94.6, 98.0) 93.8 (91.6, 96.1) 96.6 (94.1, 98.6) 96.2 (94.0, 98.4) 96.8 (94.7, 98.6) 96.7 (94.1, 98.5) 96.8 (95.0, 98.3)

Data in parentheses are 95% confidence intervals; O-RADS Ovarian-Adnexal Reporting and Data System, AUC Area under the receiver operating characteristic curve, PPV Positive predictive value, NPV Negative predictive value. The p-values of AUC were calculated using the function ‘roc_test’ in the python package of pROC. The p-values of sensitivity and specificity were calculated via two-sided McNemar test.