Table 3.
Internal dataset | Reader A | Reader B | Reader C | Reader D | Reader E | |||||
---|---|---|---|---|---|---|---|---|---|---|
O-RADS | OvcaFinder | O-RADS | OvcaFinder | O-RADS | OvcaFinder | O-RADS | OvcaFinder | O-RADS | OvcaFinder | |
AUC | 0.907 (0.860, 0.946) | 0.971 (0.943, 0.993) | 0.900 (0.838, 0.946) | 0.980 (0.957, 0.999) | 0.958 (0.919, 0.987) | 0.981 (0.949, 0.998) | 0.947 (0.907, 0.978) | 0.978 (0.949, 0.994) | 0.924 (0.874, 0.966) | 0.976 (0.951, 0.999) |
p | 0.002 | 1.50 × 10−3 | 0.120 | 0.056 | 0.007 | |||||
Sensitivity (%) | 97.3 (93.3, 100.0) | 97.3 (93.3, 100.0) | 96.0 (90.7, 100.0) | 96.0 (89.3, 100.0) | 93.3 (86.7, 98.7) | 97.3 (93.3, 100.0) | 97.3 (94.7, 100.0) | 97.3 (92.0, 100.0) | 97.3 (93.3, 100.0) | 97.3 (93.3, 100.0) |
p | 1.00 | 1.00 | 0.375 | 1.00 | 1.00 | |||||
Specificity (%) | 61.1 (49.1, 75.5) | 81.5 (71.7, 92.5) | 72.2 (60.4, 81.1) | 92.6 (83.0, 98.1) | 87.0 (79.3, 94.3) | 92.6 (83.0, 98.1) | 77.8 (66.0, 88.7) | 83.3 (73.6, 90.6) | 68.5 (54.7, 81.1) | 83.3 (71.7, 94.3) |
p | 0.013 | 9.77 × 10−4 | 0.375 | 0.549 | 0.057 | |||||
Accuracy (%) | 82.2 (77.3, 88.3) | 90.7 (85.9, 94.5) | 86.1 (81.3, 90.6) | 94.6 (89.8, 97.7) | 90.7 (85.2, 95.3) | 95.4 (91.4, 97.7) | 89.2 (84.4, 94.5) | 91.5 (86.7, 95.3) | 85.3 (80.5, 90.6) | 91.5 (86.7, 95.3) |
PPV (%) | 77.7 (73.3, 85.2) | 88.0 (83.0, 94.9) | 82.8(77.9, 88.1) | 94.7 (88.9, 98.7) | 90.9 (86.3, 96.0) | 94.8 (88.9, 98.7) | 85.9 (80.2, 92.4) | 89.0 (84.1, 93.7) | 81.1 (75.5, 87.5) | 89.0 (83.3, 95.9) |
NPV (%) | 94.3 (86.5, 100.0) | 95.7 (89.6, 100.0) | 92.9 (84.4, 100.0) | 94.3 (86.2, 100.0) | 90.4 (82.0, 97.9) | 96.2 (90.9, 100.0) | 95.5 (90.0, 100.0) | 95.7 (88.5, 100.0) | 94.9 (87.8, 100.0) | 95.7 (89.1, 100.0) |
External test dataset | ||||||||||
AUC | 0.888 (0.847, 0.928) | 0.941 (0.902, 0.966) | 0.894 (0.854, 0.922) | 0.935 (0.902, 0.967) | 0.894 (0.855, 0.932) | 0.942 (0.913, 0.971) | 0.915 (0.882, 0.945) | 0.943 (0.909, 0.968) | 0.927 (0.890, 0.954) | 0.946 (0.911, 0.971) |
p | 0.006 | 0.005 | 0.008 | 0.025 | 0.149 | |||||
Sensitivity (%) | 87.7 (79.0, 93.8) | 88.9 (81.5, 96.3) | 86.4 (79.0, 92.6) | 87.7 (81.5, 92.6) | 77.8 (69.1, 86.4) | 87.7 (77.8, 95.1) | 87.7 (80.3, 95.1) | 88.9 (81.5, 95.1) | 88.9 (79.0, 95.1) | 88.9 (81.5, 93.8) |
1.00 | 1.00 | 0.077 | 1.00 | 1.00 | ||||||
Specificity (%) | 70.6 (65.0, 75.5) | 87.6 (83.7, 91.2) | 81.4 (76.1, 85.6) | 90.5 (86.9, 93.5) | 89.2 (85.3, 92.5) | 91.8 (88.6, 94.8) | 81.7 (78.4, 86.0) | 89.5 (85.6, 92.5) | 86.0 (82.0, 89.2) | 90.9 (87.3, 93.5) |
3.07 × 10−12 | 8.36 × 10−6 | 0.185 | 6.96 × 10−5 | 0.004 | ||||||
Accuracy (%) | 74.2 (69.8, 78.0) | 87.9 (83.7, 91.0) | 82.4 (77.8, 86.1) | 89.9 (86.1, 93.3) | 86.8 (83.7, 89.9) | 91.0 (87.9, 94.1) | 83.0 (79.6, 86.6) | 89.4 (85.8, 91.7) | 86.6 (83.2, 89.4) | 90.4 (87.6, 92.8) |
PPV (%) | 44.1 (39.7, 48.7) | 65.5 (57.6, 72.6) | 55.1 (48.0, 61.5) | 71.0 (62.6, 78.4) | 65.6 (58.5, 74.2) | 74.0 (67.0, 82.0) | 55.9 (50.8, 62.0) | 69.2 (61.4, 75.3) | 62.6 (56.5, 68.9) | 72.0 (65.2, 78.3) |
NPV (%) | 95.6 (92.8, 97.8) | 96.8 (94.6, 98.9) | 95.8 (93.6, 97.7) | 96.5 (94.6, 98.0) | 93.8 (91.6, 96.1) | 96.6 (94.1, 98.6) | 96.2 (94.0, 98.4) | 96.8 (94.7, 98.6) | 96.7 (94.1, 98.5) | 96.8 (95.0, 98.3) |
Data in parentheses are 95% confidence intervals; O-RADS Ovarian-Adnexal Reporting and Data System, AUC Area under the receiver operating characteristic curve, PPV Positive predictive value, NPV Negative predictive value. The p-values of AUC were calculated using the function ‘roc_test’ in the python package of pROC. The p-values of sensitivity and specificity were calculated via two-sided McNemar test.