Table 2.
The performance of Deep-VCUG in the internal and external testing sets for unilateral and bilateral VUR grading.
| Accuracy (95% CI) | Precision (95% CI) | Sensitivity (95% CI) | F1 Score (95% CI) | AUROC (95% CI) | Specificity (95% CI) | |
|---|---|---|---|---|---|---|
| Unilateral VUR grading (N = 374) | ||||||
| Internal testing set (N = 141) | ||||||
| Deep-VCUG | 0.805 (0.738–0.866) | 0.761 (0.688–0.841) | 0.805 (0.738–0.866) | 0.782 (0.710–0.851) | 0.962 (0.943–0.978) | 0.960 (0.941–0.975) |
| MobileNetv2 | 0.779 (0.711–0.846) | 0.749 (0.683–0.828) | 0.779 (0.711–0.846) | 0.762 (0.695–0.832) | 0.948 (0.920–0.972) | 0.957 (0.941–0.972) |
| GoogLeNet | 0.597 (0.523–0.671) | 0.518 (0.417–0.633) | 0.597 (0.523–0.671) | 0.529 (0.438–0.618) | 0.881 (0.845–0.914) | 0.878 (0.856–0.900) |
| ResNet 101 | 0.738 (0.671–0.805) | 0.729 (0.652–0.809) | 0.738 (0.671–0.805) | 0.731 (0.658–0.800) | 0.931 (0.899–0.962) | 0.951 (0.935–0.968) |
| DenseNet161 | 0.758 (0.691–0.826) | 0.783 (0.724–0.850) | 0.758 (0.691–0.826) | 0.762 (0.693–0.826) | 0.942 (0.916–0.967) | 0.955 (0.938–0.970) |
| EfficientNet-B0 | 0.570 (0.483–0.644) | 0.539 (0.438–0.656) | 0.570 (0.483–0.644) | 0.533 (0.444–0.616) | 0.900 (0.868–0.928) | 0.893 (0.871–0.915) |
| External testing set (N = 233) | ||||||
| Deep-VCUG | 0.807 (0.755–0.858) | 0.827 (0.788–0.873) | 0.807 (0.755–0.858) | 0.807 (0.756–0.858) | 0.944 (0.921–0.964) | 0.958 (0.946–0.970) |
| MobileNetv2 | 0.785 (0.734–0.837) | 0.792 (0.747–0.845) | 0.785 (0.734–0.837) | 0.786 (0.734–0.838) | 0.952 (0.933–0.968) | 0.954 (0.942–0.965) |
| GoogLeNet | 0.790 (0.734–0.841) | 0.804 (0.760–0.852) | 0.790 (0.734–0.841) | 0.790 (0.736–0.839) | 0.954 (0.935–0.971) | 0.952 (0.938–0.964) |
| ResNet 101 | 0.764 (0.704–0.820) | 0.782 (0.731–0.834) | 0.764 (0.704–0.820) | 0.765 (0.706–0.820) | 0.933 (0.907–0.953) | 0.950 (0.936–0.962) |
| DenseNet161 | 0.768 (0.717–0.820) | 0.792 (0.751–0.839) | 0.768 (0.717–0.820) | 0.772 (0.718–0.823) | 0.943 (0.924–0.963) | 0.948 (0.934–0.961) |
| EfficientNet-B0 | 0.326 (0.266–0.391) | 0.295 (0.234–0.368) | 0.326 (0.266–0.391) | 0.290 (0.231–0.356) | 0.725 (0.684–0.768) | 0.837 (0.806–0.863) |
| Bilateral VUR grading (N = 74) | ||||||
| Internal testing set (N = 27) | ||||||
| Deep-VCUG | 0.796 (0.685–0.889) | 0.833 (0.671–0.917) | 0.796 (0.685–0.889) | 0.775 (0.640–0.879) | 0.960 (0.922–0.983) | 0.936 (0.898–0.969) |
| MobileNetv2 | 0.741 (0.630–0.852) | 0.782 (0.604–0.882) | 0.741 (0.630–0.852) | 0.720 (0.585–0.847) | 0.936 (0.888–0.975) | 0.911 (0.861–0.954) |
| GoogLeNet | 0.741 (0.630–0.852) | 0.766 (0.590–0.869) | 0.741 (0.630–0.852) | 0.716 (0.581–0.840) | 0.932 (0.876–0.971) | 0.912 (0.863–0.951) |
| ResNet 101 | 0.741 (0.611–0.852) | 0.804 (0.647–0.886) | 0.741 (0.611–0.852) | 0.721 (0.583–0.847) | 0.960 (0.925–0.984) | 0.904 (0.851–0.950) |
| DenseNet161 | 0.778 (0.667–0.889) | 0.795 (0.657–0.913) | 0.778 (0.667–0.889) | 0.762 (0.632–0.880) | 0.943 (0.899–0.980) | 0.946 (0.910–0.976) |
| EfficientNet-B0 | 0.667 (0.537–0.778) | 0.679 (0.560–0.817) | 0.667 (0.537–0.778) | 0.656 (0.522–0.777) | 0.919 (0.863–0.959) | 0.893 (0.836–0.935) |
| External testing set (N = 47) | ||||||
| Deep-VCUG | 0.745 (0.660–0.830) | 0.766 (0.671–0.852) | 0.745 (0.660–0.830) | 0.720 (0.617–0.819) | 0.924 (0.887–0.957) | 0.808 (0.738–0.874) |
| MobileNetv2 | 0.723 (0.628–0.809) | 0.740 (0.640–0.827) | 0.723 (0.628–0.809) | 0.699 (0.587–0.788) | 0.930 (0.888–0.963) | 0.799 (0.728–0.859) |
| GoogLeNet | 0.628 (0.521–0.734) | 0.646 (0.543–0.760) | 0.628 (0.521–0.734) | 0.616 (0.510–0.724) | 0.878 (0.834–0.918) | 0.787 (0.713–0.850) |
| ResNet 101 | 0.691 (0.606–0.777) | 0.691 (0.592–0.791) | 0.691 (0.606–0.777) | 0.676 (0.577–0.773) | 0.900 (0.858–0.937) | 0.803 (0.728–0.864) |
| DenseNet161 | 0.670 (0.574–0.766) | 0.688 (0.582–0.787) | 0.670 (0.574–0.766) | 0.639 (0.527–0.742) | 0.916 (0.881–0.949) | 0.769 (0.697–0.835) |
| EfficientNet-B0 | 0.489 (0.383–0.596) | 0.511 (0.407–0.634) | 0.489 (0.383–0.596) | 0.493 (0.391–0.604) | 0.843 (0.795–0.889) | 0.730 (0.644–0.805) |