Table 1:
Comparison of performance of GMIC and the baselines on NYUBCS. For both GMIC and ResNet-34, we reported test AUC (mean and standard deviation) of top-5 models that achieved highest validation AUC in identifying breasts with malignant findings. We also measure the total number of learnable parameters in millions, peak GPU memory usage (Mem) for training a single exam (4 images), time taken for forward (Fwd) and backward (Bwd) propagation in milliseconds, and number of floating-point operations (FLOPs) in billions.
| Model | AUC(M) | AUC(B) | #Param | Mem(GB) | Fwd/Bwd (ms) | FLOPs |
|---|---|---|---|---|---|---|
| ResNet-34 | 0.736 ± 0.026 | 0.684 ± 0.015 | 21.30M | 13.95 | 189/459 | 1622B |
| ResNet-34-1 × 1 conv | 0.889 ± 0.015 | 0.772 ± 0.008 | 21.30M | 12.58 | 201/450 | 1625B |
| DMV-CNN (w/o heatmaps) | 0.827 ± 0.008 | 0.731 ± 0.004 | 6.13M | 2.4 | 38/86 | 65B |
| DMV-CNN (w/ heatmaps) | 0.886 ± 0.003 | 0.747 ± 0.002 | 6.13M | 2.4 | 38/86 | 65B |
| Faster R-CNN | 0.908 ± 0.014 | 0.761 ± 0.008 | 104.8M | 25.75 | 920/2019 | -3 |
| GMIC-ResNet-18 | 0.913 ± 0.007 | 0.791 ± 0.005 | 15.17M | 3.01 | 46/82 | 122B |
| GMIC-ResNet-34 | 0.909 ± 0.005 | 0.790 ± 0.006 | 25.29M | 3.45 | 58/94 | 180B |
| GMIC-ResNet-50 | 0.915 ± 0.005 | 0.797 ± 0.003 | 27.95M | 5.05 | 66/131 | 194B |
| GMIC-ResNet-18-ensemble | 0.930 | 0.800 | - | - | - | - |
| GMIC-ResNet-34-ensemble | 0.920 | 0.795 | - | - | - | - |
| GMIC-ResNet-50-ensemble | 0.927 | 0.805 | - | - | - | - |