Skip to main content
. 2022 Jan 19;125:108538. doi: 10.1016/j.patcog.2022.108538

Table 9.

Runtime analysis of the attention-based CNN models considered for comparison in Table 7. To maximum possible extent, in most experiments the backbone was uniformly chosen to be ResNet50 in order to enable comparison of different attention approaches on top of the same CNN.

S. No. Method Backbone Inference time (milliseconds/image) Number of Parameters (in millions) Giga FLOPs
1 FocusNet [34] SE-Net50 12.38 26.82 2.74
2 Dual Attention Network [35] ResNet50 35.45 49.51 14.27
3 Asymmetric Non-local networks [36] ResNet50 52.78 44.04 12.57
4 Multi-scale self-guided attention [37] ResNet50 60.73 38.78 10.19
5 Criss Cross Attention [38] ResNet50 25.14 28.18 6.32
6 Semi Inf Net [8] Res2Net 44.23 33.12 7.36
7 Proposed CNN Inception-ResNet-V2 based MKE module 38.24 30.51 13.78