Skip to main content
. 2020 Feb 17;20(4):1085. doi: 10.3390/s20041085

Table 4.

Comparison of recognition accuracy with different network architectures.

Architectures Pre-Training Spatial ConvNets Temporal ConvNets Two-Stream
BN-Inception ImageNet 91.85% 95.05% 95.61%
BN-Inception + TSN ImageNet 94.74% 97.65% 98.42%
InceptionV3 ImageNet 94.44% 95.07% 96.31%
InceptionV3 + TSN ImageNet 95.52% 97.22% 97.39%
InceptionV4 ImageNet 94.74% 96.02% 96.81%
InceptionV4 + TSN ImageNet 96.02% 97.72% 97.72%
InceptionResNetV2 ImageNet 94.61% 96.02% 96.25%
InceptionResNetV2 + TSN ImageNet 94.38% 95.38% 96.32%
ResNet18 ImageNet 94.17% 95.99% 96.11%
ResNet18 + TSN ImageNet 94.38% 97.22% 98.12%
ResNet50 ImageNet 94.88% 97.22% 97.55%
ResNet50 + TSN ImageNet 96.09% 98.20% 98.55%
ResNet101 ImageNet 96.46% 97.79% 98.12%
ResNet101 + TSN ImageNet 96.32% 98.56% 98.99%