Table 4.
Architectures | Pre-Training | Spatial ConvNets | Temporal ConvNets | Two-Stream |
---|---|---|---|---|
BN-Inception | ImageNet | 91.85% | 95.05% | 95.61% |
BN-Inception + TSN | ImageNet | 94.74% | 97.65% | 98.42% |
InceptionV3 | ImageNet | 94.44% | 95.07% | 96.31% |
InceptionV3 + TSN | ImageNet | 95.52% | 97.22% | 97.39% |
InceptionV4 | ImageNet | 94.74% | 96.02% | 96.81% |
InceptionV4 + TSN | ImageNet | 96.02% | 97.72% | 97.72% |
InceptionResNetV2 | ImageNet | 94.61% | 96.02% | 96.25% |
InceptionResNetV2 + TSN | ImageNet | 94.38% | 95.38% | 96.32% |
ResNet18 | ImageNet | 94.17% | 95.99% | 96.11% |
ResNet18 + TSN | ImageNet | 94.38% | 97.22% | 98.12% |
ResNet50 | ImageNet | 94.88% | 97.22% | 97.55% |
ResNet50 + TSN | ImageNet | 96.09% | 98.20% | 98.55% |
ResNet101 | ImageNet | 96.46% | 97.79% | 98.12% |
ResNet101 + TSN | ImageNet | 96.32% | 98.56% | 98.99% |