Skip to main content
. Author manuscript; available in PMC: 2022 May 11.
Published in final edited form as: Proc IEEE Int Conf Comput Vis. 2021 Oct;2021:13557–13567. doi: 10.1109/iccv48922.2021.01332

Table 4:

Quantitative analysis on Kinetics-400 dataset. The performance gain is defined as the disparity of the top-1 accuracy between VidTr network and that of I3D.

Top 5 (+) Acc. gain Top 5 (−) Acc. gain

making a cake +26.0% shaking head −21.7%
catching fish +21.2% dunking basketball −20.8%
catching baseball +20.8% lunge −19.9%
stretching arm +19.1% playing guitar −19.9%
spraying +18.0 % tap dancing −16.3%

(a) Top 5 classes that VidTr works better than I3D. (b) Top 5 classes that I3D works better than VidTr.