Fig 5. AVMIT vs other datasets.
Audiovisual similarity scores, as estimated by MMV, across a series of audiovisual action recognition datasets; AVMIT (ours), MIT-16, Kinetics-Sounds, VGG-Sound and AVE. (a) Average audiovisual similarity score across entire datasets. (b) Rain cloud plot showing the distribution of audiovisual similarity scores for each dataset.