Figure 7.
Hybrid model (VGG16-LSTM): Column (A) represents the sequence of images extracted from the videos. Group (B) represents the convolutional layers based of a VGG16. Group (C) represents the long short-term memory (LSTM) layer responsible for learning temporal features. Group (D) represents the classification layer.