Skip to main content
. 2022 Apr 6;8:e920. doi: 10.7717/peerj-cs.920

Table 5. Violence detection using deep learning techniques.

Ding et al. (2014) Violence Detection using 3D CNN 3D convolution is used to get spatial information Backpropagation method Crowded 91% accuracy
Arandjelovic et al. (2016) Deep architecture for place recognition VGG VLAD method for image retrieval Backpropagation method for feature extraction Crowded 87%–96% accuracy
Fenil et al. (2019) Framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM Bidirectional LSTM HOG, SVM Crowded 94.5% accuracy
Mu, Cao & Jin (2016) Violent scene detection using CNN and deep audio features MFB CNN Crowded Approximately 90% accuracy
Mohtavipour, Saeidi & Arabsorkhi (2021) A multi-stream CNN using handcrafted features A deep violence detection framework based on the specific features (speed of vmovement, and representative image) derived from handcrafted methods. CNN Both crowded and uncrowded
Sudhakaran & Lanz (2017) Detect violent videos using ConvLSTM CNN along with the ConvLSTM CNN Crowded Approximately 97%
Naik & Gopalakrishna (2021) Deep violence detection framework based on the specific features derived from handcrafted methods Discriminative feature with a novel differential motion energy image CNN Both crowded and uncrowded
Meng, Yuan & Li (2017) Detecting Human Violent Behavior by integrating trajectory and Deep CNN Deep CNN Optical flow method Crowded 98% accuracy
Rendón-Segador et al. (2021) ViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM 3D DenseNet Optical flow method Crowded 95.6%– 100% accuracy
Xia et al. (2018) Violence detection method based on a bi-channels CNN and the SVM. Linear SVM Bi-channels CNN Both crowded and uncrowded scenes 95.90 ± 3.53 accuracy in Hockey fight, 93.25 ± 2.34 accuracy in Violence crowd
Meng et al. (2020) Trajectory-Pooled Deep Convolutional Networks ConvNet model which contains 17 convolutionpool-norm layers and two fully connected layers Deep ConvNet model Both crowded and uncrowded 92.5% accuracy in Crowd Violence, 98.6% in Hockey Fight dataset
Ullah et al. (2019) Violence Detection using Spatiotemporal Features Pre-train Mobile Net CNN model 3D CNN Crowded Approximately 97% accuracy