. 2022 Apr 6;8:e920. doi: 10.7717/peerj-cs.920

Table 5. Violence detection using deep learning techniques.


Ding et al. (2014)	Violence Detection using 3D CNN	3D convolution is used to get spatial information	Backpropagation method	Crowded	91% accuracy
Arandjelovic et al. (2016)	Deep architecture for place recognition	VGG VLAD method for image retrieval	Backpropagation method for feature extraction	Crowded	87%–96% accuracy
Fenil et al. (2019)	Framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM	Bidirectional LSTM	HOG, SVM	Crowded	94.5% accuracy
Mu, Cao & Jin (2016)	Violent scene detection using CNN and deep audio features	MFB	CNN	Crowded	Approximately 90% accuracy
Mohtavipour, Saeidi & Arabsorkhi (2021)	A multi-stream CNN using handcrafted features	A deep violence detection framework based on the specific features (speed of vmovement, and representative image) derived from handcrafted methods.	CNN	Both crowded and uncrowded
Sudhakaran & Lanz (2017)	Detect violent videos using ConvLSTM	CNN along with the ConvLSTM	CNN	Crowded	Approximately 97%
Naik & Gopalakrishna (2021)	Deep violence detection framework based on the specific features derived from handcrafted methods	Discriminative feature with a novel differential motion energy image	CNN	Both crowded and uncrowded
Meng, Yuan & Li (2017)	Detecting Human Violent Behavior by integrating trajectory and Deep CNN	Deep CNN	Optical flow method	Crowded	98% accuracy
Rendón-Segador et al. (2021)	ViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM	3D DenseNet	Optical flow method	Crowded	95.6%– 100% accuracy
Xia et al. (2018)	Violence detection method based on a bi-channels CNN and the SVM.	Linear SVM	Bi-channels CNN	Both crowded and uncrowded scenes	95.90 ± 3.53 accuracy in Hockey fight, 93.25 ± 2.34 accuracy in Violence crowd
Meng et al. (2020)	Trajectory-Pooled Deep Convolutional Networks	ConvNet model which contains 17 convolutionpool-norm layers and two fully connected layers	Deep ConvNet model	Both crowded and uncrowded	92.5% accuracy in Crowd Violence, 98.6% in Hockey Fight dataset
Ullah et al. (2019)	Violence Detection using Spatiotemporal Features	Pre-train Mobile Net CNN model	3D CNN	Crowded	Approximately 97% accuracy