Skip to main content
. 2024 May 27;10:e2037. doi: 10.7717/peerj-cs.2037

Table 2. Summary of digital forensic techniques for deepfake detection.

A summary of the method used, modality, features, datasets, and performance metrics, along with the advantages and limitations of mentioned techniques.

Reference Method Modality Features analyzed Validation dataset Performance metrics Advantages Limitations
Xia et al. (2022) MesoNet with Preprocessing Video (Face Images) Enhanced MesoNet features, Frame consistency, Color and texture details FaceForensics++ and DFDC Preview Dataset Accuracy: 95.6% (FaceForensics++), 93.7% (DFDC) Preprocessing module enhances the discriminative capability of the network. Robust against various compression levels and deepfake generation techniques. Performance might vary based on the quality of the deepfakes. Slight computational overhead due to preprocessing.
Guarnera et al. (2020) Feature-based Forensic Analysis Image JPEG artifacts, Quantization tables, Sensor noise patterns Custom dataset of StarGAN and StyleGAN images Qualitative analysis Targets intrinsic features and artifacts, making it robust against typical manipulations. Can be applied to a wide variety of image sources and formats. Might not be as effective against advanced manipulation techniques. Requires high-quality original images for optimal performance.
Kumar et al. (2020) Counter Anti-Forensic Approach Image JPEG compression artifacts, Histogram analysis, Noise inconsistencies Self-created dataset with a variety of JPEG manipulations Effectiveness in detecting anti-forensic manipulations discussed Specifically designed to detect and counter anti-forensic techniques. Utilizes multiple feature sets for a comprehensive analysis. May require calibration based on the specific JPEG anti-forensic technique used. Performance might vary based on the quality and type of manipulations.
Raza, Munir & Almutairi (2022) Convolutional Neural Network (CNN) Approach Video Deep features from CNN layers, Temporal dynamics and spatial details DFDC and DeepFake-TIMIT Accuracy: 96.4% (DFDC), 95.7% (DeepFake-TIMIT) Utilizes deep features which capture intricate details often missed by traditional methods. Highly scalable due to the deep learning framework. Requires a significant amount of labeled data for training. Performance might degrade in scenarios with limited training data or diverse manipulations.
Mitra et al. (2020) Machine Learning-based Forensic Analysis Video (Face Regions) Frame-by-frame pixel intensity, Facial expressions and landmarks, Audio-visual synchronization DFDC Preview Dataset Accuracy: 94.7%, Precision: 94.5%, Recall: 94.8%, F1 Score: 94.6% Integrates both visual and auditory features for improved detection. Applicable to a wide range of videos sourced from social media platforms. Might be sensitive to noisy social media data. Requires substantial computational resources for feature extraction and analysis.
Vamsi et al. (2022) Media Forensic Deepfake Detection Image and Video Compression artifacts, Lighting anomalies, Physiological signals (e.g., heartbeat, breath patterns) Combined dataset from FaceForensics++, DFDC, and DeepFake-TIMIT Accuracy: 93.5% Comprehensive approach that combines various media forensic techniques. Targets both superficial and deep features of manipulated content. May require high-resolution data to detect subtle physiological signals. Computationally intensive due to the amalgamation of multiple forensic methods.
Lee et al. (2021) Temporal Artifacts Reduction (TAR) Video (Face Regions) Temporal artifacts in frame sequences, Lighting and shadow inconsistencies DeepFake Detection Challenge Dataset (DFDC) Accuracy: 97.3% Targets inconsistencies arising due to the deepfake generation process. Effective in detecting subtle temporal artifacts. Might be sensitive to the quality and resolution of the video. Requires a sequence of frames.
Li et al. (2020) Dataset-based Forensics (Celeb-DF) Video (Face Regions) Dataset creation and benchmarking Custom Celeb-DF dataset Focus on dataset creation Provides a large-scale, challenging dataset for deepfake forensics. Contains high-quality deepfakes. Dataset complexity might challenge traditional forensic techniques. Needs other datasets for comprehensive evaluation.
Kumar & Sharma (2023) GAN-Based Forensic Detection Image, Video Discriminative features from GAN layers, Texture and color anomalies DFDC and FaceForensics++ Accuracy: 96.1% (DFDC), 95.8% (FaceForensics++) Utilizes the power of GANs for deepfake detection. Capable of detecting intricate manipulations. Sensitive to the quality of GAN-generated fakes. Requires significant computational resources.
Hao et al. (2022) Multi-modal fusion Image and Audio Image: Differences in pixel intensity, facial landmarks, skin tone inconsistencies. Audio: Spectral features, prosodic features, phonotactic features. Deepfake Detection Challenge Dataset (DFDC) Accuracy: 94.2%, Precision: 93.8%, Recall: 94.1%, F1 Score: 94.0% Uses a fusion of image and audio modalities which increases robustness. Effective in real-world scenarios where only one modality might be tampered with. Requires both audio and video data, which might not always be available. Slightly increased computational overhead due to multi-modal processing.
Jafar et al. (2020) Temporal Forensic Analysis Video Temporal inconsistencies: Frame-to-frame variations. Compression artifacts: Differences due to video compression. Lighting inconsistencies: Inconsistencies in shadows and light reflections. FaceForensics++ and DeepFake-TIMIT Accuracy: 91.5% (FaceForensics++), 89.8% (DeepFake-TIMIT) Targets inconsistencies that arise due to the video generation process. Robust against various deepfake generation techniques. Performance might degrade with higher-quality deepfakes. Requires a sequence of frames rather than individual images.
Ferreira, Antunes & Correia (2021) Dataset-based Forensics Image and Video Metadata extraction, Image source identification, Manipulation detection Proprietary dataset introduced in the article Accuracy, Precision. Provides a diverse set of images and videos for forensic analysis. Can be used to benchmark multiple forensic techniques. Dataset might not cover all possible manipulations and scenarios. Requires periodic updates to remain relevant.
Wang et al. (2022a), Wang et al. (2022b) and Wang et al. (2022c) Reliability-based Forensics Video, Audio Frame consistency, Eye blinking patterns, Facial muscle movements, Skin texture analysis and Voice pattern. Celeb-DF Dataset Accuracy: 92.3%, Precision: 92.1%, Recall: 92.4%, F1 Score: 92.2% Uses natural physiological signals which are hard for deepfakes to mimic. Applicable to a wide range of videos regardless of content. Might be sensitive to video quality and resolution. Real-life scenarios with partial occlusions or low lighting might affect performance.
Xue et al. (2022) Combination of F0 information and spectrogram features Audio Fundamental frequency (F0), real and imaginary spectrogram features ASVspoof 2019 LA dataset Equivalent error rate (EER) of 0.43% High effectiveness in detecting audio deepfakes, surpassing most existing systems Limited discussion on the applicability in diverse real-world scenarios
Müller et al. (2022) Re-implementation and evaluation of existing architectures Audio Various audio spoofing detection features New dataset of celebrity and politician recordings Performance degradation on real-world data Systematizes audio spoofing detection, identifies key successful features Poor performance on real-world data, suggesting limited generalizability
Khalid et al. (2021) Novel multimodal detection method Audio-Video Deepfake videos and synthesized cloned audios FakeAVCeleb dataset Checking audio and video accuracy and precision. Addresses multimodal deepfake detection and racial bias issues Dataset might not cover all possible manipulations and scenarios.
Fagni et al. (2021) Dataset introduction and evaluation Text (Tweets) Tweets from various generative models TweepFake dataset Evaluation of 13 methods First dataset of real deepfake tweets, baseline for future research Specific detection techniques not developed in the article
Kietzmann et al. (2020) R.E.A.L. framework Various (including text) Deepfake types and technologies Text based mostly to check accurancy. Framework effectiveness Comprehensive overview, risk management strategy Lacks empirical validation
Pu et al. (2023) Semantic analysis Text Semantic information in text Online services powered by Transformer-based tools Robustness against adversarial attacks Improves robustness and generalization Performance degradation under certain scenarios