|
CelebA
|
Academic |
200k images |
Various |
Various |
40 attribute annotations per image |
High |
Public |
|
DeeperForensics-1.0
|
Academic |
60,000 videos |
1,920 × 1,080 pixels |
100 actors from 26 countries of various skin tones and ages |
Eight naturally expressed feelings |
High |
Public |
|
Deepfake-TIMIT
|
Academic |
640 videos |
Low Quality: 64 × 64, High Quality: 128 × 128 |
Various |
Face Swap |
Low/High |
Public |
|
Celeb-DF
|
Academic |
5,639 videos |
256 × 256 pixels |
59 celebrities of various ethnicities and ages |
Face Swap |
High |
Public |
|
FaceForensics (FF)
|
Academic |
1,004 videos |
Various |
Various |
Face-to-Face Reenactment |
High |
Public |
|
UADFV
|
University of Albany |
49 videos |
294 × 500 pixels |
Various |
Physiological cues |
Medium |
Public |
|
FaceForensics++ (FF++)
|
Academic |
1,000 videos |
VGA, HD, FullHD |
Male 60%, Female 40% |
Face |
High |
Public |
|
Deepfake detection challenge (DFDC)
|
Facebook |
5,000 videos |
Various |
74% female, 26% male, and diverse ethnicities |
Face swap |
High |
Public |
|
HOHA-based
|
Academic |
600 videos |
360 × 240 pixels |
Various |
Human actions |
High |
Public |
|
FakeAVCeleb
|
Real YouTube videos of celebrities |
Audio and Video |
Various |
Covers four racial backgrounds |
Deepfake videos and corresponding synthesized cloned audios |
High, with perfect lip-syncing |
Public |
|
Attack Agnostic
|
Combines two audio DeepFakes and one anti-spoofing dataset |
Audio clips |
Audio dataset, no visual resolution |
Varied types of spoofing attacks |
Audio DeepFakes designed to challenge detection algorithms |
Designed to improve detection generalization |
Public |
|
ADD 2022
|
Audio Deep Synthesis Detection challenge |
Covers a range of real-life scenarios |
Audio dataset, no visual resolution |
Real-life and challenging scenarios |
Tracks for low-quality fake audio detection, partially fake audio |
Aims to push boundaries of detection research |
Public |
|
TweepFake
|
Twitter, based on advancements in language modeling like GPT-2 |
25,572 tweets |
Various |
23 bots imitating 17 human accounts |
Tweets generated using various techniques like Markov Chains, LSTM |
Real tweets posted on Twitter |
Public |
|
ADBT
|
Twitter, focused on detecting bot-generated messages |
Extended from existing work |
Various |
Generated tweets designed to mimic human writing |
Tweets generated to test detection models against language models |
Generated content difficult to distinguish from human-written |
Public |