Skip to main content
. 2020 Sep 8;7:293. doi: 10.1038/s41597-020-00630-y

Table 1.

Comparison of the K-EmoCon dataset with the existing multimodal emotion recognition datasets.

Name (year) Size Modalities Spon. vs. posed Natural vs. induced Annotation method Annotation type Context

IEMOCAP

(2008)51

10

Videos, face motion

capture, gesture, speech

(audio & transcribed)

Both Both

Per dialog

turn

S, E Dyadic

SEMAINE

(2011)52

150

Videos, FAUs, speech

(audio & transcribed)

Spon. Induced

Trace-style

continuous

E Dyadic

MAHNOB-HCI

(2011)23

27

Videos (face and body),

eye gaze, audio, biosignals

(EEG, GSR, ECG, respiration,

skin temp.)

Spon. Induced Per stimuli S Individual

DEAP

(2012)24

32

Face videos, biosignals

(EEG, GSR, BVP, respiration,

skin temp., EMG & EOG)

Spon. Induced Per stimuli S Individual

DECAF

(2015)25

30

NIR face videos, biosignals

(MEG, hEOG, ECG, tEMG)

Spon. Induced Per stimuli S Individual

ASCERTAIN

(2016)26

58

Facial motion units (EMO),

biosignals (ECG, GSR, EEG)

Spon. Induced Per stimuli S Individual

MSP-IMPROV

(2016)53

12 Face videos, speech audio Both Both

Per dialog

turn

E Dyadic

DREAMER

(2017)27

23 Biosignals (EEG, ECG) Spon. Induced Per stimuli S Individual

AMIGOS

(2018)28

40

Vidoes (face & body),

biosignals (EEG, ECG, GSR)

Spon. Induced Per stimuli S, E

Individual,

Group

MELD

(2019)38

7

Videos, speech

(audio & transcribed)

Both Both Turn-based E

Dyadic,

Group

CASE

(2019)29

30

Biosignals (ECG, respiration,

BVP, GSR, skin temp., EMG)

Spon. Induced

Trace-style

continuous

S Individual

CLAS

(2020)100

64

Biosignals (ECG, PPG, EDA),

accelerometer

Spon. Induced Per stimuli/task Predefined Individual

K-EmoCon

(2020)

32

Videos (face, gesture),

speech audio, accelerometer,

biosignals (EEG, ECG, BVP,

EDA, skin temp.)

Spon. Natural

Interval-based

continuous

S, P, E Dyadic

Posed emotions are when a subject is instructed to enact a particular emotion while Spon. = spontaneous. Similarly, induced emotions are when a set of selected stimuli is used for their elicitation. For annotation types, S = self annotations, P = partner annotations, and E = external observer annotations.

A dataset was considered to contain induced emotions if scripted interaction was involved in the data collection, even though no artificial stimuli (such as an emotion inducing video clip) was used.

Predefined emotion categories of stimuli and success rates of participants in a set of purposefully selected cognitive tasks were used as ground-truth labels.