Skip to main content
. Author manuscript; available in PMC: 2024 May 21.
Published in final edited form as: Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1–20.

Table 13:

Results on multimodal datasets in the affective computing domain. U: unimodal models, M: multimodal fusion paradigms, O: optimization objectives, T: training structures. MulT is the best performing model on all these datasets, and is categorized as an in-domain method since it was originally proposed and tested on affect recognition datasets. Many out-domain methods struggle on these datasets.

Dataset
Acc(2)↑
MUStARD
Acc(2)↑
CMU-MOSI
Acc(2)↑
UR-FUNNY
Acc(2)↑
CMU-MOSEI
Acc(2)↑
U Unimodal ()
Unimodal (a)
Unimodal (v)
68.6±0.4
64.9±0.4
65.7±0.7
74.2±0.5
65.5±0.2
66.3±0.3
58.3±0.2
57.2±0.9
57.3±0.5
78.8±1.5
66.4±0.7
67.2±0.4
M EF-GRU
LF-GRU
EF-Transformer
LF-Transformer
TF [179]
LRTF [106]
MI-Matrix [77]
MulT [154]
66.3±0.3
66.1±0.9
65.3±1.4
66.1±0.9
62.1±2.2
65.2±1.5
61.8±0.3
71.8±0.3
73.2±2.2
75.2±0.8
78.8±0.4
79.6±0.4
74.4±0.2
76.3±0.3
73.9±0.4
83.0±0.1
60.2±0.5
62.5±0.5
62.9±0.2
63.4±0.3
61.2±0.4
62.7±0.2
61.9±0.3
66.7±0.3
78.4±0.6
79.2±0.4
79.6±0.3
80.6±0.3
79.4±0.5
79.6±0.6
76.5±0.4
82.1±0.5
O MFM [155]
MVAE [168]
MCTN [123]
66.3±0.3
64.5±0.4
63.2±1.4
78.1±0.9
77.2±0.3
76.9±2.1
62.4±1.1
62.0±0.5
63.2±0.8
79.4±0.7
79.1±0.2
76.4±0.4
T GradBlend [167] 66.1±0.3 75.5±0.5 62.3±0.3 78.1±0.3