Skip to main content
. Author manuscript; available in PMC: 2020 May 1.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2019 Jul;2019:6558–6569. doi: 10.18653/v1/p19-1656

Table 2:

Results for multimodal sentiment analysis on (relatively large scale) CMU-MOSEI with aligned and non-aligned multimodal sequences.

Metric Acc7h Acc2h F1h MAEl Corrh

(Word Aligned) CMU-MOSEI Sentiment

EF-LSTM 47.4 78.2 77.9 0.642 0.616
LF-LSTM 48.8 80.6 80.6 0.619 0.659
Graph-MFN (Zadeh et al., 2018b) 45.0 76.9 77.0 0.71 0.54
RAVEN (Wang et al., 2019) 50.0 79.1 79.5 0.614 0.662
MCTN (Pham et al., 2019) 49.6 79.8 80.6 0.609 0.670

MulT (ours) 51.8 82.5 82.3 0.580 0.703

(Unaligned) CMU-MOSEI Sentiment

CTC (Graves et al., 2006) + EF-LSTM 46.3 76.1 75.9 0.680 0.585
LF-LSTM 48.8 77.5 78.2 0.624 0.656
CTC + RAVEN (Wang et al., 2019) 45.5 75.4 75.7 0.664 0.599
CTC + MCTN (Pham et al., 2019) 48.2 79.3 79.7 0.631 0.645

MulT (ours) 50.7 81.6 81.6 0.591 0.694