Skip to main content
. 2023 Jan 5;2023:9645611. doi: 10.1155/2023/9645611

Table 4.

Comparative experiment details.

Model Speech Text Vision Fusion
MFN [43] LSTM LSTM LSTM Feature fusion
MCTN [32] CNN CNN CNN Concatenation
RAVEN [33] LSTM LSTM LSTM Feature fusion
HFusion [34] openSMILE Word2vec + CNN 3D CNN Hierarchical fusion
MulT [35] Conv 1D Conv 1D Conv 1D Feature fusion
SSE-FT [40] Wav2vec Roberta FabNet Hierarchical fusion
CMC-HF(ours) CSA encoder EFC encoder PC encoder Hierarchical fusion