Skip to main content
. Author manuscript; available in PMC: 2024 May 21.
Published in final edited form as: Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1–20.

Table 11:

Table of hyperparameters for prediction on MM-IMDb dataset in the multimedia domain.

Component Model Parameter Value
Text Encoder 2-Layer MaxoutMLP Hidden size
Output dim
MLP num
512
128/256/512
2
Image Encoder 2-Layer MaxoutMLP Hidden size
Output dim
MLP num
1024
128/256/512
2
Classification Head Linear
2-Layer MLP Hidden size
Activation
512
ReLU
2-Layer Maxout_Linear Hidden size
MLP num
512
2
Fusion Concatenate
LRTF [106] Output dim
Ranks
512
128
MI-Matrix [77] output dim 1024
Training Unimodal, EF, LF, LRTF, MI-Matrix Loss
Batch size
Num epochs
Optimizer
Learning rate
Weight decay
Binary Cross Entropy
128
Text: 125, Image: 25, LF:5, EF/LRTF:15, MI-Matrix:20
AdamW
Unimodal: 0.0001, EF: 0.04, LF/LRTF/MI-Matrix: 0.008
0.01
CCA [145] Loss
CCA weight
Batch size
Num epochs
Optimizer
Learning rate
Weight decay
Binary Cross Entropy + CCA
0.001
800
20
AdamW
0.01
0.01
RMFE [53] Loss
Regularization weight
Batch size
Num epochs
Optimizer
Learning rate
Weight decay
Binary Cross Entropy + Regularization
1e −10 128
10
AdamW
0.01
0.01
RefNet [135] Loss
Contrast weight
Self-supervised weight
Batch size
Num epochs
Optimizer
Learning rate
Weight decay
Binary Cross Entropy
+ Contrast + Self-supervised
0.0001
0.1
128
10
AdamW
0.01
0.01
MFM [155] Loss
Batch size
Num epochs Optimizer
Learning rate
Recon Loss Modality Weight
Cross Entropy Weight
Intermediate Modules
Binary Cross Entropy
+ Reconstruction(MSE)
128
10
Adam
0.005
[1,1]
2.0
MLP [512,256,256]
MLP [512,256,256]
MLP [1024,512,256]