Table 4:
Component | Model | Parameter | Value | |
---|---|---|---|---|
GRU Encoder | GRU | Input sizes | [5,20,35,74,300,704] | |
Hidden sizes | [32,32,64,128,512,1024] | |||
Num of layers | 1 or 2 | |||
Dropout | 0:0 or 0:1 | |||
| ||||
Transformer Encoder [158] | Transformer [158] | Input sizes | [5,20,35,74,300,704] | |
Hidden sizes | [32,32,64,128,512,1024] | |||
Num of layers | 2 or 3 | |||
Dropout | 0.2 | |||
| ||||
Head | MLP | Input sizes | [5,20,32,64,128,256] | |
Hidden sizes | [5,20,32,64,128,256] | |||
Num layers | [2 | |||
Dropout | 0.2 | |||
| ||||
MCTN [123] Encoder | GRU | Input sizes | 300 | |
Hidden sizes | [32, 64] | |||
Num of layers | 1 or 2 | |||
Dropout | 0.0 or 0.1 | |||
| ||||
MCTN [123] Decoder | GRU | Input sizes | [32, 64] | |
Hidden sizes | 300 | |||
Num of layers | 1 or 2 | |||
Dropout | 0.0 or 0.1 | |||
| ||||
MCTN [123] Seq2Seq | GRU+GRU | teaching ratio | 0.5 | |
Embed sizes | 32 | |||
, , | 0.01 | |||
| ||||
Fusion | LRTF [106] | Num ranks | 64 | |
Output sizes | 128 | |||
| ||||
MI-Matrix [77] | Hidden size | 128 | ||
| ||||
MulT [ | Hidden size | 40 | ||
Num heads | 8 or 10 | |||
| ||||
Training | Loss | MAE or Cross Entropy | ||
Batch size | 32 | |||
Seq Length | 50 or 20 | |||
Num epochs | 100 or 300 | |||
Early stop | True | |||
Patience | [8,20] | |||
Activation | ReLU | |||
Optimizer | AdamW | |||
Weight Decay | 1×10−4 | |||
Learning rate | 1×10−4 |