Table 2.
Parameter | Setting |
||
---|---|---|---|
Task 1 | Task 2 | Task 3 | |
Maximum permitted labels | 5 | 8 | 70 |
Maximum tokens in documents | All tokens | First 1500 tokens | |
Neurons in the MLP | 128, 128, 64 | 7024, 7024, 128 | |
Batch size | 32 | 16 | |
Training epochs (label prediction network) | 50 | 30 | |
RNN unit (dimension) | LSTM (50) | ||
Attention layer dimension | 50 | ||
Dropout rate | 0.5 | ||
Optimizer (learning rate) | Adam (0.001) |
Abbreviations: LSTM, long short-term memory; MLP, multi-layer perceptron; RNN, recurrent neural networks;