Skip to main content
. 2021 Mar 26;117:103761. doi: 10.1016/j.jbi.2021.103761

Table 8.

Hyperparameters for the Span-based Event Extractor.

Parameter Value
Maximum sentence length, n 30
Maximum span length, M 6
Top-K spans per classifier sentence token count
Batch size 100
Number of epochs 100
Learning rate 0.001
Optimizer Adam
Maximum gradient L2-norm 100
BERT embedding dropout 0.3
bi-LSTM hidden size, vh 200
bi-LSTM activation function tanh
bi-LSTM dropout 0.3
Span classifier projection size, vs 100
Span classifier activation function ReLU
Span classifier dropout 0.3
Role classifier projection size, vr 100
Role classifier activation function ReLU
Role classifier dropout 0.3