Skip to main content
. 2020 Dec 15;8(12):e22982. doi: 10.2196/22982

Table 2.

The optimized hyperparameters of BERT-based models for various tasks.

Task Pretrained model Number of epochs Batch size Learning rate
NERa BERTb-LARGE 30 4 1.00 × 10–05
Negation classification BERT-LARGE 5 8 1.00 × 10–05
Side of family classification BERT-LARGE 10 4 1.00 × 10–05
Role of family classification BERT-LARGE 5 8 1.00 × 10–05
Living status classification BERT-LARGE 6 8 1.00 × 10–05
Relation identification BERT-LARGE 12 16 2.00 × 10–05

aNER: named entity recognition.

bBERT: bidirectional encoder representations from transformers.