Skip to main content
. 2020 Jul 29;8(7):e17958. doi: 10.2196/17958

Table 4.

Hyperparameters during further in-domain pretraining for the deep-learning methods.

Parameter BERTa RoBERTab XLNETc
Learning rate 2e-5 2e-5 2e-5
Training steps 100,000 100,000 100,000
Maximum length 256 256 256
Batch size 16 16 16
Warm-up steps 10,000 10,000 10,000

aBERT: bidirectional encoder representations from transformers.

bRoBERTa: robustly optimized bidirectional encoder representations from transformers pretraining approach.

cXLNET: generalized autoregressive pretraining for language understanding.