Skip to main content
. 2020 Jul 29;8(7):e17958. doi: 10.2196/17958

Table 3.

Hyperparameters for the deep-learning methods.

Parameter BERTa RoBERTab XLNETc
Learning rate 1e-5 1e-5 2e-5
Training steps 7000 7000 7000
Maximum length 128 128 128
Batch size 16 16 16
Warm-up steps 700 700 700
Dropout rate 0.3 0.3 0.3

aBERT: bidirectional encoder representations from transformers.

bRoBERTa: robustly optimized bidirectional encoder representations from transformers pretraining approach.

cXLNET: generalized autoregressive pretraining for language understanding.