Skip to main content
. 2022 Sep 6;22:234. doi: 10.1186/s12911-022-01977-5

Table 4.

Hyper parameters

Hyper Parameter
num_train_epochs 10
learning_rate 5e-5
per_device_train_batch_size 16
per_device_eval_batch_size 64
warmup_ratio 0.1
weight_decay 0.01
adam_beta1 0.9
adam_beta2 0.999
adam_epsilon 1e-8
max_grad_norm 1