Skip to main content
. 2023 Apr 14;4(4):100729. doi: 10.1016/j.patter.2023.100729

Table 3.

Ablation study on optimization adjustments in fine-tuning by comparing BIOSSES test performance under various pretraining settings

Pretraining setting Improved optimization Standard epochs No bias correction
BERT 93.46 92.64 91.75
BERT (no NSP) 93.12 91.31 92.35
BERT (no NSP, single seq) 75.50 0.65 70.50
ELECTRA 80.24 49.87 80.41

Improved optimization used bias correction in ADAM and up to 100 epochs in fine-tuning (vs. up to five epochs in standard setting), all with BASE models.

Highest performance for model (row).