Skip to main content

View full-text article in PMC

. 2024 Aug 27;3:e52190. doi: 10.2196/52190

Table 3.

BERT^a hyperparameter tuning in the internal training and validation cohorts using 5-fold experiments.

Batch size	Max length	Learning rate	Epoch	Value, mean (SD)
64	—^b	2×10^–5	—	0.78 (0.01)
128	—	2×10^–5	—	0.80 (0.01)
128	128	2×10^–5	3	0.80 (0.01)
256	64	2×10^–5	—	0.79 (0.01)
64	—	3×10^–5	—	0.79 (0.01)
128	—	3×10^–5	—	0.79 (0.01)
128	128	3×10^–5	3	0.78 (0.01)
256	64	3×10^–5	—	0.78 (0.01)
64	—	5×10^–5	—	0.79 (0.01)
128	—	5×10^–5	—	0.80 (0.01)
128	128	5×10^–5	3	0.79 (0.01)
256	64	5×10^–5	—	0.79 (0.01)

^aBERT: Bidirectional Encoder Representations from Transformers.

^bNot applicable.