. 2025 Jan 2;25(1):211. doi: 10.3390/s25010211

Table 6.

Training hyperparameters for BERT QA RL + RS models.

Hyperparameter	Description	Value
train_epochs	Number of complete passes through the entire training dataset. A higher number of epochs may improve model performance, though an excessive number could lead to overfitting.	3
train_batch_size	Determines the number of samples the model processes simultaneously during training. A larger batch size can accelerate training but requires more memory.	16
eval_batch_size	Similar to the training batch size, it controls the number of samples the model processes at once during evaluation.	16
learning_rate	Defines the rate at which the model adjusts its weights based on the loss gradient. A high learning rate speeds up training but may hinder convergence, while a lower rate results in more stable, albeit slower, learning.	$2 \times 10^{- 5}$
weight_decay	A regularization parameter that helps prevent overfitting by penalizing large weights, ensuring that the model generalizes well to unseen data.	$0.01$