Skip to main content
. 2025 Jan 2;25(1):211. doi: 10.3390/s25010211

Table 6.

Training hyperparameters for BERT QA RL + RS models.

Hyperparameter Description Value
train_epochs Number of complete passes through the entire training dataset. A higher number of epochs may improve model performance, though an excessive number could lead to overfitting. 3
train_batch_size Determines the number of samples the model processes simultaneously during training. A larger batch size can accelerate training but requires more memory. 16
eval_batch_size Similar to the training batch size, it controls the number of samples the model processes at once during evaluation. 16
learning_rate Defines the rate at which the model adjusts its weights based on the loss gradient. A high learning rate speeds up training but may hinder convergence, while a lower rate results in more stable, albeit slower, learning. 2×105
weight_decay A regularization parameter that helps prevent overfitting by penalizing large weights, ensuring that the model generalizes well to unseen data. 0.01