Skip to main content
. 2025 Jan 25;15:3198. doi: 10.1038/s41598-025-86743-z

Table 2.

Optimized hyperparameters for the MAB-Ensemble technique.

Hyperparameter Value
Learning Rate (α) 0.01
Discount Factor (γ) 0.9
Reward System Binary
Maximum Episodes 1000
Maximum Steps per Episode 500
Exploration Strategy Thompson sampling