Skip to main content

View full-text article in PMC

. 2025 Jan 25;15:3198. doi: 10.1038/s41598-025-86743-z

Table 2.

Optimized hyperparameters for the MAB-Ensemble technique.

Hyperparameter	Value
Learning Rate (α)	0.01
Discount Factor (γ)	0.9
Reward System	Binary
Maximum Episodes	1000
Maximum Steps per Episode	500
Exploration Strategy	Thompson sampling