Table 2.
Optimized hyperparameters for the MAB-Ensemble technique.
| Hyperparameter | Value |
|---|---|
| Learning Rate (α) | 0.01 |
| Discount Factor (γ) | 0.9 |
| Reward System | Binary |
| Maximum Episodes | 1000 |
| Maximum Steps per Episode | 500 |
| Exploration Strategy | Thompson sampling |
Optimized hyperparameters for the MAB-Ensemble technique.
| Hyperparameter | Value |
|---|---|
| Learning Rate (α) | 0.01 |
| Discount Factor (γ) | 0.9 |
| Reward System | Binary |
| Maximum Episodes | 1000 |
| Maximum Steps per Episode | 500 |
| Exploration Strategy | Thompson sampling |