Skip to main content
. 2025 Jan 2;25(1):211. doi: 10.3390/s25010211

Table 1.

Description of hyperparameters for A and their respective values.

Hyperparameter Description Value 1 Value 2
Alpha (α) Learning rate that controls how much the agent learns from each new experience. A higher value accelerates learning but may lead to unstable convergence. 0.01 0.5
Gamma (γ) Discount factor that determines the importance of future rewards. A higher value prioritizes long-term rewards. 0.9 0.5
Epsilon (ϵ) Exploration rate that controls the probability of the agent taking a random action instead of following its policy. A higher value encourages exploration. 0.2 0.015
Epsilon Decay (ϵdecay) Decay rate for the exploration rate (ϵ), which controls how ϵ decreases over time, allowing the agent to reduce exploration as it learns. 0.999 0.9