Skip to main content
. 2023 Feb 2;23(3):1634. doi: 10.3390/s23031634
Algorithm 1: Predict the optimal route
Input: Start state;
Result: Optimal route;
initialization;
Initialize Q(s,a);
Initialize state ’s’;
Choose an action ’a’ using epsilon-greedy approach;
for each time step do
 Take a;
 Observe the reward r(t+1) and the state s(t+1);
 Update Q(s(t),a(t));
 s(t) ← s(t+1);
 a(t) ← a(t+1)
end