| Algorithm 1:Q-learning algorithm. |
|
1. Initialize Q arbitrarily Q (terminal) =0 Repeat initialize s Repeat choose take action a, observe
s is terminal until convergence |
| Algorithm 1:Q-learning algorithm. |
|
1. Initialize Q arbitrarily Q (terminal) =0 Repeat initialize s Repeat choose take action a, observe
s is terminal until convergence |