Skip to main content
. 2022 Oct 28;22(21):8278. doi: 10.3390/s22218278
Algorithm 1:Q-learning algorithm.

1. Initialize

Q arbitrarily

Q (terminal) =0

Repeat

      initialize s

      Repeat

            choose aϵgreedily

            take action a, observe r,s

            Q(st,at)Q(st,at)+α[rt+1+γQ(st+1,at+1]Q(st,at)

            ss

      s is terminal

until convergence