Skip to main content
. 2021 Sep 24;2021:7588221. doi: 10.1155/2021/7588221

Figure 8.

Figure 8

Experimental results for CartPole task under different sparse reward settings, where T denotes the sparse interval of receiving rewards for the agent. Plots show the training performance over the number of episodes. (a) T = 25. (b) T = 50. (c) T = 100.