Skip to main content
. 2021 Aug 16;18:126. doi: 10.1186/s12984-021-00919-y

Fig. 6.

Fig. 6

Reinforcement learning algorithms for continuous action space. The diagram is adapted from [112] and presents a partial taxonomy of RL algorithms for continuous control, or continuous action space. This focuses on a few modern deep RL algorithms and some traditional RL algorithms that are relevant to the algorithms used by the top teams in our competition. TRPO: trust region policy optimization [113]; PPO: proximal policy optimization [114]; DDPG: deep deterministic policy gradients [115]; TD3: twin delayed deep deterministic policy gradients [116]; SAC: soft-actor critic [117]