Skip to main content
. 2021 Sep 13;8:738113. doi: 10.3389/frobt.2021.738113

FIGURE 8.

FIGURE 8

Sample trajectories from the different RL agents acting in the MovingObstaclesNoRules environment. The agents shown was trained using the simplified reward function. Black dashed line: path to follow; red dashed line: path taken by the vessel. The environment was generated and sampled equally for all algorithms by setting the random seed to zero and using the result from the initial episode.