Skip to main content
. 2021 Sep 13;8:738113. doi: 10.3389/frobt.2021.738113

FIGURE 3.

FIGURE 3

Average progress for all algorithms plotted vs environments of increasing complexity. Each agent is individually trained and tested in their respective environments, e.g., agents trained in Env0 are tested only in Env0. All the algorithms except PPO yield significantly reduced performance once static obstacles are introduced in Env2. Though neither off-policy algorithm exceeds 50% average progress, their results vary significantly among the environments.