Figure 12.
External reward obtained by different learning agents during training over 30 episodes in the surface-classification task.
External reward obtained by different learning agents during training over 30 episodes in the surface-classification task.