Skip to main content
. 2022 Oct 7;13(10):1688. doi: 10.3390/mi13101688

Figure 5.

Figure 5

Training reward curves produced by different approaches. Vertical axis represents the average reward gained in one episode among all environments. Blue: proposed approach. Orange: hybrid method without curriculum stage one. Green: end-to-end aproach.