Skip to main content
. 2018 May 29;12:24. doi: 10.3389/fnbot.2018.00024

Figure 9.

Figure 9

Sampling with relevance weighting. (A) Samples from the original distribution. (B) Samples to optimize the distribution over trajectories with respect to beginning at the start position. (C) Samples to optimize the distribution over trajectories with respect to passing through the center of the window. (D) Samples to optimize the distribution over trajectories with respect to reaching the end position. The proposed algorithm explores for each objective a large range of values for the policy parameters that are relevant to that objective, while sampling values close to the mean for the other policy parameters. The variance of the irrelevant parameters is recovered according to Equation (43). Therefore, after optimizing for each objective, the distribution over the relevant parameters is updated, while the distribution over the irrelevant parameters is preserved.