Skip to main content
. 2011 Mar 14;6(3):e14760. doi: 10.1371/journal.pone.0014760

Figure 10. Actor's parameter adaptation during closed-loop control.

Figure 10

(A) Cumulative reward over time. (B) Action values computed at the output layer of the Actor. Each color represents the value of a specific action. Here the red corresponds to the action that navigates the robot in a direct path to the target. (C) Output of the 3 hidden layer processing elements of the Actor. Larger adaptation of the values occurs before the “knee” of the cumulative reward curve. After the “knee” the system parameters stabilize their relative values indicating consolidation of the performance.