Skip to main content
. 2011 Mar 14;6(3):e14760. doi: 10.1371/journal.pone.0014760

Table 1. Adaptation algorithm of the Actor structure.

1 The user generates motor state st and its reward expectation is Inline graphic.
2 The Actor associates st to action ai and executes the selected action.
3 Execution of action ai increases or decreases the reward expectation, which would be reflected by Inline graphic.
4 The error is defined as Inline graphic
5 If Inline graphic
This error is used to update the parameters of selected action.
The hidden layer weights are not changed.
If Inline graphic
Parameters of selected action would be updated.
The error back propagates to the hidden layer and updates the  hidden layer weights.
6 Return to step one.