Table 1. Adaptation algorithm of the Actor structure.
1 | The user generates motor state st and its reward expectation is . |
2 | The Actor associates st to action ai and executes the selected action. |
3 | Execution of action ai increases or decreases the reward expectation, which would be reflected by . |
4 | The error is defined as |
5 | If |
This error is used to update the parameters of selected action. | |
The hidden layer weights are not changed. | |
If | |
Parameters of selected action would be updated. | |
The error back propagates to the hidden layer and updates the  hidden layer weights. | |
6 | Return to step one. |