Table 1.
General steps of the gating algorithms (see text for details).
| 1 | Choose motor action a and gating action g for current state s according to softmax over relevant action values in QM and QG (Equations 1, 2) |
| 2 | Observe reward r and next state s′ |
| 3 | Compute TD errors (Equations 3–5) |
| 4 | Update specific eligibility traces associated with current state s and actions a, g (Equations 6–8) |
| 5 | Update all state/state-action values (Equations 9–13) |
| 6 | Update all eligibility traces (Equations 14–16) |
| 7 | Repeat steps 1–6 until termination |