Skip to main content
. Author manuscript; available in PMC: 2022 Oct 1.
Published in final edited form as: Psychol Rev. 2021 Sep 13;129(3):513–541. doi: 10.1037/rev0000294

Figure 6.

Figure 6.

Schematic of the Advantage Actor-Critic with Mood algorithm, which approximates momentum using mood. This agent is similar to those in Figure 5, except that it maintains a mood variable (red) that is recursively updates according to the critic’s estimates of the Advantage of the actor’s chosen actions. The mood variable, in turn, influences policy updates within the actor.