Skip to main content
. 2019 May 9;13:153. doi: 10.3389/fnhum.2019.00153

Figure 3.

Figure 3

A schematic of value updating in the parallel model (left panel) and the parsimonious learning-rate adjustment model (right panel). The panels show the difference in value updating in the two models when the agent selected action 1 in state A followed by state C and an outcome. The parallel model assumes that model-free values are updated for the experienced state-action pairs and that model-based values are updated for every state-action pair. These values are mixed according to a model-based weighting parameter w. On the other hand, the parsimonious learning-rate adjustment model updates only the first-stage actions relating to the state that produced the outcome based on the transition-probability model. In the example shown in this figure, the value of action 1 in state A is updated by the outcome in state C. At the same time, the value of action 1 in state B is also updated but downweighted by the model-based weighting parameter w.