Skip to main content
. 2022 Dec 7;9(12):211800. doi: 10.1098/rsos.211800

Figure 2.

Figure 2.

Decision information heat map in a 5 × 5 gridworld with A:{,,,}. Annotated values show DπF(s) in bits. The policy presented is optimal with respect to free energy, i.e πF=argminπFπ(s;β). The arrow lengths are proportional to the conditional probability π(a|s) in the indicated direction, for convenience the prior p^(a) is shown instead of the policy in the yellow goal state. Reward-maximizing behaviour (β = 100): in (a) the goal is in the corner and in (b) the goal is on the diagonal between the corner and the middle. Behaviour where information processing is constrained (β = 0.1): (c) the goal is in the corner and in (d) it is diagonally adjacent to the corner.