Decision information heat map in a 5 × 5 gridworld with . Annotated values show in bits. The policy presented is optimal with respect to free energy, i.e . The arrow lengths are proportional to the conditional probability π(a|s) in the indicated direction, for convenience the prior is shown instead of the policy in the yellow goal state. Reward-maximizing behaviour (β = 100): in (a) the goal is in the corner and in (b) the goal is on the diagonal between the corner and the middle. Behaviour where information processing is constrained (β = 0.1): (c) the goal is in the corner and in (d) it is diagonally adjacent to the corner.