Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 Dec 7;9(12):211800. doi: 10.1098/rsos.211800

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2022 The Authors.

Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

PMC Copyright notice

Figure 1. — Value, decision information and free energy plots in a 5 × 5 gridworld with cardinal (Manhattan) actions $A : {↑, \leftarrow, ↓, \to}$ . The goal g = #12 is in the centre and is coloured yellow in the grid plots. The arrow lengths are proportional to the conditional probability π(a|s) in the indicated direction. The relevant prior, i.e. the joint state and action distribution marginalized over all transient states, $\hat{p} (a; π)$ is shown in the yellow goal state. (a) The policy displayed is the optimal value policy $π_{V} = {\arg \max}_{π} V_{g}^{π} (s)$ for all $s \in S$ . The heatmap and annotations show the negative optimal value function $- V_{g}^{π_{V}} (s)$ for each state. (b) The policy presented is optimal with respect to free energy, i.e $π_{F} = {\arg \min}_{π} F_{g}^{π} (s; β)$ for all $s \in S$ . The heatmap and annotations show decision information ${ℑ_{D}}^{π_{F}} (s)$ with β = 100. (c) The policy displayed is again $π_{F}$ with β = 100. The heatmap and annotations show free energy $F_{g}^{π_{F}} (s; β)$ . (d) Graph showing the numbering of states in the gridworld, the goal is coloured in green and the other colours indicate levels radiating from the centre.