Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 Dec 7;9(12):211800. doi: 10.1098/rsos.211800

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2022 The Authors.

Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

PMC Copyright notice

Figure 4. — Infodesic 7 × 7 gridworld with the Moore neighbourhood and the goal in the corner state #6 and β = 100. (a) A heat map showing the live state distribution, with the policy distribution denoted by arrows of length proportional to $π_{F} (a | s)$ in the direction of the action. (b) The proportion of sampled sequences, comprised of contiguous states, for an agent following a single policy, $π_{F}$ , from S = #0 which pass through various states en route to the final goal S = #6. (c) A lookup grid with states labelled with their indices and infodesic sequence states highlighted in green. The deviation from the triangle inequality is given by the normalized free energy difference which is −0.0005. We observe informationally efficient states on the diagonal; furthermore, the policy guides the agent towards these states even if it requires the agent first navigating away from the edges. (d) The proportion of subgoaled sampled sequences, comprised of contiguous states, for an agent following a subgoal policy, $π_{F}^{(1)}$ , from S = #0 which pass through various states en route to the subgoal S = #12.