Skip to main content
. 2021 Jan 25;376(1820):20190752. doi: 10.1098/rstb.2019.0752

Figure 2.

Figure 2.

An extended HRL model integrating valence. Sensory inputs from the environment (exteroceptive) are evaluated against predictions about interoceptive and exteroceptive outcomes in an integrative field, which determines valence (advantage/harm) of incoming information. Internal state regulation further integrates these inputs by calculating allostatic load relative to meeting homeostatic setpoints and the metabolic cost of current and potential action. Based on the prediction errors resulting from this HRL-like learning scheme, together with valence and the reality of metabolic constraints, a policy for action is selected. Policy selection and resulting action are implemented by genetic and epigenetic regulatory networks. Action modifies the next round of exteroceptive sensory inputs the organism receives. The rounded rectangles represent higher-order functions (sensing, information integration, decision making, implementation, behaviour), while the ovals denote processes or products that feed into or arise from the higher-order functions.