Skip to main content
. Author manuscript; available in PMC: 2019 May 1.
Published in final edited form as: Cortex. 2017 Aug 24;102:150–160. doi: 10.1016/j.cortex.2017.08.019

Figure 1. Decision chains for sampling and actions.

Figure 1

(A) Instrumental sampling: the agent makes a decision of which cue to sample (“Sampling”), discriminates the properties of the selected cue (“Discrimination”), decides which action to take based on the discrimination (“Action”) and realizes an outcome (“Outcome”, or reward, r) with probability (P(r)). In the specific example, a pedestrian decides whether to sample a traffic light or a cloud, discriminates the colors of the sampled stimuli (red/green for the light and blue/white for the cloud), and takes the decision to stop or proceed (NoGo/Go) in order to be safe (reward, r). The Shannon entropy of the possible actions is high before sampling either cue as well as after sampling the cloud (1 bit if the Go/NoGo actions are equally likely) but becomes much lower depending on the reliability of the cue (e..g, 0 if the cue produces perfect certainty about the optimal action).

(B) Non-Instrumental Sampling: The cues indicate a pre-ordained outcome but the agent cannot alter the outcome.The agent makes the decision whether to sample cue A or B, and discriminates the signal given by the sampled cue. Signals A1 and A2, produced upon sampling cue A, predict with certainty whether the reward will be large or small. Signals B1 and B2, produced by cue B are random and do not reduce the uncertainty about reward size.