Skip to main content
. 2014 Aug 20;40(2):454–462. doi: 10.1038/npp.2014.193

Figure 2.

Figure 2

Intuition for the explore parameter. The strength of a subject's belief (y axis) about the probability of a better than expected outcome (x axis) depends upon both the stage of learning and the task condition (shown for IEV (left) and DEV (right)). In early stages (dashed lines) of the IEV condition (left panel), the subject has not yet learned whether a faster (in gray) or slower (in black) response is more likely to be rewarded, a state reflected in both a weaker belief strength (value on the y axis) and greater uncertainty (a broader belief distribution). Later in the task (solid lines), belief strength increases and the uncertainty about reward likelihood decreases—ie the subject learns that slower responses are more likely to yield reward (ie, a positive prediction error). The explore parameter indexes the degree to which subjects use the relative uncertainty between the faster and slower distributions to explore the reward space (ie, to reduce the variance in the more uncertain distribution). Similar qualitative changes in the probability of reward, with a different sign relative to fast and slow distributions, can be seen for the DEV condition (right panel).