Skip to main content
. 2016 Apr 21;5:e13747. doi: 10.7554/eLife.13747

Figure 4. Human decision-making Experiment-1.

(A) On each trial, subjects chose either of two colored targets (Red or Blue in this example). Given Red, cue S+ (oval) or S0 (triangle) was presented, each with probability 0.5; S+; was followed by a reward (an erotic picture) after time TDelay, while S0 was not followed by reward. Given Blue, either a reward or nothing followed after the fixed time delay TDelay with probability 0.5 each. (B) Results. Human participants (n=14) showed a significant modulation of choice over delay conditions [one-way ANOVA, F(3,52)=3.09, p=0.035]. They showed a significant preference for the 100% info target (Red) for the case of long delays [20 s: t(13)=3.14, p=0.0078, 40 s: t(13)=2.60, p=0.022]. The mean +/- SEM indicated by the solid line. The dotted line shows simulated data using the fitted parameters. (C) Mean Q-values of targets and predicting cues estimated by the model. The value of informative cue is the mean of the reward predictive cue (oval), which has an inverted U-shape due to positive anticipation, and the no-reward predictive cue (triangle), which has the opposite U-shape due to negative anticipation. The positive anticipation peaks at around 25 s, which is consistent with animal studies shown in Figure 3(B,C). See Table 2 for the estimated model parameters. (D) Model comparison based on integrated Bayesian Information Criterion (iBIC) scores. The lower the score, the more favorable the model. Our model of RPE-boosted anticipation with a negative value for no-outcome enjoys significantly better score than the one without a negative value, the one without RPE-boosting, the one without temporal discounting, or other conventional Q-learning models with or without discounting.

DOI: http://dx.doi.org/10.7554/eLife.13747.010

Figure 4.

Figure 4—figure supplement 1. (A) Control experiment, where the first block and the last (5th) block of the experiment had the same delay duration of 2.5 s.

Figure 4—figure supplement 1.

Subjects showed no difference [t(10)=1.04, p=0.32] in the preference before and after experiencing the other delay conditions. (B) The large change in the delay duration affects on choice behavior. Y-axis shows the difference in choice percentage between the shortest (2.5 s) and the longest (40s) delay conditions. In our main experiment, the delay duration was gradually increased (Left), while in the control experiment, the delay was abruptly increased. The difference between the two procesures was significant [2 sample t(23)=2.15, p=0.042]. Subjects reported particularly unpleasant feeling for the long delay condition in the control experiment.
Figure 4—figure supplement 2. The generated choice by the model without the negative value assigned to the no-reward outcome.

Figure 4—figure supplement 2.

The model fails to capture the short delay period (7.5 s). This corresponds to the time point at which the the effect of negative anticipation was the largest, according to the model with R2.