Figure 9. Theoretical Reward Prediction Error Computations.
Value of reward prediction error (α[Rt − Vt − 1]) computed using the following equation: Vt = Vt − 1 + α[Rt − Vt − 1]. Plotted in black, α = 0.5; plotted in gray, α = 0.7. A unit value reward has been simulated during trial t, no rewards for the next ten trials.