Skip to main content
. Author manuscript; available in PMC: 2018 Sep 15.
Published in final edited form as: Biol Psychiatry. 2017 May 31;82(6):431–439. doi: 10.1016/j.biopsych.2017.05.017

Figure 1. Experimental protocol.

Figure 1

A) Learning phase. Participants learn to select one of three actions (key presses A1=3) for each stimulus in a block, using reward feedback. Incorrect choices lead to feedback 0, while correct choices lead to reward, either +1 or +2 points, probabilistically. The probability of obtaining 2 vs. 1 points is fixed for each stimulus, drawn from the set of (0.2, 0.5 or 0.8). The number of stimuli in a block (set size ns) varies from 1 to 6. B) In learning blocks, stimuli are presented individually, randomly intermixed. Delay indicates the number of trials that occurred since the last correct choice for the current stimulus. C) In a surprise test phase following learning, participants are asked to choose the more rewarding stimulus among pairs of previously encountered stimuli, without feedback. D) The computational model assumes that choice during learning comes from two separate systems (working memory and reinforcement learning), making behavior sensitive to load, delay, and reward history. In contrast, test performance is only dependent on RL, such that if RL and WM are independent, choice should only depend on reward history. E) 100 Simulations of the computational model with the new design for two sets of parameters representing poor WM use (capacity 2) and good WM use (capacity 3), respectively. Left: Learning curves indicate the proportion of correct trials as a function of the number of encounters with given stimuli in different set sizes. Middle: difference in overall proportion of correct choices between subsequent set sizes shows a maximal drop in performance between set sizes 2 and 3 with capacity 2, while the drop is maximal between set sizes 3 and 4 for capacity 3. Right: assuming RL independent of WM, the learned RL value at the end of each block is independent of set size (colors) and capacity (top vs. bottom), but is sensitive to the probability of obtaining 1 vs. 2 points in correct trials.