Fig. 1.
Task outline and classifier construction. (A) Reversal task setup. Subjects chose one of two fractals, which on each trial were randomly placed to the left or right of the fixation cross. The chosen stimulus is illuminated until 2 s after the trial onset. After a further 1 s, a reward (winning 25 cents) or punishment (losing 25 cents) is delivered for 1 s, with the total money earned displayed at the top. The screen is then cleared, and a central fixation cross is presented for 8 s before the next trial begins. One stimulus is designated the correct stimulus, in that choosing that stimulus leads to a monetary reward on 70% of occasions and a monetary loss 30% of the time. The other stimulus is “incorrect,” in that choosing that stimulus leads to a reward 40% of the time and a punishment 60% of the time. After subjects choose the correct stimulus on four consecutive occasions, the contingencies reverse with a probability of 0.25 on each successive trial. Subjects have to infer that the reversal took place and switch their choice, at which point the process is repeated. The last three scans in a trial are used by our classifier to decode whether subjects will switch their choice or not in the next trial. A canonical BOLD response elicited at the time of reward receipt is shown (in green) to illustrate the time points in the trial at which the hemodynamic response is sampled for decoding purposes. A new trial was triggered every 12 s to ensure adequate separation of hemodynamic signals related to choices on consecutive trials. The average of three scans between the outcome of reward and the time of choice in the next trial was used for decoding subjects' behavioral choice in the next trial. These three time points will not only contain activity from the decision itself (activity taking place after the receipt of feedback, but before the next trial) but also activity from the reward/punishment received in the current trial and activity consequent to the choice made in the current trial. (B) The multivariate region classifier used in this study is divided in two parts. The first extracts a representative signal from each region of interest (Left) by averaging the brain voxels within a region weighted by the voxels' discriminability of the switch vs. stay conditions. To avoid overfitting the fMRI data, we did not take into consideration the correlations between voxels within a region of interest (Eq. 3). The second part of the classifier (Right) adds up the signal from each region, weighted by the region's importance in classifying the subject's decision (Eq. 2). Weights are calculated by using a multivariate classifier that uses each region's decoding strength, and correlations between regions, to maximize the accuracy of the classifier in decoding whether subjects are going to switch or stay (see Discriminative Analysis).