Abstract
In natural behavior animals actively gather information that is relevant for learning or actions, but the mechanisms of active sampling are rarely investigated. We tested parietal neurons involved in oculomotor control in a task in which monkeys made saccades to gather visual information before reporting a decision based on the information. We show that the neurons encode, before the saccade, the information gains (reduction in decision uncertainty) that the saccade was expected to bring, correlating with the monkeys’ efficiency in processing the information in the post-saccadic fixation. Informational sensitivity is independent of the neurons’ reward sensitivity, which is unreliable across task contexts, inconsistent with the view that the cells encode economic utility. Instead, we suggest that parietal cells are involved in implementing active sampling policies, showing uncertainty-dependent boosts of neural gain that facilitate the selection of relevant cues and the efficient use of the information delivered by these cues.
Introduction
To survive and thrive in complex environments animals must make decisions under uncertainty and gather information that reduces that uncertainty. In neuroscience and psychology, information accumulation is studied by presenting participants with given (experimenter-selected) sensory cues and asking them to make decisions based on those cues1. In natural behavior, however, animals use active sensing strategies whereby they endogenously decide what information to sample – i.e., which stimuli to listen, touch or look at to guide future actions - but the neural mechanisms of these strategies are seldom investigated2–4.
In humans and monkeys, the primary means of sampling visual information are rapid eye movements (saccades) that place the fovea on selected items in visual scenes. Cortical neurons involved in spatial attention and saccadic decisions have spatial receptive fields (RF) and selectively encode attention-worthy stimuli and locations5, 6, but the significance of these selective responses has been intensely debated. A longstanding question, debated primarily in the lateral intraparietal area (LIP), is whether saccade-related responses encode attentional priority or the reward values of alternative actions4, 7.
We have argued that understanding this question requires the use of behavioral tasks in which participants deploy eye movements not merely to gather rewards but also in their natural role of gathering information 2–4. To this end we devised a new “two-step” task in which monkeys made two coordinated saccades - an initial saccade to gather information from a visual cue and a subsequent saccade to report a decision based on the information. In an initial study using this paradigm we showed that pre-saccadic responses of LIP cells encoded the percent validity of alternative cues – i.e., the extent to which a cue, when examined during the post-saccadic fixation, will reduce the uncertainty of the subsequent action8.
Here we extend these results by showing that the neurons also encode expected information gains based on dynamic changes in decision uncertainty. LIP neurons had stronger pre-saccadic responses if the monkeys had ex ante decision uncertainty and expected the initial saccade to reduce that uncertainty, relative to an alternative context in which the monkeys had prior knowledge of the appropriate final action and expected the saccade merely to bring redundant information. Moreover, the neural sensitivity to information gains was uncorrelated with the neurons’ sensitivity to reward gains, and reward sensitivity showed positive or negative scaling in different task contexts, inconsistent with the idea that the cells encode the economic utility of alternative actions. Instead, the findings support a two-stage models of attention control9. A monitoring stage, implemented in areas other than LIP, allocates control based on the benefits and costs of competing actions, while a regulatory stage, which includes the parietal cortex, implements the required control – in part through uncertainty dependent enhancement of neural gain that enables animals to select informative cues and efficiently process the information conveyed by these cues.
Results
Two monkeys (Macaca Mulatta) performed an information sampling task in which they made two coordinated saccades on each trial (Fig. 1a): a first saccade to obtain information from a visual cue (100% coherent motion toward one of the two decision alternatives) and a second saccade to indicate the final decision based on the information (a saccade to one of the alternatives).
Figure 1: Task design.
a. Trial structure All the trials had an identical structure and differed only in whether they appeared in INF (top) or unINF (bottom) blocks. Each trial started with a Fixation stage when the monkeys were informed about the reward size (large or small, 300 ms vs 100 ms solenoid open time, respectively; signaled by fixation point color), and the block type (INF or unINF, signaled by a shape around the fixation point). This was followed by the onset of the trial display containing two targets (white squares) and a cue (a patch of small stationary dots). After viewing the display for 500 ms during central fixation (Delay period), the fixation point disappeared instructing the monkeys to make the first saccade to the cue. When the monkeys fixated the cue for 100 ms, the dots began to move with 100% coherence toward one of the targets (Motion and second saccade, gray arrows). At this point the monkeys were free to make their final decision and received a reward if they made a second saccade (cyan arrow) to the target that had been cued by the motion. The dark frames highlighting the Delay/ First saccade periods indicate the epochs of interest for the neural data analysis. unINF blocks (bottom two rows) had an identical trial structure but were distinguished from INF blocks by the distribution of the correct second saccade targets. In INF blocks, the correct trial was randomly selected on each trial, whereas in unINF blocks a single target was correct for 50 consecutive correct trials (“up” or “down” in this cartoon example). Therefore, whereas in an INF block the monkeys started the trial with decision uncertainty and resolved this uncertainty by viewing the motion, in unINF blocks the uncertainty was resolved from the outset of the block and the motion provided redundant information. b. Example block sequence in a recording session INF and unINF conditions were presented in alternating blocks of 50 correct trials until at least 4 blocks were completed. Small and large reward sizes were pseudo-randomly interleaved within each block. The block type that was presented first (INF/unINF) and the target direction in the first unINF block were randomized across sessions. c. Factorial design The 2 × 2 factorial design dissociating information gains (IG) and reward size (RS). The colors indicate the convention we use throughout the paper, whereby INF blocks are shown in red and unINF blocks in blue, and saturated/pale colors indicate, respectively, large and small reward sizes.
We varied the information gains (IG) of the initial saccade through blockwise manipulations of the monkeys’ ex ante decision uncertainty. In informative trial blocks (INF, 50 correct trials), the monkeys started each trial with uncertainty about their final decision and could expect that the motion will resolve their uncertainty (indicating which one of the two equally likely targets was correct; Fig. 1a, top). In uninformative blocks, in contrast (unINF, 50 correct trials), the correct target was fixed across trials, so that the monkeys could identify the correct second saccade in advance and the motion merely confirmed their prior expectations (Fig. 1a, bottom). INF and unINF conditions were presented in alternation with the initial block type randomized across sessions (Fig. 1b).
Importantly, the monkeys had to make their first saccade to the cue in both INF and unINF blocks, meaning that this saccade had reward value independently of IG. We further manipulated the reward size (RS) for the second saccade, signaling to the monkey whether the trial will deliver a large or small reward by means of the fixation point color, with RS randomly interleaved in each block (Fig. 1a). This created a 2 × 2 factorial design that statistically dissociated RS and IG (Fig. 1c).
Behavior
RS affected the monkeys’ motivation, as shown by the fact that the monkeys had higher rates of trial completion at large relative to small RS (Fig. 2, right, 2-way ANOVA, main effect of RS: F(1,344) = 20.0, p < 10−5). However, RS had only a modest influence on the initial cue-directed saccade (Fig. S1a) and the monkeys’ final decision, including the accuracy and time to make the decision (post-saccadic motion viewing time (VT); Fig. 2 left and center panels, 2-way ANOVA; main effect of RS on VT: F(1,344) = 5.5, p = 0.02; IG*RS interaction, F(1,344) = 0.69, p = 0.4; main effect of RS on accuracy: F(1,344) = 0.6, p = 0.4; IG*RS interaction, F(1,344) = 10−5, p = 1.0; see Fig. 6c and Fig. S5 for individual monkey behavior).
Figure 2: Behavior was sensitive to IG.
Each point shows the mean and standard errors (SEM) across all neural recording sessions (n=87), indicating decision accuracy (the probability of selecting the correct final target), viewing times (VT) (the time that the monkeys spent viewing the motion before their second saccade) and completion rate (the fraction of trials in which the monkeys traversed all task states up to the second saccade).
Figure 6: the strength of IG signals correlates with performance.
a. For a fixed VT, stronger IG modulations are associated with higher decision accuracy. The distribution of INF trials for both RS was divided into deciles according to VT, and mean decision accuracy was plotted as a function of deciles (n=87, error bars show SEM), separately for neurons showing large or small βIG (median split). Decision accuracy was higher in sessions in which neurons had higher βIG. This difference was found on large reward trials (main panels) but not small reward trials (inset). b. Stronger RS modulations are not related to higher decision accuracy. Same format as in a, but splitting sessions according to the βRS of the recorded cell. Note that βRS > median means a weaker reward effect (less negative coefficient). c. IG modulations do not correlate with speed accuracy tradeoffs. Each point is one session (n=87), with color denoting individual monkeys. The abscissa shows the difference between the speed-accuracy index (calculated as described in Fig. S5) on large reward relative to small reward trials. The ordinate is the difference between the βIG on the same trials. The dashed diagonal line is the least square regression; r and p values refer to the correlation coefficient.
In contrast with the weak effects of RS, IG strongly modulated both the timing and accuracy of the final decision. In unINF blocks the monkeys spent minimal time viewing the motion and nevertheless reached very high decision accuracy (Fig. 2, center and left panels, blue) whereas in INF blocks VT were significantly longer and decision accuracy dropped (Fig. 2, red; 2-way ANOVA; main effect of IG on VT: F(1,344) = 388, p<10−57; accuracy F(1,344) = 222, p<10−38; all p < 10−24 in individual monkeys). Completion rates were insensitive to IG (main effect of IG: F(1,344) = 0.8, p = 0.4; IG*RS interaction: F(1,344) = 0.7, p = 0.4) showing that the monkeys were motivated to complete INF and unINF blocks. IG had a minimal influence on the metrics of the initial cue-directed saccade (Fig. S1a), and there was no negative correlation between the first saccade latency and the post-saccadic viewing durations, ruling out that the monkeys traded off the time they spent preparing the first and second saccades (Fig. S1b). Thus, monkeys were highly sensitive to IG and required more time to link the motion to the correct final action when the motion provided new rather than redundant information.
LIP neurons respond more for informative and low reward trials
To understand the neural mechanisms related to informational actions we placed the cue in the receptive field (RF) of an LIP cell and focused on its responses during the delay period when the monkeys prepared their initial cue-directed saccade (Fig. 1a, delay/1st saccade; see Fig. S2 for control geometries). The neurons had visual responses to the onset of the cue, followed by sustained delay period activity lasting until the onset of the cue-directed saccade (Fig. 3). For many neurons, this pre-saccadic activity differed as a function of IG and RS. Some neurons, like the example cell shown in Fig. 3a (left) responded more strongly in INF relative to unINF blocks while other cells, like the example shown in Fig. 3a (right), were sensitive to RS. Strikingly, the RS-sensitive cells were typically enhanced by smaller rewards, opposite to effect commonly observed in this area (e.g.,10–12). The average population response (n = 87; n = 49 in monkey M) showed both effects, with stronger firing rates in INF relative to unINF blocks (Fig. 3b, red vs blue) and on small-reward relative to large-reward trials (Fig. 3b, left vs right).
Figure 3: Neural responses are stronger on INF blocks and small reward trials a. Left: Example neuron with stronger responses in INF relative to unINF blocks.
Peristimulus time histogram (PSTH) were constructed by convolving trial by trial spike trains with a Gaussian filter of 10 ms standard deviation and averaging across trials. From the time of cue onset (left) throughout the end of the delay period, the monkeys maintained central fixation and the stationary dots were located inside the RF of the cell (dark cone in the cartoon). At the time of saccade onset (right alignment) the monkeys initiated their first saccade to the dots. This neuron had higher activity for INF vs unINF blocks, with no sensitivity to RS. After the first saccade the RF moved away from the visual display and the neuron no longer had task-related responses. PSTH were produced for each of the 87 recorded neurons, and their means are presented in b. Right: Example neuron with stronger responses for smaller rewards. A different neuron that had higher activity on small reward versus large reward trials but was insensitive to IG. Conventions as in a. b. Population response (n = 87 cells) Population PSTHs were constructed by z-scoring the raw firing rates within each neuron (using its activity across the entire displayed epoch) and averaging across neurons. For clarity, large and small reward sizes are shown in separate panels. The population response is larger in the INF context (red vs blue in each panel) as on trials with smaller rewards (left vs right).
To quantitatively measure these modulations we fit the delay activity of each cell to a linear model that included IG and RS as categorical regressors, along with nuisance regressors related to the 1st saccade latency, velocity, endpoint accuracy, post-saccadic VT and the direction of the final saccade (Methods, eq. 1). While sensitivity to saccade parameters was slight (Fig. S3a), sizeable fractions of cells showed significant IG and RS modulations. 38% of the cells showed significant effects of RS (Fig. 4; monkey M: 29%; monkey S: 50%). The average βRS coefficient was negative across the population (Fig. 4, black triangle; −1.7 (0.35) sp/s, z = −4.6, p < 10−5 relative to 0; monkey M: z = −3.1, p = 0.0022; monkey S: z = −3.3, p = 0.0011) and in 79% of the RS-sensitive cells (teal triangle: −3.3 (0.8) sp/s, z = −3.4, p < 10−4; 85% in monkey M, 74% in monkey S) indicating that the predominant response was enhancement for smaller RS. Significant effects of IG were found in 39% of the cells (Fig. 4, ordinate; monkey M: 47%; monkey S: 29 %). The IG coefficient was positive across the population (Fig. 4, black triangle; 2.3 (0.6) sp/s, z = 4.2, p < 10−4 relative to 0; monkey M, z = 3.7, p = 0.0002; monkey S: z=2.0, p = 0.05), and in 85% of the significant cells (orange triangle: 5.5 (1.2) sp/s, z=3.8, p < 10−4; M: 91%; monkey S: 73%) indicating that most cells had higher firing rates for INF relative to unINF blocks.
Figure 4. Distribution of IG and RS effects in individual cells:
Coefficients of a regression model (Methods, eq. 1) capturing the effects of IG in units of spikes/second (βIG; ordinate) and RS (βRS; abscissa). Each point is one cell (n=87), and colors indicate the significance of the two coefficients. In the marginal distributions, significant cells are indicated in darker shades and the arrowheads indicate the average values across the entire sample (black) and the subset of cells with significant coefficients (teal/orange). The gray vertical and horizontal lines show the null effects (βIG = 0 and βRS = 0). The dashed diagonal line is the least square regression; r and p values refer to correlation coefficient.
We conducted several analyses to estimate the reliability of these neural effects. In the significant cells, the average RS and IG coefficients represented a change of more than 10% relative to the neurons’ average firing rate (RS: −11% (2.7%), IG: 13% (2.4%)) and more than 35% of the firing rate standard deviation (RS: −0.36 (0.08) zscore units, IG, 0.43 (0.08) zscore units). The results were replicated when using receiver operating characteristic (ROC) analysis, which revealed significant discrimination across the population (Fig. S3b: RS: mean 0.55 (0.011), z = 4.0, p < 10−4, IG: mean 0.55 (0.012), z = 4.0, p < 10−4) and in a sizeable fraction of cells (RS: 38%; IG: 44%). Time-resolved analysis showed that the sensitivity to IG was significant throughout the delay period (Fig. S3c). Finally, the variance in firing rates was smaller in unINF relative to INF blocks, consistent with the lower firing rates in the latter blocks but at odds with the idea that these blocks reflected mixtures of states related to the two anticipated directions of the final saccade (Fig. S3d,e).
IG responses are not value effects
We conducted several analyses to determine whether the neurons’ IG modulations were explained by their reward sensitivity. In a first analysis we asked whether the cells encode the expected value (EV) of the initial saccade, defined as the product of RS and reward rate in each trial type (Fig. 5a). EV primarily depended on RS as intended in the design of the task, and in addition was larger in unINF relative to INF blocks because of the monkeys’ higher decision accuracy in the former blocks (Fig. 5a). Thus, EV showed significant positive effects of RS and IG, and a negative IG*RS interaction (2-way ANOVA: RS main effect F(1,344) > 1060, p <10−50 for the full data set and each monkey; IG main effect: all F(1,344) > 90, p < 10−18; IG*RS interaction, all F(1,344) > 8.5, p < 10−3). This pattern qualitatively differed from the average LIP firing rates, which showed a negative effect of RS and no IG*RS interaction (Fig. 5c; 2-way ANOVA, main effects of IG and RS, both F(1,344) > 39, p = 10−9; F(1,344) >7, p < 0.01 in each monkey individually; interaction, F(1,344)=1.3, p = 0.3; monkey M, F(1,344)=1.8, p = 0.2; monkey S, F(1,344)=0.03, p = 0.9).
Fig. 5. IG signals cannot be explained by VOI or EV.
a. The EV experienced by the monkeys. We estimated EV as the product of RS and decision accuracy. Each point is the average and SEM across sessions (n=87). b. The VOI experienced by the monkeys, measured as the difference in EV between trials in which the monkeys viewed the dot motion and catch trials in which the motion was absent and the monkeys guessed the direction of the final saccade. Each point is the average and SEM across sessions (n=87). c. Population firing rates were inconsistent with VOI or EV. Each point is the FR during the delay period, in the full data set (thick traces) and individual monkeys (thin traces). Symbols show mean and SEM across cells (n=87). d: The majority of neurons do not show IG*RS interactions. Distribution of interaction coefficients from the 3-parameter model. Significant coefficients were found in only 8 cells (black bars), of which only 3 were positive. The gray triangle shows the mean coefficient and the vertical line shows abscissa=0. e: INF coefficients are equivalent when estimated with the 2-parameter and 3 parameter models. Each point is one cell (n=87). The line is the best fit least squares linear regression.
We next examined whether the cells may encode the value of information (VOI) – a higher order reward function that measures the added value of obtaining information relative to what the monkeys may expect to obtain had they acted without the information. The monkeys could have estimated VOI based on their experience with catch trials in which no motion was shown (10% in each block; Methods). Catch trials had high success rates of 93% in unINF blocks (monkey M: 96%, monkey S, 90%) but were only at chance in INF blocks (49% overall; monkey M, 53%, monkey S, 49%), consistent with the different levels of ex ante uncertainty in the two types of blocks. Thus, VOI – the difference in EV between motion and no-motion (catch) trials – was very low on unINF blocks, but it was positive in INF blocks particularly at higher RS (Fig. 5b; 2-way ANOVA, F(1,344)>470, p<10−47 for main effect of IG, RS and IG*RS interactions, in the full data set and each monkey). The strong positive interaction between IG and RS distinguishes VOI from EV (which shows a negative interaction; Fig. 5a) and from uncertainty reduction per se (which is independent of reward magnitude). Importantly, this pattern also distinguishes VOI from the LIP pre-saccadic response, in which IG and RS were encoded additively and with opposite sign (cf Fig. 5a).
These findings were confirmed by further examination of individual cells. We reasoned that, if the IG effects were fully explained by reward sensitivity, the two signals would be highly correlated, such that neurons would only show IG sensitivity if they also had reward modulations. Contrary to this view, βIG and βRS coefficients were uncorrelated (Fig. 4; r = −0.0051, p = 0.96; monkey M: r = −0.009, p = 0.95; monkey S: r = −0.11, p = 0.5). To rule out that this negative result was an artefact of low statistical power we estimated βIG separately for small reward and large reward trials – i.e., using only half the number of trials for each cell. The resulting βIG coefficients were highly correlated across small and large reward trials (r = 0.76, p < 10−17; monkey M: r = 0.79, p < 10−10; monkey S: r = 0.51, p 0.0011), showing that we could detect reliable correlations even if we used half the number of trials, and confirming that the lack of correlation between the IG and RS modulations reflects a true independence of the two modulations.
Because both EV and VOI measures showed prominent interactions between RS and IG, we repeated the individual neuron analysis using a model that included IG, RS, and the IG * RS interaction (Methods, eq. 2). In contrast with EV and VOI, the interaction coefficients in LIP firing rates did not differ from 0 across the population (Fig. 5d, βINT mean (SE) = 0.44 (0.42), z=1.6, p = 0.11; monkey M: 0.34 (0.63), z=1.4, p = 0.15; monkey S: 0.58 (0.51), z=0.6, p = 0.56) and the βIG coefficients were statistically equivalent whether we did or did not include an interaction term (Fig. 5e; z=1.42, p = 0.16; r = 0.94, p < 10−40).
We further evaluated these observations at the level of the population by fitting the population responses with the full set of 127 models resulting from all the possible combinations of 7 regressors including categorical indicators of task context (IG, RS and IG*RS), and average EV, VOI, decision accuracy and completion rates in individual sessions (cf Fig. 2). The best fitting model was a 2-parameter model that included only terms for IG and RS (Fig. S4). Consistent with the individual-neuron results, this model produced a significant positive effect of IG and negative effect of RS (βIG_population (SE): combined data: 0.17 (0.016); monkey M: 0.24 (0.022); monkey S: 0.079 (0.024), all p < 0.05; βRS_population = −0.17 (0.016); monkey M: −0.094 (0.02); monkey S: −0.26 (0.024); all p < 0.05). A 3-parameter model with IG, RS and IG*RS as regressors produced an inferior fit and a non-significant interaction coefficient (p = 0.09). Finally, the models with the lowest Bayesian Information Criterion (BIC) scores included the IG and RS terms, but only inconsistently included other predictors (Fig. S4, bold). In sum, at the level of the population and individual cells, LIP firing rates are best described as encoding IG and RS, rather than reward gains or behavioral indicators that covaried with context.
LIP responses to information gains correlate with post-saccadic discrimination efficacy
Since the neurons showed enhanced pre-saccadic firing rates when the task required more engagement during the post-saccadic fixation, we asked whether the IG modulations were related to the efficacy of the post-saccadic motion discrimination. To this end we plotted the accuracy of the final decision as a function of post-saccadic VT (Fig. 6a,b). While accuracy in unINF blocks was near ceiling regardless of VT (Fig. S5, blue), in INF blocks accuracy and VT were positively related (Fig. 6a,b) confirming that, although the motion was fully coherent, the monkeys needed time to select the appropriate action. A median split of the data based on the neural βIG showed that the increase in accuracy was steeper in sessions in which the recorded neurons had stronger IG modulations (Fig. 6a, black vs gray). Note that this effect involves a comparison across distinct groups of cells, suggesting that the relationship between βIG and performance is a network effect – i.e., can be detected above and beyond the specific sample of neurons that we happened to record in a session.
Interestingly, the relation between decision efficiency and βIG was prominent in large reward but not small reward trials (Fig. 6a, inset). A 2-way ANCOVA with VT as a continuous covariate confirmed that βIG had a significant effect on accuracy above and beyond the effect of VT (Lawley-Hotelling Trace T = 12.71, p = 0.0055), and the relationship was stronger on large-reward relative to small-reward trials (RS*βIG interaction, T = 7.0, p = 0.0084), even though VT showed no effect of RS (main effect of RS, T = 2.8, p > 0.4; RS*VT interaction, T = 0.6, p > 0.3 in the combined data and individual monkeys). In contrast to the effects of IG, performance did not differ when the data were split according to βRS, whether we examined low-reward trials, high-reward trials or both (Fig. 6b; effects of βRS, VT*βRS, and RS*βRS all T < 2.0, p > 0.1 overall and in individual monkeys). Therefore, the efficiency of the post-saccadic discrimination is not related to the LIP reward modulations but is related to the neurons’ IG sensitivity in a reward-dependent fashion.
Our earlier finding that the neurons did not encode variations in post-saccadic VT (Fig. S2) suggests that, while LIP firing correlates with decision efficiency at a constant VT, it may not encode speed-accuracy tradeoffs across the different contexts. To directly examine this hypothesis, we examined how the monkeys traded off VT and accuracy for different RS in INF blocks. The two monkeys made different adjustments in response to RS. Whereas monkey M slowed down and became more accurate on large-reward relative to small-reward trials, monkey S showed the opposite pattern, making faster and less accurate responses when higher rewards were at stake (Fig. S5). We captured this difference by calculating a combined index of speed and accuracy and taking the difference between the indices on large-reward versus small-reward trials (Fig. S5 and Fig. 6c). Monkey M showed a positive difference on average, indicating that he slowed down and achieved higher accuracy on large reward versus small reward trials (Fig. 6c, abscissa, median difference (SEM) = 3.1 (1.48), z=2.5, p = 0.013 relative to 0). Monkey S showed a negative difference indicating that he sped up and sacrificed accuracy when larger rewards were at stake (a.u, median difference (SEM) = −11.3 (3.04), z=3.5, p = 0.00049 relative to 0). This behavioral difference was highly significant (z=4.1, p < 10−5 between the two monkeys), but was uncorrelated with the LIP response. The behavioral index was not correlated with the magnitude of the RS sensitivity (r = 0.0072, p = 0.51 overall; monkey M: r = −0.077, p = 0.6; monkey S: r = 0.012, p = 0.94) or with the difference between the IG coefficient for small and large RS (Fig. 6c; r = 0.089, p = 0.41 overall; monkey M: r = 0.059, p = 0.69; monkey S: r = 0.13, p = 0.44). In sum, LIP responses are correlated with behavior in a very specific fashion, via an increase in decision efficiency as a function of the strength of the IG effect independently of speed-accuracy strategies.
Negative reward effects are explained by the task
Our finding that the neurons had negative reward modulations was unexpected given previous reports that the cells have higher firing rates for higher rewards, raising the possibility that we may have inadvertently recorded from a different population of cells. Two observations strongly argue against this possibility. First, all the neurons that were tested with the 2-step task were pre-screened using a memory-guided saccade task and showed spatially tuned delay period activity, conforming to the functional definition of LIP cells13, 14 (Fig. S6; population: z = 8.0, p < 10−14 overall; z > 5.1, p < 10−7 in individual monkeys; p < 0.05 in 93% of individual cells).
For additional confirmation, we tested a subset of the cells on a traditional 1-step saccade task in which the monkeys made a single saccade to receive a large or small reward (with reward sizes equated to those in the 2-step task; see Methods for details). Replicating the findings in the entire sample, the cells tested in this control task (n = 56/87) showed a significant enhancement by small reward sizes during the delay period of the 2-step task (average (SEM) βRS = −1.8 (0.5) sp/s, z=−3.1, p = 0.002, n = 56). When tested with the 1-step task, however, the same neurons had a positive reward effect as previously shown for this area (Fig. 7a; a 200 ms window centered on saccade onset; mean βRS_1step, 2.7 (1.2), z=2.2, p = 0.025 relative to 0; z=3.7, p = 0.0002 relative to the delay period of the 2-step task).
Figure 7: LIP neurons are enhanced by reward on a traditional 1-step task.
a. Saccade-aligned population PSTHs on the standard 1-step task. The cartoon shows the task geometry, which involved a single saccade to a target inside the RF with the expectation of a large or small reward. b. Saccade-aligned population PSTHs in the same format as in a, but for the second saccade of the 2-step task (control geometry 2, Fig. S2c).
We finally asked whether the neurons showed reward enhancement for the final saccade in the 2-step task which, like the saccade on the 1-step task, harvested the final reward. However, the 48 cells that were tested in control geometry 2 of the 2-step task (Fig. S2c) showed no effect of RS in this epoch (Fig. 7b; mean (SEM) = −0.46 (0.73), z=−1.21, p = 0.22, n = 48) and no correlation between the RS coefficients before the 1st and 2nd saccades of the 2-step task (r = −0.01, p=0.92, overall; monkey M, r = 0.09, p = 0.64; monkey S, r = −0.19, p = 0.46). Like the entire sample, these cells showed a significant enhancement by larger rewards on the 1-step task (mean (SEM) = 3.0 (1.3), z=2.04, p = 0.04) and a significant difference between the two tasks (z=2.25, p = 0.02). Thus, the discrepancy between our results and previous investigations is explained by the behavioral context and not the neuronal population.
IC effects do not encode uncertainty, arousal or difficulty
To determine whether the IG modulations may reflect non-specific differences between the INF and unINF contexts, we compared the IG modulations related to the cue/initial saccade with those related to the targets of the final saccade. We reasoned that, if the IG modulations were non-specific effects of uncertainty, arousal or difficulty, we should find enhancements not only for the cue but also the target-related responses in the INF relative to the unINF blocks15. We thus used control geometry 1, which revealed how the neurons encoded the saccade targets during the delay period preceding the initial saccade (Fig. S2b) and compared the IG modulations elicited by the targets in this geometry, pooling across saccade directions to detect non-spatial effects, with the neurons’ modulations related to the cues (Fig. 8a). The neurons showed no significant IG modulation to the second saccade targets (Fig. 8a abscissa; mean βIG_target = −0.31 (0.64), z = 0.41, p = 0.68 relative to 0; n = 36), resulting in a highly significant difference between the cue and target-related effects (z = 3.21, p = 0.0014; cue-related IG: mean βIG_cue = 3.2 (1.1), z = 2.9, p = 0.0038 relative to 0; n = 36). Thus the IG modulations were specific to the visual cue rather than indicating global changes of gain in the INF context.
Fig. 8. Controls ruling out alternative explanations a. IG modulations are specific rather than global effects.
Comparison of IG coefficients in response to the cue (standard geometry) and to the targets during the delay period (geometry 1, Fig. S2b). Each point is one cell (n=36). The diagonal line is the equality line. Arrowheads show marginal means, black if p<0.05, otherwise gray. During the delay period, IG modulates responses to the cue/first saccade (black) but not the responses to the target in geometry 1 (gray). b. IG effects are not by byproducts of spatial competition. Comparison of βIG in response to the cue (standard geometry) and to the target when it is opposite the cell’s RF in unINF blocks (geometry 1, Fig. S2b). Each point is one cell (n=36). The diagonal line is the equality line. Arrowheads show marginal means, black if p<0.05, otherwise gray. Contrary to the spatial competition hypothesis, the neurons did not show IG modulations when the final saccade target was out of the RF. c. IG modulations do not depend on the relative distance between the cue and targets. The IG coefficient (βIG, ordinate) as a function the TRF index measuring the relative responses to the locations occupied by the cue and the target in the standard geometry (Methods, eq. 3). Larger values along the abscissa indicate cells for which the target elicited a stronger response; larger values along the ordinate indicate cells with stronger IG sensitivity. Each point is one cell (n=87), color denotes individual monkeys. The dashed diagonal line is the least square regression; r and p values refer to the correlation coefficient. d. IG modulations do not depend on the visual hemifield locations of the cue and targets. Comparison of the βIG coefficients on trials in which the final saccade was directed to the same (purple) versus the opposite (pink) hemifield relative to the cue, for cells that had a target in each hemifield (n = 41). Arrowheads show marginal means.
IG effects cannot be explained by spatial normalization
A final hypothesis we consider is that the IG modulations arose indirectly from spatial competition between pools of LIP neurons that encode plans for the 1st and 2nd saccade. According to this view, IG modulations may arise because the monkeys could plan their 2nd saccade as early as the delay period in unINF but not in INF blocks. If the neurons encoded this advance saccade plan, this might have triggered competitive interactions that reduced the responses to the 1st saccade specifically in unINF blocks, masquerading as an IG effect. Several analyses speak against this interpretation.
For the first analysis we reasoned that, if the IG effect were merely explained by spatial competition, it will also be seen in the neurons’ response to the targets. Crucially, neurons should show an apparent IG effect in trials in which the final saccade was away from the RF, as the advance planning of the null-direction saccade would produce lower activity in unINF relative to INF blocks. We therefore evaluated the IG coefficient in control geometry 1 as we did for Fig. 8a, but this time separately for unINF blocks with different saccade directions (Fig. 8b). Contrary to the spatial competition hypothesis, the neurons did not show IG modulations when the final saccade target was out of the RF (Fig. 8b, abscissa, mean 0.66 (0.76), z = 0.72, p = 0.47 relative to 0), and the IG coefficients in this condition were uncorrelated with and significantly lower than the cue-related effects (r = 0.04, p = 0.81; z = 2.2, p = 0.028 paired comparison; cue-related coefficient in this subset of cells was 3.2 (1.1), z = 2.9, p = 0.0038 relative to 0, n = 36). Similar results were obtained when using unINF blocks in which the final saccade was toward the RF (mean IG coefficient −1.7 (0.95), z = −1.9, p = 0.06 relative to 0; relative to cue-related IG: z = 3.4, p < 10−3 paired comparison; r = 0.16, p = 0.35).
A second important prediction stems from the fact that competitive interactions between stimuli depend strongly on the overlap in their neural representations, implying that the IG modulations should be inversely related to the distance between the cue and the targets16, 17. The relevant distance may be defined by the neural representation – i.e., the extent to which the cue and target locations activate the same population of cells. To test this hypothesis, we used RF mapping data from the memory-guided saccade task to compute a target RF (TRF) index that ranged between 1 (indicating a cell for which the target location was entirely outside the RF) and 0 (indicating a cell for which the target location strongly encroached on the RF, eliciting responses equivalent to cue location; Methods, eq. 3). If the IG sensitivity arose from spatial competition, it should be inversely correlated with the TRF – i.e., be larger for neurons whose RF included both the cue and target locations. However, no such correlation was present in the neural response (Fig. 8c, r = 0.044, p = 0.69 overall; monkey M: r = 0.083, p = 0.57; monkey S: r = −0.12, p = 0.46).
An alternative version of this hypothesis is that the relevant distance is relative to the visual field, and competition is strongest when the cue and target locations fall within the same, relative to opposite hemifields. To evaluate this possibility, we focused on neurons for which the cue was at a diagonal location (i.e., not on the vertical or horizontal meridians) and the two saccade targets fell in opposite hemifields. If IG sensitivity were due to visual competition, it should be stronger when the monkeys planned the second saccade to the target that occupied the same hemifield rather than the opposite hemifield relative to the cue. Contrary to this prediction, the βIG were highly correlated and statistically equivalent between the two geometries (Fig. 8d; z = 0.21, p = 0.83 between the two conditions (n = 41); monkey S: z = 0.12, p=0.91, monkey M: z = 0.4, p = 0.69). In sum, neither the temporal or the spatial properties of the IG modulations are consistent with a spatial interaction hypothesis.
Discussion
In contrast with laboratory tasks in which participants make saccades to obtain reward gains, in this experiment we focused on the neural correlates of saccades deployed for information gains. In a previous study using this paradigm we showed that LIP neurons encoded expected information gains based on fixed, long-term estimates of cue validity8. Here we extend this result by showing that the cells are also sensitive to expected information gains occasioned by dynamic changes in decision uncertainty. Responses to decision uncertainty are found in frontal cortical areas18, 19 and subcortical structures including dopamine cells20–22 and have been implicated in reward expectation, confidence and risk attitudes(e.g., 23). Our findings suggest that these responses also play key roles in determining the informativeness of sensory cues, raising important new questions about the neural links between decision uncertainty, attention and active sensing strategies2, 24–26.
Given the dual sensitivity of LIP cells to rewards and saccadic decisions, it seems natural to propose that the cells encode the economic utility of competing alternatives10, 11. Our results argue against this interpretation. They argue instead that the apparent encoding of economic utility may be limited to laboratory tasks in which animals make saccades directly to harvest reward. Although LIP cells had the expected reward-related enhancement on a traditional 1-step paradigm, they showed no reward sensitivity for the final saccade of the 2-step task and, strikingly, showed enhancement for smaller rewards before the information-sampling saccade on this task. Moreover, the neurons’ responses to IG could not be explained by the reward value of gathering information. In striking contrast with the value hypothesis - which predicts that the neural sensitivity to IG and reward gains should be correlated, interact multiplicatively rather than additively, and have congruent signs - the neurons’ sensitivity to reward and IG were uncorrelated, combined additively and, critically, had opposite signs - increasing as a function of IG but decreasing with reward gains.
In contrast with their inconsistent encoding of utility, the neurons’ sensitivity to IG correlated with the efficiency with which the monkeys used the information in the post-saccadic fixation – measured by the decision accuracy at a given VT. Contrary to the traditional view that top-down attentional feedback facilitates visual discrimination in a retinotopic fashion within a single fixation5, 16, the facilitatory effect in our task transcended a saccade in non-retinotopic fashion, linking neural responses to a peripheral location before the saccade with motion processing at a foveal location after the saccade. Contrary to the prevailing pre-motor view whereby attention is recruited merely in relation to a saccade motor plan5, the effects we describe linked the LIP pre-saccadic responses with the cognitive demands of the post-saccadic fixation.
These findings suggest that current views based on economic utility or attentional priority are insufficient to describe the neural mechanisms of information sampling decisions. Instead, we propose that understanding this process requires a broader framework that integrates oculomotor decisions and cognitive control mechanisms. Specifically, our results support the expected value of control (EVC) theory that postulates a separation between monitoring and regulative (implementation) aspects of control9.
According to EVC theory, the dorsal anterior cingulate cortex (dACC) monitors the rewards and costs of alternative actions and uses this information to decide whether to engage in a task and how much effort (control) to allocate to the task. In contrast, more posterior areas including lateral frontal and parietal areas27, 28 are involved in separate regulatory mechanisms that implement the attentional effort – i.e., execute the control signal generated by the dACC. The dACC is proposed to call for additional effort by boosting neuromodulators – including dopamine and norepinephrine29 – which may have led to the enhanced neural gain we observed in conditions of higher uncertainty. In this view, “attentional priority” is a type of focused arousal – a transformation of global arousal into a stimulus-specific gain increase, which enables animals to efficiently reduce uncertainty by focusing on and extracting the information delivered by relevant cues.
This idea naturally accounts for the fact that the effects of IG in our task were mere modulatory effects on the neurons’ larger visual and pre-saccadic responses. The effects we describe are comparable with previously reported attentional and reward modulations14, 30, 31 and, although they seem modest in size, the fact that they correlate with behavioral efficiency suggests that they are functionally significant. A full understanding of the functional consequences of IG-related enhancements ultimately requires an understanding of the large-scale networks controlling active sampling, including the dACC and possibly also the orbitofrontal cortex and dopamine cells that signal the value of information in tasks motivated by curiosity32, 33, along with downstream readout mechanisms that translate these responses into actions34.
The interpretation of priority maps as implementing cognitive effort may also explain our present result that the cells scaled negatively with reward magnitude, as well as other seemingly paradoxical findings in LIP cells. In experiments that randomly interleave small and large reward sizes, a smaller reward is perceived as a loss and elicits motivational conflict - prepotent “escape” reactions in both behavior and neural responses35, 36. Motivational conflict was also seen in our monkeys’ behavior, as rates of trial completion were smaller for lower rewards (Fig. 2). If resolving the conflict and completing a low-reward trial requires enhanced attentional effort, this may explain the stronger responses for smaller RS we observe. The need for enhanced control in conditions of conflict can parsimoniously explain reports that LIP neurons have enhanced responses to punishment-predicting cues that monkeys must look away from37 or stimuli signaling visuomotor conflict such as an antisaccade38 or change in motor effector39. Should this hypothesis be correct, our finding that responses to RS and IG are uncorrelated suggests that different types of effort (e.g., related to uncertainty reduction, visuomotor conflict or motivational conflict) require boosting through distinct circuits, consistent with a recent report40. Thus a central question for future investigation concerns the relation between priority maps, value and cognitive effort in implementing active sensing strategies.
Methods
General
Data were collected from two adult male rhesus monkeys using standard behavioral and neurophysiological techniques41. All methods were approved by the Animal Care and Use Committees of Columbia University and New York State Psychiatric Institute as complying with the guidelines within the Public Health Service Guide for the Care and Use of Laboratory Animals. Behavioral control was implemented in MonkeyLogic42, stimuli were presented on a Mitsubishi Diamond Pro 2070 monitor (30.4 × 40.6 cm viewing area) and eye tracking was performed by an Applied Science Laboratories, model 5000 (digitized at 240 Hz), and action potentials were recorded with the APM digital processing module (Fred Haer, Inc.). Individual electrodes (glass-coated tungsten electrodes, Alpha Omega, impedance at 1kHz: 0.5–1MOhm) were inserted in daily sessions and aimed to the lateral bank of the intraparietal sulcus based on stereotactic coordinates and structural magnetic resonance imaging.
Memory-guided saccade (MGS) task
After obtaining a well isolated waveform, each neuron was first screened with a standard MGS task in which a peripheral target was flashed for 300 ms while the monkeys maintained central fixation and, after a 500 ms delay period, the monkeys were rewarded for making a saccade to the remembered target location. Neurons were tested further only if they had spatially tuned visual and delay period responses on this task (Fig. S6). For these cells the RF was mapped by conducting the MGS at 4 locations, including the RF center and 3 equally eccentric locations spaced at 90-degree intervals.
Information sampling task
Each trial began with the presentation of a fixation point, whose color (red or blue) signaled reward size, and which was initially surrounded by a shape (circle or square) signaling the block type (INF/unINF). When the monkeys fixated this point for a variable period of 1,300 to 1,500 ms, the circle/square was removed, and the monkeys were shown a display containing a cue (a collection of ~60 dots randomly positioned within a circular aperture with a diameter of 4.6 degrees of visual angle (DVA)) and two targets (white squares measuring 1 × 1 DVA). In the standard geometry, the display was adjusted so that the cue was at the center of the neurons’ RF while the targets were at 90-degree angular separation at an equal eccentricity (with all three locations having been tested in the MGS task; Fig. 1a and S2a). After an additional 500 ms delay period, the fixation point was removed, and the monkeys had to make a first saccade to the dots within 50 – 2,000 ms of fixation point offset. If the monkeys fixated the cue for 100 ms, the dots began to move with 100% coherence at a speed of 3.7 DVA/sec toward one of the targets. A trial was scored as correct if the monkeys made a second saccade to the cued target within 0–1,000 ms after motion onset and maintained fixation in a 3-degree window surrounding this target for an additional 200 ms. The motion was terminated as soon as the monkeys’ eye exited the cue window. Correct trials were signaled by an auditory cue (frequency: 500 Hz; duration: 400 ms) and the delivery of a small or large juice reward according to the fixation point color. Reward sizes were, respectively, 300 and 100 ms solenoid open times, i.e., a 3:1 magnitude ratio in both monkeys. An error occurring at any point in the trial was followed by no reward and the immediate removal of the visual stimuli.
Each neuron was tested with a minimum of 2 INF blocks and 2 unINF blocks of 50 correct trials each. The blocks were presented in alternating order (Fig. 1b), such that the condition that was presented first (INF or unINF) and the target that was rewarded in the first unINF block were randomized across sessions. In each block, trials were randomly assigned to deliver a small or large reward with 50% probability. In addition, 8% of trials in each block were catch trials, which were identical with the main trials in all respects except that there was no dot motion (the dots remained stationary). The monkeys thus had to guess the direction of the correct final saccade and were rewarded for making the second saccade to the blocked target in unINF blocks, and to the randomly selected, but not signaled target, in INF blocks. If the monkeys prematurely broke fixation or had a saccade latency that was outside the allowed window, the trial was scored as incomplete and was immediately repeated with the same RS and motion direction until correctly completed (up to a limit of 10 consecutive errors). Trials with erroneous decisions (in which the monkeys made a second saccade with the correct timing but to the wrong target) were not repeated and counted as completed error trials.
To signal which target location was correct in an unINF block the first 10 trials of each unINF block were “instruction” trials in which the rewarded target had a higher luminance than the unrewarded one (these trials were not included in the data analysis). These signals, together with the monkeys’ extensive practice, resulted in very fast behavioral transitions, as shown by the fact that decision accuracy and VT changed little as a function of time in a block and differed significantly between the INF/unINF contexts from the very first trial in a block.
After obtaining a data set with the main geometry in the 2-step task, cells were further tested in 3 control conditions that were presented in randomized order until isolation was lost or the monkeys stopped working. Two conditions presented the 2-step task in the same format as described above but using modified geometries in which a target was in the RF during the delay period before the first saccade (geometry 1), or during the preparation of the second saccade (geometry 2; Fig. S2).
In addition, to test for reward sensitivity, neurons were tested in a control 1-step saccade task in which the monkeys made a single saccade to a target to obtain a large or small reward. In these trials the monkeys achieved fixation and, when the fixation point disappeared made a saccade to a target whose shape (upward or downward pointing triangle) signaled reward size. Reward sizes were identical to those used in the main task. The task randomly interleaved free-choice trials in which the monkeys were offered both reward alternatives (with the large and small reward targets falling randomly inside or opposite the RF) and forced-choice trials in which a single target was present (either inside or opposite the RF). On free choice trials the monkeys nearly always chose the large-reward size showing that they were highly sensitive to these differences in RS (monkey M, 97.6%, monkey S: 97.8%). Neural responses were analyzed on interleaved forced-choice trials, in which we could obtain RF-directed saccades at both reward magnitudes. Note that this task was designed to act as a screening tool for the presence of reward modulations. Therefore, it was designed to emulate the structures of tasks used in previous investigations of reward in this area (e.g.,43) rather than provide a systematic comparison with the information sampling task.
Data analysis
Data are reported from 87 neurons recorded from two 16 year old male monkeys (49 in monkey M, 38 in monkey S) that were tested in the standard geometry of the 2-step task. Of these cells, 36 neurons were further tested in control geometry 1 (26 in monkey M), 48 neurons (30 in monkey M) were tested in geometry 2, and 56 neurons (36 in monkey M) were tested with the 1-step reward control task. No statistical methods were used to pre-determine sample sizes but our sample sizes are similar to those reported in previous studies of this area8, 14, 37. Data collection and analysis were not performed blind to the conditions of the experiments. Error trials were not considered in analysis of neural responses. All statistical comparisons used non-parametric tests (two-sided paired rank or signed-rank tests) unless otherwise noted. Life Sciences Reporting Summary contains summaries of statistics and data of this Methods section.
Trial completion rate (Fig. 2) was the number of completed trials divided by the number of the trials in which the monkeys initiated fixation. Decision accuracy was the number of rewarded trials divided by the number of completed trials. Beyond the analysis of completion rates, incomplete trials were discarded and not analyzed further. Saccade latency was measured between fixation point offset and saccade start as defined based on velocity and acceleration criteria44. Endpoint accuracy was defined as the Euclidean distance between the center of the cue/target and the saccade landing position, “velocity” refers to the peak saccade velocity, and VT was the interval between motion onset and the start of the second saccade. Note that, because the monkeys had to complete 100 ms of post-saccadic fixation before motion onset, the inter-saccadic interval (i.e., the latency of the second saccade) was equal to VT + 100 ms.
Delay period firing rates (FR) were measured from the raw spike trains.
For each neuron, trial by trial firing rates were fit with two regression equations:
(eq. 1) |
(eq. 2) |
IG is information gains (0 for unINF, 1 for INF), RS is reward size (0 for small, 1 for large), and IG*RS is their interaction. LAT, VEL, ACC are, respectively, the latency, velocity, accuracy of the first saccade. VT is the motion viewing time and SDIR is the direction of the 2nd saccade. SDIR was arbitrarily set to 0 or 1 for each of the two final saccades. The fits were implemented with the stepwiselm function in Matlab 2014b. All the regressors were z-scored using all the trials in a session. IG and RS terms in eq. 1, and IG, RS and IG*RS terms in eq. 2 were locked in the model. The remaining saccade descriptors were included according to a backward stepwise procedure using p-values of 0.05 and 0.10 as cutoffs for, respectively, including and excluding regressors 14.
All analyses were done on raw firing rates and repeated with z-scored and mean-normalized firing rates. Unless otherwise noted, regression coefficients are reported in units of sp/s. Exploratory analyses showed that the IG and RS effects were sustained during the delay period (Fig. 3b and Fig. S3c) and were not sensitive to a range of window sizes spanning this period. Thus the results are based on the FR averaged throughout the delay period (i.e., 150 ms after cue onset until the saccade onset).
The “target RF index” (TRF) measured the extent to which the targets encroached on a cell’s RF in the standard geometry of the 2-step task. TRF was computed for each cell as:
(eq. 3) |
All FR values are measured during the delay period of the memory guided saccade task (300 – 800 ms after cue onset). FRcue_loc is the FR at the RF center (the location occupied by the cue in the standard geometry of the 2-step task), FRtar_loc is the average FR at the two locations closest to the center (the same locations that were occupied by the targets in the standard geometry of the 2-step task). The TRF ranges between 1 (indicating a cell that does not respond at all at the target locations) and 0 (indicating a cell that responded equivalently to the cue and target locations). Note that negative values are impossible in this index since the cue location was defined as the location that had higher FR.
Data and code availability
The data generated and analyzed for this study are available from the corresponding author upon request.
The code written to analyze the data and produce the figures for this study are available from the corresponding author upon request.
Supplementary Material
Acknowledgements
The work was supported by NIH grants R24 EY015634, RO1EY25965 to JG, and fellowhips from the Danish Council For Independent Research, Reinholdt W. Jorck og Hustrus Fond and the Marie og MB Richters Fond to MH.
Footnotes
Disclosure statement:
The authors declare that they have no conflict of interest.
References
- 1.Hanks TD & Summerfield C Perceptual Decision Making in Rodents, Monkeys, and Humans. Neuron 93, 15–31 (2017). [DOI] [PubMed] [Google Scholar]
- 2.Gottlieb J & Oudeyer PY Toward a neuroscience of active sampling and curiosity. Nat Rev Neurosci. 19, 758–770 (2018). [DOI] [PubMed] [Google Scholar]
- 3.Gottlieb J Understanding active sampling strategies: empirical approaches and implications for attention and decision reseeaerch. Cortex 17 30276–30279 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gottlieb J, Hayhoe M, Hikosaka O & Rangel A Attention, reward and information seeking. Journal of Neuroscience 34, 15497–154504 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bisley JW & Goldberg ME Attention, intention, and priority in the parietal lobe. Annual Review of Neuroscience 33, 1–21 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Thompson KG & Bichot NP A visual salience map in the primate frontal eye field. Prog Brain Res 147, 251–262 (2005). [DOI] [PubMed] [Google Scholar]
- 7.Maunsell JH Neuronal representations of cognitive state: reward or attention? Trends Cogn Sci 8, 261–265 (2004). [DOI] [PubMed] [Google Scholar]
- 8.Foley NC, Kelley SP, Mhatre H, Lopes M & Gottlieb J Parietal neurons encode expected gains in instrumental information. Proceedings of the National Academy of Science 114, E3315–E3323 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Shenhav A, Botvinick M & Cohen J The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kable JW & Glimcher PW The neurobiology of decision: consensus and controversy. Neuron 63, 733–745 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sugrue LP, Corrado GS & Newsome WT Choosing the greater of two goods: neural currencies for valuation and decision making. Nat Rev Neurosci 6, 363–375 (2005). [DOI] [PubMed] [Google Scholar]
- 12.Peck CJ, Jangraw DC, Suzuki M, Efem R & Gottlieb J Reward modulates attention independently of action value in posterior parietal cortex. J Neurosci 29, 11182–11191 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Barash S, Bracewell RM, Fogassi L, Gnadt JW & Andersen RA Saccade-related activity in the lateral intraparietal area. I. Temporal properties. J. Neurophysiol 66, 1095–1108 (1991). [DOI] [PubMed] [Google Scholar]
- 14.Sugrue LP, Corrado GS & Newsome WT Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004). [DOI] [PubMed] [Google Scholar]
- 15.Aston-Jones G & Cohen JD An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu Rev Neurosci. 28, 403–450 (2005). [DOI] [PubMed] [Google Scholar]
- 16.Reynolds JH & Heeger DJ The normalization model of attention. Neuron 61, 168–185 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Balan PF, Oristaglio J, Schneider DM & Gottlieb J Neuronal correlates of the set-size effect in monkey lateral intraparietal area. PLoS Biol 6, e158 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.O’Neill M & Schultz W Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800 (2010). [DOI] [PubMed] [Google Scholar]
- 19.Tobler PN, Christopoulos GI, O’Doherty JP, Dolan RJ & Schultz W Risk-dependent reward value signal in human prefrontal cortex. Proc Natl Acad Sci U S A 106, 7185–7190 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Monosov IE, Leopold DA & Hikosaka O Neurons in the Primate Medial Basal Forebrain Signal Combined Information about Reward Uncertainty, Value, and Punishment Anticipation. Journal of Neuroscience 35, 7443–7459 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Monosov IE & Hikosaka O Selective and graded coding of reward uncertainty by neurons in the primate anterodorsal septal region. Nat Neurosci 16, 756–762 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schultz W et al. Explicit neural signals reflecting reward uncertainty. Philos Trans R Soc Lond B Biol Sci 363, 3801–3811 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Pouget A, Drugowitsch J & A. K Confidence and certainty: distinct probabilistic quantities for different goals. Nat Neurosci. 19, 366–374 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Fan J An information theory account of cognitive control. Front Hum Neurosci 8 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Vossel S et al. Spatial Attention, Precision, and Bayesian Inference: A Study of Saccadic Response Speed. Cerebral Cortex 24, 1436–1450 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Dayan P, Kakade S & Montague PR Learning and selective attention. Nat Neurosci 3 Suppl, 1218–1223 (2000). [DOI] [PubMed] [Google Scholar]
- 27.Krebs RM, Boehler CN, Roberts KC, Song AW & Woldorff MG The involvement of the dopaminergic midbrain and cortico-striatal-thalamic circuits in the integration of reward prospect and attentional task demands. Cereb Cortex 22, 607–615 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Chong TT et al. Neurocomputational mechanisms underlying subjective valuation of effort costs. PLoS Biol. 15, e1002598 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Silvetti M, Vassena E, Abrahamse E & Verguts T Dorsal anterior cingulate-brainstem ensemble as a reinforcement meta-learner. PLoS Comput Biol. 14, e1006370 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bisley JW & Goldberg ME Neuronal activity in the lateral intraparietal area and spatial attention. Science 299, 81–86 (2003). [DOI] [PubMed] [Google Scholar]
- 31.Louie K, Grattan LE & Glimcher PW Reward value-based gain control: divisive normalization in parietal cortex. J Neurosci 31, 10627–10639 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Blanchard TC, Hayden BY & Bromberg-Martin ES Orbitofrontal cortex uses distinct codes for different choice attributes in decisions motivated by curiosity. Neuron 85, 602–614 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bromberg-Martin ES & Hikosaka O Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron 63, 119–126 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Park IM, Meister ML, Huk AC & Pillow JW Encoding and decoding in parietal cortex during sensorimotor decision making. Nature Neuroscience 10, 1395–1403 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Hikosaka O & Isoda M Switching from automatic to controlled behavior: cortico-basal ganglia mechanisms. Trends Cogn Sci 14, 154–161 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Isoda M & Hikosaka O A neural correlate of motivational conflict in the superior colliculus of the macaque. J Neurophysiol 100, 1332–1342 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Leathers ML & Olson CR In monkeys making value-based decisions, LIP neurons encode cue salience and not action value. Science 338, 132–135 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gottlieb J & Goldberg ME Activity of neurons in the lateral intraparietal area of the monkey during an antisaccade task. Nature Neurosci. 2, 906–912 (1999). [DOI] [PubMed] [Google Scholar]
- 39.Snyder LH, Batista AP & Andersen RA Change in motor plan, without a change in the spatial locus of attention, modulates activity in posterior parietal cortex. J Neurophysiol 79, 2814–2819 (1998). [DOI] [PubMed] [Google Scholar]
- 40.Hosokawa T, Kennerley SW, Sloan J & Wallis JD Single-neuron mechanisms underlying cost-benefit analysis in frontal cortex. Joural of Neuroscience 33, 17385–17397 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Oristaglio J, Schneider DM, Balan PF & Gottlieb J Integration of visuospatial and effector information during symbolically cued limb movements in monkey lateral intraparietal area. J Neurosci 26, 8310–8319 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Asaad WF & Eskandar EN A flexible software tool for temporally-precise behavioral control in Matlab. J Neurosci Methods 174, 245–258 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Platt ML & Glimcher PW Neural correlates of decision variables in parietal cortex. Nature 400, 233–238 (1999). [DOI] [PubMed] [Google Scholar]
- 44.Nystrom M & Holmqvist K An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data. Behav Res Methods 42, 188–204 (2010). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data generated and analyzed for this study are available from the corresponding author upon request.
The code written to analyze the data and produce the figures for this study are available from the corresponding author upon request.