Abstract
Humans and animals navigate uncertain environments by seeking information about the future. Remarkably, we often seek information even when it has no instrumental value for aiding our decisions – as if the information is a source of value in its own right. In recent years, there has been a flourishing of research into these non-instrumental information preferences and their implementation in the brain. Individuals value information about uncertain future rewards, and do so for multiple reasons, including valuing resolution of uncertainty and overweighting desirable information. The brain motivates this information seeking by tapping into some of the same circuitry as primary rewards like food and water. However, it also employs cortex and basal ganglia circuitry that predicts and values information as distinct from primary reward. Uncovering how these circuits cooperate will be fundamental to understanding information seeking and motivated behavior as a whole, in our increasingly complex and information-rich world.
Keywords: uncertainty, information, information seeking, temporal resolution of uncertainty, observing response, cingulate, striatum, pallidum, orbitofrontal, habenula, Dopamine
Humans and animals navigate uncertain environments by seeking information about the future. Of course, this is partly due to the instrumental value of information to help us choose better actions [1–5]. Remarkably, however, we can be strongly motivated to seek information even when we know there is no way to use it to influence our future actions and outcomes – as if knowledge is a source of value in its own right. Many of us have the experience of voting in an election, knowing there is nothing more we can do to influence the outcome, and telling ourselves we should get a good night’s sleep and find out in the morning…and instead, staying up late into the night with our eyes glued to the TV screen, in order to get the information the first moment it becomes available.
More than ten years ago, neuroscientists began to study how these preferences for non-instrumental information are encoded by single neurons in the brain, focusing on information about uncertain rewards [6]. In its simplest form, this preference can be measured by giving a choice between two offers, one of which provides informative cues that indicate the reward outcome in advance, while the other provides non-informative cues that do not indicate the outcome (Figure 1A). Importantly, both offers provide exactly the same reward distribution, and there is no way to use the information to influence the outcome. Yet both humans and animals can strongly prefer information (Figure 1B) and willingly pay a price for it [7–10], assigning it considerable value in their decisions (Figure 1C).
Figure 1. Humans and animals can seek non-instrumental information about uncertain future rewards.

(A) A simple task to study information preferences (originally invented by [14]). Individuals are offered a choice between options that provide either informative cues that indicate the outcome in advance (Info, red; reward cue or no-reward cue) or non-informative cues that do not indicate the outcome (Noinfo, blue). Importantly, there is no way to use this information to influence the outcome. (B) Humans and macaque monkeys can both prefer to view informative cues. This data was collected with more sophisticated tasks where offers provided different chances of obtaining information about gaining future rewards – money for humans, and juice or water for monkeys [8,**9,22]. (C) A simple candidate mechanism for non-instrumental information seeking. Individuals can value information for multiple reasons, including valuing the resolution of uncertainty, and overweighting desirable vs. undesirable information. The value of information is then combined with the value of primary rewards to compute the total reward value that guides decisions.
At the time non-instrumental information seeking had a long history in several fields. In psychology it was called a form of “observing behavior” [11] and was primarily studied in rats and pigeons [10,12–14]. In economics it was called “temporal resolution of uncertainty” [15] or informational attitude, and was primarily studied in theoretical models and surveys [16–18], with studies beginning to examine choices with real consequences [7]. Developmental psychology and machine learning also studied the importance of intrinsic motivation for learning about the world [19,20]. However, these literatures were largely independent, approaching the phenomenon and interpreting their findings in very different frameworks, with surprisingly little communication between them. This made it difficult to pool their knowledge to understand how informational preferences are created by neural circuits in the brain.
In the last few years this picture has changed dramatically. There has been an explosion of research on information seeking in neuroscience, bringing together researchers from different backgrounds to bridge the gap between these diverse fields, and a flourishing of new discoveries. This has been especially true for information seeking about a specific type of future event – uncertain rewards – which has become the target of systematic and comparative neuroscientific studies in both humans and animals. Here we highlight advances in understanding the mechanisms that motivate this form of information seeking, and their neural implementations in the brain.
Neural networks integrating information and primary reward
Early neuroscience studies of information seeking in monkeys focused on the “reward prediction error” (RPE) system. The RPE system has a key role in motivating actions to seek primary rewards like food and water, by signaling the difference between a situation’s predicted and actual reward value [21]. The RPE system was shown to have similar signals for information – effectively treating information about uncertain outcomes as a reward in itself [6,22]. For example, just as many midbrain dopamine neurons are excited when a monkey learns that it will get a large water reward (‘more water than predicted’), many are also excited when the monkey learns that it will see an informative cue (‘more information than predicted’; Figure 2B). This suggested that information seeking is motivated by the same RPE circuitry that motivates primary reward seeking.
Figure 2. Candidate neural signals to motivate information seeking: information predictions and reward prediction errors.

(A) An example neuron in the dorsal striatum with activity resembling an information prediction (data from [**39]). This neuron activated during reward uncertainty, ramped up to the time the animal expected to receive information, and then returned to baseline after the information was received. Specifically, this neuron had ramping activation in anticipation of the offer, which indicated the availability of information and of juice reward. If the offer indicated that reward was uncertain and information was forthcoming, the neuron activated again and ramped to the predicted time of the informative cue (red). The neuron had much less activity if no information was predicted (blue) or if reward was fully certain to occur (gray). (B) An example midbrain dopamine neuron with an RPE signal treating information as a reward (data from [6]). This neuron was activated by Info offers (red, ‘more information than predicted’) and by informative cues indicating water reward delivery (red solid line, ‘more reward than predicted’); it was inhibited by Noinfo offers (blue, ‘less information than predicted’) and by informative cues indicating reward omission (red dashed line, ‘less reward than predicted’); and had little response to non-informative cues (blue, ‘predictions unchanged’). As a result, the information prediction signal (A) ramps up to the expected time of detecting an RPE (B).
Recent studies have replicated and greatly extended this finding in humans, uncovering principles by which this network evaluates information. Midbrain regions that contain dopamine neurons have a blood oxygen level dependent (BOLD) signal responding to informational prediction errors [**9]. This signal is sensitive to the subjective value of the information – scaling up for desirable information about likely monetary gains, and scaling down for less desirable information about likely losses. Indeed, the signal strength in a major target of dopamine projections, the ventral striatum, predicts information preferences. This suggests that dopamine projections to basal ganglia may transform prediction errors into motivation to seek information.
This convergence of information and monetary reward processing extends to cortex. Both monetary and informational prediction errors induce electroencephalographic (EEG) signals with strikingly similar spatial and temporal profiles [*23]. This feedback-related negativity originates from medial prefrontal cortex (mPFC), including anterior cingulate cortex (ACC) and supplementary eye field [24,25]. Many mPFC neurons respond to unpredicted outcomes that motivate changes in behavior [25–31] and may regulate evaluation of uncertain rewards [32]. Thus, mPFC may also motivate adjustments in information seeking behavior.
In these cases people sought information with no instrumental value. Could the same network handle information that does have instrumental value? This was recently addressed by allowing humans to pay for partial information about a lottery’s outcome before deciding whether to accept the lottery [*33]. People valued information for both instrumental and non-instrumental reasons. This combined subjective value of information correlated with BOLD signals in many of the same regions as the value of money. This included the ventral striatum region discussed above and the ventromedial prefrontal cortex. Thus, these areas may represent the total value of information and primary reward to guide decisions.
Neural networks for information seeking
The networks discussed so far may combine information and primary reward into a common currency of total value. However, we can also treat information and primary reward as distinct, separate entities (Fig 1C). After all, when we sit down in a restaurant we are pleased to get the menu or to get the meal itself – but we know exactly which we are expecting, and something is wrong if one comes in place of the other!
Early support for this hypothesis came from evidence that the monkey orbitofrontal cortex (OFC) can encode the distinct values of both information about primary reward and primary reward itself [8]. This is consistent with evidence that OFC associates cues with the distinct values of different rewards [34–36], and separately encodes the confidence of a decision and its primary reward value [37]. Human OFC regions respond to the availability [**9] and receipt [*38] of information about uncertain rewards. Thus OFC may separately adjust the values of information and primary reward cues – so that hunger leads us to seek signs of food, while curiosity and uncertainty lead us to seek signs of information.
How, then, could the brain create a specific motivation to seek information? It would need to (1) detect when rewards are uncertain, (2) predict when information will become available to resolve the uncertainty, and (3) use this prediction to promote information seeking actions.
A recent study showed evidence for a neural system that carries out these steps, in an anatomically interconnected cortex-basal ganglia network including regions of ACC, dorsal striatum (DS), and pallidum (Pal) [**39]. A subset of neurons there have information predictive activity. They activate when monkeys are uncertain about future rewards and ramp up to the time information will arrive to resolve the uncertainty (Figure 2A, red). They have much less activity when rewards are certain or no information is predicted (Figure 2A, gray and blue).
Crucially, this network’s activity causally influences information seeking. Monkeys rapidly shift their gaze to view informative cues [6] in a manner sensitive to reward uncertainty [40]. These information seeking gaze shifts are predictive, ramping up to the time of getting information [**39]. The neural information prediction signal is linked to this behavior. Strong neural signals are followed by gaze shifts toward information-related cues, while weak signals are followed by gaze shifts away from them. The signal has no comparable relationship to primary reward seeking. Furthermore, inactivating basal ganglia regions that contain the information signal impairs information seeking gaze shifts. This suggests that the network’s information prediction signal specifically motivates and sustains information seeking.
The ACC may have a supervisory role in sustaining information seeking. In the ACC-DS-Pal network its information predictions have early tuning to graded levels of uncertainty, and are the earliest predictor of information-seeking gaze shifts [**39]. ACC activity has been linked to anticipation of multiple pieces of information to resolve reward uncertainty [41], and integrating gathered information to change actions or strategies [42–44].
Furthermore, a recent study implicated ACC in sustained information sampling [**45]. Humans and monkeys were allowed to collect information about choice options before making a final decision. Both species showed evidence of non-instrumental information seeking. After they collected strong evidence favoring a specific option, they did not simply choose it immediately, and instead spent additional time gathering information about its future outcome [**45,46]. In parallel, a subset of monkey ACC neurons tracked how each piece of information influenced the certainty of choosing that favored option, and hence the certainty of receiving that outcome. This activity was prevalent in ACC and rare in OFC and dorsolateral prefrontal cortex, supporting a key role of ACC in information sampling.
How do these networks cooperate to motivate behavior?
We have discussed networks that (1) specifically signal information, and (2) integrate information and primary reward. How do they cooperate to motivate behavior? We propose the following hypothesis (Figure 3).
Figure 3. Hypothesized network mechanisms of information seeking.

Left: anatomically connected neural networks for motivated behavior. Left top: cortex-striatum-pallidum network. Left bottom: classic RPE network. Sharp/blunt arrows indicate predominant excitatory/inhibitory projections [**39,51]. Right: neuronal signals demonstrated to exist in each network, and their hypothesized influences on each other (arrows). Note that the RPE signal is depicted to straddle the two networks because RPE-like activity has been reported in subsets of neurons in both. Top: the cortex-striatum-pallidum network contains distinct predictions about the availability of information and primary reward (red, gray). These may directly motivate the specific pursuit of information or of primary reward. They may also be combined (purple) to compute the total predicted reward value, and in turn the total reward prediction error. These signals could then motivate the general pursuit of total reward value. The information prediction signal may also interface with RPEs: information predictions could prepare the brain for impending prediction errors, while RPEs could instruct information predictions by signaling the receipt of new information that changes reward predictions.
Cortex-basal ganglia pathways have primary reward predictive neurons that promote primary reward seeking [47–49], and the work reviewed here suggests they contain a parallel process for information seeking (Figure 3, top). These processes are intermixed, with information and primary reward signals in nearby neurons [**39] and potentially mixed in single neurons. However, there is some clustering. For example, information-related neurons are enriched in a DS region that receives strong ACC projections [**39] and can activate in humans during risky decisions [50].
As a result, these pathways could combine information and primary reward predictions to compute the total predicted reward value that guides decisions. These total reward predictions could also be used to compute RPEs, by sending them directly to the classic RPE system that regulates dopamine, including the lateral habenula, rostromedial tegmental nucleus, and dopamine neurons themselves [51] (Figure 3, bottom; LHb, RMTg, DA). RPE computations could also occur in the cortex-basal ganglia pathway itself, where subsets of neurons have RPE-related signals [52–55]. Indeed, these areas have all been implicated in predicting values, controlling the RPE system, or both [48,55–66].
Neural systems for information prediction and RPEs are well positioned to support each other. Information seeking provides the raw material for predictions, while erroneous predictions indicate the need for new information. In particular, information predictive activity anticipates the moment of gaining information about uncertain future rewards (Figure 2A), and this information immediately triggers a phasic RPE signal based on whether it is better or worse than predicted (Figure 2B). Therefore, information prediction signals may have a special role in preparing the brain to compute and learn from RPEs (Figure 3).
Conversely, RPEs are ideally suited to instruct information predictions (Figure 3). An RPE indicates the receipt of new information about rewards. In classic theories, RPEs instruct learning of reward predictions: positive RPEs increase reward predictions, while negative RPEs decrease reward predictions [67]. In principle, however, RPEs could also instruct learning of information predictions: large positive and negative RPEs both indicate the receipt of a large amount of information about future reward value, while small RPEs typically indicate the receipt of little information. If so, RPEs could instruct both total reward value predictions and information predictions.
In addition, many dopamine neurons are activated by alerting events, which are important for motivated behavior but do not increase the situation’s reward value [68–70] (such as memoranda in a memory task [71], unexpected stop signals [72], and rewards changing flavor [73,74]). Some dopamine neurons are also activated by certain aversive events [70,75–77]. This could motivate information prediction and seeking about these important events as well.
What algorithm does the brain use to calculate the value of information?
We have discussed networks that translate the value of information into motivated behavior. How, then, does the brain decide information’s value? Shortly after the discovery of “observing behavior” in 1952 [11], two general theories emerged about the underlying mechanism [78,79] (Figure 1C). These theories have been remarkably durable, emerging in similar forms in multiple fields. They share the common basis that, since non-instrumental information cannot be used to change the objective rate of gaining primary rewards from the environment, the brain must value information based on how it changes subjective, internal states.
The first theory, proposed by the curiosity research pioneer Daniel Berlyne, is that we value information because it reduces uncertainty [78] (Figure 1C). His original paper suggested computing uncertainty with Shannon’s newly invented information theory [80]. Later work proposed alternate computations for uncertainty, often considering its experience and resolution over time [81]. For instance, uncertainty about potential gains and losses may produce anticipatory emotions like hope and anxiety [16–18]. This theory is supported by evidence that uncertainty and its resolution activate cortex-basal ganglia networks in monkeys [**39] (Figure 2A) and multiple cortical areas in humans [*38], and in those settings uncertainty strongly influences information seeking [*38,**39].
The second theory, proposed by the “observing behavior” discoverer L. Benjamin Wyckoff, is that we value information because we overweight desirable information [79] (Figure 1C). In a nutshell, objectively neutral information sources may become subjectively valuable if we overweight their desirable information (‘good news’), or underweight their undesirable information (‘bad news’). The first proposal by economists was a similar mechanism, even suggesting the same equation for overweighting (a squaring nonlinearity) [15]. Later work proposed many mechanisms for overweighting, including selective observing [13], engagement [82], savoring desirable outcomes, and dreading undesirable outcomes [83,84].
An important message of recent work is that individuals may value information about uncertain rewards through both mechanisms – in a way that may vary across individuals, species, and situations. For example, information seeking can be tuned to both reward uncertainty and expected reward value, but to different degrees in different studies [**9,*38,40,**85,86]. Even work reporting predominance of uncertainty often finds some effect of expected value [*38,86], while work reporting predominance of ‘good news’ often finds that this requires variance between good news and other possible outcomes [84,87]. Most strikingly, even in a single study, humans have a remarkable diversity of information attitudes: some are guided by uncertainty, some by expected value, and some by both [**85]. Taken together, this work suggests the need for hybrid theories [18,88,89] with flexible mechanisms to value information.
Looking forward
A key goal of future research will be to uncover the neural basis of a broader spectrum of information attitudes. We have discussed information seeking about uncertain rewards, but similar neural mechanisms may apply to other events. Humans have similar information seeking about trivia, and even remember trivia better when it evokes a more positive informational prediction error – when it satisfies their curiosity better than predicted [90]. Humans and animals can also seek information about aversive events [91], and humans can have ‘morbid fascination’ with observing the aversive experiences of others [92] that may activate regions of mPFC and OFC [93]. A recent study reported evidence that monkeys even seek information about counterfactual outcomes – outcomes they would have received if they had chosen a different option [*94]. Lastly, we have discussed information seeking, but humans and animals sometimes avoid information [9,12,95,96], such as avoiding medical screening for disease [95] or refusing to check their stock portfolio during a bear market [**9].
Finally, an important long-term goal will be to discover the evolutionary and developmental origin of information seeking. That is, why do brains develop a motivational system that treats non-instrumental information as valuable? In natural environments, organisms can rarely estimate the instrumental value of a specific piece of information with high precision. This is especially true for unfamiliar environments and important life events, such as moving to a new home, finding a mate, or voting in a national election.
We hypothesize that the brain solves this problem by nudging its estimate of information value toward the value that similar types of information typically have in natural environments. This could be based on the environments the organism encountered during its development, and an evolved prior about the environments its species typically encounters. This would explain why organisms can be well adapted to natural environments, yet persistently seek information in controlled lab experiments where it has no instrumental value.
There is precedent for this phenomenon in neuroscience. ‘Visual illusions’ are often viewed as simple errors in perception, perhaps due to the brain using flawed algorithms for visual processing. Why should we perceive certain objects to be closer than they really are, and others to move slower than they really are? However, when vision scientists measured the natural statistics of visual scenes, they realized that many illusions may actually result from the brain making rational inferences about the world based on its evolved and learned knowledge about the structure of natural environments [97,98]. ‘Most objects move slowly, so err on the side of slowness.’ In a similar manner, non-instrumental information seeking may arise from a ‘value illusion’ as the brain attempts to infer the value of information based on the natural statistics of cues, rewards, and actions. ‘Information about rewards is usually valuable, so err on the side of value.’ If so, then measuring the natural statistics of motivational events would revolutionize our understanding of why we value information, and what rules we use to calculate its value.
Our increasingly interconnected world puts a vast ocean of information at our fingertips. Learning to navigate it is vital for our own happiness and the health of our society. Thus, understanding our informational preferences is becoming increasingly important not only for the sake of scientific discovery, but also society as a whole.
HIGHLIGHTS.
Humans and animals seek information about uncertain future rewards
Information is valued for resolving uncertainty and signaling desirable outcomes
The reward prediction error system integrates information with primary reward
A cortex-basal ganglia network specifically predicts and drives information seeking
Acknowledgements:
This work has been supported by the National Institute of Mental Health under Award Numbers R01MH110594 and R01MH116937.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declarations of interest: none.
REFERENCES
- 1.Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ: Cortical substrates for exploratory decisions in humans. Nature 2006, 441:876–879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nakamura K: Neural representation of information measure in the primate premotor cortex. J Neurophysiol 2006, 96:478–485. [DOI] [PubMed] [Google Scholar]
- 3.Costa VD, Tran VL, Turchi J, Averbeck BB: Dopamine modulates novelty seeking behavior during decision making. Behav Neurosci 2014, 128:556–566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Horan M, Daddaoua N, Gottlieb J: Parietal neurons encode information sampling based on decision uncertainty. Nat Neurosci 2019, 22:1327–1335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Nakamura K, Komatsu M: Information seeking mechanism of neural populations in the lateral prefrontal cortex. Brain Res 2019, 1707:79–89. [DOI] [PubMed] [Google Scholar]
- 6.Bromberg-Martin ES, Hikosaka O: Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron 2009, 63:119–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Eliaz K, Schotter A: Experimental testing of intrinsic preferences for NonInstrumental information. American Economic Review 2007, 97:166–169. [Google Scholar]
- 8.Blanchard TC, Hayden BY, Bromberg-Martin ES: Orbitofrontal cortex uses distinct codes for different choice attributes in decisions motivated by curiosity. Neuron 2015, 85:602–614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ** 9.Charpentier CJ, Bromberg-Martin ES, Sharot T: Valuation of knowledge and ignorance in mesolimbic reward circuitry. Proc Natl Acad Sci U S A 2018, 115:E7255–E7264. [DOI] [PMC free article] [PubMed] [Google Scholar]; This study used functional imaging in humans to compare informational and monetary reward prediction errors for gains and losses. They demonstrated BOLD signals for informational prediction errors in regions of the mesolimbic dopamine system classically known to have RPEs. Information attitudes in their tasks scaled strongly with expected value, and the informational prediction error signal tracked this preference. This marks an important convergence between human and animal data.
- 10.Zentall TR: Suboptimal choice by pigeons: an analog of human gambling behavior. Behav Processes 2014, 103:156–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Wyckoff LB Jr.: The role of observing responses in discrimination learning. Psychol Rev 1952, 59:431–442. [DOI] [PubMed] [Google Scholar]
- 12.Daly HB: Preference for unpredictability is reversed when unpredictable nonreward is aversive: procedures, data, and theories of appetitive observing response acquisition. In Learning and Memory: The Behavioral and Biological Substrates. Edited by Wasserman IGaEA: L.E. Associates; 1992:81–104. [Google Scholar]
- 13.Dinsmoor JA: Observing and conditioned reinforcement. Behavioral and Brain Sciences 1983, 6:693–704. [Google Scholar]
- 14.Prokasy WF Jr.: The acquisition of observing responses in the absence of differential external reinforcement. J Comp Physiol Psychol 1956, 49:131–134. [DOI] [PubMed] [Google Scholar]
- 15.Kreps DM, Porteus EL: Temporal Resolution of Uncertainty and Dynamic Choice Theory. Econometrica 1978, 46:185–200. [Google Scholar]
- 16.Chew SH, Ho JL: Hope - an Empirical-Study of Attitude toward the Timing of Uncertainty Resolution. Journal of Risk and Uncertainty 1994, 8:267–288. [Google Scholar]
- 17.Wu G: Anxiety and decision making with delayed resolution of uncertainty. Theory and Decision 1999, 46:159–198. [Google Scholar]
- 18.Caplin A, Leahy J: Psychological expected utility theory and anticipatory feelings. Quarterly Journal of Economics 2001, 116:55–79. [Google Scholar]
- 19.Schmidhuber J: A possibility for implementing curiosity and boredom in model-building neural controllers. In Proc. of the International Conference on Simulation of Adaptive Behavior Edited by Meyer JAW, S. W: MIT Press/Bradford Books; 1991:222–227. [Google Scholar]
- 20.Kidd C, Hayden BY: The Psychology and Neuroscience of Curiosity. Neuron 2015, 88:449–460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Schultz W, Dayan P, Montague PR: A neural substrate of prediction and reward. Science 1997, 275:1593–1599. [DOI] [PubMed] [Google Scholar]
- 22.Bromberg-Martin ES, Hikosaka O: Lateral habenula neurons signal errors in the prediction of reward information. Nat Neurosci 2011, 14:1209–1216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- * 23.Brydevall M, Bennett D, Murawski C, Bode S: The neural encoding of information prediction errors during non-instrumental information seeking. Sci Rep 2018, 8:6134. [DOI] [PMC free article] [PubMed] [Google Scholar]; This study used a carefully designed task in humans to dissociate informational and monetary prediction errors. They found that both resulted in EEG signals with strikingly similar spatial and temporal response profiles, consistent with the classic feedback-related negativity. This signal is thought to be generated by the mPFC, marking a point of convergence between human and animal data implicating this region in information seeking.
- 24.Holroyd CB, Coles MGH: The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol Rev 2002, 109:679–709. [DOI] [PubMed] [Google Scholar]
- 25.Sajad A, Godlove DC, Schall JD: Cortical microcircuitry of performance monitoring. Nat Neurosci 2019, 22:265–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Stuphorn V, Taylor TL, Schall JD: Performance monitoring by the supplementary eye field. Nature 2000, 408:857–860. [DOI] [PubMed] [Google Scholar]
- 27.Hayden BY, Heilbronner SR, Pearson JM, Platt ML: Surprise Signals in Anterior Cingulate Cortex: Neuronal Encoding of Unsigned Reward Prediction Errors Driving Adjustment in Behavior. Journal of Neuroscience 2011, 31:4178–4187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bryden DW, Johnson EE, Tobia SC, Kashtelyan V, Roesch MR: Attention for learning signals in anterior cingulate cortex. J Neurosci 2011, 31:18266–18274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Monosov IE: Anterior cingulate is a source of valence-specific information about value and uncertainty. Nat Commun 2017, 8:134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kennerley SW, Behrens TE, Wallis JD: Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat Neurosci 2011, 14:1581–1589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wallis JD, Rich EL: Challenges of Interpreting Frontal Neurons during Value-Based Decision-Making. Front Neurosci 2011, 5:124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen X, Stuphorn V: Inactivation of Medial Frontal Cortex Changes Risk Preference. Curr Biol 2018, 28:3709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- * 33.Kobayashi K, Hsu M: Common neural code for reward and information value. Proc Natl Acad Sci U S A 2019, 116:13061–13066. [DOI] [PMC free article] [PubMed] [Google Scholar]; This study made an important advance by using functional imaging in humans to directly test whether the same brain areas have BOLD responses related to the subjective values of information and monetary reward, in a situation where people placed both instrumental and non-instrumental value on information. They reported evidence that these values could be decoded from multiple brain areas, including ventromedial prefrontal cortex, ventral striatum, and the middle frontal gyrus.
- 34.Murray EA, Rudebeck PH: Specializations for reward-guided decision-making in the primate ventral prefrontal cortex. Nat Rev Neurosci 2018, 19:404–417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Burke KA, Franz TM, Miller DN, Schoenbaum G: The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature 2008, 454:340–344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Howard JD, Reynolds R, Smith DE, Voss JL, Schoenbaum G, Kahnt T: Targeted Stimulation of Human Orbitofrontal Networks Disrupts Outcome-Guided Behavior. Curr Biol 2020, 30:490–498 e494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hirokawa J, Vaughan A, Masset P, Ott T, Kepecs A: Frontal cortex neuron types categorically encode single decision variables. Nature 2019, 576:446–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- * 38.van Lieshout LLF, Vandenbroucke ARE, Muller NCJ, Cools R, de Lange FP: Induction and Relief of Curiosity Elicit Parietal and Frontal Activity. Journal of Neuroscience 2018, 38:2579–2588. [DOI] [PMC free article] [PubMed] [Google Scholar]; This study used a series of experiments to measure human information attitudes and their relationship to the uncertainty and expected value of rewards. In their task information preferences were more strongly driven by uncertainty than by expected value, and they reported BOLD signals in multiple cortical areas related to induction and resolution of uncertainty.
- ** 39.White JK, Bromberg-Martin ES, Heilbronner SR, Zhang K, Pai J, Haber SN, Monosov IE: A neural network for information seeking. Nat Commun 2019, 10:5168. [DOI] [PMC free article] [PubMed] [Google Scholar]; This study demonstrated a network of anatomically connected brain areas in monkeys containing neurons that predict the moment of receiving information about uncertain future rewards, and are linked to information seeking gaze shifts. Furthermore, it showed the first causal evidence that specific brain areas regulate the motivation to seek non-instrumental information.
- 40.Daddaoua N, Lopes M, Gottlieb J: Intrinsically motivated oculomotor exploration guided by uncertainty reduction and conditioned reinforcement in non-human primates. Scientific Reports 2016, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Luhmann CC, Chun MM, Yi DJ, Lee D, Wang XJ: Neural dissociation of delay and uncertainty in intertemporal choice. J Neurosci 2008, 28:14459–14466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kolling N, Wittmann MK, Behrens TE, Boorman ED, Mars RB, Rushworth MF: Value, search, persistence and model updating in anterior cingulate cortex. Nat Neurosci 2016, 19:1280–1285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Shenhav A, Cohen JD, Botvinick MM: Dorsal anterior cingulate cortex and the value of control. Nat Neurosci 2016, 19:1286–1291. [DOI] [PubMed] [Google Scholar]
- 44.Heilbronner SR, Hayden BY: Dorsal Anterior Cingulate Cortex: A Bottom-Up View. Annu Rev Neurosci 2016, 39:149–170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ** 45.Hunt LT, Malalasekera WMN, de Berker AO, Miranda B, Farmer SF, Behrens TEJ, Kennerley SW: Triple dissociation of attention and decision computations across prefrontal cortex. Nat Neurosci 2018, 21:1471–1481. [DOI] [PMC free article] [PubMed] [Google Scholar]; This study used an elegant task in which monkeys were allowed to seek multiple pieces of information about the reward probabilities and magnitudes of two options, before making a final decision. Monkeys were biased to keep gathering information about an option even after they had already collected strong evidence that it was the right choice. In parallel, many ACC neurons (but few OFC and dorsolateral prefrontal cortex neurons) reflected the degree to which each new piece of information supported that decision.
- 46.Hunt LT, Rutledge RB, Malalasekera WM, Kennerley SW, Dolan RJ: Approach-Induced Biases in Human Information Sampling. PLoS Biol 2016, 14:e2000638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hikosaka O, Nakamura K, Nakahara H: Basal ganglia orient eyes to reward. J Neurophysiol 2006, 95:567–584. [DOI] [PubMed] [Google Scholar]
- 48.Lak A, Okun M, Moss MM, Gurnani H, Farrell K, Wells MJ, Reddy CB, Kepecs A, Harris KD, Carandini M: Dopaminergic and Prefrontal Basis of Learning from Sensory Confidence and Reward Value. Neuron 2020, 105:700–711 e706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Bari BA, Grossman CD, Lubin EE, Rajagopalan AE, Cressy JI, Cohen JY: Stable Representations of Decision Variables for Flexible Behavior. Neuron 2019, 103:922–933 e927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Hsu M, Bhatt M, Adolphs R, Tranel D, Camerer CF: Neural systems responding to degrees of uncertainty in human decision-making. Science 2005, 310:1680–1683. [DOI] [PubMed] [Google Scholar]
- 51.Hikosaka O: The habenula: from stress evasion to value-based decision-making. Nat Rev Neurosci 2010, 11:503–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Seo H, Lee D: Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J Neurosci 2007, 27:8366–8377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Matsumoto M, Matsumoto K, Abe H, Tanaka K: Medial prefrontal cell activity signaling prediction errors of action values. Nat Neurosci 2007, 10:647–656. [DOI] [PubMed] [Google Scholar]
- 54.Oyama K, Hernadi I, Iijima T, Tsutsui K: Reward prediction error coding in dorsal striatal neurons. J Neurosci 2010, 30:11447–11457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Hong S, Hikosaka O: The globus pallidus sends reward-related signals to the lateral habenula. Neuron 2008, 60:720–729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Takahashi YK, Roesch MR, Wilson RC, Toreson K, O’Donnell P, Niv Y, Schoenbaum G: Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat Neurosci 2011, 14:1590–1597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Takahashi YK, Stalnaker TA, Roesch MR, Schoenbaum G: Effects of inference on dopaminergic prediction errors depend on orbitofrontal processing. Behav Neurosci 2017, 131:127–134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Starkweather CK, Gershman SJ, Uchida N: The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty. Neuron 2018, 98:616–629 e616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Hong S, Amemori S, Chung E, Gibson DJ, Amemori KI, Graybiel AM: Predominant Striatal Input to the Lateral Habenula in Macaques Comes from Striosomes. Curr Biol 2019, 29:51–61 e55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Wallace ML, Saunders A, Huang KW, Philson AC, Goldman M, Macosko EZ, McCarroll SA, Sabatini BL: Genetically Distinct Parallel Pathways in the Entopeduncular Nucleus for Limbic and Sensorimotor Output of the Basal Ganglia. Neuron 2017, 94:138–152 e135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Stephenson-Jones M, Yu K, Ahrens S, Tucciarone JM, van Huijstee AN, Mejia LA, Penzo MA, Tai LH, Wilbrecht L, Li B: A basal ganglia circuit for evaluating action outcomes. Nature 2016, 539:289–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Stephenson-Jones M, Bravo-Rivera C, Ahrens S, Furlan A, Xiao X, Fernandes-Henriques C, Li B: Opposing Contributions of GABAergic and Glutamatergic Ventral Pallidal Neurons to Motivational Behaviors. Neuron 2020, 105:921–933 e925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Li H, Pullmann D, Jhou TC: Valence-encoding in the lateral habenula arises from the entopeduncular region. Elife 2019, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Li H, Vento PJ, Parrilla-Carrero J, Pullmann D, Chao YS, Eid M, Jhou TC: Three Rostromedial Tegmental Afferents Drive Triply Dissociable Aspects of Punishment Learning and Aversive Valence Encoding. Neuron 2019, 104:987–999 e984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Ottenheimer D, Richard JM, Janak PH: Ventral pallidum encodes relative reward value earlier and more robustly than nucleus accumbens. Nature Communications 2018, 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Yun MK T; Nejime M; Yamada H; Matsumoto M: Signal dynamics of midbrain dopamine neurons during economic decision making in monkeys. Sci Adv 2020, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Schultz W, Dayan P, Montague PR: A neural substrate of prediction and reward. Science 1997, 275:1593–1599. [DOI] [PubMed] [Google Scholar]
- 68.Schultz W: Predictive reward signal of dopamine neurons. J Neurophysiol 1998, 80:1–27. [DOI] [PubMed] [Google Scholar]
- 69.Schultz W: Dopamine reward prediction-error signalling: a two-component response. Nat Rev Neurosci 2016, 17:183–195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Bromberg-Martin ES, Matsumoto M, Hikosaka O: Dopamine in motivational control: rewarding, aversive, and alerting. Neuron 2010, 68:815–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Matsumoto M, Takada M: Distinct representations of cognitive and motivational signals in midbrain dopamine neurons. Neuron 2013, 79:1011–1024. [DOI] [PubMed] [Google Scholar]
- 72.Ogasawara T, Nejime M, Takada M, Matsumoto M: Primate Nigrostriatal Dopamine System Regulates Saccadic Response Inhibition. Neuron 2018, 100:1513–1526 e1514. [DOI] [PubMed] [Google Scholar]
- 73.Takahashi YK, Batchelor HM, Liu B, Khanna A, Morales M, Schoenbaum G: Dopamine Neurons Respond to Errors in the Prediction of Sensory Features of Expected Rewards. Neuron 2017, 95:1395–1405 e1393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Stalnaker TA, Howard JD, Takahashi YK, Gershman SJ, Kahnt T, Schoenbaum G: Dopamine neuron ensembles signal the content of sensory prediction errors. Elife 2019, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Mirenowicz J, Schultz W: Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 1996, 379:449–451. [DOI] [PubMed] [Google Scholar]
- 76.Menegas W, Akiti K, Amo R, Uchida N, Watabe-Uchida M: Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli. Nat Neurosci 2018, 21:1421–1430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Steinberg EEG F; Heifets BD; Taylor MD; Norville ZC; Beier KT; Foldy C; Lerner TN; Luo L; Deisseroth K; Malenka RC: Amygdala-Midbrain Connections Modulate Appetitive and Aversive Learning. Neuron 2020, 106. [DOI] [PubMed] [Google Scholar]
- 78.Berlyne DE: Uncertainty and Conflict - a Point of Contact between Information-Theory and Behavior-Theory Concepts. Psychological Review 1957, 64:329–339. [DOI] [PubMed] [Google Scholar]
- 79.Wyckoff LB: Toward a quantitative theory of secondary reinforcement. Psychol Rev 1959, 66:68–78. [DOI] [PubMed] [Google Scholar]
- 80.Shannon CE: A mathematical theory of communication. Bell System Technical Journal 1948, 27:379–423. [Google Scholar]
- 81.Bennett D, Bode S, Brydevall M, Warren H, Murawski C: Intrinsic Valuation of Information in Decision Making under Uncertainty. PLoS Comput Biol 2016, 12:e1005020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Beierholm UR, Dayan P: Pavlovian-instrumental interaction in ‘observing behavior’. PLoS Comput Biol 2010, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Loewenstein G: Anticipation and the Valuation of Delayed Consumption. Economic Journal 1987, 97:666–684. [Google Scholar]
- 84.Iigaya K, Story GW, Kurth-Nelson Z, Dolan RJ, Dayan P: The modulation of savouring by prediction error and its effects on choice. Elife 2016, 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ** 85.Kobayashi K, Ravaioli S, Baranes A, Woodford M, Gottlieb J: Diverse motives for human curiosity. Nat Hum Behav 2019, 3:587–595. [DOI] [PubMed] [Google Scholar]; This study used an cleverly designed behavioral task in humans to separate information seeking motivated by reducing uncertainty vs. viewing high-value cues. Humans had a remarkable diversity of information preferences, with different individuals motivated by the former, the latter, both, or neither of these factors. This is key evidence that information seeking can be driven by multiple mechanisms, or a mechanism that tunes its parameters differently in different individuals.
- 86.Rodriguez Cabrero JAM, Zhu JQ, Ludvig EA: Costly curiosity: People pay a price to resolve an uncertain gamble early. Behav Processes 2019, 160:20–25. [DOI] [PubMed] [Google Scholar]
- 87.Zentall TR, Andrews DM, Case JP: Contrast between what is expected and what occurs increases pigeon’s suboptimal choice. Anim Cogn 2019, 22:81–87. [DOI] [PubMed] [Google Scholar]
- 88.Golman RL G: Information gaps: A theory of preferences regarding the presence and absence of information. Decision 2018, 5:143–164. [Google Scholar]
- 89.Sharot T, Sunstein CR: How people decide what they want to know. Nat Hum Behav 2020, 4:14–19. [DOI] [PubMed] [Google Scholar]
- 90.Marvin CB, Shohamy D: Curiosity and reward: Valence predicts choice and information prediction errors enhance learning. J Exp Psychol Gen 2016, 145:266–272. [DOI] [PubMed] [Google Scholar]
- 91.Badia P, Harsh J, Abbott B: Choosing between Predictable and Unpredictable Shock Conditions - Data and Theory. Psychological Bulletin 1979, 86:1107–1131. [Google Scholar]
- 92.Oosterwijk S: Choosing the negative: A behavioral demonstration of morbid curiosity. PLoS One 2017, 12:e0178399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Oosterwijk S, Lindquist KA, Adebayo M, Barrett LF: The neural representation of typical and atypical experiences of negative images: comparing fear, disgust and morbid fascination. Soc Cogn Affect Neurosci 2016, 11:11–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- * 94.Wang MZ, Hayden BY: Monkeys are curious about counterfactual outcomes. Cognition 2019, 189:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]; This creative study showed evidence that monkeys not only prefer information about the actual outcomes they will receive, but also the outcomes they would have received if they had chosen a different option. Such counterfactual outcomes were previously shown to be encoded in ACC, putting them in a good position to drive information preferences.
- 95.Miller SM: Monitoring Versus Blunting Styles of Coping with Cancer Influence the Information Patients Want and Need About Their Disease - Implications for Cancer Screening and Management. Cancer 1995, 76:167–177. [DOI] [PubMed] [Google Scholar]
- 96.Golman R, Hagmann D, Loewenstein G: Information Avoidance. Journal of Economic Literature 2017, 55:96–135. [Google Scholar]
- 97.Geisler WS: Visual perception and the statistical properties of natural scenes. Annu Rev Psychol 2008, 59:167–192. [DOI] [PubMed] [Google Scholar]
- 98.Purves D, Wojtach WT, Lotto RB: Understanding vision in wholly empirical terms. Proc Natl Acad Sci U S A 2011, 108 Suppl 3:15588–15595. [DOI] [PMC free article] [PubMed] [Google Scholar]
