Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2021 Jan 11;376(1819):20190665. doi: 10.1098/rstb.2019.0665

The description–experience gap: a challenge for the neuroeconomics of decision-making under uncertainty

Basile Garcia 1, Fabien Cerrotti 1, Stefano Palminteri 1,
PMCID: PMC7815421  PMID: 33423626

Abstract

The experimental investigation of decision-making in humans relies on two distinct types of paradigms, involving either description- or experience-based choices. In description-based paradigms, decision variables (i.e. payoffs and probabilities) are explicitly communicated by means of symbols. In experience-based paradigms decision variables are learnt from trial-by-trial feedback. In the decision-making literature, ‘description–experience gap’ refers to the fact that different biases are observed in the two experimental paradigms. Remarkably, well-documented biases of description-based choices, such as under-weighting of rare events and loss aversion, do not apply to experience-based decisions. Here, we argue that the description–experience gap represents a major challenge, not only to current decision theories, but also to the neuroeconomics research framework, which relies heavily on the translation of neurophysiological findings between human and non-human primate research. In fact, most non-human primate neurophysiological research relies on behavioural designs that share features of both description- and experience-based choices. As a consequence, it is unclear whether the neural mechanisms built from non-human primate electrophysiology should be linked to description-based or experience-based decision-making processes. The picture is further complicated by additional methodological gaps between human and non-human primate neuroscience research. After analysing these methodological challenges, we conclude proposing new lines of research to address them.

This article is part of the theme issue ‘Existence and prevalence of economic behaviours among non-human primates’.

Keywords: neuroeconomics, description–experience gap, reinforcement learning, decision-making, macaque, risk

1. The neuroeconomic research programme

The expected utility model was established as the standard normative model of decision-making under risk [1,2]. Integrating Bernoulli's intuition about the curvature of the utility function and probability theories, von Neumann and Morgenstern demonstrated that choices based on the expected utility (i.e. the product between the utility of an outcome and its probability) satisfies four basic axioms of rationality (completeness, transitivity, continuity and independence). Historically, the neoclassical economics research programme disregarded the study of the internal processes governing economic behaviours. Keynes' animal spirits [3] were considered unmeasurable, and economic theory was built on the assumption that the human mind as well the brain were ultimately black boxes. The ‘as-if’ hypothesis [4] illustrates this position by endorsing an instrumentalist epistemology: theory predictive power prevails on the realism of its initial assumptions. Accordingly, it was considered acceptable to rely on unrealistic assumptions regarding the unbounded cognitive capacities or perfect knowledge of economic agents, as far as the predictions were sufficiently accurate.

However, with the accumulation of behavioural evidence against the standard normative expected utility model, it soon appeared that it had to be profoundly amended to successfully account for actual decisions under risk [5,6]. Positive, descriptive, models of decision-making under risk that integrate insights from psychology, such as the notion of bounded rationality (i.e. humans display limited computational capacities), heuristics (taking computational shortcuts to make decisions) and biases (systematically distorted representations of behavioural variables) were then proposed and formalized [79]. Among the descriptive theories of decision under risk and uncertainty, ‘prospect theory’ (PT) had a strong empirical ground and stood out [8,10]. PT postulates that expected utility is calculated relative to a reference point (the frame), an asymmetric treatment of gains and losses (loss aversion), as well as a subjective weighting of probabilities (probability distortion). PT successfully explained known paradoxes (such as the Allais's paradoxes) and new ones (e.g. the Asian disease paradox, as well as a certain number of ‘real life’ irrational behaviours [11,12]).

However, despite these successes, some aspects of the descriptive approach, in general, and PT, in particular, remained unsatisfactory. First, it remained difficult to ultimately arbitrate between competing descriptive theories solely based on behavioural data. For instance, alternative behavioural theories have been proposed (such as rank-dependent utility, regret and disappointment theories; see [13] for a review) that make overlapping predictions with PT, making them hard to disentangle. Second, while making accurate predictions, PT, and other descriptive theories, do not specify which are the actual cognitive operations and how they are implemented by the brain. In terms of the Marrian analysis of modelling, PT (as other descriptive theories) is situated at the computational level that specifies which is the goal of the agent (in this case: maximizing a subjective utility that includes reference point dependence, loss aversion and probability deformation), but is silent concerning the algorithmic (i.e. what are the operations involved in the manipulation of decision variables) and implementational levels (i.e. how these operations are physically embodied and realized) [14].

A couple of decades later the time was ripe for a group of scholars of diverse origins to seek in neuroscientific data the way to overcome the limitations of descriptive theories, developed by psychologists and behavioural economists. This was facilitated by the rapid development of non-invasive neuroimaging techniques in humans (most notably functional magnetic resonance imaging: fMRI [1517]) and improvement of single-unit electrophysiological recordings in monkeys [18,19]. The hope was (and still is) that, taking advantage of neuroscientific methods and concepts, neuroeconomics (as this raising field was named), would be able to address the epistemological issues of economic theories highlighted above.

Concerning adjudicating on competing theories (our first issue), by opening the brain ‘black box’ functional neuroimaging studies would provide an additional crucial observable measure—blood oxygen level dependent signal (BOLD: an aggregate and indirect measure of neural electrical activity), to compare, falsify and ultimately refine behavioural models. We define this approach as the weak neuroeconomic agenda, as it does not involve rewriting economic descriptive theories [2022]. Coming back to our example, while making similar behavioural predictions in respect of preferences under risk, different theories postulate different utility functions that can be searched in the brain [2325]. Assuming one knows where to look for utility representation in the brain,1 it would be, in principle, possible to assess which model better predicts its activity (a sort of neural model comparison: see [29]). Beyond comparing different theories, the neural activity could in principle help refining a theory by fixing some of its parameters. For instance, in many circumstances, PT is silent about how the reference point should be set [30]. Assuming one knows where to look for positive (gain) and negative (loss) utility representations in the brain, in some cases the reference point could be inferred comparing the profile of activity of the ‘gains’ and ‘losses’ areas2 [25,33].

Concerning building new theories (second issue), accepting the fundamental ontological tenet that (economic) decisions ultimately stem from neural activity in the brain (which is a standard materialistic and monistic solution to the mind-body problem, see [34]), entails that neuroscientific methods should provide the conceptual and methodological tools necessary to develop new, neurobiologically grounded, neural models encompassing the algorithmic and implementational levels. By contrast with the previous approach, we define this approach as the strong neuroeconomic agenda, as it involves rewriting economic theories in neurobiological terms. By integrating biological constraints and cost functions, these hypothetical neurobiologically grounded economic models have the potential of explaining why human decision-making presents certain biases from a biologically (not logically or statistically) normative perspective [35,36].

The methodological requirements of the two main neuroeconomics agenda are not quite the same. The weak neuroeconomic agenda can, in principle, be fulfilled by experiments relying on aggregate and indirect measures of the neural activity, such as the BOLD signal recorded by fMRI scanners in areas encoding subjective values. Furthermore, since the goal is arbitrating between different behavioural theories of decision-making developed by psychologists and economists, the experiments belonging to this research agenda should be preferentially (if not exclusively) performed in humans.

On the other side, as neural models are, ultimately, models of which information is encoded in neurons and how neurons are connected (networks), the strong neuroeconomic agenda research programme cannot be pursued only relying on fMRI neural signals.3 In fact, BOLD signal, at its best resolution, aggregates over thousands of neurons [3739]. Furthermore, it is still unclear to which extent it reflects presynaptic or postsynaptic activity (probably a mixture of both) [39,40]. Such neural models should eventually be validated based on the recording of single-cell activities, which is, for obvious ethical reasons, nearly impossible in humans.4 This is why neuroeconomics research, from its very inception, strongly relies on electrophysiological research on animal models, which have been employed in the study of neural mechanisms and cognition for almost 80 years [42]. Monkeys (especially rhesus monkey: Macaca mulatta), are particularly popular models, because they present a wide behavioural repertoire and high degree of neuro-anatomical homology with humans, especially concerning the prefrontal cortices that underpin decision-making [43].

In figure 1, we represent what a prototypical workflow should look like to combine human and monkey data to deliver a neural model of decision-making. Of note, we describe it from an abstract perspective of theory-building, but in reality, its different steps can occur simultaneously (or in reverse order), and in very distant laboratories. Once identified as a behavioural process of interest (e.g. decision-making under uncertainty), a behavioural protocol is designed (typically, a series of choice problems involving different amounts of rewards and probabilities) and administered to both humans and monkeys. If the behaviour is comparable across species (meaning that the monkey represents a valid experimental model of human behaviour5), functional imaging in humans can then be deployed to identify neural targets encoding macroscopic variables (e.g. probabilities, outcomes) that are later used to guide the selection of the areas where neurons will be recorded in monkeys. A desirable intermediate step, to reinforce the functional correspondence between human and monkey brain activations, would be to also deploy fMRI in monkeys [45]. Similarly, in some neurologic and psychiatric diseases, intra-cranial neural activity can also be recorded in humans [41]. Finally, all these data can then be combined together to propose and validate a neurobiologically plausible model of the behavioural process of interest. Thereafter, the proposed model should be validated using lesions and assessing its generalizability. Methods such as trans-cranial magnetic stimulation and brain lesions can be used to test the alleged causal relationship between neural correlates and behavioural processes [4648]. The model's ability to generalize can be assessed by generating predictions in tasks involving different decision problems and behavioural processes (out-of-sample validation).

Figure 1.

Figure 1.

Prototypical workflow combining human (purple) and monkey (green) data to pursue the strong neuroeconomic agenda. Dotted lines designate optional steps. (Online version in colour.)

A crucial step in this workflow is checking that humans and monkeys display the same behavioural processes and biases as a result of a true homology. This is something notoriously tricky to assess, because several, to some extent unavoidable, methodological differences exist between human and non-human primate research.

The foundational experimental paradigm of behavioural decision-making research consists in making choices between ‘lotteries’ or ‘gambles’, i.e. options associated with known or unknown probabilities of obtaining different outcomes [2,5]. According to the gambling metaphor of individual choice [49], lotteries are believed to be prototypical of real-life decisions [50]. Outcomes and their probabilities are described to participants, who often (especially in the first generation of behavioural economics studies) make only one or very few decisions, without being informed about the outcome of their choices (in general to purposely prevent learning processes from influencing decision-making [51]). On the other side, monkey electrophysiological research adopts very different methodological standards. For various reasons (including ethical ones), monkey studies are limited in terms of sample size, and consequently, the number of observations per subject is greatly increased in order to increase statistical power and reduce measurement noise. In fact, behavioural tasks in monkeys display a greater number of trials per subject, collected on a sample size of often less than five subjects (e.g. [52,53]). Both parameters (sample size and number of trials) are roughly a couple of orders of magnitude different compared to what is common practice in behavioural economics (e.g. [54,55]) (figure 2a). Interestingly, fMRI studies of decision-making present experimental parameters somehow in-between those used in monkeys and human studies: they usually involve hundreds of trials and also sample sizes of about 20–40 subjects (see two notable examples in neuroeconomics: [25,56]). Assuming that decision-making possesses ergodicity (i.e. the behaviour averaged across trials is the same as the behaviour averaged across subjects), different ratio trial/participants per se should not present a big challenge to compare results from human and monkey studies (but note that ergodicity does not seem to be granted for psychological processes, see [57]). However, in addition to these quantitative differences, in monkey studies, an outcome (usually a primary reward) is provided on a trial-by-trial basis. This is because a monkey would simply stop doing the experiment in the absence of extrinsic motivation. Thus, in virtually all cases monkey experiments include a reinforcement learning component, where actions are associated with past outcomes. This is true even when the paradigm involves establishing a symbolic system to communicate outcomes and probabilities. In fact, in the absence of a shared language or semantic system to communicate, monkeys are compelled to learn any representational system by trial-and-error from feedback.

Figure 2.

Figure 2.

Methodological differences between description, experience and description–experience studies. (a) Sample size and number of trials listed in two electrophysiological studies [52,53], two human fMRI studies [25,56] and two human behavioural studies [54,55]. (b) Successive screens of a trial in the different behavioural decision-making paradigms. In pure 'description' paradigms, decision variables are explicitly described and no feedback is provided. In pure 'experience' paradigms, decision variables are hidden and feedback is provided on a trial-by-trial basis. In the ‘description plus experience’ paradigms, decision variables are explicitly described and feedback is provided on a trial-by-trial basis. (Online version in colour.)

In the present article, we argue that the above-mentioned differences do not only present a technical issue, but also a major epistemological challenge for the (strong) neuroeconomic agenda. We detail why below.

2. The experience–description gap

As mentioned before, foundational contributions to behavioural decision-making research were made through the use of explicitly described gambles. Several representations have been used to convey outcome values and probabilities, including textual and numerical descriptions (e.g. [5,8,54]), later replaced by visual cues such as pie-charts (e.g. [25,58]). In these paradigms, the information pertaining to the decision-relevant variables is processed by verbal and mental calculation systems and relies upon some degree of semantic knowledge to decode the meaning of the symbols used. In addition to that, decision problems were usually presented only once and, in case multiple decision problems were used, the final outcome (i.e. the realization of the lottery) was usually not displayed on a trial-by-trial basis (figure 2b).

However, relatively few situations in real life match the characteristics of the pure description-based paradigms, namely complete and explicit information about outcome values and probabilities. In fact, in many circumstances, it seems rather prudent to assume that information about outcome values and probabilities are shaped by past encounters of the same decision problem. Experimentally, this configuration is often translated into multi-armed bandit problems (starting with Thompson [59], but see [60] for a review), where the decision-maker faces abstract cues of unknown value and has to figure by trial-and-error the value of the options. Computationally, behaviour in multi-armed bandit problems is generally well-captured by associative or reinforcement learning processes [61]. In the early 2000s, a line of enquiry arose where researcher translated the typical decision problems used in behavioural economics (i.e. involving choices between a safe and a risky prospect in the gain and loss domain6) into experience-based paradigms [55,63,64] (figure 2b). Systematic comparisons between these two decision-making modes revealed the existence of robust description–experience gaps regarding risk preferences in humans [6567] . More precisely, probability weighting functions eventually show opposite deformations when comparing description-based and experience-based choices (figure 3, box 1). In particular, most of the tenets of PT do not seem to hold in experience-based choices [8]. While traditionally, in the description domain, the occurrence of rare events is overestimated (possibility effect) and the occurrence of frequent events is underestimated, experience-based decisions tend to show the opposite biases: an effect that is only partially explained by incomplete sampling [55,63,64,66].

Figure 3.

Figure 3.

(a) Illustration of the nonlinear transformation of probabilities in description (left panel) and experience (right panel). In the description domain, subjective probability is reflected by a probability weighting function (here denoted π) following an inverse S-shape (i.e. low probabilities are overweighted while high probabilities are underweighted). This tendency is reversed when it comes to the experience domain, where the curve follows an S-shape. (b) Illustration of the classical linear utility function in the description domain (left panel) and the update of the value function for the experience domain (right panel). In description, the utility curve displays a steeper slope for losses than gains. In experience, an opposite phenomenon is frequently observed. The sign of the prediction error (i.e. the difference between the obtained reward R and the associative value Q) affects the learning rate.

Box 1. Description- and experience-based behavioural models.

In this box, we sketch the formalisms standardly employed to explain and quantify risk preferences in description-based and experience-based decisions. Description and experience paradigms radically differ in how they model decision under risk. In the description domain, risk preferences are the direct result of subjective deformations of probabilities and outcomes that are explicitly stated. On the other side, in the experience domain there is no separate representation of outcomes' probabilities and no explicit deformation of outcomes’ values. Consequently, risk preferences are the indirect result of the learning process that links past outcome information to subsequent choices. Eventually, these two approaches lead to different explanations of risk attitudes.

Risk preferences in description-based paradigms are commonly explained by prospect theory (PT). The expected value of k iterations of the same gamble X (which is random variable) is computed as follows:

E(X)=i=1kpixi,

where xi is the value of an individual outcome and pi is the objective probability of the outcome. PT states that the utility of an outcome, that is the subjective value u(xi), is nonlinear and modulated by different parameters: α and β, that are the power to which, respectively, a positive or negative outcome are elevated, and λ the loss aversion coefficient. Thus, the PT utility function is defined as follows:

u(xi)={xiαifxi0λ(xi)βifxi<0,

an α ≤ 1 corresponds to risk aversion in the gain domain (the intuition dates back to Bernoulli), α > 1 corresponds to risk-seeking behaviours. In the loss domain, the same relation is true concerning the values of β. A value of λ > 1 corresponds to loss aversion; its typical empirical value is around 2 [10,68]. A decision-maker with α < 1, β > 1 and λ > 1 will present different risk preference in the gain (risk aversion) and the loss (risk seeking) domain (figure 3b).

In addition, PT postulates a subjective deformation of probabilities. There are multiple ways to mathematically express the probability weighting function. One of the most common is the ‘Prelec’ function [69]:

π(pi)=eδ(log(pi))γ

with δ controlling the elevation and γ the curvature. When both parameters are set to 1, the function tends to linearity. The more γ > 1, the more the function adopts an S-shape. A classical result is the overweighting of low probabilities compared to high probabilities, where the direction of the curve follows an inverse S-shape (figure 3a), with γ < 1. Note that another probability weighting function has been proposed [54]. Finally, the subjective expected utility is given by

SEU(X)=i=1kπ(pi)u(xi).

By the variation of these parameters, PT accounts for inter-individual differences in risk preferences. Of note, concurrent theories such as regret theory [70] or rank-dependent utility models [71], which use very different representational structures and parameterizations, are also used to model decision-making under risk.

Experience-based paradigms can be seen as reinforcement learning problems operationalized as k-armed bandit tasks [61]. Consider an environment composed by a state vector S, with sS. In each of states s, there are available actions denoted aA. Each state-action pair has an underlying reward probability distribution, such that P[R|s, a], is the probability of obtaining the reward R, knowing the state-action couple (s, a). An agent must then follow a policy in order to maximize a state-action value function Q(s, a) (i.e. to maximize the average expected reward). A common learning policy is to compute subsequently to a choice of the prediction error δ, that will be used to incrementally update the value associated to a specific state-action pair (s, a):

δ=RQ(s,a)
Q(s,a)Q(s,a)+αδ

with α the learning rate that determines to what extent newly acquired information overrides the previous. In this paradigm, inter-individual variability in behaviours can be accounted for by differences in individual parameters such as the aforementioned learning rate α. However, this model with only one parameter is too simple to accommodate different risk preferences.

A way to refine this model to account for different risk preferences, is to allow for two different learning rates, α+ and α:

Q(s,a)Q(s,a)+{α+δifδ>0αδifδ<0

If α+ = α, the two learning rates model is equivalent to a one learning rate model. We define the tendency to preferentially update Q(s,a) from positive prediction errors rather than negative prediction errors as positivity bias (or loss neglect) (α+ > α). Conversely, we define the opposite situation (α+ < α) as negativity bias (or loss enhancement).

The learning rate asymmetry has direct consequence for risk preferences in the setting where a subject has to learn the value of a safe (say a fixed value of 0) and a risky (say 50% chance of winning/losing one euro) option. A subject displaying a positivity bias would neglect the past losses and will, therefore, be a risk-seeker (figure 3b). Conversely, the negativity bias implies risk aversion. Both pessimistic and optimistic biases have been reported in the literature, with the latter bias being more frequently reported [7275].

While it is tempting to see the positivity bias as the experience-based antithesis of loss aversion, their formalism and psychological interpretations are quite different and they are, therefore, not mutually exclusive. Indeed, loss aversion concerns the valuation of prospective losses, while the positivity bias concerns the retrospective assessment of past losses.

It is important to note that, in humans, although the average values of the behavioural biases are reported as described above (for instance: inverse S-shape in description-based paradigms and loss neglect in experience-based paradigms; see figure 3a), their results are further tempered by a high degree of inter-individual variability in the bias parameters. At the individual level, some subjects may in fact display opposite biases in both experimental settings [72,76]. If inter-individual variability is equally high in other primates, the fact that monkey studies use very small sample size (figure 2) can contribute to explaining the comparably less consistent picture observed (table 1).

In description-based choices, a behavioural hallmark of loss aversion (overweighting of negative outcomes) is the reflection effect, where subjects are risk averse in the gain domain and risk seeking in the loss domain. The opposite pattern has been repeatedly found in the experience-based decisions [67]. This observation may be explained by biases in the learning process, such as remembering preferentially extreme outcomes or integrating preferentially better-than-expected outcomes [72,77]. Finally, a smaller subset of studies investigated a hybrid situation where decision problems are fully described, choices are repeated and followed by a trial-by-trial feedback. These ‘description plus experience’ paradigms showed that probability distortions compatible with prospect theory are initially present, but corrected by the presence of feedback [78,79]. To summarize, the whole spectrum of decision-making under uncertainty in humans is far from being fully captured by PT's loss aversion and subjective probability deformation. Specifically, different descriptive models seem to apply as a function of how outcome and probability information is conveyed. In what remains of the paper, we illustrate why we believe that this feature seriously challenges leveraging on neural and behavioural data in monkeys to build a neural model of decision-making under uncertainty.

3. Decision under risk in monkeys

In this section, we try to address the question of whether monkeys are a good experimental model for human decision-making under uncertainty. We will focus this survey on rhesus monkey (Macaca mulatta) results because most electrophysiological studies are performed in this species (but see [44] for a more detailed review including other primates). Asking whether monkeys are a good experimental model translates into asking whether in the laboratory setting their behaviour displays the distinctive features and biases observed in humans. We stress again that the comparison is complicated by the fact that pure description-based paradigms cannot exist in monkey studies because of the lack of language. In fact, in monkey studies, whenever outcomes and probabilities are conveyed via a symbolic system, the system is nonetheless learned and maintained by trial-by-trial outcomes (i.e. a situation similar to the ‘description plus experience’ paradigm, described above). In such ‘pseudo’ description-based paradigm, monkeys are trained to associate continuous variations in one visual feature (e.g. colour or size) to continuous variations of a decision variable (e.g. outcomes or probabilities). The comparison is further complicated by the fact that only few studies formalize risk preferences in terms of model parameters (such as probability distortion, loss aversion or learning rates) and data reporting is often limited to behavioural measures.

The general picture (table 1) emerging from ‘pseudo’ description-based paradigms in monkeys (i.e. studies relying on learned symbolic systems to communicate values) is, at best, mixed. PT has been explicitly tested in paradigms using visual cues carrying symbolic information similar to those presented to humans (e.g. pie-charts). Only a few studies show results in conformity with the pattern of description-based decisions observed in humans. Risk aversion, suggestive of marginally decreasing utility in the gain domain, has been rarely reported [93]. Nioche et al. [98] is the sole study confirming all PT features: marginally decreasing utility (risk aversion in the gain domain), loss aversion (risk seeking in the loss domain) and subjective probability weighting consistent with overestimation of rare events. Probability weighting function consistent with standard PT has been reported by other studies, but the same studies also reported increasing marginal utility and risk seeking in the gain domain, which is not typically observed in description-based decisions in humans [95,97]. Many others pseudo description-based experiments also reported risk-seeking attitudes and/or marginally increasing utility in gains [91,92,94,96]. In addition, although the traditional inverse probability weighting function has sometimes been observed [95,98], variation of experimental design features (such as randomly mixing gambles instead of repeating the same gambles sequentially) can reverse the direction of the probability weighting function [99].

Table 1.

Studies investigating risk attitudes in rhesus monkeys. E, experience-based paradigms (i.e. without explicit representation of outcomes and probabilities); D, description-based paradigms (i.e. involving explicit representation of outcomes and probabilities; note that in monkeys this implies a 'description plus experience' set-up); liquid, the utilization of either water or fruit juice; tokens, the acquisition of a secondary reward, which is later exchanged for a primary reward; seek, an overall preference for the risky option; avoid, an overall preference for the safe option; inverse S-shape, the probability distortion postulated by prospect theory; S-shape, the probability distortion traditionally found in experience-based paradigms; N/A, the information is not available.

study sample size modality reward risk attitude in gains risk attitude in losses probability distortion loss aversion
McCoy & Platt [80] 2 E liquid seek N/A N/A N/A
Hayden & Platt [81] 2 E liquid seek N/A N/A N/A
Hayden et al. [82] 5 E liquid seek N/A N/A N/A
Long [83] 3 E liquid seek N/A N/A N/A
Watson [84] 8 E liquid seek N/A N/A N/A
O'Neill & Schultz [85] 2 E liquid seek N/A N/A N/A
Heilbronner et al. [86] 3 E liquid seek N/A N/A N/A
Kim et al. [87] 2 E liquid seek N/A N/A N/A
Heilbronner & Hayden [88] 2 E liquid seek N/A N/A N/A
Xu & Kralik [89] 2 E liquid seek N/A N/A N/A
Smith et al. [90] 7 E liquid seek seek N/A N/A
Hayden et al. [91] 4 D liquid seek N/A N/A N/A
So & Stuphorn [92] 2 D liquid seek N/A N/A N/A
Yamada et al. [93] ? D liquid avoid N/A N/A N/A
Raghuraman & Padoa-Schioppa [94] 2 D liquid seek N/A N/A N/A
Staufer et al. [95] 2 D liquid seek seek inverse S-shape N/A
Farashahi et al. [96], experiment 1 3 D liquid seek N/A none N/A
Farashahi et al. [96], experiment 2 3 D token seek seek S-shape N/A
Chen & Stuphorn [97] 2 D liquid seek seek inverse S-shape N/A
Nioche et al. [98] 2 D liquid avoid seek inverse S-shape yes
Ferrani-Toniolo et al. [99] experiment 1 2 D liquid N/A N/A inverse S-shape N/A
Ferrani-Toniolo et al. [99], experiment 2 2 D liquid N/A N/A S-shape N/A
Eisenreich et al. [100] 3 D liquid seek seek N/A N/A

Regarding ‘pure’ experience-based studies in monkeys (i.e. involving no symbolic system to communicate values), the picture is somehow clearer. Indeed, rhesus macaques exhibit robust risk-seeking behaviour in the gain domain [8089]. Risk-seeking attitudes have also been reported in the loss domain [90].

Risk-seeking behaviour in experience-based studies can be computationally explained by an increased sensitivity to positive (compared to negative) prediction errors (‘positivity’ bias) which is generally documented in human reinforcement learning (box 1) [7274]. This hypothesis is corroborated by studies demonstrating a stronger impact of past positive outcome in choices using either model-free or model-based measures [81,82,101].

Finally, it can be argued that if monkeys are a good model for human decision-making under uncertainty, they should display a description–experience gap. To our knowledge, so far only one study explicitly tackled this issue [102]. Monkeys were asked to make repeated choices between safe, and risky options, whose outcome probability was either learned by experience or described by the ratio between colours on a rectangle. Replicating previous findings in monkeys, and in discordance with the standard result in humans, Heilbronner and Hayden found that monkeys were risk-seekers in the description domain. However, consistent with the gap observed in humans, they also found that risk-seeking behaviour was higher for experience-based cues.

To summarize, the literature seems to suggest that monkeys' decision-making for experience-based choice is quite consistent with what is observed in humans in terms of risk preference. This is consistent with a large body of literature showing that the neural substrates of reinforcement learning are largely preserved in the two species [103,104]. Risk seeking in this context may be driven by a higher learning rate from positive compared to negative prediction errors, which is essentially a computational reinforcement learning translation of the ‘hot hand’ fallacy [105,106]. The situation is much less reassuring concerning description-based decisions, as preferences compatible with PT are rarely observed. This can be due to the fact that pseudo description-based design in monkeys resembles the ‘description plus experience’ set-up in humans, where PT-like deformations are blunted or even disappear, as if description-based and experience-based biases reciprocally cancel themselves [78,79]. As a result, it remains unclear to what extent description-based processes can be elicited in the non-human primate animal model.

4. The impact of other experimental differences

Experimental results concerning decision-making under uncertainty in monkeys do not seem to straightforwardly comply with the predictions of PT. Overall it seems that monkeys' behaviour is better accounted for as an experience-based decision process, which is consistent with the fact that pure description-based paradigms are not possible and monkey experiments always involve trial-by-trial feedback. The systematic presence of trial-by-trial feedback is not the only systematic methodological difference between the monkey and human studies (figures 2 and 4).

Figure 4.

Figure 4.

The figure illustrates how human (purple) and monkey (green) experimental settings map into a four-dimensional space, whose axes are: the way value information is provided (from description to experience); the nature of the reward (from primary to secondary; a), the amount of training (from moderate to extreme; b) and the level of the stakes (from low to high; c). (Online version in colour.)

First, monkey studies essentially rely on primary rewards (mainly water or fruit juice), while human studies are realized mainly with secondary rewards (sometimes hypothetical ones) and primary reinforcers are only occasionally used [107,108]. Preliminary evidence from a study comparing risk propensity for different kinds of rewards in humans (money versus sport beverage) and monkeys showed similar patterns in the two species, thus suggesting that in more comparable experimental condition risk preferences in both species could converge [109]. Furthermore, while the neural correlates of different kinds of rewards converge in the ventral prefrontal and striatal systems (principle of the common currency; [110]) they also have specific correlates, which may contribute to the different neural mechanisms and result in distinct, reward-specific, risk preferences [107]. On the other side, a proxy for secondary reward can be found in monkey paradigms that involve collecting (virtual) tokens to be later exchanged for a primary reward. Unlike pure primary reward tasks, where losses cannot be implemented (it is impossible to take some fruit juice away from the stomach of a monkey), tokens have the advantage of making possible subtracting previously acquired rewards from the animal, thus inducing ‘losses’ in the same manner as in human. However, a recent study using tokens, showed risk-seeking attitudes comparable to that observed using primary reward [96]. Furthermore, when tokens are used, they are almost immediately changed against primary reward, making them not really comparable to money, whose value is much more permanent. Taken together, the available evidence suggests that the primary/secondary reward dichotomy does not explain the fact that human description-based biases are hardly observed in monkeys.

Second, in addition to the difference in the nature of the reward, description-based paradigms in humans and paradigms in monkeys often present a systematic difference in the amount of the reward (figure 4). Indeed, most of the original studies about PT used hypothetical gambles of hundreds of dollars and the same biases have been replicated using real stakes of about a month's salary [111]. On the other side, monkey studies use very small amounts of rewards (mere drops of liquids). It has been argued that part of the description–experience gap may simply derive from this difference in stake instead of being induced by fundamental differences in the decision-making process [88]. This would be consistent with Markowitz utility function which supposes risk seeking for small stakes (peanuts effect) before converting to risk aversion for higher stakes [112] and is supported by the finding that increasing the relative amount of reward (by reducing its frequency) decreases risk seeking down to risk neutrality in monkeys [88,112]. However, risk aversion in the gain domain (and a reverse pattern in the loss domain: the reflection effect) has also been observed with small stakes in description-based decisions in humans [67]. Thus, available evidence suggests that differences in the size of the stake cannot fully explain the fact that human description-based preferences are hardly observed in monkeys.

Finally, another notable difference between human and monkey experiments is represented by the amount and the type of training required to perform the task (see figures 2 and 4). In human experiments, task training rarely takes more than a few minutes (in some extreme cases of description-based paradigms, there is virtually no training: subjects are just asked to reveal their preferences). On the other side, monkey experiments require extensive training, in general spanning several months (usually training takes longer than the experiment itself). It can be, therefore, argued that their behaviour becomes to some extent habitual or automatized: a cognitive state that contrasts dramatically with the declarative and deliberative stance of description-based choices taken by humans [113]. In addition to that, training in monkeys (and other animals) often involves simplified versions of the task (often deterministic contingencies), which may reinforce specific risk preferences. Although the role of extended (several days, weeks) training and the resulting behavioural automation (or habituation) in risk preferences is unclear, it may contribute to the fact that human description-based biases are rarely observed in monkeys.

5. Conclusion and perspectives

Our review suggests that the rhesus monkey is a partial model of human decision-making under uncertainty. Risk preferences in monkeys are generally better explained as experience-based processes. Accordingly, monkeys proved to be a very good model of human reinforcement learning processes, providing crucial insights into its neural implementation (the dopamine prediction error hypothesis: [56,62,114]). The situation is less clear concerning description-based choices. In paradigms using explicit symbolic information about decision variables, monkeys only rarely displayed risk preferences compatible with human results. Deciding by description implies a symbolic system of communication. While in humans this system pre-exists (language), in monkeys it has to be learnt by trial-and-error, thus irremediably confounding description and experience. In addition to differences in the way value information is conveyed (experience- or description-based), other methodological factors (training, reward type and stakes) further drive apart the experimental set-ups of the two species. This situation is problematic as building a neural model of decision-making under uncertainty should integrate human (fMRI) and monkey (single unit) neurophysiological data, while explaining risk preferences in a wide range of situations that span from pure description-based choices to pure experience-based choices.

We propose further lines of research that could eventually help filling these gaps and ultimately fulfilling the strong neuroeconomic agenda. On the human side, the description–experience gap has been extensively studied at the behavioural level, but surprisingly neglected at the neural level. A notable exception [115], found different neural representations for description- and experience-oriented decisions. Furthering this line of enquiry would prove useful to redefine the target areas to look specifically for description-based processes in monkey electrophysiological studies.

With the development of online testing techniques, it is becoming easier to implement extended massive training in humans [116]. Translated in the field of decision-making under risk, these experiments would provide crucial insights into the impact of extensive training in risk preferences. While, description-based studies in monkeys require learning ex novo a symbolic system, in humans the meaning of pie-charts is provided by the language. It would be interesting to put humans in situations where they have to figure out by trial-and-error the code linking continuous visual features to decision variables.

In general, all the efforts aimed at increasing the methodological overlap between human and monkey studies will provide further insights into what are the behavioural processes shared across the two species. Popularizing fMRI experiments in monkeys would help confirm the neuro-anatomical targets and increase the focus on shared neural systems. The token paradigm (conceptually closer to the notion of the secondary reward) offers the possibility to implement losses in monkeys, hence facilitating the cross-species study of loss aversion.

Finally, on the monkey side, PT has been sporadically replicated. It will be important to clarify and formalize the experimental factors (in terms of stimuli, training and reward type; see table 1) that predict whether PT-like behaviour will be observed in a monkey experiment [88]. Determining under which experimental conditions PT is replicated in monkeys will imply a deeper understanding of the cognitive mechanisms underlying decision-making under uncertainty.

Endnotes

1

Subjective utility (or subjective value) representation seems to be distributed across a network of areas that include the ventral and the dorsal prefrontal cortices (both medial and lateral part), posterior cingulate cortex, the striatum, the insula, the amygdala and the hippocampus [2628].

2

It is indeed the case that brain systems encoding positive and negative values are, at least partially, dissociable. Losses are generally encoded by the insula, the amygdala and the dorsal prefrontal cortex, while gains are generally encoded in the ventral prefrontal and the striatum [31,32].

3

Other non-invasive imaging techniques, such as magneto- and electro-encephalography present no advantage over fMRI when it comes to inferring single unit activity. They present better temporal resolution traded off against a worst spatial resolution.

4

There are a few exceptions of single unit recordings in humans, obtained from neurologic patients undergoing brain surgery. While informative, these data are limited by the fact the neuro-anatomical targets cannot be chosen freely and that findings may not generalize to the general population [41].

5

Of course, there is a lot of information to be gained also in the case where humans and monkeys do not display the same decisions and biases. Such differences currently represent a strong area of research in comparative psychology and ethology [44]. However, the (not so implicit) assumption of the vast majority of research in neuro-economics is that monkeys are valid experimental models for human cognition, and they are not investigated for comparative reasons.

6

In the human reinforcement learning literature, the most frequently used paradigms involve options that possess, at a given trial, different expected values but overall similar risk level [56,62]. As a result the human reinforcement learning literature is more concerned about measures of objective performance rather than subjective preference.

Data accessibility

This article has no additional data.

Authors' contributions

S.P. designed the review. B.G., F.C. and S.P. discussed the review. B.G., F.C. and S.P. wrote the review.

Competing interests

We declare we have no competing interests.

Funding

S.P. was supported by an ATIP-Avenir grant (R16069JS), the Programme Emergence(s) de la Ville de Paris, the Fyssen Foundation, the Fondation Schlumberger pour l'Education et la Recherche (FSER) and the CNRS projet 80 I Prime.

References

  • 1.Samuelson PA. 1938. A note on the pure theory of consumer's behaviour. Economica 5, 61–71. ( 10.2307/2548836) [DOI] [Google Scholar]
  • 2.Von Neumann J, Morgenstern O. 1944. Theory of games and economic behavior. Princeton, NJ: Princeton University Press. [Google Scholar]
  • 3.Keynes JM. 1936. The general theory of employment, interest, and money . Berlin, Germany: Springer. [Google Scholar]
  • 4.Friedman M. 1953. Essays in positive economics. Chicago, IL: University of Chicago Press. [Google Scholar]
  • 5.Allais M. 1953. Le comportement de l'Homme rationnel devant le risque: critique des postulats et axiomes de l'Ecole Americaine. Econometrica 21, 503–546. ( 10.2307/1907921) [DOI] [Google Scholar]
  • 6.Risk ED. 1961. Ambiguity, and the savage axioms. Q. J. Econ. 75, 643–669. ( 10.2307/1884324) [DOI] [Google Scholar]
  • 7.Simon HA. 1955. A behavioral model of rational choice. Q. J. Econ. 69, 99–118. ( 10.2307/1884852) [DOI] [Google Scholar]
  • 8.Tversky A, Kahneman D. 1979. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291. ( 10.2307/1914185) [DOI] [Google Scholar]
  • 9.Kahneman D, Slovic SP, Slovic P, Tversky A. 1982. Judgment under uncertainty: heuristics and biases. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 10.Tversky A, Kahneman D. 1992. Advances in prospect theory: cumulative representation of uncertainty. J. Risk Uncertain. 5, 297–323. ( 10.1007/BF00122574) [DOI] [Google Scholar]
  • 11.Kahneman D, Knetsch JL, Thaler RH. 1991. Anomalies: the endowment effect, loss aversion, and status quo bias. J. Econ. Perspect. 5, 193–206. ( 10.1257/jep.5.1.193) [DOI] [Google Scholar]
  • 12.Camerer CF.1998. Prospect theory in the wild: evidence from the field. In Advances in behavioral economics (eds CF Camerer, G Loewenstein, M Rabin), pp. 148–161. Princeton, NJ: Princeton University Press.
  • 13.Vlaev I, Chater N, Stewart N, Brown GDA. 2011. Does the brain calculate value? Trends Cogn. Sci. 15, 546–554. ( 10.1016/j.tics.2011.09.008) [DOI] [PubMed] [Google Scholar]
  • 14.Marr D, Poggio T.1976. From understanding computation to understanding neural circuitry. Technical Report. Cambridge, MA: Massachusetts Institute of Technology.
  • 15.Friston KJ, Ashburner J, Frith CD, Poline J-B, Heather JD, Frackowiak RS. 1995. Spatial registration and normalization of images. Hum. Brain Mapp. 3, 165–189. ( 10.1002/hbm.460030303) [DOI] [Google Scholar]
  • 16.Friston KJ, Holmes AP, Worsley KJ, Poline J-P, Frith CD, Frackowiak RS. 1994. Statistical parametric maps in functional imaging: a general linear approach. Hum. Brain Mapp. 2, 189–210. ( 10.1002/hbm.460020402) [DOI] [Google Scholar]
  • 17.Worsley KJ, Friston KJ. 1995. Analysis of fMRI time-series revisited--again. Neuroimage. 23, 173–181. ( 10.1006/nimg.1995.1023) [DOI] [PubMed] [Google Scholar]
  • 18.Nordhausen CT, Maynard EM, Normann RA. 1996. Single unit recording capabilities of a 100 microelectrode array. Brain Res. 726, 129–140. ( 10.1016/0006-8993(96)00321-6) [DOI] [PubMed] [Google Scholar]
  • 19.Baker JT, Patel GH, Corbetta M, Snyder LH. 2006. Distribution of activity across the monkey cerebral cortical surface, thalamus and midbrain during rapid, visually guided saccades. Cereb. Cortex. 16, 447–459. ( 10.1093/cercor/bhi124) [DOI] [PubMed] [Google Scholar]
  • 20.Camerer CF, Loewenstein G, Prelec D. 2004. Neuroeconomics: why economics needs brains. Scand. J. Econ. 106, 555–579. ( 10.1111/j.0347-0520.2004.00377.x) [DOI] [Google Scholar]
  • 21.Camerer C, Loewenstein G, Prelec D. 2005. Neuroeconomics: how neuroscience can inform economics. J. Econ. Lit. 43, 9–64. ( 10.1257/0022051053737843) [DOI] [Google Scholar]
  • 22.Rustichini A. 2009. Neuroeconomics: what have we found, and what should we search for. Curr. Opin Neurobiol. 19, 672–677. ( 10.1016/j.conb.2009.09.012) [DOI] [PubMed] [Google Scholar]
  • 23.Chua HF, Gonzalez R, Taylor SF, Welsh RC, Liberzon I. 2009. Decision-related loss: regret and disappointment. Neuroimage 47, 2031–2040. ( 10.1016/j.neuroimage.2009.06.006) [DOI] [PubMed] [Google Scholar]
  • 24.Coricelli G, Critchley HD, Joffily M, O'Doherty JP, Sirigu A, Dolan RJ. 2005. Regret and its avoidance: a neuroimaging study of choice behavior. Nat. Neurosci. 8, 1255–1262. ( 10.1038/nn1514) [DOI] [PubMed] [Google Scholar]
  • 25.De Martino B, Kumaran D, Seymour B, Dolan RJ. 2006. Frames, biases, and rational decision-making in the human brain. Science 313, 684–687. ( 10.1126/science.1128356) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lebreton M, Jorge S, Michel V, Thirion B, Pessiglione M. 2009. An automatic valuation system in the human brain: evidence from functional neuroimaging. Neuron 64, 431–439. ( 10.1016/j.neuron.2009.09.040) [DOI] [PubMed] [Google Scholar]
  • 27.Bartra O, McGuire JT, Kable JW. 2013. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 7, 412–427. ( 10.1016/j.neuroimage.2013.02.063) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Clithero JA, Rangel A. 2014. Informatic parcellation of the network involved in the computation of subjective value. Soc. Cogn. Affect. Neurosci. 9, 1289–1302. ( 10.1093/scan/nst106) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Palminteri S, Khamassi M, Joffily M, Coricelli G. 2015. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 1–14. ( 10.1038/ncomms9096) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Baillon A, Bleichrodt H, Spinu V. 2020. Searching for the reference point. Manag. Sci. 66, 93–112. ( 10.1287/mnsc.2018.3224) [DOI] [Google Scholar]
  • 31.Palminteri S, Pessiglione M. 2017. Opponent brain systems for reward and punishment learning: causal evidence from drug and lesion studies in humans. In Decision neuroscience (eds Dreher J-C, Tremblay L), pp. 291–303. London, UK: Elsevier. [Google Scholar]
  • 32.Pessiglione M, Delgado MR. 2015. The good, the bad and the brain: neural correlates of appetitive and aversive values underlying decision making. Curr. Opin. Behav. Sci. 5, 78–84. ( 10.1016/j.cobeha.2015.08.006) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Tom SM, Fox CR, Trepel C, Poldrack RA. 2007. The neural basis of loss aversion in decision-making under risk. Science 315, 515–518. ( 10.1126/science.1134239) [DOI] [PubMed] [Google Scholar]
  • 34.Bunge M. 2014. The mind–body problem: a psychobiological approach. Oxford, UK: Pergamon Press. [Google Scholar]
  • 35.Glimcher PW. 2011. Foundations of neuroeconomic analysis. Oxford, UK: Oxford University Press. [Google Scholar]
  • 36.Padoa-Schioppa C, Rustichini A. 2014. Rational attention and adaptive coding: a puzzle and a solution. Am. Econ. Rev. 104, 507–513. ( 10.1257/aer.104.5.507) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Goense J, Bohraus Y, Logothetis NK. 2016. fMRI at high spatial resolution: implications for BOLD-models. Front. Comput. Neurosci. 10, 66 ( 10.3389/fncom.2016.00066/abstract) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Iranpour J, Morrot G, Claise B, Jean B, Bonny J-M. 2015. Using high spatial resolution to improve BOLD fMRI detection at 3T. PLoS ONE 10, e0141358 ( 10.1371/journal.pone.0141358) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kayser C, Logothetis NK. 2010. The electrophysiological background of the fMRI signal. In fMRI (eds Ulmer S, Jansen O), pp. 23–33. Berlin, Germany: Springer. [Google Scholar]
  • 40.Logothetis NK. 2008. What we can do and what we cannot do with fMRI. Nature 453, 869–878. ( 10.1038/nature06976) [DOI] [PubMed] [Google Scholar]
  • 41.Zaghloul KA, Blanco JA, Weidemann CT, McGill K, Jaggi JL, Baltuch GH, Kahana MJ. 2009. Human substantia nigra neurons encode unexpected financial rewards. Science 323, 1496–1499. ( 10.1126/science.1167342) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Jacobsen CF, Nissen HW. 1937. Studies of cerebral function in primates. IV. The effects of frontal lobe lesions on the delayed alternation habit in monkeys. J. Comp. Psychol. 23, 101–112. ( 10.1037/h0056632) [DOI] [Google Scholar]
  • 43.Kennerley SW, Walton ME. 2011. Decision making and reward in frontal cortex: complementary evidence from neurophysiological and neuropsychological studies. Behav. Neurosci. 125, 297 ( 10.1037/a0023575) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Addessi E, Beran MJ, Bourgeois-Gironde S, Brosnan SF, Leca J-B. 2020. Are the roots of human economic systems shared with non-human primates? Neurosci. Biobehav. Rev. 109, 1–15. ( 10.1016/j.neubiorev.2019.12.026) [DOI] [PubMed] [Google Scholar]
  • 45.Fouragnan EF, et al. 2019. The macaque anterior cingulate cortex translates counterfactual choice value into actual behavioral change. Nat. Neurosci. 22, 797–808. ( 10.1038/s41593-019-0375-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Ruff CC, Driver J, Bestmann S. 2009. Combining TMS and fMRI: from ‘virtual lesions’ to functional-network accounts of cognition. Cortex 45, 1043–1049. ( 10.1016/j.cortex.2008.10.012) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Gläscher J, Adolphs R, Damasio H, Bechara A, Rudrauf D, Calamia M, Paul LK, Tranel D. 2012. Lesion mapping of cognitive control and value-based decision making in the prefrontal cortex. Proc. Natl Acad. Sci. USA 109, 14 681–14 686. ( 10.1073/pnas.1206608109) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Si Y, et al. 2018. Different decision-making responses occupy different brain networks for information processing: a study based on EEG and TMS. Cerebral Cortex 29, 4119–4129. ( 10.1093/cercor/bhy294) [DOI] [PubMed] [Google Scholar]
  • 49.Goldstein WM, Hogarth RM. 1997. Judgment and decision research: some historical context. In Research on judgment and decision making: currents, connections, and controversies (eds WM Goldstein, RM Hogarth), pp. 3–65. Cambridge, UK: Cambridge University Press.
  • 50.Savage LJ. 1972. The foundations of statistics. New York, NY: Dover Publications Inc. [Google Scholar]
  • 51.Lichtenstein S, Slovic P. 2006. The construction of preference. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 52.Platt ML, Glimcher PW. 1999. Neural correlates of decision variables in parietal cortex. Nature 400, 233–238. ( 10.1038/22268) [DOI] [PubMed] [Google Scholar]
  • 53.Fiorillo CD, Tobler PN, Schultz W. 2003. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902. ( 10.1126/science.1077349) [DOI] [PubMed] [Google Scholar]
  • 54.Wu G, Gonzalez R. 1996. Curvature of the probability weighting function. Manag. Sci. 42, 1676–1690. ( 10.1287/mnsc.42.12.1676) [DOI] [Google Scholar]
  • 55.Hertwig R, Barron G, Weber EU, Erev I. 2004. Decisions from experience and the effect of rare events in risky choice. Psychol. Sci. 15, 534–539. ( 10.1111/j.0956-7976.2004.00715.x) [DOI] [PubMed] [Google Scholar]
  • 56.Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD. 2006. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442, 1042–1045. ( 10.1038/nature05051) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Fisher AJ, Medaglia JD, Jeronimus BF. 2018. Lack of group-to-individual generalizability is a threat to human subjects research. Proc. Natl Acad. Sci. USA 115, E6106–E6115. ( 10.1073/pnas.1711978115) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ludvig EA, Spetch ML. 2011. Of black swans and tossed coins: is the description-experience gap in risky choice limited to rare events? PLoS ONE 6, e20262 ( 10.1371/journal.pone.0020262) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Thompson WR. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285–294. ( 10.1093/biomet/25.3-4.285) [DOI] [Google Scholar]
  • 60.Slivkins A. 2019. Introduction to multi-armed bandits. arXiv 190407272.
  • 61.Sutton RS, Barto AG. 2018. Reinforcement learning: an introduction. Cambridge, MA: MIT Press. [Google Scholar]
  • 62.Frank MJ, Seeberger LC, O'Reilly RC. 2004. By carrot or by stick: cognitive reinforcement learning in Parkinsonism. Science 306, 1940–1943. ( 10.1126/science.1102941) [DOI] [PubMed] [Google Scholar]
  • 63.Barron G, Erev I. 2003. Small feedback-based decisions and their limited correspondence to description-based decisions. J. Behav. Decis. Mak. 16, 215–233. ( 10.1002/bdm.443) [DOI] [Google Scholar]
  • 64.Weber EU, Shafir S, Blais A-R. 2004. Predicting risk sensitivity in humans and lower animals: risk as variance or coefficient of variation. Psychol. Rev. 111, 430 ( 10.1037/0033-295X.111.2.430) [DOI] [PubMed] [Google Scholar]
  • 65.Hertwig R, Erev I. 2009. The description–experience gap in risky choice. Trends Cogn. Sci. 13, 517–523. ( 10.1016/j.tics.2009.09.004) [DOI] [PubMed] [Google Scholar]
  • 66.Wulff DU, Mergenthaler-Canseco M, Hertwig R. 2018. A meta-analytic review of two modes of learning and the description-experience gap. Psychol. Bull. 144, 140 ( 10.1037/bul0000115) [DOI] [PubMed] [Google Scholar]
  • 67.Madan CR, Ludvig EA, Spetch ML. 2019. Comparative inspiration: from puzzles with pigeons to novel discoveries with humans in risky choice. Behav. Processes 160, 10–19. ( 10.1016/j.beproc.2018.12.009) [DOI] [PubMed] [Google Scholar]
  • 68.Abdellaoui M, Bleichrodt H, Kammoun H. 2013. Do financial professionals behave according to prospect theory? An experimental study. Theory Decis. 74, 411–429. ( 10.1007/s11238-011-9282-3) [DOI] [Google Scholar]
  • 69.Prelec D. 1998. The probability weighting function. Econometrica 66, 497 ( 10.2307/2998573) [DOI] [Google Scholar]
  • 70.Loomes G, Sugden R. 1982. Regret theory: an alternative theory of rational choice under uncertainty. Econ. J. 92, 805–824. ( 10.2307/2232669) [DOI] [Google Scholar]
  • 71.Quiggin J. 2012. Generalized expected utility theory: the rank-dependent model. Dordrecht, The Netherlands: Springer. [Google Scholar]
  • 72.Lefebvre G, Lebreton M, Meyniel F, Bourgeois-Gironde S, Palminteri S. 2017. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 1–9. ( 10.1038/s41562-017-0067) [DOI] [Google Scholar]
  • 73.Chambon V, Théro H, Vidal M, Vandendriessche H, Haggard P, Palminteri S. 2020. Information about action outcomes differentially affects learning from self-determined versus imposed choices. Nat. Hum. Behav. 4, 1067–1079. ( 10.1038/s41562-020-0919-5) [DOI] [PubMed] [Google Scholar]
  • 74.Palminteri S, Lefebvre G, Kilford EJ, Blakemore S-J. 2017. Confirmation bias in human reinforcement learning: evidence from counterfactual feedback processing. PLoS Comput. Biol. 13, e1005684 ( 10.1371/journal.pcbi.1005684) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Niv Y, Edlund JA, Dayan P, O'Doherty JP. 2012. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562. ( 10.1523/JNEUROSCI.5498-10.2012) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Tobler PN, Christopoulos GI, O'Doherty JP, Dolan RJ, Schultz W. 2008. Neuronal distortions of reward probability without choice. J. Neurosci. 28, 11 703–11 711. ( 10.1523/JNEUROSCI.2870-08.2008) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Ludvig EA, Madan CR, McMillan N, Xu Y, Spetch ML. 2018. Living near the edge: how extreme outcomes and their neighbors drive risky choice. J. Exp. Psychol. Gen. 147, 1905–1918. ( 10.1037/xge0000414) [DOI] [PubMed] [Google Scholar]
  • 78.Jessup RK, Bishara AJ, Busemeyer JR. 2008. Feedback produces divergence from prospect theory in descriptive choice. Psychol. Sci. 19, 1015–1022. ( 10.1111/j.1467-9280.2008.02193.x) [DOI] [PubMed] [Google Scholar]
  • 79.Erev I, Ert E, Plonsky O, Cohen D, Cohen O. 2017. From anomalies to forecasts: toward a descriptive model of decisions under risk, under ambiguity, and from experience. Psychol. Rev. 124, 369 ( 10.1037/rev0000062) [DOI] [PubMed] [Google Scholar]
  • 80.McCoy AN, Platt ML. 2005. Risk-sensitive neurons in macaque posterior cingulate cortex. Nat. Neurosci. 8, 1220–1227. ( 10.1038/nn1523) [DOI] [PubMed] [Google Scholar]
  • 81.Hayden BY, Platt ML. 2007. Temporal discounting predicts risk sensitivity in rhesus macaques. Curr. Biol. 17, 49–53. ( 10.1016/j.cub.2006.10.055) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Hayden BY, Heilbronner SR, Nair AC, Platt ML. 2008. Cognitive influences on risk-seeking by rhesus macaques. Judgm. Decis. Mak. 3, 389. [PMC free article] [PubMed] [Google Scholar]
  • 83.Long AB, Kuhn CM, Platt ML. 2009. Serotonin shapes risky decision making in monkeys. Soc. Cogn. Affect. Neurosci. 4, 346–356. ( 10.1093/scan/nsp020) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Watson KK, Ghodasra JH, Platt ML. 2009. Serotonin transporter genotype modulates social reward and punishment in rhesus macaques. PLoS ONE 4, e4156 ( 10.1371/journal.pone.0004156) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.O'Neill M, Schultz W. 2010. Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800. ( 10.1016/j.neuron.2010.09.031) [DOI] [PubMed] [Google Scholar]
  • 86.Heilbronner S, Hayden BY, Platt M. 2011. Decision salience signals in posterior cingulate cortex. Front. Neurosci. 5, 55 ( 10.3389/fnins.2011.00055) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Kim S, Bobeica I, Gamo NJ, Arnsten AF, Lee D. 2012. Effects of α-2A adrenergic receptor agonist on time and risk preference in primates. Psychopharmacology 219, 363–375. ( 10.1007/s00213-011-2520-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Heilbronner SR, Hayden BY. 2013. Contextual factors explain risk-seeking preferences in rhesus monkeys. Front. Neurosci. 7, 7 ( 10.3389/fnins.2013.00007/abstract) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Xu ER, Kralik JD. 2014. Risky business: rhesus monkeys exhibit persistent preferences for risky options. Front. Psychol. 5, 258 ( 10.3389/fpsyg.2014.00258/abstract) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Smith TR, Beran MJ, Young ME. 2017. Gambling in rhesus macaques (Macaca mulatta): the effect of cues signaling risky choice outcomes. Learn. Behav. 45, 288–299. ( 10.3758/s13420-017-0270-5) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Hayden B, Heilbronner S, Platt M. 2010. Ambiguity aversion in rhesus macaques. Front. Neurosci. 4, 166 ( 10.3389/fnins.2010.00166) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.So N-Y, Stuphorn V. 2010. Supplementary eye field encodes option and action value for saccades with variable reward. J. Neurophysiol. 104, 2634–2653. ( 10.1152/jn.00430.2010) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Yamada H, Tymula A, Louie K, Glimcher PW. 2013. Thirst-dependent risk preferences in monkeys identify a primitive form of wealth. Proc. Natl Acad. Sci. USA 110, 15 788–15 793. ( 10.1073/pnas.1308718110) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Raghuraman AP, Padoa-Schioppa C. 2014. Integration of multiple determinants in the neuronal computation of economic values. J. Neurosci. 34, 11 583–11 603. ( 10.1523/JNEUROSCI.1235-14.2014) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Stauffer WR, Lak A, Bossaerts P, Schultz W. 2015. Economic choices reveal probability distortion in macaque monkeys. J. Neurosci. 35, 3146–3154. ( 10.1523/JNEUROSCI.3653-14.2015) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Farashahi S, Azab H, Hayden B, Soltani A. 2018. On the flexibility of basic risk attitudes in monkeys. J. Neurosci. 38, 4383–4398. ( 10.1523/JNEUROSCI.2260-17.2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Chen X, Stuphorn V. 2018. Inactivation of medial frontal cortex changes risk preference. Curr. Biol. 28, 3114–3122.e4. ( 10.1016/j.cub.2018.07.043) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Nioche A, Bourgeois-Gironde S, Boraud T. 2019. An asymmetry of treatment between lotteries involving gains and losses in rhesus monkeys. Sci. Rep. 9, 1–13. ( 10.1038/s41598-019-46975-2) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Ferrari-Toniolo S, Bujold PM, Schultz W. 2019. Probability distortion depends on choice sequence in rhesus monkeys. J. Neurosci. 39, 2915–2929. ( 10.1523/JNEUROSCI.1454-18.2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Eisenreich BR, Hayden BY, Zimmermann J. 2019. Macaques are risk-averse in a freely moving foraging task. Sci. Rep. 9, 15091 ( 10.1038/s41598-019-51442-z) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Farashahi S, Donahue CH, Hayden BY, Lee D, Soltani A. 2019. Flexible combination of reward information across primates. Nat. Hum. Behav. 3, 1215–1224. ( 10.1038/s41562-019-0714-3) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Heilbronner SR, Hayden BY. 2016. The description-experience gap in risky choice in nonhuman primates. Psychon. Bull. Rev. 23, 593–600. ( 10.3758/s13423-015-0924-2) [DOI] [PubMed] [Google Scholar]
  • 103.Niv Y. 2009. Reinforcement learning in the brain. J. Math. Psychol. 53, 139–154. ( 10.1016/j.jmp.2008.12.005) [DOI] [Google Scholar]
  • 104.Daw ND, Doya K. 2006. The computational neurobiology of learning and reward. Curr. Opin. Neurobiol. 16, 199–204. ( 10.1016/j.conb.2006.03.006) [DOI] [PubMed] [Google Scholar]
  • 105.Croson R, Sundali J. 2005. The gambler's fallacy and the hot hand: empirical data from casinos. J. Risk Uncertain. 30, 195–209. ( 10.1007/s11166-005-1153-2) [DOI] [Google Scholar]
  • 106.Blanchard TC, Wilke A, Hayden BY. 2014. Hot-hand bias in rhesus monkeys. J. Exp. Psychol. Anim. Learn. Cogn. 40, 280–286. ( 10.1037/xan0000033) [DOI] [PubMed] [Google Scholar]
  • 107.Sescousse G, Caldú X, Segura B, Dreher J-C. 2013. Processing of primary and secondary rewards: a quantitative meta-analysis and review of human functional neuroimaging studies. Neurosci. Biobehav. Rev. 37, 681–696. ( 10.1016/j.neubiorev.2013.02.002) [DOI] [PubMed] [Google Scholar]
  • 108.Amiez C, Neveu R, Warrot D, Petrides M, Knoblauch K, Procyk E. 2013. The location of feedback-related activity in the midcingulate cortex is predicted by local morphology. J. Neurosci. 33, 2217–2228. ( 10.1523/JNEUROSCI.2779-12.2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Hayden BY, Platt ML. 2009. Gambling for Gatorade: risk-sensitive decision making for fluid rewards in humans. Anim. Cogn. 12, 201–207. ( 10.1007/s10071-008-0186-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Levy DJ, Glimcher PW. 2012. The root of all value: a neural common currency for choice. Curr. Opin. Neurobiol. 22, 1027–1038. ( 10.1016/j.conb.2012.06.001) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Barberis NC. 2013. Thirty years of prospect theory in economics: a review and assessment. J. Econ. Perspect. 27, 173–196. ( 10.1257/jep.27.1.173) [DOI] [Google Scholar]
  • 112.Markowitz H. 1952. The utility of wealth. J. Polit. Econ. 60, 151–158. ( 10.1086/257177) [DOI] [Google Scholar]
  • 113.Wood W, Neal DT. 2007. A new look at habits and the habit-goal interface. Psychol. Rev. 114, 843–863. ( 10.1037/0033-295X.114.4.843) [DOI] [PubMed] [Google Scholar]
  • 114.Schultz W, Dayan P, Montague PR. 1997. A neural substrate of prediction and reward. Science 275, 1593–1599. ( 10.1126/science.275.5306.1593) [DOI] [PubMed] [Google Scholar]
  • 115.FitzGerald TH, Seymour B, Bach DR, Dolan RJ. 2010. Differentiable neural substrates for learned and described value and risk. Curr. Biol. 20, 1823–1829. ( 10.1016/j.cub.2010.08.048) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Wimmer GE, Poldrack RA. 2020. Reward learning and working memory: effects of massed versus spaced training and post-learning delay period. bioRxiv. 997098 ( 10.1101/2020.03.19.997098) [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

This article has no additional data.


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES