Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 May 18.
Published in final edited form as: Anim Cogn. 2008 Aug 22;12(1):201–207. doi: 10.1007/s10071-008-0186-8

Gambling for Gatorade: risk-sensitive decision making for fluid rewards in humans

Benjamin Y Hayden 1,, Michael L Platt 1,2
PMCID: PMC2683409  NIHMSID: NIHMS98255  PMID: 18719953

Abstract

Determining how both humans and animals make decisions in risky situations is a central problem in economics, experimental psychology, behavioral economics, and neurobiology. Typically, humans are risk seeking for gains and risk averse for losses, while animals may display a variety of preferences under risk depending on, amongst other factors, internal state. Such differences in behavior may reflect major cognitive and cultural differences or they may reflect differences in the way risk sensitivity is probed in humans and animals. Notably, in most studies humans make one or a few choices amongst hypothetical or real monetary options, while animals make dozens of repeated choices amongst options offering primary rewards like food or drink. To address this issue, we probed risk-sensitive decision making in human participants using a paradigm modeled on animal studies, in which rewards were either small squirts of Gatorade or small amounts of real money. Possible outcomes and their probabilities were not made explicit in either case. We found that individual patterns of decision making were strikingly similar for both juice and for money, both in overall risk preferences and in trial-to-trial effects of reward outcome on choice. Comparison with decisions made by monkeys for juice in a similar task revealed highly similar gambling styles. These results unite known patterns of risk-sensitive decision making in human and nonhuman primates and suggest that factors such as the way a decision is framed or internal state may underlie observed variation in risk preferences between and within species.

Keywords: Risk, Primary reinforcer, Neuroeconomics

Introduction

Uncertainty characterizes most of the decisions humans and other animals confront in their daily lives (Kahneman and Tversky 2000; Stephens and Krebs 1986). Within economics, uncertainty is often subdivided into risk, in which the distribution of possible outcomes is known, and ambiguity, in which the distribution of possible outcomes is unknown (Knight 1921). In real life, such distinctions may not be so readily apparent, particularly for nonhuman animals. Nonetheless, a distinct pattern of behavior has emerged from decades of study of the influence of risk on decision making in humans and other animals. Typically, humans are risk averse for gains and risk seeking for losses (Kahneman and Tversky 2000). In contrast, risk preferences in animals are more variable. Although many studies report risk aversion (Stephens and Krebs 1986), risk seeking for primary rewards varying in magnitude has been reported for monkeys (McCoy and Platt 2005) and rats (Rachlin 2000), risk seeking for rewards varying in delay has been found in several species (Kacelnik and Bateson 1996), and risk neutrality has been reported for some species as well (e.g., Shafir et al. 1999). In addition, physiological variables such as hunger or thirst can promote risk seeking under some conditions (Caraco 1981; Schuck-Paim et al. 2004).

Differences in risk-sensitive decision making observed in humans and other animals may reflect important species differences, the effects of physiological state on reward evaluation and choice, or simply differences in the tasks used. For example, humans and other animals clearly differ in cognitive and cultural abilities such as advanced language, planning, and delayed gratification skills, as well as the use of an arbitrary monetary system. On the other hand, individual risk-sensitive behavior within a given species may reflect parameters related to the gamble under consideration. For example, although humans are thought to be risk averse for gains, repeated gambles may promote risk-seeking behavior (Samuelson 1963). Moreover, risk preferences in humans deviate from strict aversion when hypothetical rewards are learned through experience rather than presented explicitly (Hertwig et al. 2004). Finally, state variables such as hunger or thirst in animals, or financial wealth in humans, may make important contributions to decision making, yet these factors typically are not explicitly controlled between studies (Schuck-Paim et al. 2004). These observations suggest the possibility that task design and decision framing may contribute as much as inherent species differences in cognition to preferences displayed by human and nonhuman animals in risky decision contexts.

There are several factors of task design that distinguish human and animal studies, which may influence risk sensitivity. Animals typically learn about reward outcomes and their associated probabilities through repeated gambles rather than through explicit symbolic presentation of information. This repetition may influence risk seeking by biasing animals to choose between long-term strategies rather than make individual choices between two immediate options (Hayden and Platt 2007; Rachlin 2000; Rachlin et al. 1991). Furthermore, repetition of responses may lead to changes in internal state via satiety; such changes have been shown to influence decisions (Schuck-Paim et al. 2004; Caraco 1981). In addition, most animals do not use symbolic currencies (but see Chen et al. 2006), so their rewards must be primary (almost always food or fluid), and they are typically given immediate reinforcement. In contrast, in most studies with repeated choices by humans, gambles are not resolved until after the experimental session is over (e.g., Huettel et al. 2006). The availability of immediate feedback may influence risk sensitivity by fostering reliance on recent information (Hertwig et al. 2004).

To directly test the idea that task parameters strongly influence patterns of risk-sensitive decision making between species, we studied decision making in humans using methods designed to mirror, as closely as possible, conditions previously used to study risk-sensitive decision making in monkeys (McCoy and Platt 2005; Hayden and Platt 2007). These methods share several features in common with other animal studies (Kacelnik and Bateson 1996; Shafir et al. 1999). Specifically, we used real (i.e., non-hypothetical) primary rewards. As in our studies of monkeys, the reward we used was a flavored sugary juice (in this case, Gatorade) that was consumed immediately. Furthermore, reward probability and magnitude were unsignaled and learned through experience, and gambles were repeated many times. Our aim was to determine whether, when placed in the same conditions, humans would exhibit behavior similar to that of monkeys. For comparison, in alternating blocks of trials, subjects were rewarded with small amounts of money. To our knowledge, this is the first study to examine risk-sensitive decision making by humans for primary rewards.

We hypothesized that the primary determinants of risk-sensitive behavior are specific factors of task design, including reward type, the immediacy of the reward, and whether one gamble is resolved before the next occurs. Such results would be consistent with earlier findings that risk sensitivity reflects the way in which probabilities are learned (Hertwig et al. 2004), the delay between choices (Hayden and Platt 2007) and the number of choices (Samuelson 1963). We specifically hypothesized that people would be risk seeking for juice, just like monkeys. Moreover, we predicted that people would show more risk seeking for monetary rewards when there were repeated gambles than in typical one-shot studies. Finally, we hypothesized that people would be more likely to gamble again after receiving a large reward than a small reward, a variant of a win–stay lose–shift strategy, as we observed previously in monkeys (McCoy and Platt 2005).

Methods

Subjects and apparatus

Fifty adults (ages 18–32, mean age 21.4, SD. 3.7 years, 27 male, 23 female) participated and gave informed consent. Ten of the subjects performed the fluid discrimination task (see below) while the other 40 performed the gambling task. The study was approved by the Duke University Medical School IRB. The task was run in a 1-h session that paid $12–18 depending on gamble outcomes. Testing was performed on PC computers using Matlab and Psychtoolbox (Brainard 1997). Experiments were performed in an anechoic chamber. Each subject was provided with a fresh section of plastic tubing (American Plastics) that was placed in the mouth. A small piece of tubing was attached in parallel to the first so that sucking would not provide more fluid from the tube. Flow was controlled by a solenoid valve connected to the PC (interfaced through Board # 6025, National Instruments). Mild thirst was induced by providing subjects a small bag of pretzels or corn chips before the task. Subjects performed fluid and money trials in ten trial blocks, randomly interleaved.

We initially performed a discrimination task on ten subjects to verify that they could distinguish the rewards used (Fig. 1). In this task, subjects performed a series of discriminations between two sequentially presented volumes of juice with a 1-s interval between them by indicating which of the two volumes was larger. Subjects were asked to indicate which of the two volumes was larger. The delay between trials was about 15 s, and lasted until the subject indicated he or she was ready to begin the next trial. Each subject performed 50 trials and was given feedback at the end of the testing session. The reward volumes used in the discrimination experiment were sampled from a larger range of possible volumes than were used in the subsequent experiment. One of the two volumes was 0.75 s (1.125 mL) while the other varied from it by anywhere from 4 to 48%.

Fig. 1.

Fig. 1

a Schematic of gambling task. Top row fluid gambling task. Subjects were initially presented with a blank screen. Following a 3-s delay, two targets appeared. Both targets were the identical. Subjects could then choose one of the two targets by pressing either the left or the right key on the keyboard. Following the choice, a fluid reward was presented into the subject’s mouth via a juice tube. Trial was then followed by a short delay. Bottom row monetary gambling task. Task was identical to the fluid task, except the targets were a different color, and the reward was announced via text on the monitor, but was not given immediately. Trial types were switched at random in blocks of 10 trials. b Population data for discrimination ability for juice volumes. Subjects performed a two-alternative forced-choice task in which they had to identify the larger of the two juice volumes. Accuracy increased as the relative size of the two squirts increased. The two vertical lines indicate the relative values used in the low and high-risk conditions of the gambling task

Gambling task

Subjects were seated in front of a computer monitor and provided with a plastic tube. On each trial, subjects were presented with two squares on the monitor (~5°). On fluid trials the squares were green, on money trials they were red. Subjects could press either a left arrow or a right arrow on the keyboard (no time limit) to obtain either a risky or safe payoff. To minimize confusion, the position of the risky option was fixed, but selected randomly on a subject-by-subject basis. We observed no population side-bias effects.

On fluid trials, the safe option offered 0.75 s (1.125 mL) fluid while the risky option offered an even gamble between a higher and lower volume. These values were 0.5 s (1 mL) and 1 s (1.5 mL) in the high-risk condition and 0.625 s (0.94 mL) and 0.875 s (1.32 mL) in the low-risk condition. On money trials, the safe option offered 12 cents and the risky option offered an even gamble between either a larger or smaller payoff. These payoffs were 10 and 14 cents in the low-risk condition, 7 and 17 cents in the medium risk condition, and 3 and 21 cents in the high-risk condition. Trials were run in blocks of 10. Blocks occurred in a random order, and all subjects participated in all blocks. Overall, 200 trials (100 juice and 100 monetary) were performed. Each block was 10 trials long, and blocks of all types (juice/money and high/low) were mixed randomly. Each trial lasted about 3 s and the inter-trial interval was 1 s. These durations are nearly identical to those used in our studies of monkeys (McCoy and Platt 2005; Hayden and Platt 2007).

Results

Sensitivity to risk in fluid and monetary rewards

We investigated the appeal of risky options in a group of 40 subjects. Following convention, we define risk aversion as a statistically significant tendency to choose the safer option in a task offering two options with identical expected values, and risk seeking as a corresponding tendency to choose the risky option (McCoy and Platt 2005). In the fluid gambling task, 21 subjects were risk averse (13 significant, binomial test, t100 > 2.0, P < 0.05, n = 100 trials), while 19 were risk seeking (9 significant). In the monetary gambling task, 24 subjects were risk averse (15 significant), while 16 subjects were risk seeking (5 significant). Across all subjects, there was no significant bias towards either risk seeking or risk aversion for fluid rewards (mean likelihood of choosing the risky option was 0.49, Student’s t test, t40 = 0.42, P = 0.68, n = 40 subjects, Fig. 2c). There was a weak but significant bias towards risk aversion for monetary rewards (mean likelihood of choosing the risky option was 0.46, Student’s t test, t40 = 2.11, P = 0.041, n = 40 subjects, Fig. 2b). These data thus document a diversity of risk preferences in humans, consistent with previous studies (Huettel et al. 2006; Reboreda and Kacelnik 1991; Kuhnen and Knutson 2005). This diversity is illustrated in histograms of preference levels for the individual subjects (Fig. 2b, c). Subjects with preferences significantly different from chance are shown with in black; others are shown in white. We manipulated reward variance, because this parameter has previously been shown to influence risk preferences (McCoy and Platt 2005). However, there was no effect of reward variance on average risk preferences in this study (Student’s t test, t40 < 1.28, P > 0.2 for both money and fluid, Student’s t test, n = 40 subjects).

Fig. 2.

Fig. 2

Risk preferences generalize across reward types. a Scatter plot showing relative tendency to gamble in the two tasks. Each dot represents a single subject. Horizontal and vertical axes represent propensity to gamble on fluid and monetary gambling tasks, respectively. Dashed diagonal line indicates unity line. Solid diagonal line indicates best fit correlation line. b Population histogram for money risk preferences. Subjects with preferences significantly different from change are shown in black; others are shown in white. c Population histogram for fluid risk preferences. Conventions as in b

Despite variation in preferences, each individual subject tended to either prefer or avoid risk for both reward types. Across subjects, the frequency of choosing the risky option was correlated for fluid and monetary rewards (Fig. 2a, bootstrap correlation test, r40 = 0.4972, P < 0.001, n = 40 subjects.) Moreover, there was no significant difference in risk preference for the two reward types across the population of subjects (Student’s t test, t40 = 1.20, P = 0.24, n = 40 subjects). These results suggest that propensity to choose the risky option may be more strongly influenced by factors that the two tasks had in common than by the factor that distinguished them, namely, reward type. Thus, these results suggest that task design is a critical factor influencing risk preference (Weber et al. 2004). Nonetheless, we acknowledge that further studies will be needed to fully validate this hypothesis.

Local patterns of choice behavior

We showed previously that monkeys are more likely to choose the risky option again after they have received a larger than average reward than after they have received a small reward for the same choice (McCoy and Platt 2005). This behavioral pattern is similar to a win-stay-lose-shift strategy (Barraclough et al. 2004; Sugrue et al. 2005) and is consistent with reinforcement learning (Sutton and Barto 1998). To examine the dependence of risky choices on the recent history of rewards, we compared risk-seeking behavior following different trial outcomes, for both fluid and monetary reward. In these analyses, we combined data over risk levels.

The behavior of subjects in the fluid gambling task on a trial-to-trial basis is illustrated in Fig. 3. In each panel, the propensity to choose the risky option (y-axis) is shown as a function of the outcome of the last trial (x-axis). The asymmetric V-shape of the graphs reflects the interaction of two processes: first, the influence of reward variability on propensity to gamble again (the V-shape) and the greater draw of wins than losses on reward (the asymmetry). Overall, subjects were significantly more likely to choose the risky option following a large reward (62% likelihood) than following a small reward (57% likelihood; paired samples t test on individual risk preference levels, t40 = 2.51, P = 0.016, n = 40 subjects). These results demonstrate that humans, like monkeys, have a tendency to use a win-stay-lose-shift strategy in repeated gambling situations, and illustrate the common importance of simple decision-making heuristics for the two species. Subjects were risk seeking following a risky choice, independent of outcome (59%), and risk averse following a safe choice (37%, paired samples t test, t40 = 4.1, P < 0.001). Subjects were significantly less likely to switch following a safe choice (37%) than following a risky choice (41%, paired samples t test, t40 = 2.99, P = 0.0049, n = 40 subjects).

Fig. 3.

Fig. 3

Reward history influences risk preference in humans and monkeys. a Plot of propensity to choose the risky option as a function of outcome of previous trial for all subjects. Subjects were more likely to gamble following a large reward than following a small reward. Error bars indicate one standard error above and below the mean. b Same plot as a, but shows only 10 most risk-averse subjects. c Same plot as a, but shows only 10 most risk-seeking subjects. d Plot of propensity to gamble as a function of outcome of previous trial in the monetary gambling task. As in the fluid task, subjects showed a greater propensity to gamble following large rewards than following small rewards. e Plot of propensity to gamble following different outcomes for two monkeys. Behavior of these two monkeys was qualitatively similar to that of the most risk-seeking quartile of humans (Data from McCoy and Platt 2005)

We next considered the range of behavioral patterns observed in the more risk-seeking and more risk-averse subjects. We therefore divided our subject pool into four quartiles based on risk preference level, and separately examined the 10 most risk-seeking (Fig. 3c, mean risk preference 65%) and 10 most risk averse (Fig. 3b, mean risk preference 34%) quartiles of subjects. Both pools of subjects exhibited win-stay lose-shift patterns. Risk-seeking subjects were significantly more likely to choose the risky option following a large reward (78%) than following a small reward (64%, Student’s t test, t10 = 2.47, P = 0.0354, n = 10 subjects). Risk-averse subjects were also significantly more likely to choose the risky option following a large reward (51%) than following a small reward (39%, Student’s t test, t10 = 2.52, P = 0.0329, n = 10 subjects). To compare choice patterns for primary (juice) and secondary (money) rewards, we examined trial-to-trial choices in the gambling task made by humans working for money (Fig. 3d) using the same mathematical methods used for the juice study. Humans were significantly more likely to choose the risky option following a large reward (62%) than following a smaller reward (44%, Student’s t test, t40 = 5.2, P < 0.0001, n = 40 subjects).

Discussion

We studied choices made by humans in a gambling situation designed to mirror that used in previous studies of risk-sensitive decision making in monkeys (McCoy and Platt 2005; Hayden and Platt 2007). We found that subjects’ decision patterns were surprisingly similar to those made by monkeys. Specifically, neither people nor monkeys were risk averse, and both adopted a win-stay-lose-shift strategy. Moreover, we found that individual preferences for risky fluid and monetary rewards were somewhat correlated, suggesting that factors of task design other than reward type influence choice. The present results are consistent with an emerging portrait of risk-sensitive decision making behavior that is highly similar in humans and animals when tested in similar situations (Hertwig et al. 2004; Weber et al. 2004).

Together, these results suggest that species differences in decision-making behavior often found in different studies may reflect, at least in part, the specific details of the paradigm used. These differences may include the way that options are presented, whether outcomes are learned, changes in internal state, and the nature of the individual being tested, rather than the intrinsic cognitive capacities of the species tested (Weber et al. 2004; Kuhberger et al. 1999; Schuck-Paim et al. 2004). Moreover, attempts to formulate a general theory of risk sensitivity may benefit from attention to evolutionary considerations (Schuck-Paim et al. 2004; Brandstätter et al. 2006).

In a previous study of monkeys performing a nearly identical task (McCoy and Platt 2005), the animals were 87% likely to choose the risky option following a large juice reward and only 65% likely to choose the risky option following a small reward (Fig. 3e, paired samples t test, P < 0.001); monkeys were indifferent following choice of the safe option. These data suggest that our most risk-seeking quartile of human subjects behaved nearly identically to our monkeys, and that risk-sensitive decision making behavior in monkeys is well within the human range. Collectively, these results suggest that both species choices were powerfully influenced by the recent history of reward (cf. Sugrue et al. 2004). We also note that the observed difference in overall risk preferences between monkeys and humans may reflect factors other than intrinsic biological differences. Specifically, monkeys had years of experience with the task, performed around 1,000 trials per day, earned most of their daily fluids by their choices, and received smaller rewards on each trial. It is quite possible that humans would show similar overall preferences if tested under these conditions.

It is interesting that humans exhibited a greater propensity to gamble following choices in which the larger reward was obtained (the ‘win’ situation) than following gambles in which the smaller reward was obtained (the ‘lose’ situation). Such behavior is similar to a win-stay-lose-shift heuristic (Barraclough et al. 2004). Like other useful heuristics, this strategy provides a computationally efficient method by which decision-makers can choose effectively in an uncertain environment. The observation that humans and monkeys both use this strategy, even when they do not realize they are doing so, highlights its simple appeal. A win-stay-lose-shift strategy is consistent with local satiety effects on choice; receiving a large reward may temporarily increase satiety, and receiving a small reward may temporarily decrease satiety. If so, win-stay-lost-shift behavior in this task contrasts markedly with state-dependent risk preferences reported in previous studies (Caraco 1981; Schuck-Paim et al. 2004).

The present study indicates that species differences in cognition or emotion may not be the most critical factor mediating risk preferences. However, we have not identified which of the factors that typically distinguish typical human and animal studies are critical for explaining risk preferences. Our results also suggest that other factors, such as whether rewards are actually delivered, may also be important. Future studies will be needed to tease apart the role of each of these factors.

Acknowledgments

We thank Jason-Flor Sisante for helping to set up the human experimental system. We thank Ashley Nutter and Cameron Martin for collecting the data, and the SROP program at Duke for financial support. Experiments were performed in compliance with the laws of the United States.

References

  1. Barraclough DJ, Conroy ML, Lee D. Prefrontal cortex and decision making in a mixed-strategy game. Nat Neurosci. 2004;7:404–410. doi: 10.1038/nn1209. [DOI] [PubMed] [Google Scholar]
  2. Brainard DH. The Psychophysics toolbox. Spat Vis. 1997;10:433–436. [PubMed] [Google Scholar]
  3. Brandstätter E, Gigerenzer G, Hertwig R. The priority heuristic: making choices without trade-offs. Psychol Rev. 2006;133:409–432. doi: 10.1037/0033-295X.113.2.409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Chen M, Lakshminarayanan V, Santos L. How basic are behavioral biases? Evidence from Capuchin monkey trading behavior. J Polit Econ. 2006;114:517–537. [Google Scholar]
  5. Caraco T. Energy budgets, risk and foraging preferences in dark-eyed juncos (Junco hyemalis) Behav Ecol Sociobiol. 1981;8:213–217. [Google Scholar]
  6. Hayden BY, Platt ML. Temporal discounting predicts risk sensitivity in rhesus macaques. Curr Biol. 2007;17:49–53. doi: 10.1016/j.cub.2006.10.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Hertwig R, Barron G, Weber EU, Erev I. Decisions from experience and the effect of rare events in risky choice. Psychol Sci. 2004;15:534–539. doi: 10.1111/j.0956-7976.2004.00715.x. [DOI] [PubMed] [Google Scholar]
  8. Huettel SA, Stowe CJ, Gordon EM, Warner BT, Platt ML. Neural signatures of economic preferences for risk and ambiguity. Neuron. 2006;49:765–775. doi: 10.1016/j.neuron.2006.01.024. [DOI] [PubMed] [Google Scholar]
  9. Kacelnik A, Bateson M. Risky theories—the effects of variance on foraging decisions. Am Zool. 1996;36:402–434. [Google Scholar]
  10. Kahneman D, Tversky A. Choices, values, and frames. Cambridge University Press; Cambridge: 2000. [Google Scholar]
  11. Knight FH. Risk, uncertainty, and profit. Houghton Mifflin; Boston: 1921. [Google Scholar]
  12. Kuhberger A, Schulte-Mecklenbeck M, Perner J. The Effects of framing, reflection, probability, and payoff on risk preference in choice tasks. Organ Behav Hum Decis Process. 1999;78:204–231. doi: 10.1006/obhd.1999.2830. [DOI] [PubMed] [Google Scholar]
  13. Kuhnen CM, Knutson B. The neural basis of financial risk taking. Neuron. 2005;47:763–770. doi: 10.1016/j.neuron.2005.08.008. [DOI] [PubMed] [Google Scholar]
  14. McCoy AN, Platt ML. Risk-sensitive neurons in macaque posterior cingulate cortex. Nat Neurosci. 2005;8:1220–1227. doi: 10.1038/nn1523. [DOI] [PubMed] [Google Scholar]
  15. Rachlin H, Raineri A, Cross D. Subjective probability and delay. J Exp Anal Behav. 1991;55:233–244. doi: 10.1901/jeab.1991.55-233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Rachlin H. The science of self-control. Harvard University Press; Cambridge: 2000. [Google Scholar]
  17. Reboreda J, Kacelnik A. Risk sensitivity in starlings: variability in food amount and food delay. Behav Ecol. 1991;2:301–308. [Google Scholar]
  18. Samuelson PA. Risk and uncertainty: a fallacy of large numbers. Scientia. 1963;98:108–113. [Google Scholar]
  19. Schuck-Paim C, Pompilio L, Kacelnik A. State-dependent decisions cause apparent violations of rationality in animal choice. PLos Biol. 2004;2:12. doi: 10.1371/journal.pbio.0020402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Shafir S, Wiegmann DD, Smith BH, Real LA. Risk-sensitive foraging: choice behaviour of honeybees in response to variability in volume of reward. Anim Behav. 1999;57:1055–1061. doi: 10.1006/anbe.1998.1078. [DOI] [PubMed] [Google Scholar]
  21. Stephens DW, Krebs JR. Foraging theory. Princeton University Press; Princeton: 1986. [Google Scholar]
  22. Sugrue LP, Corrado GS, Newsome WT. Matching behavior and the representation of value in the parietal cortex. Science. 2004;304:1782–1787. doi: 10.1126/science.1094765. [DOI] [PubMed] [Google Scholar]
  23. Sugrue LP, Corrado GS, Newsome WT. Choosing the greater of two goods: neural currencies for valuation and decision making. Nat Rev Neurosci. 2005;6:363–375. doi: 10.1038/nrn1666. [DOI] [PubMed] [Google Scholar]
  24. Sutton RS, Barto AG. Reinforcement learning: an introduction. MIT Press; Cambridge: 1998. [Google Scholar]
  25. Weber EU, Shafir S, Blais AR. Predicting risk sensitivity in humans and lower animals: risk as variance or coefficient of variation. Psychol Rev. 2004;111:430–445. doi: 10.1037/0033-295X.111.2.430. [DOI] [PubMed] [Google Scholar]

RESOURCES