A Neurocomputational Model for Intrinsic Reward

Benjamin Chew; Bastien Blain; Raymond J Dolan; Robb B Rutledge

doi:10.1523/JNEUROSCI.0858-20.2021

. 2021 Oct 27;41(43):8963–8971. doi: 10.1523/JNEUROSCI.0858-20.2021

A Neurocomputational Model for Intrinsic Reward

Benjamin Chew ^1,^2,^*, Bastien Blain ^1,^2,^*,^✉, Raymond J Dolan ^1,², Robb B Rutledge ^1,^2,^3,^✉

PMCID: PMC8549542 PMID: 34544831

Abstract

Standard economic indicators provide an incomplete picture of what we value both as individuals and as a society. Furthermore, canonical macroeconomic measures, such as GDP, do not account for non-market activities (e.g., cooking, childcare) that nevertheless impact well-being. Here, we introduce a computational tool that measures the affective value of experiences (e.g., playing a musical instrument without errors). We go on to validate this tool with neural data, using fMRI to measure neural activity in male and female human subjects performing a reinforcement learning task that incorporated periodic ratings of subjective affective state. Learning performance determined level of payment (i.e., extrinsic reward). Crucially, the task also incorporated a skilled performance component (i.e., intrinsic reward) which did not influence payment. Both extrinsic and intrinsic rewards influenced affective dynamics, and their relative influence could be captured in our computational model. Individuals for whom intrinsic rewards had a greater influence on affective state than extrinsic rewards had greater ventromedial prefrontal cortex (vmPFC) activity for intrinsic than extrinsic rewards. Thus, we show that computational modeling of affective dynamics can index the subjective value of intrinsic relative to extrinsic rewards, a “computational hedonometer” that reflects both behavior and neural activity that quantifies the affective value of experience.

SIGNIFICANCE STATEMENT Traditional economic indicators are increasingly recognized to provide an incomplete picture of what we value as a society. Standard economic approaches struggle to accurately assign values to non-market activities that nevertheless may be intrinsically rewarding, prompting a need for new tools to measure what really matters to individuals. Using a combination of neuroimaging and computational modeling, we show that despite their lack of instrumental value, intrinsic rewards influence subjective affective state and ventromedial prefrontal cortex (vmPFC) activity. The relative degree to which extrinsic and intrinsic rewards influence affective state is predictive of their relative impacts on neural activity, confirming the utility of our approach for measuring the affective value of experiences and other non-market activities in individuals.

Keywords: affect, mood, reward, risky decision making, value

Introduction

A key index of quality of life is subjective well-being which reflects “how people experience and evaluate their lives and specific domains and activities in their lives” (Oswald and Wu, 2010). Individuals with higher subjective well-being display lower mortality rates (Chida and Steptoe, 2008; Steptoe et al., 2015) and have a lower risk of disease (Davidson et al., 2010). In the workplace, employees who report higher subjective well-being have higher productivity without loss of output quality (Oswald et al., 2015), reduced rates of absenteeism (Pelled and Xin, 1999), and are rated more positively by their supervisors (Peterson et al., 2011). On this basis, maximizing subjective well-being should be of prime interest not only to individuals but also to companies and governments, as well as a target for health and economic policies (Dolan and White, 2007).

A problem arises when it comes to designing effective measures likely to increase well-being. When contemplating the future, people exhibit biases in affective forecasting when making predictions about what it would feel like to experience specific events, consistently misjudging how future events will impact their affective state and leading them to perform actions that may be detrimental to maximization of their subjective well-being (Wilson and Gilbert, 2005; Meyvis et al., 2010). In particular, people overestimate both the intensities and durations of their hedonic responses to future events, and this is referred to as an impact bias (Gilbert and Wilson, 2007; Morewedge and Buechel, 2013). Furthermore, the value of tangible goods can be quantified by prices or willingness-to-pay (Plassmann et al., 2007), but the value of intangible goods and experiences that are intrinsically rewarding (e.g., hobbies, recreational sports) are often more difficult to define or elicit accurately because of biases (Van de Mortel, 2008; Nisbet and Zelenski, 2011), while the predictive validity of implicit measures is unclear (Levesque et al., 2008; Keatley et al., 2013).

Neuroscience-informed methods can provide a means to evaluate the subjective value of an intrinsic reward (e.g., the experience of mastering a musical composition for its own sake), allowing extrinsic and intrinsic rewards to be compared using a common scale of objectively measured neural activity (FitzGerald et al., 2009). We hypothesized that extrinsic and intrinsic rewards would both influence affective states, and the extent of their relative influences should be reflected in regional brain activity. Recent studies (Rutledge et al., 2014, 2015; Vinckier et al., 2018) demonstrate that experience sampling during reward-based tasks can link affective and motivational responses to extrinsic reward. Here, we extend this approach to investigate how affective state is influenced by the history of intrinsic rewards.

We developed a reinforcement learning task incorporating both an explicit reward component and a skilled performance component, where the latter did not affect payment (Fig. 1A). On each trial, subjects selected one of two options, one of which was on average more rewarding than the other, and then navigated a cursor past a series of barriers (see Materials and Methods). We hypothesized that the experience of successful skilled performance, a source of intrinsic reward, would influence the momentary happiness of subjects in a manner that is quantitatively akin to the impacts of extrinsic rewards and that this would also be evident at the level of neural activity.

Figure 1. — Extrinsic and intrinsic reward paradigm. A, Subjects (n = 33) experienced both extrinsic and intrinsic rewards on each trial. A trial starts with subjects selecting from one or two available options each associated with an implicit extrinsic reward. One option on average leads to the larger reward (mean = 50, SD = 10) whereas the other leads to a lower reward (mean = 25, SD = 10) with a reversal every 19–23 trials. Four barriers then appear along the path to the outcome and a cursor appears at the bottom of the screen which automatically advances after a 1-s delay. Subjects press left and right keys to navigate around barriers, constituting a form of skilled performance that can be intrinsically rewarding. Successfully avoiding a barrier turns it green whereas contact with a barrier turns it red. There is no financial penalty for contact with barriers nor financial benefit for avoiding them. Earnings depend only on the outcome delivered at the end of the trial. After every two to three trials, subjects report their current happiness by moving a cursor on a rating line. B, Probability of choice to the initial high-reward option averaged across subjects (n = 33) in black. Shaded areas correspond to SEM. Gray vertical bands represent intervals where probability reversals could occur. C, D, Happiness trajectories and model fits for a computational model with both reward and performance parameters are displayed for two example subjects (C, r² = 0.45; D, r² = 0.42). Also see Figures 2, 3 and Tables 1, 2.

Materials and Methods

Participants

A total of 37 healthy young adults (age: 25.8 ± 4.7, mean ± SD; 8 males, 29 females) were recruited through the University College London (UCL) Psychology Subject Database. Subjects were screened to ensure no history of neurologic or psychiatric disorders. Four subjects were excluded because of excessive head movement during scanning, leaving a total of 33 subjects (age: 26.1 ± 4.9; 8 males, 25 females). The study was approved by the UCL research ethics committee, and all subjects gave written informed consent.

Study design

Subjects completed the experiment at the Wellcome Center for Human Neuroimaging at UCL in an appointment that lasted ∼90 min. Stimuli were presented in MATLAB (MathWorks) using Cogent 2000. The layout of each trial resembled a T-Maze (Howe et al., 2013). On each trial, subjects selected a blue or magenta box, one of which resulted in 50 points on average and the other which resulted in 25 points on average. The SD of points received for each box was 10. Points were assigned based on draws from Gaussian distributions. Every 19–23 trials, a reversal occurred where the box that previously contained the higher number of points on average now contained a lower number of points and vice versa. On half of the trials, subjects were afforded a free choice. For the remaining half, subjects were only presented with a single option. After a choice was made, the chosen option was indicated and four barriers appeared on the screen along with a small cursor at the bottom of the screen. Following a 1-s delay, the cursor automatically advanced along the path to the outcome. Subjects were able to control the horizontal position of the cursor to avoid colliding with barriers. If they passed a barrier without colliding with it, the barrier turned green. Contact with a barrier turned it red and provided immediate feedback about performance. Subjects then had to press the appropriate directional key to navigate around the barrier for the cursor to continue advancing on its course. Crucially, the subjects' final payment depended only on the number of points accumulated across the experiment and not their ability to quickly navigate past barriers. After the cursor had entered the chosen box, the outcome was displayed for 800 ms after a 1.5-s delay. Total cumulative points were displayed on the top right of the screen throughout the experiment. Subjects were presented with the question, “How happy are you at this moment?” after every two to three trials. After a 1-s delay period, a rating line appeared with a cursor at the midpoint and subjects had 4 s to move a cursor along the scale with button presses. The left end of the line was labeled “very unhappy” and the right end of the line was labeled “very happy.”

Staircase procedure

To ensure that differences in affective responses were not because of skill-related differences in how often each subject collided with barriers, we used a standard staircase procedure called the parametric estimation by sequential testing (PEST; Taylor and Creelman, 1967). This procedure calibrated the speed at which the cursor moved for every subject such that they did not contact the barriers on ∼70% of trials. This calibration was conducted over 60 trials before the start of the task in the scanner. Continuation of the procedure during the task allowed small adjustments (e.g., to compensate for any fatigue) to maintain consistent successful skill performance.

Questionnaire measures

Subjects were administered the Beck depression inventory (BDI-II; Beck et al., 1996), apathy evaluation scale (AES; Marin et al., 1991), and apathy motivation index (AMI; Ang et al., 2017).

Image acquisition

MRI scanning took place at the Wellcome Center for Human Neuroimaging at UCL using a Siemens Prisma 3-Tesla scanner equipped with a 64-channel head coil. Functional images were acquired with a gradient echo T2*-weighted echoplanar sequence with whole-brain coverage. Each volume consisted of 48 slices with 3-mm isotropic voxels [repetition time (TR): 3.36 s; echo time (TE): 30 ms; slice tilt: 0°] in ascending order. A field map (double-echo FLASH, TE1 = 10 ms, TE2 = 12.46 ms) with 3-mm isotropic voxels (whole-brain coverage) was also acquired for each subject to correct the functional images for any inhomogeneity in magnetic field strength. Subsequently, the first 6 volumes of each run were discarded to allow for T1 saturation effects. Structural images were T1-weighted (1 × 1 × 1 mm resolution) images acquired using a MPRAGE sequence.

Model-based analyses

Models were fit to happiness ratings in individual subjects by minimizing the residual sum of squares between actual and predicted happiness ratings, and this also served as the objective function for the optimizer. Model fitting was performed using the fmincon optimizer in MATLAB (MathWorks). The significance for individual parameters was determined using likelihood ratio tests comparing the full model with a model that had only a reward or performance parameter but not both. The significance of those tests is indicated by filled circles in Figure 4. Note that models were first fit to the raw happiness ratings to test the relationship between the happiness baseline mood parameter (denoted w₀ in the equations below) and questionnaire measures to replicate findings in the literature. Models were then fit to standardized ratings. Normalizing ratings prevents individuals with greater variance in their ratings from having a disproportionate effect on model comparisons. The SD of ratings differs widely across participants although rating variance is known to be stable in time (Rutledge et al., 2015) and across tasks (Blain and Rutledge, 2020).

Figure 4. — Computational model parameters and task behavior. A, B, The contribution of reward to happiness varied across subjects despite a similar high choice accuracy across subjects. Despite titrating difficulty at the individual level to match performance across subjects at 70%, subjects displayed considerable variation in the degree to which performance impacted affective state as captured by the computational model. Filled circles indicate betas that are significant at the individual level.

Recovery analysis

To ensure that the model parameters were recoverable, we performed model recovery and parameter recovery analyses following established procedures (Wilson and Collins, 2019). To test for parameter recovery, we first estimated the parameters for each participant. Then, we simulated data with each of the four generative models using parameters estimated for each participant. To account for noise in the simulation, we computed the SD of the residuals from the model at the individual level and then generated Gaussian noise with the same SD using the MATLAB randn function and added that noise to generated ratings. We then estimated parameters from the generated data using the same procedure as applied to the actual mood dynamics data (n = 33). The SDs of residuals in the recovery analysis were highly correlated with the noise parameter in the generative process (e.g., for reward and performance, the correlation is Spearman ρ(31) = 0.98, p < 10⁻¹⁸).

Results

Subjects completed two trial blocks while in the MRI scanner. We first asked whether subjects could learn the reward contingencies (Fig. 1B) and found that they could, making 85.8 ± 1.0% (mean ± SEM, z = 5.0, p < 10⁻⁶) of choices to the current high-reward option. Subjects were not penalized for contact with barriers, and thus actual performance was non-instrumental to the receipt of eventual monetary reward. We observed no correlation between earnings and how often subjects successfully avoided barriers (ρ(31) = 0.21, p = 0.24). During debriefing, all 33 subjects reported that they believed there was no association between successful skilled performance and earnings.

Reports of affective state for example subjects are included in Figure 1C and D. On average, subjects reported being happier after receiving outcomes from the high- compared with low-reward option (high-reward: 63.8 ± 1.9, low-reward: 59.5 ± 2.1, z = 4.7, p < 10⁻⁵), consistent with previous research (Rutledge et al., 2014, 2015). On average, subjects reported also being happier when they navigated through the barriers without collisions compared with when they contacted at least one barrier (without collisions: 63.5 ± 1.9; collision: 60.0 ± 2.1, z = 4.6, p < 10⁻⁵), suggesting that intrinsic rewards related to performance influence subjective affective state.

Because participants vary in how they use the scale, we next z-scored happiness ratings. Consistent with analyses using non-normalized ratings, subjects reported greater average happiness after receiving high compared with low rewards (high-reward: 0.08 ± 0.01, low-reward: −0.18 ± 0.02, z = 4.8, p < 10⁻⁵; Fig. 2A). Subjects also reported being happier after navigating through the maze without contacting any barriers compared with when they collided with at least one barrier (without collisions: 0.08 ± 0.01; collision: −0.17 ± 0.03, z = 4.7, p < 10⁻⁵; Fig. 2A), consistent with an impact of intrinsic rewards. There was considerable variation across subjects in terms of how much extrinsic rewards and skilled performance contributed to momentary happiness (Fig. 2B), but there was no relationship between happiness for reward outcomes and happiness for skilled performance (ρ(31) = −0.20, p = 0.26).

Figure 2. — Computational modeling of affective dynamics. A, Subjects were happier when they received a reward from high-reward compared with low-reward options (Z = 4.7, p < 10⁻⁵, in blue). Subjects were happier on average when they navigated through the barriers without contacting them, compared with when they contacted at least one barrier (Z = 4.6, p < 10⁻⁵, in orange); ***p < 0.001. B, The majority of subjects (29 of 33) were happier after receiving a reward from a high-reward compared with low-reward option. The majority of subjects (29 of 33) were happier after successful compared with unsuccessful performance. There was no relationship between happiness for reward outcomes and happiness for skilled performance (ρ(31) = −0.20, p = 0.26). C, Average happiness across all subjects and model fit is displayed for the computational model (n = 33, mean r² = 0.26). D, According to the computational model, happiness was significantly related to the history of extrinsic rewards in the form of points converted to money (Z = 4.9, p < 10⁻⁵) and also to the history of skilled performance, a proxy for intrinsic rewards (Z = 4.4, p < 10⁻⁴); ***p < 0.001.

Computational model of affective dynamics

We next employed a previously established methodology (Rutledge et al., 2014, 2015; Blain and Rutledge, 2020) to quantify the extent to which rewards impacted on the affective state of our participants. In particular, we aim to replicate that (1) the recent history of reward influences happiness and (2) that the baseline happiness parameter correlates with depressive symptoms. To that end, we fit the raw happiness ratings. We considered influences that decay exponentially in time:

Happiness (t) = w_{0} + w_{r e w a r d} \sum_{j = 1}^{t} γ^{t - j} R e w a r d_{j} + ϵ,

(1)

where t and j are trial numbers, w₀ is a baseline mood parameter, w_reward captures the influence of reward which is the z-scored reward outcome of the selected option on each trial, and 0 ≤ γ ≤ 1 represents a forgetting factor that reduces the impact of distal relative to recent events. If this parameter is equal to 0, only the most recent reward outcome influences happiness. The model includes a Gaussian noise term, ϵ ∼ N(0, σ). The parameters of this model are recoverable (for details about parameter recovery, see Fig. 3A and Table 1). Parameters were first fit to non-normalized happiness ratings in each individual subject. The mean r² was 0.26 ± 0.03 and the mean forgetting factor was 0.40 ± 0.06 (mean ± SEM; for example subjects, see Fig. 1C). Consistent with previous findings (Rutledge et al., 2014, 2015), happiness was significantly associated with the history of reward (w_reward = 0.06 ± 0.01; Wilcoxon signed-rank test: z = 4.7, p < 10⁻⁵). Sigma was estimated to be on average 0.13 ± 0.01.

Figure 3. — Parameter recovery analysis for reward model (A), performance model (B), and reward and performance model (C), plotting the parameter values used to generate the data against the estimated parameters for z-scored happiness ratings. The model parameters were recoverable with no bias. See Materials and Methods for details; ***p < 10⁻⁷.

Table 1.

Model parameter recovery results

Model	Spearman ρ between generated and estimated parameters
Model	w_reward	w_performance	γ₁	γ₂
Reward	0.91^***	-	0.82^***	-
Performance	-	0.70^***	0.61^***	-
Reward and performance	0.89^***	0.73^***	0.76^***	-
Reward and performance (separate γ)	0.86^***	0.90^***	0.81^**	0.81^***

Open in a new tab

The values correspond to the Spearman correlation between the generated parameters and the estimated parameters of 33 agents using z-score happiness ratings. See Materials and Methods for details;

**p < 0.01,

***p < 0.001.

Likewise, consistent with previous findings during risky decision-making (Rutledge et al., 2017), we found that baseline mood parameters, estimated using raw happiness ratings while accounting for mood dynamics because of reward history, were negatively correlated with symptom severity assessed using the BDI-II (Beck et al., 1996; Spearman ρ(31) = −0.35, p = 0.046). This result shows that depressive symptoms relate to happiness ratings during a novel task including a performance component consistent with previous findings during risky decision-making (Rutledge et al., 2017) and learning in volatile environments (Blain and Rutledge, 2020). This relationship is consistent with an affective set point, which happiness returns to over time, that is lower in individuals with a greater symptom load.

We also found baseline mood parameters tended to be negatively related to apathy as measured by AES (Marin et al., 1991; ρ(31) = −0.32, p = 0.07) and behavioral apathy as assessed by the AMI (27; ρ(31) = −0.33, p = 0.06; see Table 2). The first happiness rating before the start of the first trial was positively correlated with baseline mood parameter (ρ(31) = 0.46, p = 0.007). In contrast to baseline mood parameters, first happiness ratings were not significantly correlated with BDI-II (ρ(31) = −0.21, p = 0.25) or AES (ρ(31) = −0.17, p = 0.35) but was correlated with behavioral AMI (ρ(31) = −0.39, p = 0.027). We found no correlation between baseline mood parameter and the average staircased cursor speed (ρ(31) = −0.01, p = 0.95), suggesting that the speed of the cursor was not associated with persistent affective state.

Table 2.

Model comparison results

Model	Parameters	Mean r²	BIC	ΔBIC
Reward	2	0.19	−326	145
Performance	2	0.09	−26	445
Reward and performance	3	0.26	−471	0
Reward and performance (separate γ)	4	0.27	−351	120

Open in a new tab

Bayesian information criterion (BIC) scores are summed across 33 subjects. The winning model (lowest BIC) was the model with both reward and performance having the same forgetting factor γ rather than a model where the influence of past reward and performance differs in their forgetting factor. ΔBIC refers to the difference in BIC between each model and the winning model. Ratings are z-scored to prevent individuals with greater rating variance from disproportionally influencing model comparison.

We next z-scored happiness ratings to better evaluate the relative contributions of extrinsic and intrinsic reward to affective state. To that end, we z-scored the happiness ratings, thereby preventing individuals with greater rating variance from disproportionally affecting analyses. With happiness ratings centered on zero, as well as Reward and Performance vectors, any constant term would be expected to be near zero and we omitted the w₀ from analyses with z-scored ratings. We expanded the model to include an additional term that accounts also for influences pertaining to skilled performance:

Happiness (t) = w_{r e w a r d} \sum_{j = 1}^{t} γ^{t - j} R e w a r d_{j} + w_{performance} \sum_{j = 1}^{t} γ^{t - j} P e r f o r m a n c e_{j} + ϵ,

(2)

where t and j are trial numbers, w_reward and w_performance capture the influence of task events related to reward and performance, respectively, and 0 ≤ γ ≤ 1 represents a forgetting factor that reduces the impact of distal relative to recent events. The model includes a Gaussian noise term, ϵ ∼ N(0, σ). The model parameters were indeed recoverable (see Fig. 3C and Table 2; for details, see Materials and Methods). Reward is the z-scored outcome of the selected option on each trial, and performance is the z-scored result of whether a barrier was contacted on each trial, assigning a 1 when no barriers were contacted and 0 if at least one barrier was contacted. This simple model explained a substantial amount of variance in happiness with r² = 0.26 ± 0.03 (mean ± SEM; Fig. 2C and Table 2). Weights for both performance (w_performance = 0.18 ± 0.03; z = 4.4, p < 10⁻⁴; Fig. 2D) and reward (w_reward = 0.39 ± 0.04, z = 4.9, p < 10⁻⁵; Fig. 2D) were positive on average. The forgetting factor γ was 0.48 ± 0.05 (mean ± SEM), indicating that happiness depended on the past four to five trials on average. Sigma was estimated to be on average 0.85 ± 0.02.

In previous studies, we found expectations of reward exerted a substantial influence on happiness (Rutledge et al., 2014, 2015; Blain and Rutledge, 2020). In the current study, we used high-reward and low-reward distributions with minimal overlap to maximize learning accuracy. We also employed a staircase to keep skilled performance stable and at a similar level across individuals. These features render the current design unsuitable for quantifying the impact of expectations on happiness. We chose a design that maximized our power for quantifying individual differences in the relative subjective values of extrinsic and intrinsic rewards.

Model comparison (Table 2) shows that a model with parameters for past rewards and performance (mean r² = 0.26) outperformed models containing individual terms for reward (mean r² = 0.19) or performance (mean r² = 0.09) alone. These results show that the happiness of subjects in this task is, on average, dependent on both receipt of explicit rewards (e.g., money) and the non-instrumental experience of skilled performance.

We found considerable variation across individuals in how much reward outcomes contributed to affective dynamics, although subjects on average learned reward contingencies to a similar degree (Fig. 4A). Despite performance being held constant because of staircasing of cursor speed (successful performance: 69.1 ± 2.4%, mean ± SD; Fig. 4B), there was considerable variation also across individuals in how much non-instrumental performance influenced affective state. Many subjects showed a negligible impact of successful performance on affective state, despite a similar level of successful performance. Furthermore, learning choice accuracy was not correlated with either happiness reward parameters (ρ(31) = 0.12, p = 0.49) or successful skilled performance (ρ(31) = −0.05, p = 0.78).

Intrinsic rewards can be associated with an increased motivation or metacognitive strategy to improve performance over time (Son and Metcalfe, 2005). Before scanning, participants completed 60 practice trials to determine an appropriate starting speed for the experiment. W_performance was positively correlated with the starting cursor speed (ρ(31) = 0.38, p = 0.03). There was no correlation between percent successful skilled performance and w_performance derived from the happiness model (ρ(31) = 0.056, p = 0.76). Intrinsic rewards are often thought as resulting from uncertainty reduction, or from learning progress (Gottlieb and Oudeyer, 2018). However, we did not find any significant difference in the median cursor speed between blocks (z = 0.63, p = 0.53), suggesting that participants were at a stable level of performance from the start that did not improve over time. Similarly, w_performance was not significantly different between blocks (z = 1.47, p = 0.14). These results together suggest that performing this task accurately was intrinsically rewarding with a stable relationship between performance and happiness despite no signs of learning progress during the experiment.

We then checked whether we can extend the link between the baseline mood parameter from the reward model (see above) and apathy and depression scores to the baseline mood parameter of models including a performance term. Results indicate a trend toward the same relationship as for the reward model (see Table 3).

Table 3.

Correlation between baseline mood parameter and questionnaire score

	w₀ reward	w₀ performance	w₀ reward and performance
BDI	−0.35^*	−0.31^†	−0.34^†
AES	−0.32^†	−0.32^†	−0.30^†
bAMI	−0.33^†	−0.32^†	−0.29^†

Open in a new tab

Values correspond to the Spearman coefficient ρ;

*p < 0.05,

†p < 0.1

Neural correlates of extrinsic and intrinsic rewards

Having established interindividual variability in the impact of outcomes and performance on reported happiness, we next asked whether this variability was also predictive of neural responses to both rewards and performance. The experiment was separated into two scans and we first evaluated whether happiness model parameters were stable across scans. We found that both extrinsic (ρ(31) = 0.35, p = 0.044) and intrinsic (ρ(31) = 0.35, p = 0.044) reward computational parameters were positively correlated across the two scans.

We regressed event-related activity on parametrically modulated task events to assess brain activity related to receipt of extrinsic and intrinsic rewards. We found an effect of reward magnitude at time of outcome in ventromedial prefrontal cortex [vmPFC; −3, 38, −1; t₍₃₂₎ = 5.92, p < 0.05 family-wise error (FWE) cluster-corrected at the whole brain level; Fig. 5A, top], as well as an effect of successful skilled performance in an overlapping region of the vmPFC (−3, 50, −1; t₍₃₂₎ = 4.24, p < 0.05 FWE cluster-corrected; Fig. 5A, bottom).

Figure 5. — Relative affective impacts of reward and performance predict vmPFC activity. A, top, BOLD activity in vmPFC was parametrically modulated by reward magnitude (peak: −3, 38, −1). Bottom, Bold activity in an overlapping region of vmPFC was modulated by trial-by-trial successful skilled performance (peak: −3, 50, −1). B, An independent vmPFC ROI shows modulation by both reward magnitude and skilled performance (both p < 0.01). C, In the independent vmPFC ROI, subjects with higher performance than reward weights in the computational analysis of affective dynamics displayed stronger neural responses in the vmPFC for performance than subjects with higher reward than performance weights (p = 0.003). D, The difference between performance and reward weights in the happiness computational model predicted the difference in vmPFC neural responses for successful skilled performance relative to reward magnitude (ρ(31) = 0.50, p = 0.003); *p < 0.05, **p < 0.01.

The vmPFC is widely implicated in representation of subjective reward value. On this basis, we used an independent vmPFC mask from a meta-analysis of subjective value studies of extrinsic reward for further analysis (Bartra et al., 2013). Within this region of interest (ROI), we extracted weights for reward magnitude and skilled performance from each individual subject. We found that within this independent ROI, BOLD activity was significantly associated with both reward magnitude (0.26 ± 0.08, Z = 3.0, p = 0.0029) and skilled performance (0.38 ± 0.13, Z = 2.8, p = 0.0052; Fig. 5B).

Having established that neural responses in vmPFC are associated with both extrinsic and intrinsic rewards, we next examined whether neural responses were predicted by computational parameters estimated from individual affective dynamics. Across subjects, we found a positive relationship (ρ(31) = 0.50, p = 0.003; Fig. 5D) between the relative weights for extrinsic and intrinsic rewards in our happiness computational model and the relative effect sizes for neural responses in the vmPFC. Initial happiness ratings deviate from model predictions on average (Fig. 2C). The relationship between relative happiness weights and relative neural effect sizes was still present after removing the initial 10% of ratings (ρ(31) = 0.54, p = 0.0015). The relationship was also present after removing the initial 10% and detrending the remaining ratings before estimating model parameters (ρ(31) = 0.49, p = 0.0038).

We also subdivided subjects into two groups comprising a group with higher W_performance than reward parameters and a group with the opposite pattern. The group with higher performance than reward parameters showed greater vmPFC responses for skilled performance compared with the group with larger reward than performance parameters (Z = 2.8, p = 0.0047; Fig. 5C). These findings suggest that the pattern of momentary affective dynamics reflects the impact of both extrinsic and intrinsic rewards and is mirrored at the level of vmPFC activity.

Discussion

Using experience sampling (Reis and Gable, 2000; Kahneman et al., 2004) combined with functional neuroimaging, we show that extrinsic and intrinsic rewards contribute to affective dynamics (i.e., happiness). Recent studies demonstrate that computational approaches can quantify consistent relationships between subjective feelings and value-based decision-making (Rutledge et al., 2014; Eldar et al., 2016, 2018; Vinckier et al., 2018; Blain and Rutledge, 2020), including in relation to individual social preferences (Rutledge et al., 2016). Here, using the same computational approach applied during reinforcement learning, we show that momentary happiness is influenced by both extrinsic and intrinsic rewards. The computational parameters we extract from affective dynamics enabled us to quantify, within a common value scale, the relative affective value of intrinsic relative to extrinsic rewards. Our key finding here is that the relative weight of intrinsic and extrinsic reward extracted from affective dynamics predicts neural activity in the vmPFC, a region proposed to represent rewards in a common neural currency (Chib et al., 2009; Levy and Glimcher, 2011, 2012), validating our computational approach.

While improvements in skilled performance can be enhanced by rewarding individuals for performance (Sugawara et al., 2012), holding performance constant across subjects allowed us to investigate how happiness varied independently of the level of skill individuals manifest in the task. We show that individuals, whose happiness was substantially influenced by intrinsic rewards, had increased vmPFC BOLD responses for successful versus unsuccessful skilled performance, relative to individuals whose happiness was influenced more by extrinsic rewards.

The vmPFC is known to represent the value of different types of goods, including food and juice (Padoa-Schioppa, 2007; Hare et al., 2011), money (De Martino et al., 2006), esthetic judgments (Kawabata and Zeki, 2004; Jacobsen et al., 2006), and even perceived pleasantness (Plassmann et al., 2008). This suggests that vmPFC plays a central role in representing qualitatively different types of goods on a common scale, an operation that can facilitate making decisions between otherwise incommensurable goods (Chib et al., 2009; Levy and Glimcher, 2011, 2012). Our study builds on these prior results by now identifying an association between vmPFC BOLD activity and intrinsic rewards, here the experience of performing a skilled task without error. Whole-brain analysis showed that the representation of subjective intrinsic reward values involved an adjacent region in the vmPFC, anterior to the representation for extrinsic rewards but still residing within a central vmPFC cluster (Clithero and Rangel, 2014), a finding that parallels a distinction between experienced and decision values previously mapped to anterior and posterior vmPFC, respectively (Smith et al., 2010).

The vmPFC has been demonstrated to play a role in affect with subjective emotional experiences elicited by images and pleasurable music leading to changes in both vmPFC BOLD activity and regional cerebral blood flow (Blood and Zatorre, 2001; Zald et al., 2002; Winecoff et al., 2013). Damage to the vmPFC can lead to aberrant emotional responses (Koenigs et al., 2007; Zald and Andreotti, 2010; Hiser and Koenigs, 2018) and maladaptive decision-making in environments where emotional regulation may be useful (Grossman et al., 2010; Spaniol et al., 2019). Numerous studies suggest that subjective reward values are represented by vmPFC neural activity. Unfortunately, the constraints and expense of neuroimaging makes it impractical as an every-day tool for assessing individual values for non-market activities. The strong association between neural responses for intrinsic and extrinsic rewards and computational parameters extracted from affective dynamics suggests that computational models combined with experience sampling can provide a valid measure for the subjective reward value of experience.

A limitation of the current study is that the staircase procedure we used does not allow us to address questions related to the intrinsic motivation for learning of our subjects. The staircase procedure can be useful for study of interindividual variation either by keeping performance constant across individuals despite differences in abilities (Fleming et al., 2010) or for tailoring choice options to individuals (Klein-Flügge et al., 2015). Using the staircase procedure meant that subjects quickly reached the limit by which they could improve performance. Our design is thus unsuitable for studying intrinsic motivation pertaining to learning. However, such a framework for measuring affective value could be valuable for other features related to intrinsic rewards (Blain and Sharot, 2021), like metacognitive control and learning (Son and Sethi, 2006), resource allocation under external pressures (Son and Metcalfe, 2005), as well as curiosity-driven exploration of the environment where rewards may be more dependent on the learning progress of an individual (Gottlieb and Oudeyer, 2018).

Humans exhibit biases when it comes to predicting how future events are likely to impact on their affective states, and are prone to making sub-optimal decisions by misjudging the hedonic consequences of options (Wilson and Gilbert, 2005; Meyvis et al., 2010; Nisbet and Zelenski, 2011). Increasing subjective well-being is widely believed to be an appropriate societal goal (OECD, 2020), but these biases pose a difficulty for enacting policies that are likely to be successful. Additional factors such as social desirability bias (Van de Mortel, 2008) can decrease the reliability of self-reported values when an individual's assessment of a hypothetical experience or good, such as the availability of public parks, differs from prevailing social norms. An advantage of our method (i.e., repeated mood sampling combined with computational modeling) is that it can be in principle applied not only to any cognitive task but also to any repeatable experience (e.g., commuting, walking in a park, exercising, doing yoga, etc.) without a need to probe people explicitly about the content of those experiences (e.g., how do you feel after having done yoga?). Mood measurements make no reference to recent events but allow the relative influence of multiple factors to be simultaneously estimated, reducing biases associated with social desirability (e.g., following social norms about how one should feel after doing yoga). For example, affective dynamics reflect depressive symptoms (Rutledge et al., 2017; Blain and Rutledge, 2020), show consistent relationships to reward in the lab and outside the lab in anonymous participants who did not interact with an experimenter (Rutledge et al., 2014), and allow quantification of the extent of guilt and envy in response to social inequality (Rutledge et al., 2016). A potential application of our approach, yet to be tested, would be to combine our computational approach with experience sampling in different naturalistic settings such as a corporate workplace, to identify factors important for employee well-being. Thus, the approach we use in this study demonstrates a novel tool for understanding preferences and well-being.

Over a century ago, Francis Edgeworth described an idealized instrument, which he called a hedonometer, for “continually registering the height of pleasure experienced by an individual” (Edgeworth, 1881). Here, we introduce a “computational hedonometer” that has a distinct advantage over Edgeworth's hypothetical hedonometer in that it mathematically quantifies the relative contributions of different factors to an affective state, including the relative values of intrinsic and extrinsic rewards. We validate our computational tool using objective neural measurements, suggesting that computational parameters can capture the affective values for abstract goods and experiences that may be otherwise challenging to accurately quantify.

Footnotes

We thank Tobias Hauser, Matilde Vaghi, and Rachel Bedder for helpful comments. B.C. is a predoctoral fellow of the International Max Planck Research School on Computational Methods in Psychiatry and Ageing Research. The participating institutions are the Max Planck Institute for Human Development and the University College London (UCL). B.C. is also supported by a scholarship from the Singapore Institute of Management. R.J.D. holds the Wellcome Trust Investigator Award 098362/Z/12/Z. R.B.R. is supported by the Medical Research Council Career Development Award MR/N02401X/1, the 2018 National Alliance for Research on Schizophrenia and Depression Young Investigator Grant 27674 from the Brain and Behavior Research Foundation, P&S Fund, and the National Institute of Mental Health Grant 1R01MH124110. The Max Planck UCL Centre is a joint initiative supported by UCL and the Max Planck Society. The Wellcome Centre for Human Neuroimaging is supported by core funding from the Wellcome Trust (203147/Z/16/Z).

The authors declare no competing financial interests.

References

Ang YS, Lockwood P, Apps MA, Muhammed K, Husain M (2017) Distinct subtypes of apathy revealed by the apathy motivation index. PLoS One 12:e0169938. 10.1371/journal.pone.0169938 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bartra O, McGuire JT, Kable JW (2013) The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76:412–427. 10.1016/j.neuroimage.2013.02.063 [DOI] [PMC free article] [PubMed] [Google Scholar]
Beck AT, Steer RA, Ball R, Ranieri WF (1996) Comparison of Beck Depression Inventories-ia and-ii in psychiatric outpatients. J Pers Assess 67:588–597. 10.1207/s15327752jpa6703_13 [DOI] [PubMed] [Google Scholar]
Blain B, Rutledge RB (2020) Momentary subjective well-being depends on learning and not reward. Elife 9:e57977. 10.7554/eLife.57977 [DOI] [PMC free article] [PubMed] [Google Scholar]
Blain B, Sharot T (2021) Intrinsic reward: potential cognitive and neural mechanisms. Curr Opin Behav Sci 39:113–118. 10.1016/j.cobeha.2021.03.008 [DOI] [Google Scholar]
Blood AJ, Zatorre RJ (2001) Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proc Natl Acad Sci USA 98:11818–11823. 10.1073/pnas.191355898 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chib VS, Rangel A, Shimojo S, O'Doherty JP (2009) Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. J Neurosci 29:12315–12320. 10.1523/JNEUROSCI.2575-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chida Y, Steptoe A (2008) Positive psychological well-being and mortality: a quantitative review of prospective observational studies. Psychosom Med 70:741–756. 10.1097/PSY.0b013e31818105ba [DOI] [PubMed] [Google Scholar]
Clithero JA, Rangel A (2014) Informatic parcellation of the network involved in the computation of subjective value. Soc Cogn Affect Neurosci 9:1289–1302. 10.1093/scan/nst106 [DOI] [PMC free article] [PubMed] [Google Scholar]
Davidson KW, Mostofsky E, Whang W (2010) Don't worry, be happy: positive affect and reduced 10-year incident coronary heart disease: the Canadian Nova Scotia Health Survey. Eur Heart J 31:1065–1070. 10.1093/eurheartj/ehp603 [DOI] [PMC free article] [PubMed] [Google Scholar]
De Martino B, Kumaran D, Seymour B, Dolan RJ (2006) Frames, biases, and rational decision-making in the human brain. Science 313:684–687. 10.1126/science.1128356 [DOI] [PMC free article] [PubMed] [Google Scholar]
Dolan P, White MP (2007) How can measures of subjective well-being be used to inform public policy? Perspect Psychol Sci 2:71–85. 10.1111/j.1745-6916.2007.00030.x [DOI] [PubMed] [Google Scholar]
Edgeworth FY (1881) Mathematical psychics: an essay on the application of mathematics to the moral sciences. London: Kegan Paul. [Google Scholar]
Eldar E, Rutledge RB, Dolan RJ, Niv Y (2016) Mood as representation of momentum. Trends in cognitive sciences 20:15–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
Eldar E, Roth C, Dayan P, Dolan RJ (2018) Decodability of reward learning signals predicts mood fluctuations. Current Biology 28:1433–1439. [DOI] [PMC free article] [PubMed] [Google Scholar]
FitzGerald THB, Seymour B, Dolan RJ (2009) The role of human orbitofrontal cortex in value comparison for incommensurable objects. J Neurosci 29:8388–8395. 10.1523/JNEUROSCI.0717-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Fleming SM, Weil RS, Nagy Z, Dolan RJ, Rees G (2010) Relating introspective accuracy to individual differences in brain structure. Science 329:1541–1543. 10.1126/science.1191883 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gilbert DT, Wilson TD (2007) Prospection: experiencing the future. Science 317:1351–1354. 10.1126/science.1144161 [DOI] [PubMed] [Google Scholar]
Gottlieb J, Oudeyer P-Y (2018) Towards a neuroscience of active sampling and curiosity. Nat Rev Neurosci 19:758–770. 10.1038/s41583-018-0078-0 [DOI] [PubMed] [Google Scholar]
Grossman M, Eslinger PJ, Troiani V, Anderson C, Avants B, Gee JC, McMillan C, Massimo L, Khan A, Antani S (2010) The role of ventral medial prefrontal cortex in social decisions: converging evidence from fMRI and frontotemporal lobar degeneration. Neuropsychologia 48:3505–3512. 10.1016/j.neuropsychologia.2010.07.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hare TA, Malmaud J, Rangel A (2011) Focusing attention on the health aspects of foods changes value signals in vmPFC and improves dietary choice. J Neurosci 31:11077–11087. 10.1523/JNEUROSCI.6383-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hiser J, Koenigs M (2018) The multifaceted role of ventromedial prefrontal cortex in emotion, decision-making, social cognition, and psychopathology. Biol Psychiatry 83:638–647. 10.1016/j.biopsych.2017.10.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
Howe MW, Tierney PL, Sandberg SG, Phillips PE, Graybiel AM (2013) Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500:575–579. 10.1038/nature12475 [DOI] [PMC free article] [PubMed] [Google Scholar]
Jacobsen T, Schubotz R, Höfel L, Cramon Y (2006) Brain correlates of aesthetic judgment of beauty. Neuroimage 29:276–285. 10.1016/j.neuroimage.2005.07.010 [DOI] [PubMed] [Google Scholar]
Kahneman D, Krueger AB, Schkade DA, Schwarz N, Stone AA (2004) A survey method for characterizing daily life experience: the day reconstruction method. Science 306:1776–1780. 10.1126/science.1103572 [DOI] [PubMed] [Google Scholar]
Kawabata H, Zeki S (2004) Neural correlates of beauty. J Neurophysiol 91:1699–1705. 10.1152/jn.00696.2003 [DOI] [PubMed] [Google Scholar]
Keatley D, Clarke DD, Hagger MS (2013) The predictive validity of implicit measures of self-determined motivation across health-related behaviours. Br J Health Psychol 18:2–17. 10.1111/j.2044-8287.2011.02063.x [DOI] [PubMed] [Google Scholar]
Klein-Flügge MC, Kennerley SW, Saraiva AC, Penny WD, Bestmann S (2015) Behavioral modeling of human choices reveals dissociable effects of physical effort and temporal delay on reward devaluation. PLoS Comput Biol 11:e1004116. 10.1371/journal.pcbi.1004116 [DOI] [PMC free article] [PubMed] [Google Scholar]
Koenigs M, Young L, Adolphs R, Tranel D, Cushman F, Hauser M, Damasio A (2007) Damage to the prefrontal cortex increases utilitarian moral judgements. Nature 446:908–911. 10.1038/nature05631 [DOI] [PMC free article] [PubMed] [Google Scholar]
Levesque C, Copeland KJ, Sutcliffe RA (2008) Conscious and nonconscious processes: implications for self-determination theory. Can Psychol 49:218–224. 10.1037/a0012756 [DOI] [Google Scholar]
Levy DJ, Glimcher PW (2011) Comparing apples and oranges: using reward-specific and reward-general subjective value representation in the brain. J Neurosci 31:14693–14707. 10.1523/JNEUROSCI.2218-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
Levy DJ, Glimcher PW (2012) The root of all value: a neural common currency for choice. Curr Opin Neurobiol 22:1027–1038. 10.1016/j.conb.2012.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
Marin RS, Biedrzycki RC, Firinciogullari S (1991) Reliability and validity of the apathy evaluation scale. Psychiatry Res 38:143–162. 10.1016/0165-1781(91)90040-v [DOI] [PubMed] [Google Scholar]
Meyvis T, Ratner RK, Levav J (2010) Why don't we learn to accurately forecast feelings? How misremembering our predictions blinds us to past forecasting errors. J Exp Psychol Gen 139:579–589. 10.1037/a0020285 [DOI] [PubMed] [Google Scholar]
Morewedge CK, Buechel EC (2013) Motivated underpinnings of the impact bias in affective forecasts. Emotion 13:1023–1029. 10.1037/a0033797 [DOI] [PubMed] [Google Scholar]
Nisbet EK, Zelenski JM (2011) Underestimating nearby nature: affective forecasting errors obscure the happy path to sustainability. Psychol Sci 22:1101–1106. 10.1177/0956797611418527 [DOI] [PubMed] [Google Scholar]
OECD (2020) How's life? 2020: measuring well-being. OECD iLibrary. Available at: https://www.oecd.org/statistics/how-s-life-23089679.htm#:∼:text=Measuring%20Well%2Dbeing-,How's%20Life%3F,resources%20for%20future%20well%2Dbeing. Accessed March 22, 2021.
Oswald AJ, Wu S (2010) Objective confirmation of subjective measures of human well-being: evidence from the U.S.A. Science 327:576–579. 10.1126/science.1180606 [DOI] [PubMed] [Google Scholar]
Oswald AJ, Proto E, Sgroi D (2015) Happiness and productivity. J Labor Econ 33:789–822. 10.1086/681096 [DOI] [Google Scholar]
Padoa-Schioppa C (2007) Orbitofrontal cortex and the computation of economic value. Ann NY Acad Sci 1121:232–253. 10.1196/annals.1401.011 [DOI] [PubMed] [Google Scholar]
Pelled LH, Xin KR (1999) Down and out: an investigation of the relationship between mood and employee withdrawal behavior. J Management 25:875–895. 10.1177/014920639902500605 [DOI] [Google Scholar]
Peterson S, Luthans F, Avolio BJ, Walumbwa FO, Zhang Z (2011) Psychological capital and employee performance: a latent growth modeling approach. Pers Psychol 64:427–450. 10.1111/j.1744-6570.2011.01215.x [DOI] [Google Scholar]
Plassmann H, O'Doherty J, Rangel A (2007) Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci 27:9984–9988. 10.1523/JNEUROSCI.2131-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
Plassmann H, O'Doherty J, Shiv B, Rangel A (2008) Marketing actions can modulate neural representations of experienced pleasantness. Proc Natl Acad Sci USA 105:1050–1054. 10.1073/pnas.0706929105 [DOI] [PMC free article] [PubMed] [Google Scholar]
Reis HT, Gable SL (2000) Event-sampling and other methods for studying everyday experience. In Handbook of research methods in social and personality psychology (Reis H. T. & Judd C. M. (Eds.), (pp. 190–222). Cambridge University Press. [Google Scholar]
Rutledge RB, Skandali N, Dayan P, Dolan RJ (2014) A computational and neural model of momentary subjective well-being. Proc Natl Acad Sci USA 111:12252–12257. 10.1073/pnas.1407535111 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rutledge RB, Skandali N, Dayan P, Dolan RJ (2015) Dopaminergic modulation of decision making and subjective well-being. J Neurosci 35:9811–9822. 10.1523/JNEUROSCI.0702-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rutledge RB, de Berker AO, Espenhahn S, Dayan P, Dolan RJ (2016) The social contingency of momentary subjective well-being. Nat Commun 7:11825. 10.1038/ncomms11825 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rutledge RB, Moutoussis M, Smittenaar P, Zeidman P, Taylor T, Hrynkiewicz L, Lam J, Skandali N, Siegel JZ, Ousdal OT, Prabhu G, Dayan P, Fonagy P, Dolan RJ (2017) Association of neural and emotional impacts of reward prediction errors with major depression. JAMA Psychiatry 74:790–797. 10.1001/jamapsychiatry.2017.1713 [DOI] [PMC free article] [PubMed] [Google Scholar]
Smith DV, Hayden BY, Truong T-K, Song AW, Platt ML, Huettel SA (2010) Distinct value signals in anterior and posterior ventromedial prefrontal cortex. J Neurosci 30:2490–2495. 10.1523/JNEUROSCI.3319-09.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
Son LK, Metcalfe J (2005) Judgments of learning: evidence for a two-stage process. Mem Cognit 33:1116–1129. 10.3758/bf03193217 [DOI] [PubMed] [Google Scholar]
Son LK, Sethi R (2006) Metacognitive control and optimal learning. Cogn Sci 30:759–774. 10.1207/s15516709cog0000_74 [DOI] [PubMed] [Google Scholar]
Spaniol J, Di Muro F, Ciaramelli E (2019) Differential impact of ventromedial prefrontal cortex damage on “hot” and “cold” decisions under risk. Cogn Affect Behav Neurosci 19:477–489. 10.3758/s13415-018-00680-1 [DOI] [PubMed] [Google Scholar]
Steptoe A, Deaton A, Stone AA (2015) Subjective wellbeing, health, and ageing. Lancet 385:640–648. 10.1016/S0140-6736(13)61489-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
Sugawara SK, Tanaka S, Okazaki S, Watanabe K, Sadato N (2012) Social rewards enhance offline improvements in motor skill. PLoS One 7:e48174. 10.1371/journal.pone.0048174 [DOI] [PMC free article] [PubMed] [Google Scholar]
Taylor M, Creelman CD (1967) PEST: efficient estimates on probability functions. J Acoust Soc Am 41:782–787. 10.1121/1.1910407 [DOI] [Google Scholar]
Van de Mortel TF (2008) Faking it: social desirability response bias in self-report research. Austr J Adv Nurs 25:40. [Google Scholar]
Vinckier F, Rigoux L, Oudiette D, Pessiglione M (2018) Neuro-computational account of how mood fluctuations arise and affect decision making. Nat Commun 9:1708. 10.1038/s41467-018-03774-z [DOI] [PMC free article] [PubMed] [Google Scholar]
Wilson RC, Collins AG (2019) Ten simple rules for the computational modeling of behavioral data. Elife 8:e49547. 10.7554/eLife.49547 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wilson T, Gilbert DT (2005) Affective forecasting: knowing what to want. Curr Dir Psychol Sci 14:131–134. 10.1111/j.0963-7214.2005.00355.x [DOI] [Google Scholar]
Winecoff A, Clithero JA, Carter RM, Bergman SR, Wang L, Huettel SA (2013) Ventromedial prefrontal cortex encodes emotional value. J Neurosci 33:11032–11039. 10.1523/JNEUROSCI.4317-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zald DH, Andreotti C (2010) Neuropsychological assessment of the orbital and ventromedial prefrontal cortex. Neuropsychologia 48:3377–3391. 10.1016/j.neuropsychologia.2010.08.012 [DOI] [PubMed] [Google Scholar]
Zald DH, Mattson DL, Pardo JV (2002) Brain activity in ventromedial prefrontal cortex correlates with individual differences in negative affect. Proc Natl Acad Sci USA 99:2450–2454. 10.1073/pnas.042457199 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B1] Ang YS, Lockwood P, Apps MA, Muhammed K, Husain M (2017) Distinct subtypes of apathy revealed by the apathy motivation index. PLoS One 12:e0169938. 10.1371/journal.pone.0169938 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] Bartra O, McGuire JT, Kable JW (2013) The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76:412–427. 10.1016/j.neuroimage.2013.02.063 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] Beck AT, Steer RA, Ball R, Ranieri WF (1996) Comparison of Beck Depression Inventories-ia and-ii in psychiatric outpatients. J Pers Assess 67:588–597. 10.1207/s15327752jpa6703_13 [DOI] [PubMed] [Google Scholar]

[B4] Blain B, Rutledge RB (2020) Momentary subjective well-being depends on learning and not reward. Elife 9:e57977. 10.7554/eLife.57977 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] Blain B, Sharot T (2021) Intrinsic reward: potential cognitive and neural mechanisms. Curr Opin Behav Sci 39:113–118. 10.1016/j.cobeha.2021.03.008 [DOI] [Google Scholar]

[B6] Blood AJ, Zatorre RJ (2001) Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proc Natl Acad Sci USA 98:11818–11823. 10.1073/pnas.191355898 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] Chib VS, Rangel A, Shimojo S, O'Doherty JP (2009) Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. J Neurosci 29:12315–12320. 10.1523/JNEUROSCI.2575-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] Chida Y, Steptoe A (2008) Positive psychological well-being and mortality: a quantitative review of prospective observational studies. Psychosom Med 70:741–756. 10.1097/PSY.0b013e31818105ba [DOI] [PubMed] [Google Scholar]

[B9] Clithero JA, Rangel A (2014) Informatic parcellation of the network involved in the computation of subjective value. Soc Cogn Affect Neurosci 9:1289–1302. 10.1093/scan/nst106 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] Davidson KW, Mostofsky E, Whang W (2010) Don't worry, be happy: positive affect and reduced 10-year incident coronary heart disease: the Canadian Nova Scotia Health Survey. Eur Heart J 31:1065–1070. 10.1093/eurheartj/ehp603 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] De Martino B, Kumaran D, Seymour B, Dolan RJ (2006) Frames, biases, and rational decision-making in the human brain. Science 313:684–687. 10.1126/science.1128356 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] Dolan P, White MP (2007) How can measures of subjective well-being be used to inform public policy? Perspect Psychol Sci 2:71–85. 10.1111/j.1745-6916.2007.00030.x [DOI] [PubMed] [Google Scholar]

[B13] Edgeworth FY (1881) Mathematical psychics: an essay on the application of mathematics to the moral sciences. London: Kegan Paul. [Google Scholar]

[B14] Eldar E, Rutledge RB, Dolan RJ, Niv Y (2016) Mood as representation of momentum. Trends in cognitive sciences 20:15–24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] Eldar E, Roth C, Dayan P, Dolan RJ (2018) Decodability of reward learning signals predicts mood fluctuations. Current Biology 28:1433–1439. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B16] FitzGerald THB, Seymour B, Dolan RJ (2009) The role of human orbitofrontal cortex in value comparison for incommensurable objects. J Neurosci 29:8388–8395. 10.1523/JNEUROSCI.0717-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B17] Fleming SM, Weil RS, Nagy Z, Dolan RJ, Rees G (2010) Relating introspective accuracy to individual differences in brain structure. Science 329:1541–1543. 10.1126/science.1191883 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] Gilbert DT, Wilson TD (2007) Prospection: experiencing the future. Science 317:1351–1354. 10.1126/science.1144161 [DOI] [PubMed] [Google Scholar]

[B19] Gottlieb J, Oudeyer P-Y (2018) Towards a neuroscience of active sampling and curiosity. Nat Rev Neurosci 19:758–770. 10.1038/s41583-018-0078-0 [DOI] [PubMed] [Google Scholar]

[B20] Grossman M, Eslinger PJ, Troiani V, Anderson C, Avants B, Gee JC, McMillan C, Massimo L, Khan A, Antani S (2010) The role of ventral medial prefrontal cortex in social decisions: converging evidence from fMRI and frontotemporal lobar degeneration. Neuropsychologia 48:3505–3512. 10.1016/j.neuropsychologia.2010.07.036 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B21] Hare TA, Malmaud J, Rangel A (2011) Focusing attention on the health aspects of foods changes value signals in vmPFC and improves dietary choice. J Neurosci 31:11077–11087. 10.1523/JNEUROSCI.6383-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B22] Hiser J, Koenigs M (2018) The multifaceted role of ventromedial prefrontal cortex in emotion, decision-making, social cognition, and psychopathology. Biol Psychiatry 83:638–647. 10.1016/j.biopsych.2017.10.030 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B23] Howe MW, Tierney PL, Sandberg SG, Phillips PE, Graybiel AM (2013) Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500:575–579. 10.1038/nature12475 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B24] Jacobsen T, Schubotz R, Höfel L, Cramon Y (2006) Brain correlates of aesthetic judgment of beauty. Neuroimage 29:276–285. 10.1016/j.neuroimage.2005.07.010 [DOI] [PubMed] [Google Scholar]

[B25] Kahneman D, Krueger AB, Schkade DA, Schwarz N, Stone AA (2004) A survey method for characterizing daily life experience: the day reconstruction method. Science 306:1776–1780. 10.1126/science.1103572 [DOI] [PubMed] [Google Scholar]

[B26] Kawabata H, Zeki S (2004) Neural correlates of beauty. J Neurophysiol 91:1699–1705. 10.1152/jn.00696.2003 [DOI] [PubMed] [Google Scholar]

[B27] Keatley D, Clarke DD, Hagger MS (2013) The predictive validity of implicit measures of self-determined motivation across health-related behaviours. Br J Health Psychol 18:2–17. 10.1111/j.2044-8287.2011.02063.x [DOI] [PubMed] [Google Scholar]

[B28] Klein-Flügge MC, Kennerley SW, Saraiva AC, Penny WD, Bestmann S (2015) Behavioral modeling of human choices reveals dissociable effects of physical effort and temporal delay on reward devaluation. PLoS Comput Biol 11:e1004116. 10.1371/journal.pcbi.1004116 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B29] Koenigs M, Young L, Adolphs R, Tranel D, Cushman F, Hauser M, Damasio A (2007) Damage to the prefrontal cortex increases utilitarian moral judgements. Nature 446:908–911. 10.1038/nature05631 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B30] Levesque C, Copeland KJ, Sutcliffe RA (2008) Conscious and nonconscious processes: implications for self-determination theory. Can Psychol 49:218–224. 10.1037/a0012756 [DOI] [Google Scholar]

[B31] Levy DJ, Glimcher PW (2011) Comparing apples and oranges: using reward-specific and reward-general subjective value representation in the brain. J Neurosci 31:14693–14707. 10.1523/JNEUROSCI.2218-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B32] Levy DJ, Glimcher PW (2012) The root of all value: a neural common currency for choice. Curr Opin Neurobiol 22:1027–1038. 10.1016/j.conb.2012.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B33] Marin RS, Biedrzycki RC, Firinciogullari S (1991) Reliability and validity of the apathy evaluation scale. Psychiatry Res 38:143–162. 10.1016/0165-1781(91)90040-v [DOI] [PubMed] [Google Scholar]

[B34] Meyvis T, Ratner RK, Levav J (2010) Why don't we learn to accurately forecast feelings? How misremembering our predictions blinds us to past forecasting errors. J Exp Psychol Gen 139:579–589. 10.1037/a0020285 [DOI] [PubMed] [Google Scholar]

[B35] Morewedge CK, Buechel EC (2013) Motivated underpinnings of the impact bias in affective forecasts. Emotion 13:1023–1029. 10.1037/a0033797 [DOI] [PubMed] [Google Scholar]

[B36] Nisbet EK, Zelenski JM (2011) Underestimating nearby nature: affective forecasting errors obscure the happy path to sustainability. Psychol Sci 22:1101–1106. 10.1177/0956797611418527 [DOI] [PubMed] [Google Scholar]

[B37] OECD (2020) How's life? 2020: measuring well-being. OECD iLibrary. Available at: https://www.oecd.org/statistics/how-s-life-23089679.htm#:∼:text=Measuring%20Well%2Dbeing-,How's%20Life%3F,resources%20for%20future%20well%2Dbeing. Accessed March 22, 2021.

[B38] Oswald AJ, Wu S (2010) Objective confirmation of subjective measures of human well-being: evidence from the U.S.A. Science 327:576–579. 10.1126/science.1180606 [DOI] [PubMed] [Google Scholar]

[B39] Oswald AJ, Proto E, Sgroi D (2015) Happiness and productivity. J Labor Econ 33:789–822. 10.1086/681096 [DOI] [Google Scholar]

[B40] Padoa-Schioppa C (2007) Orbitofrontal cortex and the computation of economic value. Ann NY Acad Sci 1121:232–253. 10.1196/annals.1401.011 [DOI] [PubMed] [Google Scholar]

[B41] Pelled LH, Xin KR (1999) Down and out: an investigation of the relationship between mood and employee withdrawal behavior. J Management 25:875–895. 10.1177/014920639902500605 [DOI] [Google Scholar]

[B42] Peterson S, Luthans F, Avolio BJ, Walumbwa FO, Zhang Z (2011) Psychological capital and employee performance: a latent growth modeling approach. Pers Psychol 64:427–450. 10.1111/j.1744-6570.2011.01215.x [DOI] [Google Scholar]

[B43] Plassmann H, O'Doherty J, Rangel A (2007) Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci 27:9984–9988. 10.1523/JNEUROSCI.2131-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44] Plassmann H, O'Doherty J, Shiv B, Rangel A (2008) Marketing actions can modulate neural representations of experienced pleasantness. Proc Natl Acad Sci USA 105:1050–1054. 10.1073/pnas.0706929105 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B45] Reis HT, Gable SL (2000) Event-sampling and other methods for studying everyday experience. In Handbook of research methods in social and personality psychology (Reis H. T. & Judd C. M. (Eds.), (pp. 190–222). Cambridge University Press. [Google Scholar]

[B46] Rutledge RB, Skandali N, Dayan P, Dolan RJ (2014) A computational and neural model of momentary subjective well-being. Proc Natl Acad Sci USA 111:12252–12257. 10.1073/pnas.1407535111 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B47] Rutledge RB, Skandali N, Dayan P, Dolan RJ (2015) Dopaminergic modulation of decision making and subjective well-being. J Neurosci 35:9811–9822. 10.1523/JNEUROSCI.0702-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B48] Rutledge RB, de Berker AO, Espenhahn S, Dayan P, Dolan RJ (2016) The social contingency of momentary subjective well-being. Nat Commun 7:11825. 10.1038/ncomms11825 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49] Rutledge RB, Moutoussis M, Smittenaar P, Zeidman P, Taylor T, Hrynkiewicz L, Lam J, Skandali N, Siegel JZ, Ousdal OT, Prabhu G, Dayan P, Fonagy P, Dolan RJ (2017) Association of neural and emotional impacts of reward prediction errors with major depression. JAMA Psychiatry 74:790–797. 10.1001/jamapsychiatry.2017.1713 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B50] Smith DV, Hayden BY, Truong T-K, Song AW, Platt ML, Huettel SA (2010) Distinct value signals in anterior and posterior ventromedial prefrontal cortex. J Neurosci 30:2490–2495. 10.1523/JNEUROSCI.3319-09.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B51] Son LK, Metcalfe J (2005) Judgments of learning: evidence for a two-stage process. Mem Cognit 33:1116–1129. 10.3758/bf03193217 [DOI] [PubMed] [Google Scholar]

[B52] Son LK, Sethi R (2006) Metacognitive control and optimal learning. Cogn Sci 30:759–774. 10.1207/s15516709cog0000_74 [DOI] [PubMed] [Google Scholar]

[B53] Spaniol J, Di Muro F, Ciaramelli E (2019) Differential impact of ventromedial prefrontal cortex damage on “hot” and “cold” decisions under risk. Cogn Affect Behav Neurosci 19:477–489. 10.3758/s13415-018-00680-1 [DOI] [PubMed] [Google Scholar]

[B54] Steptoe A, Deaton A, Stone AA (2015) Subjective wellbeing, health, and ageing. Lancet 385:640–648. 10.1016/S0140-6736(13)61489-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B55] Sugawara SK, Tanaka S, Okazaki S, Watanabe K, Sadato N (2012) Social rewards enhance offline improvements in motor skill. PLoS One 7:e48174. 10.1371/journal.pone.0048174 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B56] Taylor M, Creelman CD (1967) PEST: efficient estimates on probability functions. J Acoust Soc Am 41:782–787. 10.1121/1.1910407 [DOI] [Google Scholar]

[B57] Van de Mortel TF (2008) Faking it: social desirability response bias in self-report research. Austr J Adv Nurs 25:40. [Google Scholar]

[B58] Vinckier F, Rigoux L, Oudiette D, Pessiglione M (2018) Neuro-computational account of how mood fluctuations arise and affect decision making. Nat Commun 9:1708. 10.1038/s41467-018-03774-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[B59] Wilson RC, Collins AG (2019) Ten simple rules for the computational modeling of behavioral data. Elife 8:e49547. 10.7554/eLife.49547 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B60] Wilson T, Gilbert DT (2005) Affective forecasting: knowing what to want. Curr Dir Psychol Sci 14:131–134. 10.1111/j.0963-7214.2005.00355.x [DOI] [Google Scholar]

[B61] Winecoff A, Clithero JA, Carter RM, Bergman SR, Wang L, Huettel SA (2013) Ventromedial prefrontal cortex encodes emotional value. J Neurosci 33:11032–11039. 10.1523/JNEUROSCI.4317-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B62] Zald DH, Andreotti C (2010) Neuropsychological assessment of the orbital and ventromedial prefrontal cortex. Neuropsychologia 48:3377–3391. 10.1016/j.neuropsychologia.2010.08.012 [DOI] [PubMed] [Google Scholar]

[B63] Zald DH, Mattson DL, Pardo JV (2002) Brain activity in ventromedial prefrontal cortex correlates with individual differences in negative affect. Proc Natl Acad Sci USA 99:2450–2454. 10.1073/pnas.042457199 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A Neurocomputational Model for Intrinsic Reward

Benjamin Chew

Bastien Blain

Raymond J Dolan

Robb B Rutledge

Abstract

Introduction

Figure 1.