Selecting and Executing Actions for Rewards

Pierre Vassiliadis; Gerard Derosiere

doi:10.1523/JNEUROSCI.1250-20.2020

. 2020 Aug 19;40(34):6474–6476. doi: 10.1523/JNEUROSCI.1250-20.2020

Selecting and Executing Actions for Rewards

Pierre Vassiliadis ^1,^2,^✉, Gerard Derosiere ¹

PMCID: PMC7486658 PMID: 32817389

Rewards shape human actions. The mere possibility of earning a reward induces substantial improvements in the way we choose and execute actions (Chen et al., 2017). This observation has raised hope for rehabilitation: reward is regarded as a promising means to magnify the positive effects of practice on motor control (Quattrocchi et al., 2017). Yet, this branch of research is only burgeoning, and neuroscientists have yet to identify the mechanisms through which reward improves movements.

At present, two distinct spheres of study have provided insights into how reward improves motor control. First, studies on action selection show that reward can speed up reaction times (RTs; the time elapsed between stimulus presentation and action initiation; Klein et al., 2012) and enhance selection accuracy (subjects select the “right action” more often when reward is at stake; Derosiere et al., 2017a,b). Second, studies on action execution reveal a beneficial effect of reward on movement times (MTs; the time elapsed between action initiation and completion; Reppert et al., 2015) and execution accuracy (e.g., subjects execute faster and more precise movements when reaching to a rewarding target) (Manohar et al., 2019). Strikingly, most work investigating the effects of reward on action selection and execution has examined these effects in separate studies (Chen et al., 2017), impeding the genesis of an integrative understanding of how reward shapes the two processes in more natural settings.

Even when considered in isolation, the precise mechanisms underlying the effects of reward on selection and execution processes have remained obscure. An important gap in our knowledge concerns how reward improves execution accuracy. One possibility is that the presence of reward increases limb stiffness, enhancing the resistance of the moving effectors to internal and external perturbations (Gribble et al., 2003) and ultimately reducing movement variability (so-called “motor noise”; Manohar et al., 2015). Yet, the contribution of stiffness to reward-driven improvements in execution accuracy has been speculative.

In a recent article published in The Journal of Neuroscience, Codol et al. (2020) addressed the two issues mentioned above. In a series of experiments, the authors asked human subjects to use reaching movements to displace a manipulandum from a starting position to one of four target locations. Before starting each movement, subjects were informed of the maximum reward they could obtain in the trial (0, 10, or 50 pence). In 10p and 50p trials, the magnitude of the reward ultimately obtained by the subject depended on her/his performance in the trial (see below).

The first aim of the study was to test the impact of reward on the speed and accuracy of action selection and execution in a single setting. To do so, on 10p and 50p trials, Codol et al. (2020) provided rewards that were inversely proportional to the RT and the MT combined together (reflecting the speed of selection and execution processes, respectively). Reward magnitude also depended on the accuracy of both selection and execution. Importantly, some trials required subjects to ignore distractor cues; initiating a movement toward these cues was classified as a selection error and thus unrewarded. Furthermore, trials on which the final position of the manipulandum fell >4 cm away from the target center were classified as an execution error and thus unrewarded. Hence, to maximize reward in 10p and 50p trials, participants had to select and execute reaching movements as quickly as possible while keeping both selection and execution accuracy high.

A second objective of the study was to test the contribution of limb stiffness to reward-driven improvements in execution accuracy. To investigate this, the authors had subjects perform the same task as described above, with the addition that some trials involved a displacement of the manipulandum after movement completion, pushing subjects' arm away from the target. Arm stiffness was evaluated by measuring the amount of force exerted by the subject during this perturbation. The authors were able to assess the impact of reward on stiffness by comparing this measure of force in 50p versus in 0p trials. In a control experiment, the authors also tested the effect of reward on arm stiffness before movement initiation (i.e., the displacement of the manipulandum pushed subjects' arm away from the starting position).

The results indicated that, when considered in a single task, reward can have a dissociable impact on action selection and execution. Indeed, 10p and 50p trials were not associated with any change in selection speed (i.e., no significant effect on RTs, compared with 0p trials), but entailed a boost of execution speed (i.e., a reduction in MTs). Conversely, selection accuracy was enhanced in rewarded trials (i.e., a smaller proportion of movements were initiated toward distractor cues than on 0p trials), whereas execution accuracy remained unchanged (i.e., the deviation between the manipulandum final position and the target center was stable). Interestingly, computational analyses revealed that the maintenance of high execution accuracy in rewarded trials (despite faster MTs) could be in part attributed to a reduction in motor noise. Most importantly, this reduction in motor noise was associated with a substantial increase in arm stiffness in 50p compared with 0p trials that was observed specifically at the end of the reaching movement (and not before movement initiation), thus confirming the contribution of endpoint stiffness to reward-driven improvements in execution accuracy.

The dissociable effect of reward on selection and execution speeds is striking. Indeed, one major framework in motor neuroscience views selection and execution processes as part of a continuum with a shared neural basis centered on the motor system (Cisek, 2007). In this view, altered activity in specific neural structures (e.g., in the case of reward processing, midbrain dopaminergic neurons; Schultz, 2015) could produce changes in both selection and execution processes at the behavioral level. The roots of this idea lie so deep within the field that researchers often consider RTs and MTs together as a single measure, thought to reflect action vigor (Shadmehr et al., 2019). The findings of Codol et al. (2020) ask us to reconsider carefully this vision, suggesting that, in some conditions, the speed of action selection and execution can be regulated by independent (yet likely interacting) neural structures. Consistent with this hypothesis, a recent study revealed the existence of distinct subpopulations of midbrain dopaminergic neurons, with some cells encoding behavioral choice and others sensitive to movement features (Engelhard et al., 2019).

An alternative explanation for the lack of effect of reward on RTs may arise, however, if one concedes that this measure not only reflects the speed of action selection but also the rapidity of sensory processing (Haith et al., 2016; Vassiliadis et al., 2020), and that reward could have affected these two processes in opposite ways. Indeed, the task described above puts a considerable demand on sensory processing, as it required participants to discriminate between four target locations and, in some trials, to avoid distractor cues. This time-consuming process relies on attentional mechanisms that amplify and suppress neural responses in visual neurons encoding target and distractor cues, respectively (Itthipuripat et al., 2019). The prospect of reward may have strengthened the emphasis on such attentional mechanisms, slowing them down to take more time to sharpen visual activity. Importantly, this interpretation offers a potential mechanistic explanation for how subjects may have improved selection accuracy in rewarded trials. Notably, if such a scenario holds true, the lack of effect of reward on RTs may have emerged from a concomitant, antagonistic hastening of action selection. In this case, the increase in selection speed would be concurrent to the boost of execution speed, and would be therefore in accordance with the continuum framework mentioned above. This hypothesis suggests new avenues of research, aiming to disentangle the effects of reward on the different processes occurring between sensation and action.

Another important finding of the study is that reward reduced motor noise through increased limb stiffness, limiting the potential negative consequence of high execution speed on accuracy. Interestingly, the movement pattern reported by Codol et al. (2020), a parallel increase in movement speed and stiffness during rewarded trials, is similar to that observed when participants are exposed to unpredictable perturbations of their movements during execution (Crevecoeur et al., 2019). This pattern is thought to reflect the implementation of a specific strategy of the motor system (so-called “robust strategy”), minimizing the impact of perturbations on action execution in uncertain environments (Bian et al., 2020). Critically, the results of Codol et al. (2020) suggest that the presence of reward also influences the reliance on such a robust strategy. More generally, the reliance of the motor system on this strategy may depend on the expected outcome of a movement: it increases both when the risk of execution failure is high (i.e., in uncertain environments) and when adequate execution can lead to a reward.

The finding of a reward-driven increase in stiffness has at least two major implications for the development of rehabilitation protocols. First, high stiffness may induce muscular fatigue, a process that might reduce the magnitude of rehabilitative learning (Branscheidt et al., 2019). Therefore, therapists should track patients' fatigue systematically when training involves reward. Second, the ability to regulate limb stiffness could be a relevant marker of whether a patient may or may not benefit from reward-based rehabilitation. For instance, patients with excessive stiffness (e.g., because of poststroke spasticity) may not display the reward-driven improvements in execution reported by Codol et al. (2020), at least not without appropriate antispastic treatment.

In conclusion, the study by Codol et al. (2020) builds on timely questions regarding the mechanisms underlying the impact of reward on motor control. In a series of experiments, the authors show that the presence of reward can have dissociable impacts on action selection and execution, with effects on the latter process associated with increased arm stiffness. As we discussed, these findings provide mechanistic insights and have implications for future clinical translation.

Footnotes

Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/jneurosci-journal-club.

P.V. was a PhD student supported by the Fund for Research Training in Industry and Agriculture (FRIA/Fonds National de la Recherche Scientifique). G.D. was a postdoctoral fellow supported by the Belgian National Funds for Scientific Research (Fonds National de la Recherche Scientifique).

The authors declare no competing financial interests.

References

Bian T, Wolpert DM, Jiang ZP (2020) Model-free robust optimal feedback mechanisms of biological motor control. Neural Comput 32:562–595. 10.1162/neco_a_01260 [DOI] [PubMed] [Google Scholar]
Branscheidt M, Kassavetis P, Anaya M, Rogers D, Huang HD, Lindquist MA, Celnik P (2019) Fatigue induces long-lasting detrimental changes in motor-skill learning. Elife 8:e40578 10.7554/eLife.40578 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen X, Holland P, Galea JM (2017) The effects of reward and punishment on motor skill learning. Curr Opin Behav Sci 20:83–88. 10.1016/j.cobeha.2017.11.011 [DOI] [Google Scholar]
Cisek P. (2007) Cortical mechanisms of action selection: the affordance competition hypothesis. Philos Trans R Soc Lond B Biol Sci 362:1585–1599. 10.1098/rstb.2007.2054 [DOI] [PMC free article] [PubMed] [Google Scholar]
Codol O, Holland PJ, Manohar SG, Galea JM (2020) Reward-based improvements in motor control are driven by multiple error-reducing mechanisms. J Neurosci 40:3604–3620. 10.1523/JNEUROSCI.2646-19.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
Crevecoeur F, Scott SH, Cluff T (2019) Robust control in human reaching movements: a model-free strategy to compensate for unpredictable disturbances. J Neurosci 39:8135–8148. 10.1523/JNEUROSCI.0770-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
Derosiere G, Vassiliadis P, Demaret S, Zénon A, Duque J (2017a) Learning stage-dependent effect of M1 disruption on value-based motor decisions. Neuroimage 162:173–185. 10.1016/j.neuroimage.2017.08.075 [DOI] [PubMed] [Google Scholar]
Derosiere G, Zénon A, Alamia A, Duque J (2017b) Primary motor cortex contributes to the implementation of implicit value-based rules during motor decisions. Neuroimage 146:1115–1127. 10.1016/j.neuroimage.2016.10.010 [DOI] [PubMed] [Google Scholar]
Engelhard B, Finkelstein J, Cox J, Fleming W, Jang HJ, Ornelas S, Koay SA, Thiberge SY, Daw ND, Tank DW, Witten IB (2019) Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature 570:509–513. 10.1038/s41586-019-1261-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gribble PL, Mullin LI, Cothros N, Mattar A (2003) Role of cocontraction in arm movement accuracy. J Neurophysiol 89:2396–2405. 10.1152/jn.01020.2002 [DOI] [PubMed] [Google Scholar]
Haith AM, Pakpoor J, Krakauer JW (2016) Independence of movement preparation and movement initiation. J Neurosci 36:3007–3015. 10.1523/JNEUROSCI.3245-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
Itthipuripat S, Vo VA, Sprague TC, Serences JT (2019) Value-driven attentional capture enhances distractor representations in early visual cortex. PLoS Biol 17:e3000186 10.1371/journal.pbio.3000186 [DOI] [PMC free article] [PubMed] [Google Scholar]
Klein PA, Olivier E, Duque J (2012) Influence of reward on corticospinal excitability during movement preparation. J Neurosci 32:18124–18136. 10.1523/JNEUROSCI.1701-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
Manohar SG, Chong TT, Apps MA, Jarman PR, Bhatia KP, Husain M, Manohar SG, Chong TT, Apps MA, Batla A, Stamelou M (2015) Reward pays the cost of noise reduction in motor and cognitive control. Curr Biol 25:1707–1716. 10.1016/j.cub.2015.05.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
Manohar SG, Muhammed K, Fallon SJ, Husain M (2019) Motivation dynamically increases noise resistance by internal feedback during movement. Neuropsychologia 123:19–29. 10.1016/j.neuropsychologia.2018.07.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
Quattrocchi G, Greenwood R, Rothwell JC, Galea JM, Bestmann S (2017) Reward and punishment enhance motor adaptation in stroke. J Neurol Neurosurg Psychiatry 88:730–736. 10.1136/jnnp-2016-314728 [DOI] [PubMed] [Google Scholar]
Reppert TR, Lempert KM, Glimcher PW, Shadmehr R (2015) Modulation of saccade vigor during value–based decision making. J Neurosci 35:15369–15378. 10.1523/JNEUROSCI.2621-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
Schultz W. (2015) Neuronal reward and decision signals: from theories to data. Physiol Rev 95:853–951. 10.1152/physrev.00023.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shadmehr R, Reppert TR, Summerside EM, Yoon T, Ahmed AA (2019) Movement vigor as a reflection of subjective economic utility. Trends Neurosci 42:323–336. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vassiliadis P, Derosiere G, Grandjean J, Duque J (2020) Motor training strengthens corticospinal suppression during movement preparation. bioRxiv 948877 10.1101/2020.02.14.948877 10.1101/2020.02.14.948877 [DOI] [PubMed] [Google Scholar]

[B1] Bian T, Wolpert DM, Jiang ZP (2020) Model-free robust optimal feedback mechanisms of biological motor control. Neural Comput 32:562–595. 10.1162/neco_a_01260 [DOI] [PubMed] [Google Scholar]

[B2] Branscheidt M, Kassavetis P, Anaya M, Rogers D, Huang HD, Lindquist MA, Celnik P (2019) Fatigue induces long-lasting detrimental changes in motor-skill learning. Elife 8:e40578 10.7554/eLife.40578 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] Chen X, Holland P, Galea JM (2017) The effects of reward and punishment on motor skill learning. Curr Opin Behav Sci 20:83–88. 10.1016/j.cobeha.2017.11.011 [DOI] [Google Scholar]

[B4] Cisek P. (2007) Cortical mechanisms of action selection: the affordance competition hypothesis. Philos Trans R Soc Lond B Biol Sci 362:1585–1599. 10.1098/rstb.2007.2054 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] Codol O, Holland PJ, Manohar SG, Galea JM (2020) Reward-based improvements in motor control are driven by multiple error-reducing mechanisms. J Neurosci 40:3604–3620. 10.1523/JNEUROSCI.2646-19.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] Crevecoeur F, Scott SH, Cluff T (2019) Robust control in human reaching movements: a model-free strategy to compensate for unpredictable disturbances. J Neurosci 39:8135–8148. 10.1523/JNEUROSCI.0770-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] Derosiere G, Vassiliadis P, Demaret S, Zénon A, Duque J (2017a) Learning stage-dependent effect of M1 disruption on value-based motor decisions. Neuroimage 162:173–185. 10.1016/j.neuroimage.2017.08.075 [DOI] [PubMed] [Google Scholar]

[B8] Derosiere G, Zénon A, Alamia A, Duque J (2017b) Primary motor cortex contributes to the implementation of implicit value-based rules during motor decisions. Neuroimage 146:1115–1127. 10.1016/j.neuroimage.2016.10.010 [DOI] [PubMed] [Google Scholar]

[B9] Engelhard B, Finkelstein J, Cox J, Fleming W, Jang HJ, Ornelas S, Koay SA, Thiberge SY, Daw ND, Tank DW, Witten IB (2019) Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature 570:509–513. 10.1038/s41586-019-1261-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] Gribble PL, Mullin LI, Cothros N, Mattar A (2003) Role of cocontraction in arm movement accuracy. J Neurophysiol 89:2396–2405. 10.1152/jn.01020.2002 [DOI] [PubMed] [Google Scholar]

[B11] Haith AM, Pakpoor J, Krakauer JW (2016) Independence of movement preparation and movement initiation. J Neurosci 36:3007–3015. 10.1523/JNEUROSCI.3245-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] Itthipuripat S, Vo VA, Sprague TC, Serences JT (2019) Value-driven attentional capture enhances distractor representations in early visual cortex. PLoS Biol 17:e3000186 10.1371/journal.pbio.3000186 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] Klein PA, Olivier E, Duque J (2012) Influence of reward on corticospinal excitability during movement preparation. J Neurosci 32:18124–18136. 10.1523/JNEUROSCI.1701-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14] Manohar SG, Chong TT, Apps MA, Jarman PR, Bhatia KP, Husain M, Manohar SG, Chong TT, Apps MA, Batla A, Stamelou M (2015) Reward pays the cost of noise reduction in motor and cognitive control. Curr Biol 25:1707–1716. 10.1016/j.cub.2015.05.038 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] Manohar SG, Muhammed K, Fallon SJ, Husain M (2019) Motivation dynamically increases noise resistance by internal feedback during movement. Neuropsychologia 123:19–29. 10.1016/j.neuropsychologia.2018.07.011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B16] Quattrocchi G, Greenwood R, Rothwell JC, Galea JM, Bestmann S (2017) Reward and punishment enhance motor adaptation in stroke. J Neurol Neurosurg Psychiatry 88:730–736. 10.1136/jnnp-2016-314728 [DOI] [PubMed] [Google Scholar]

[B17] Reppert TR, Lempert KM, Glimcher PW, Shadmehr R (2015) Modulation of saccade vigor during value–based decision making. J Neurosci 35:15369–15378. 10.1523/JNEUROSCI.2621-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] Schultz W. (2015) Neuronal reward and decision signals: from theories to data. Physiol Rev 95:853–951. 10.1152/physrev.00023.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B19] Shadmehr R, Reppert TR, Summerside EM, Yoon T, Ahmed AA (2019) Movement vigor as a reflection of subjective economic utility. Trends Neurosci 42:323–336. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B20] Vassiliadis P, Derosiere G, Grandjean J, Duque J (2020) Motor training strengthens corticospinal suppression during movement preparation. bioRxiv 948877 10.1101/2020.02.14.948877 10.1101/2020.02.14.948877 [DOI] [PubMed] [Google Scholar]

PERMALINK

Selecting and Executing Actions for Rewards

Pierre Vassiliadis

Gerard Derosiere

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Selecting and Executing Actions for Rewards

Pierre Vassiliadis

Gerard Derosiere

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases