Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Sep 27;56(10):5823–5835. doi: 10.1111/ejn.15826

The value of an action: Impact of motor behaviour on outcome processing and stimulus preference

Kotryna Bikute 1, Caroline Di Bernardi Luft 1, Frederike Beyer 1,
PMCID: PMC9828266  PMID: 36114689

Abstract

While influences of Pavlovian associations on instrumental behaviour are well established, we still do not know how motor actions affect the formation of Pavlovian associations. To address this question, we designed a task in which participants were presented with neutral stimuli, half of which were paired with an active response, half with a passive waiting period. Stimuli had an 80% chance of predicting either a monetary gain or loss. We compared the feedback‐related negativity (FRN) in response to predictive stimuli and outcomes, as well as directed phase synchronization before and after outcome presentation between trials with versus without a motor response. We found a larger FRN amplitude in response to outcomes presented after a motor response (active trials). This effect was driven by a positive deflection in active reward trials, which was absent in passive reward trials. Connectivity analysis revealed that the motor action reversed the direction of the phase synchronization at the time of the feedback presentation: Top‐down information flow during the outcome anticipation phase in active trials, but bottom‐up information flow in passive trials. This main effect of action was mirrored in behavioural data showing that participants preferred stimuli associated with an active response. Our findings suggest an influence of neural systems that initiate motor actions on neural systems involved in reward processing. We suggest that motor actions might modulate the brain responses to feedback by affecting the dynamics of brain activity towards optimizing the processing of the resulting action outcome.

Keywords: FRN, motor action, Pavlovian conditioning, phase synchronization, punishment, reward

Short abstract

In this study, we investigated the influence on performing a motor action on the neural processing of monetary wins and loCsses. We found a larger FRN amplitude in response to outcomes presented after a motor response. Connectivity analysis revealed that the motor action reversed the direction of the phase synchronization at the time of the feedback presentation. Our findings suggest an influence of neural systems that initiate motor actions on neural systems involved in reward processing.

1. INTRODUCTION

Associative learning is generally divided into two distinct subtypes: Pavlovian learning—the formation of associations between stimuli—and instrumental learning—the formation of associations between actions and their outcomes. While Pavlovian and instrumental learning are understood as relatively independent processes (Fanselow & LeDoux, 1999; Ivlieva & Ivliev, 2014), there is also evidence for interaction between them. Reward‐associated cues can enhance instrumental responding (Cartoni et al., 2016), while aversive Pavlovian associations can facilitate avoidance behaviour (Lewis et al., 2013). There is evidence that certain actions are more amenable to Pavlovian conditioning than others: participants find it easier to learn a Go‐response to obtain rewards and a No‐Go‐response to avoid punishment than vice versa (Guitart‐Masip, Huys, et al., 2012; Swart et al., 2017). Thus, there is evidence for a directed connection between the two types of associative learning, such that established Pavlovian associations can influence instrumental actions (Guitart‐Masip, Huys, et al., 2012; Swart et al., 2017). However, we still do not know about influences in the opposite direction, that is, whether actions affect the formation of Pavlovian associations.

The influence of Pavlovian associations on instrumental behaviour is shown in two phenomena commonly observed in conditioning paradigms. First, conditioned stimuli in instrumental reward learning tasks can eventually become conditioned (or secondary) reinforcers (Skinner, 1951; Williams, 1994). That is, the stimulus associated with a reward takes on rewarding properties itself. Second, as mentioned above, existing Pavlovian associations can affect instrumental learning, as conditioned stimuli generally facilitate behaviour in line with the expected reward or punishment. This phenomenon is referred to as Pavlovian‐to‐instrumental transfer (Cartoni et al., 2016) and has played a crucial role in our understanding of instrumental behaviour. For example, observations of instrumental‐to‐Pavlovian transfer in animal models play a crucial role in current models of addiction (Robinson & Berridge, 2001). Yet, many classical conditioning paradigms contain instrumental elements in the form of active reward consumption, or avoidance responses against expected punishment, for example, freezing. Indeed, it has been argued that in the context of research on addiction, evidence for true Pavlovian‐to‐instrumental transfer is scarce, given the instrumental confounds present in classical conditioning paradigms (Lamb et al., 2016). Thus, it will be crucial to better understand bidirectional influences between the two types of learning.

A first step towards understanding the impact of a motor action on Pavlovian learning is understanding its impact on the processing of rewards and punishment outcomes. Previous studies investigated feedback‐guided learning using the feedback‐related negativity (FRN) as a measure of outcome processing (Burnside et al., 2019; Donkers et al., 2005; Miltner et al., 1997; Yeung et al., 2005), thought to emerge from the anterior cingulate cortex (ACC) and to reflect prediction error or conflict processing (San Martín, 2012). Previous studies suggest that the FRN is sensitive to the behavioural relevance of an outcome: FRN amplitudes tend to be larger for outcomes over which participants had control (e.g., by making an active choice between two or more competing responses), than for outcomes which are due to chance (Li et al., 2011), or in forced‐choice situations (Yeung et al., 2005). Similarly, Zhou and colleagues (Zhou et al., 2010) found enhanced FRN and P300 amplitudes when participants chose to change a previous decision, compared with trials in which they opted to stick with their initial choice. This suggests that changes of mind enhance outcome monitoring similarly to manipulations increasing control over action outcomes. Together, these studies suggest that the need for instrumental learning enhances outcome monitoring. They do not, however, allow for conclusions regarding how the action itself affects outcome monitoring, nor the impact of motor actions on Pavlovian learning, as in these studies, participants always made an active response, whether or not they made an active choice.

Other studies introduced passive conditions: they observed reduced (Yeung et al., 2005) or less specific (Burnside et al., 2019) FRN amplitudes, suggesting an impact of passivity on outcome monitoring. However, they lacked a control condition in which participants performed an action in the absence of choice (Burnside et al., 2019), or used blocked designs, in which participants were presented with series of passive trials (Yeung et al., 2005). Such passive blocks could have plausible effects on participants' attention, as they did not have to respond for prolonged periods. Thus, we cannot determine whether outcome processing is affected by the performance of a motor action itself, or by attentional processes related to task engagement.

We addressed this gap by designing a task in which active and passive trials are mixed randomly, and in which participants are presented with neutral stimuli predicting high probabilities of either winning or losing money. Crucially, in this task passive trials included no motor response, and active trials included a motor response in the absence of choice. This allowed for the direct investigation of the impact of motor behaviour itself on outcome processing.

Given the FRN has been widely associated with outcome monitoring and reinforcement learning (San Martín, 2012), in this study we focused on FRN amplitude as a core measure of the neural processing of reward and punishment. If performing an action enhances outcome monitoring, this should be reflected in a stronger FRN in active, compared with passive trials. If this, in turn, is related to stronger associative learning, a stronger FRN should also be observed for active predictor stimuli, compared with passive predictors (Liao et al., 2011; Walsh & Anderson, 2011). We also expect that if making an action enhances associative learning, participants should show the strongest preference towards active stimuli.

To further understand how actions might affect anticipation and processing of outcomes, we compared the directed phase synchronization between trials with and without actions. Directed phase synchronization measures enable measuring how different areas communicate and in which direction. Using the direction of the phase synchronization, we can infer the nature of the synchronization involved, for example, bottom‐up versus top‐down (Nolte et al., 2008). Previous studies looking at performance monitoring connectivity patterns observed changes in phase synchronization between different brain areas in the theta frequency band (López et al., 2019; Luft et al., 2013; van de Vijver et al., 2011). The direction of the information flow was found to reverse after feedback is presented (Luft et al., 2013). In this study, we explored the possibility of action changing the direction of information flow before and after the outcomes were presented. We adopted an exploratory approach (nonparametric cluster permutation) because, due to the lack of studies on Pavlovian learning and directed phase synchronization, we could not precisely predict the specific electrode pairs and direction of synchronization.

2. METHODS

2.1. Participants

Eighty‐six participants took part in the study (13 male, age 18–26, mean age = 20.5), but data from six participants was lost due to technical failure. Participants were reimbursed for their time either with course credit (1 credit per 15 min), or payment (£7.50 per hour). Participants were predominantly University students, all were resident in the United Kingdom and fluent English speakers. All procedures were approved by the local ethics committee.

2.2. Experimental design and statistical analysis

This study used a randomized within‐subject experimental design, comparing the amplitude of the FRN between active and passive trials. Active and passive trials were randomized on a trial‐wise basis. A paired‐samples t‐test was used to compare the FRN amplitude between the two conditions. For analysis of neural and behavioural responses to predictive stimuli, 2 × 2 repeated measures analyses of variance (ANOVAs) with the within‐subject factors Action (active vs. passive) and Valence (high probability of Win vs. Loss) were used.

2.3. Tasks

The main task was a probabilistic learning task (Figure 1). At the beginning of each trial, participants saw a picture of a safe. The safe could either be closed (active condition) or open (passive condition). On top of the safe, a symbol taken from the Japanese Hiragana system was displayed. In the active condition, participants were instructed to press the space bar with their right hand to ‘open the safe’. After the button press, a fixation cross was shown for 0.5–1 s, before the outcome was shown for 1 s. In the passive condition, an open safe with the symbol on top was displayed for 1 s, and participants were instructed not to respond. This was followed by a fixation cross for 0.5–1 s and the outcome for 1 s. The outcome was either ‘−40’ displayed in red, or ‘+50’ displayed in green, representing a corresponding loss or win of 40 or 50 monetary points, respectively. An imbalanced value of loss versus win outcome was chosen to ensure participants remained motivated throughout the task, rather than being frustrated by losses continually offsetting win outcomes. The intertrial interval varied between 1.5 and 2 s, to prevent impact of motor responses themselves on the neural signature of outcome processing. If participants responded within the 1‐s presentation of a passive stimulus, a red cross was presented in the centre of the screen for 1 s, and the trial repeated.

FIGURE 1.

FIGURE 1

Task outline. (a) Active trial with Loss outcome. (b) Passive trial with Win outcome (safe image: freepik.com)

Symbols were distinct for active and passive trials, and predictive of outcomes in a probabilistic manner. Thus, one symbol (‘active good’) was always shown on top of a closed safe, and was followed by a win outcome in 80% of trials. Another symbol (‘active bad’) was always shown on top of a closed safe and was followed by a loss outcome in 80% of trials. Two different symbols were associated with an 80% probability of win and loss in passive trials, respectively (‘passive good’ and ‘passive bad’). Assignment of symbols to conditions was counterbalanced across participants.

The task consisted of four blocks of 60 trials each (15 trials per condition and block; 60 trials per condition in total). Within each block, trials of the four conditions were mixed randomly. Each block lasted about 5 min. The same four symbols were used in blocks one and two. In blocks three and four, a separate set of four symbols was used. This was done to endure participants were learning throughout the task, rather than getting too familiar with the same symbols presented for four blocks.

Participants were instructed to pay attention to their wins and losses, as they would receive their resulting earnings at the end of the experiment. They were further instructed that this was not a reaction time task, and that rather than trying to respond fast, they should ensure high accuracy, that is, only responding to active stimuli, and not responding to passive stimuli.

After the probabilistic learning task, participants performed a forced choice task, during which in each trial, they were shown two of the symbols from the learning task, without an underlying safe. The two symbols were shown to the left and right of the screen centre, and participants were asked to indicate which symbol they liked better, by pressing a left or right button correspondingly, using their left and right index fingers. In the first half of the task, symbols from blocks one and two of the probabilistic learning task were shown. In the second half of the task, symbols from the third and fourth block were shown. Each possible stimulus combination (e.g., ‘active‐good’ – ‘passive‐bad’) was shown 10 times. Thus, the maximum number a participant could choose one symbol type (e.g., ‘active good’) was 30. Besides the chosen symbol being highlighted by a surrounding rectangle for 1 s, participants received no feedback in this task, and their choices did not contribute to their overall payoff.

2.4. Procedures

Upon arriving in the laboratory, participants were given a study information sheet and completed an informed consent form. They were then brought into a shielded EEG cabin, where they sat in front of a standard computer monitor and keyboard. After setup of the EEG cap and electrodes, participants were provided with written instructions for the probabilistic learning task. Once the task was started, the experimenter waited outside the EEG cabin, monitoring EEG signal and participant responses. In between task blocks, participants were given a chance to have a short break if desired.

After the probabilistic learning task, participants received written instructions for the forced choice task, and were shown the correct response keys on the keyboard. The experimenter left the EEG cabin whilst the participant completed the task.

Once the forced choice task was finished, participants completed other tasks unrelated to the current study, including creativity tasks and personality questionnaires. As these were always completed after the tasks of interest, they are not further explained here.

After completion of all tasks and removal of the EEG cap and electrodes, participants were debriefed, reimbursed for their time at a rate of £7.50 per hour, and received the monetary gains from the probabilistic learning task (£2.40). Task payoff was fixed across participants, as they experienced the same number of win and loss trials.

2.5. EEG recording

Continuous EEG signal was measured throughout the probabilistic learning task, using a 32‐channel active electrode BioSemi system. The signal was recorded at 1000 Hz and offline referenced against the average of two electrodes placed on the earlobes. To monitor eye movements, four electrodes were placed on the participant's face: To the outside of the left and right eye, and above and below the left eye.

2.6. EEG analysis

A highpass filter of 0.5 Hz was applied to the continuous EEG data. Independent component analysis was used to identify and remove eye‐movement related artefacts. For ERP analysis, data were then lowpass filtered at 20 Hz, and segmented into epochs of −100 to 1000 ms time‐locked to the presentation of outcomes, as well as to the stimulus presentation. Data were baseline corrected to the prestimulus period. Epochs with excessive noise were automatically identified and removed using a voltage threshold of ±80 μV. For directed synchronization analyses, data were lowpass filtered at 50 Hz and segmented into epochs of −500 to 2500 ms time‐locked to outcome presentation, without baseline correction. Epochs with excessive noise were automatically identified and removed using a voltage threshold of ±80 μV. Participants with >30% of rejected epochs were excluded from data analysis.

2.7. ERP analysis

For outcome‐locked ERPs, based on the Loss‐Win difference wave for the grant average across active and passive trials, we visually identified the FRN peak at 302 ms at electrode FZ. We computed the average amplitude per condition for the time window 280‐320 ms after outcome presentation. For analysis of the FRN, we calculated the Loss‐Win difference of those averages, separately for active and passive trials. As outcome‐locked ERPs did not differ between trials with predicted win and predicted loss (see Table S1), we combined both trial types for FRN analysis. For analysis of the stimulus‐locked ERP, we used the same time window at electrode FZ, testing for differences in amplitude between the four conditions (‘active‐bad’, ‘active‐good’, ‘passive‐bad’, ‘passive‐good’).

2.7.1. Directed connectivity analysis

We measured the phase slope index (PSI) (Nolte et al., 2008) to estimate the synchronization between the electrodes in the theta frequency band (4–8 Hz). The PSI estimates synchronization between two signals by estimating the slope of the phase of their cross‐spectrum. The PSI is sensitive to noninstantaneous functional relations between the signals. We calculated the PSI in two time windows based on the presentation of the outcome: −0.3 to 0 s (prestimulus phase) and 0 to 0.3 s following the outcome presentation.

2.7.2. Nonparametric cluster permutation on the PSI network

Because we did not have a clear hypothesis regarding the electrode pairs that could be affected by the action, we adopted a data‐driven exploratory approach (Maris & Oostenveld, 2007) on the connectivity space (connectivity matrices). To reduce potential biases introduced by multiple comparisons and distribution assumptions of parametric tests, the difference distribution for active versus passive conditions was constructed in a data‐driven manner using randomizations combined with a network‐based clustering criterion for the t‐statistic extraction (Zalesky et al., 2010). The network‐based statistic controls for family‐wise error rate offering a substantial gain in power by considering the topological characteristics of the graph assuming that a biologically relevant effect on the network cannot be isolated to single or disconnected edges. Meaningful clusters need to show strongly connected components (connected to each other). We first calculated the statistical difference for each brain edge (Active vs. Passive trials in loss trials), discarding absolute t‐values lower than 2. Then, the survived edges were clustered in strong connected components (SCCs; partition into subgraphs with the property of having at least one path between all pairs of nodes) depending on whether they reflect identical effects (separate clusters for positive and negative edges). Subsequently, difference distribution curves of the condition differences were estimated using 1000 permutations by randomly shuffling the condition labels. In each iteration, we performed a sum of the t‐scores within each cluster and kept the maximum (absolute value) cluster score as the cluster t‐statistic. The t‐critical values were then calculated to align with the significance level of .05 (two‐tailed). Clusters formed by the actual labels with t‐score exceeding the t‐critical values were finally identified following an SCC‐wise inference on the difference distribution. To avoid problems with the signs (because the PSI is a directional measure), for each significant contrast (t‐value >2), we considered whether the link was associated with increased connectivity in active or passive trials by looking at the highest absolute PSI value (e.g., if a PSI value is significantly lower in the passive condition, it does not mean the active condition had higher connectivity because it is possible that the value was negative). This procedure enabled us to separate negative and positive clusters which were due to increased connectivity in active versus passive trials. This enabled us to build a larger t‐distribution and compare each of those clusters (Active > Passive/Passive > Active) against the permutated distribution. To test whether the clusters observed generalized to the win trials, we extracted the PSI values of the electrode pairs forming the cluster observed using the loss trials (if significant) in the win trials in each condition. Because the win trials were not used in the nonparametric cluster permutation, this was a way of cross‐validating our analysis, especially because in this case, we were not interested in the differences between win and loss trials (we also tested the difference in PSI between win and loss trials to validate our contrasts).

2.8. Analysis of behavioural data

To assess whether participants showed a significant preference for any of the four symbol categories (active‐good; active‐bad; passive‐good; passive‐bad), we conducted 2 × 2 ANOVA on the choice frequency values, with the within‐subject factors Action (active vs. passive stimuli) and Valence (stimuli predicting a high chance of Win vs. Loss).

3. RESULTS

Data from eight participants were excluded due to signal noise (>30% of trials rejected during artefact detection). Thus, data from 72 participants were included in the analyses.

For the remaining datasets, the mean rejection rate was 10.25 trials, which was distributed evenly across conditions (2.4, 2.3, 2.7 and 2.8 trials on average for Win‐passive, Loss‐passive, Win‐active and Loss‐active, respectively).

3.1. Outcome‐locked ERPs

For the 280‐ to 320‐ms time window, we found a significant difference between FRN amplitudes for passive and active conditions (t 71 = 2.4, p = .021), with a stronger FRN amplitude in active (M = −1.6, SD = 2.9) compared with passive (M = −.7, SD = 3.4) trials (Figure 2a).

FIGURE 2.

FIGURE 2

Outcome‐locked event‐related potentials. (a) Feedback related negativity (difference wave Loss − Win outcome) for active and passive conditions at electrode FZ. (b) Condition‐wise event‐related potentials time‐locked to outcome presentation, at electrode FZ. (c) Topographic maps for outcome‐locked event‐related potentials averaged across 280‐ to 320‐ms post‐outcome presentation

To better understand the effects underlying the reduced FRN in passive trials, we analysed the amplitudes for Win and Loss trials separately. This showed a significantly reduced amplitude for Win trials in the passive (M = 2.4, SD = 3.8) compared with the active condition (M = 3.5, SD = 3.6, t 71 = 3.1, p = .003, corr. p = .006). Amplitude for Loss trials did not differ significantly between passive (M = 1.7, SD = 2.8) and active conditions (M = 1.9, SD = 3.3, t 71 = .6, p = .563; Figure 2b).

For an overview of comparison of FRN amplitudes between active and passive trials across all electrodes, see Table S2 and Figure S1.

3.2. Stimulus‐locked ERPs

Analysing the same FRN time window as above (280–320 ms at electrode FZ) for stimulus‐locked ERPs in a 2 × 2 repeated measures ANOVA (with the factors valence and action) showed a main effect of action (F 71,1 = 57.0, p < .001, η p 2 = .445), and no effect of valence (F 71,1 = .1, p = .803 η p 2 = .001), nor a valence * action interaction (F 71,1 = 2.3, p < .132 η p 2 = .032). This was due to a negative amplitude which was reduced (i.e., less negative) in active (M = −.7, SD = 3.5) compared with passive trials (M = −2.7, SD = 3.4, t 71 = 7.6, p < .001; Figure 3).

FIGURE 3.

FIGURE 3

Stimulus‐locked event‐related potentials. (a) Condition‐wise event‐related potentials time‐locked to stimulus presentation, at electrode FZ. (b) Topographic maps for stimulus‐locked event‐related potentials averaged across 280‐ to 320‐ms post‐outcome presentation

3.3. Directed connectivity results

We conducted two separate cluster permutation analyses comparing the PSI on the theta band in active and passive trials, one in each of these two different time‐windows: immediately before (−300 ms to the outcome) and after the outcome was presented (0 to 300 ms). Because we were not interested in the differences between winning or losing, we conducted the nonparametric cluster permutation on the losing trials and tested whether they generalized to the winning trials. This is also justified by the fact that the type of outcome (positive or negative) was not found to influence directed connectivity in previous work (Luft et al., 2013).

For both win and loss trials, we found two significant clusters in the period immediately before the outcome was presented: one with a cluster which was higher during active trials (Figure 4a), and one which was higher during passive compared with active trials (Figure 4b) during losses and during wins. During losses, the active cluster (t‐statistic = 289.18, t‐critical = 95.62, p = .0105) shows strong modulations from the prefrontal regions to the posterior sensory regions. Interestingly, the passive trials cluster shows the opposite direction: flow from mid‐posterior to frontal areas (t‐statistic = 287.29, t‐critical = 95.62, p = .0110). To test whether the differences between active and passive trials replicated in the win trials, we extracted the PSI values of the significant connections for each cluster separately and compared the values using a t‐test. We found that our findings replicated to the win trials as the PSI in the connections of cluster 1 (Active) was significantly higher for active (M compared with passive trials (t 71 = 2.594, p = .012, Cohen's d = 0.501) whereas the connections of cluster 2 presented higher PSI for passive compared with active trials (t 71 = 2.198, p = .031, Cohen's d = 0.259). Furthermore, we checked whether there was a difference between loss and win trials in relation to the PSI values of each cluster and observed no significant difference for both active (t 71 = 1.392, p = .168) and passive clusters (t 71 = −0.195, p = .846). We conducted the same cluster permutation analysis on the post‐outcome window (0 to 300 ms), but we did not find any significant cluster between active and passive trials (p > .1).

FIGURE 4.

FIGURE 4

Cluster and topography of the phase slope index. (a) Significant cluster connections during active compared with passive trials (left hand side) and the topography of the PSI during active trials. (b) Significant cluster connections during passive compared with active trials (left hand side) and the topography of the PSI during passive trials (right hand side). Only loss trials were used for this analysis

3.4. Behavioural choice data

Behavioural data were lost for one participant, so 71 participants are included in these analyses. The 2 × 2 ANOVA showed a main effect of Action (F 1,70 = 5.5, p = .022, η p 2 = .073), but no main effect of Valence (F 1,70 = .5, p = .473, η p 2 = .007) and no Action * Valence interaction (F 1,70 = .1, p = .711, η p 2 = .002). The main effect of Action was based on a slight preference for active symbols (M = 32.1 and 27.9 respectively, SD = 7.5). The difference in preference for active minus passive symbols was not correlated with the difference in FRN amplitude between active and passive trials (r = .003, p = .982).

4. DISCUSSION

In this study, we examined the effect of performing a motor action on the neural processing of trial outcomes, on the processing of predictive stimuli, and on directed phase synchronization before and after the trial outcome. The FRN was enhanced for outcomes following actions, in the absence of instrumental choice. For predictive stimuli, we found a main effect of performing a motor action on both neural processing and behavioural preference, regardless of stimulus valence. Furthermore, we found that the action was associated with a shift in the direction of phase synchronization just before the outcome was presented: Actions were associated with an increase in flow from frontal to posterior areas whereas passive trials presented the opposite pattern.

4.1. Making an action enhances the neural processing of win, but not loss, outcomes

Previous studies have shown that the FRN amplitude is increased when participants make a meaningful choice, or believe that they have active control over action outcomes (Li et al., 2011; Yi et al., 2018). These findings can be interpreted as evidence for the FRN being sensitive to the behavioural relevance of an outcome: outcome processing is enhanced when it is informative for the value of future actions. Here, we show that beyond such effects linked to instrumental learning, the neural processing of outcomes is enhanced by the mere act of performing a motor action, in the absence of instrumental control. In our task, participants pressed a single key, thus they had no choice over which action to perform, and no control over the valence of the outcome. Nevertheless, FRN amplitude was increased in trials in which participants made a button press, compared with completely passive trials.

Our findings significantly expand the findings of Yeung et al. (2005), who showed similar effects using a blocked design. Their observation of reduced FRN amplitude in passive blocks, however, can be easily explained by reduced task engagement in prolonged periods of passivity. Our task was specifically designed to avoid profound differences in attention between task conditions. Several explanations for the effects observed here are possible.

One possibility is that performing any kind of motor action acutely enhances general attention towards subsequent environmental events in a top‐down manner. Our task, mixing active and passive instructions on a trial‐wise basis, was designed to prevent general effects of prolonged passivity on attention. Thus, any effects of passive versus active trials on attentional processes would be a short‐lived, action‐driven process.

Our connectivity findings are in line with the notion that performing a motor action is linked to transient changes in attention. In active trials, following the button press and preceding outcome presentation, we observed directed information flow from prefrontal areas to parietal and occipital areas, consistent with the post‐response pattern observed previously (Luft et al., 2013). This suggests top‐down modulation of visual areas in preparation for outcome presentation considering the evidence that top‐down processes are initiated in the frontal to posterior areas (Buschman & Miller, 2007). In contrast, in passive trials, information flow was reversed, from posterior to frontal areas. Thus, information flow during the outcome anticipation phase in passive trials was in line with preparation for bottom‐up processing of incoming stimuli. Notably, these effects of reversed information flow for active and passive trials were specific to the preoutcome phase and were similar for win and loss trials. This difference in flow direction preceding the outcome presentation may suggest that the action might change the state of the brain before receiving the feedback, making it more receptive to it.

Our ERP findings showed action‐related enhancement of outcome processing specifically for win trials. ERPs for loss outcomes did not differ between active and passive trials. Such a valence‐specific enhancement of outcome processing is not in line with general attentional effects. However, we cannot fully exclude the possibility that passive trials decreased attention sufficiently to reduce reward processing, without affecting loss processing. Nevertheless, given the trial‐wise randomization of passive and active conditions, such attentional effects would need to be immediate and short‐lived, in line with the core interpretation of our findings that making an action acutely facilitates reward processing.

An alternative explanation considering the valence‐specific nature of the effects observed here is based on the overlap in neural systems involved in action initiation and reward processing. Both processes involve an activation of dopaminergic systems (da Silva et al., 2018; Schultz, 1998). Administration of levodopa leads to a general facilitation of motor actions, as well as to a specific enhancement of preparation for actions expected to lead to a reward in striatum and ventral tegmental area (Guitart‐Masip, Chowdhury, et al., 2012). Thus, expectation of reward appears to facilitate dopaminergic firing underlying action preparation. We suggest that in turn, dopaminergic firing associated with action preparation might enhance subsequent neural responses to rewarding events. Such an overlap is in line with existing findings on biases in instrumental learning, suggesting that action‐win associations are learned more easily than action‐no‐loss associations (Guitart‐Masip, Huys, et al., 2012; Swart et al., 2017). Crucially, our findings suggest that these effects are bi‐directional: not only do existing Pavlovian associations bias stimulus–response learning, but neutral motor actions bias the processing of subsequent outcomes towards stronger responses to rewards.

Alternatively, inhibiting motor responses in the passive trials may have suppressed neural responses to rewards. With the current design, it is difficult to distinguish whether making an action facilitates reward processing, or inhibiting an action suppresses reward processing. Future studies could try to distinguish between the two by increasing the delay between the condition cue (active/passive) and the motor action itself, or by introducing different types of action requiring stronger or weaker motor responses. However, this would need to be carefully balanced against the risk of inducing task disengagement in passive trials, which the current task was designed to avoid.

4.2. Processing of active versus passive cues

Contrary to our hypotheses, we did not observe FRN‐type responses to predictive cues. Rather, a strong negative deflection was observed in response to passive cues, regardless of whether they predicted win or loss outcomes. Thus, the neural representation of an upcoming motor action—or suppression thereof—dominated over neural responses based on stimulus valence. This was mirrored in behavioural choice data, with participants showing stronger liking towards active compared with passive cues, regardless of their associated valence. Importantly, in this task active and passive stimuli were presented simultaneously and position (left/right) on the screen was counterbalanced. Thus, participants had to select the corresponding motor response whether they chose an active or passive stimulus, such that preference for active stimuli could not be explained by participants having learned during the main task, to initiate a motor response whenever an active stimulus was presented. It should be noted that while we asked participants to choose which of two stimuli they liked better, this was presented within a forced choice paradigm. Thus, we cannot distinguish between ‘liking’ and ‘wanting’ responses as proposed for reward‐based control of behaviour (Berridge & Robinson, 2016). It is plausible that if the behavioural effects observed here are driven by an association of cues with action‐based dopaminergic firing, the preference for active cues was based on motivational ‘wanting’ responses, which are dependent on dopaminergic firing, rather than affective ‘liking’.

This main effect of action on subjective cue valence is in line with a previous study showing that faces consistently paired with NoGo‐responses are evaluated as less trustworthy (Kiss et al., 2008). Interestingly, this study also showed that faces evaluated as less trustworthy elicited stronger NoGo‐related N200 amplitudes. Thus, these findings are in line with our suggestion of bidirectional links between motor behaviour and valence processing: negative stimuli are inherently linked to response suppression (and, as shown by Swart et al., 2017, positive stimuli are linked to action initiation); in turn, pairing a stimulus with action inhibition facilitates negative associations formed with that stimulus, whereas, as our data suggest, performing an action enhances the subjective value of that stimulus.

In contrast to our findings, some studies have observed FRN responses to predictive cues (Liao et al., 2011; Walsh & Anderson, 2011). However, one of these used predictive cues that were identical to the subsequent signals for win versus loss (Liao et al., 2011). Thus, this finding does not clearly show an FRN response to predictive cues per se but might reflect a lack of differentiation between the first (predictive) and second (outcome) cue. While an FRN‐like response to a predictive cue was also observed by (Walsh & Anderson, 2011), this cue was preceded by an active choice, and thus may have been processed as an action outcome in itself. In our study, predictive cues were superimposed on the stimuli instructing an active response versus passive waiting. It is possible that strong neural markers of behavioural activation versus inhibition masked FRN‐type effects. Thus, considering previous findings and our current data, it remains to be shown whether FRN shows a true shift from outcomes to predictive cues during associative learning.

4.3. Implications and outlook

If supported by future studies, the mechanisms suggested here can be of great relevance to our understanding of addiction, in particular of behavioural addictions such as problematic gambling. Specific action‐based enhancement of the neural processing of win outcomes, but not loss outcomes, could explain the highly addictive properties of games such as slot machines. In these games, outcomes are directly preceded by actions such as pulling a lever or pressing a button, and frequent players appear relatively resilient to repeated losses, with their behaviour driven by infrequent wins. Problematic gambling is disproportionally associated with playing slot machines or slot‐machine like video games (MacLaren, 2016), rather than, for example, playing the lottery, which has much longer gaps between the gambling action and the outcome (Bakken et al., 2009).

The proposed mechanisms also have implications for our understanding of substance‐based addiction, and merit further research into the role of active drug consumption. Habit‐based theories of addiction emphasize the role of drug‐related behaviour in acquiring and maintaining addiction (Lipton et al., 2019). In turn, models focused on incentive salience focus on drug‐related behaviour as the consequence of Pavlovian associations (Berridge & Robinson, 2016). What remains to be shown, is whether the rewarding properties of the drug itself are enhanced by active consumption (i.e., self‐administration). While most drugs of abuse are self‐administered, such understanding could have great implications for drug administration in medical settings, as well as for therapeutic approaches towards addiction.

4.4. Limitations

A key limitation of the current study is that predictive cues were not presented outside the task context. Future studies should include a passive stimulus viewing phase after the reward learning task, to measure the neural response to predictive cues alone. Further, our measure of stimulus preference was limited to an explicit measure of forced choice, which did not allow distinction between affectively driven preference, and motivationally driven approach behaviour.

Another limitation of our task was that in the passive condition, the open safe appeared to be empty. If this may have prompted expectations of a loss outcome, we would expect this to enhance the neural response to (more unexpected) win outcomes, in line with prediction error accounts of FRN amplitude (San Martín, 2012). However, we cannot exclude the possibility that the opposite occurred, and seeing an apparently empty safe reduced participants' sensitivity to win outcomes.

Given the homogeneity of our sample (85% female; 88% between 18 and 22 years of age), we did not test for gender or age effects. We did not record handedness, as we were not studying lateralized motor effects. Future studies should consider such effects in more heterogeneous samples.

Finally, the imbalance in win versus loss outcomes in this task may have affected the overall amplitude of outcome‐locked ERPs (although the impact of outcome magnitude on FRN amplitude is disputed; San Martín, 2012). However, as this imbalance was identical for active and passive trials, it is unlikely to have played a significant role in the core effects discussed here.

5. CONCLUSIONS

In this study, we showed that performing a simple motor action, in the absence of choice, modulates top‐down information processing during the outcome anticipation phase, and selectively enhances the neural processing of rewards. While we found no evidence of a shift in FRN from outcomes to predictive symbols, symbols associated with an active response acquired stronger preference properties than symbols associated with passive waiting. Our results suggest a close link between motor control and reward processing.

CONFLICT OF INTEREST

The authors report no conflicts of interest.

AUTHOR CONTRIBUTION

KB: conducted data collection & analysis, revised manuscript. CDBL: conducted data analysis, wrote & revised manuscript. FB: Designed research, conducted data analysis, wrote & revised manuscript.

PEER REVIEW

The peer review history for this article is available at https://publons.com/publon/10.1111/ejn.15826.

Supporting information

Table S1: Comparison of Outcome‐locked ERPs between trials with high probability of win vs. loss (280‐320 ms post outcome presentation, electrode FZ)

Table S2: Comparison of FRN amplitude between active and passive trials across electrodes

Figure S1: FRN amplitudes for active and passive trials across electrodes.

ACKNOWLEDGEMENTS

This research was supported by a grant from the Royal Society to Frederike Beyer (RGS\R1\191344).

Bikute, K. , Di Bernardi Luft, C. , & Beyer, F. (2022). The value of an action: Impact of motor behaviour on outcome processing and stimulus preference. European Journal of Neuroscience, 56(10), 5823–5835. 10.1111/ejn.15826

Edited by: David Belin

Funding information Royal Society, Grant/Award Number: RGS\R1\191344

Contributor Information

Kotryna Bikute, Email: kbikute@gmail.com.

Frederike Beyer, Email: f.beyer@qmul.ac.uk.

DATA AVAILABILITY STATEMENT

Data are available upon request from the corresponding author.

REFERENCES

  1. Bakken, I. J. , Götestam, K. G. , Gråwe, R. W. , Wenzel, H. G. , & Øren, A. (2009). Gambling behavior and gambling problems in Norway 2007. Scandinavian Journal of Psychology, 50(4), 333–339. 10.1111/j.1467-9450.2009.00713.x [DOI] [PubMed] [Google Scholar]
  2. Berridge, K. C. , & Robinson, T. E. (2016). Liking, wanting, and the incentive‐sensitization theory of addiction. American Psychologist, 71(8), 670–679. 10.1037/amp0000059 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Burnside, R. , Fischer, A. G. , & Ullsperger, M. (2019). The feedback‐related negativity indexes prediction error in active but not observational learning. Psychophysiology, 56(9), e13389. 10.1111/psyp.13389 [DOI] [PubMed] [Google Scholar]
  4. Buschman, T. J. , & Miller, E. K. (2007). Top‐down versus bottom‐up control of attention in the prefrontal and posterior parietal cortices. Science, 315(5820), 1860–1862. 10.1126/science.1138071 [DOI] [PubMed] [Google Scholar]
  5. Cartoni, E. , Balleine, B. , & Baldassarre, G. (2016). Appetitive Pavlovian‐instrumental transfer: A review. Neuroscience & Biobehavioral Reviews, 71, 829–848. 10.1016/j.neubiorev.2016.09.020 [DOI] [PubMed] [Google Scholar]
  6. da Silva, J. A. , Tecuapetla, F. , Paixão, V. , & Costa, R. M. (2018). Dopamine neuron activity before action initiation gates and invigorates future movements. Nature, 554(7691), 244–248. 10.1038/nature25457 [DOI] [PubMed] [Google Scholar]
  7. Donkers, F. C. , Nieuwenhuis, S. , & Van Boxtel, G. J. (2005). Mediofrontal negativities in the absence of responding. Cognitive Brain Research, 25(3), 777–787. 10.1016/j.cogbrainres.2005.09.007 [DOI] [PubMed] [Google Scholar]
  8. Fanselow, M. S. , & LeDoux, J. E. (1999). Why we think plasticity underlying Pavlovian fear conditioning occurs in the basolateral amygdala. Neuron, 23(2), 229–232. 10.1016/S0896-6273(00)80775-8 [DOI] [PubMed] [Google Scholar]
  9. Guitart‐Masip, M. , Chowdhury, R. , Sharot, T. , Dayan, P. , Duzel, E. , & Dolan, R. J. (2012). Action controls dopaminergic enhancement of reward representations. Proceedings of the National Academy of Sciences, 109(19), 7511–7516. 10.1073/pnas.1202229109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Guitart‐Masip, M. , Huys, Q. J. M. , Fuentemilla, L. , Dayan, P. , Duzel, E. , & Dolan, R. J. (2012). Go and no‐go learning in reward and punishment: Interactions between affect and effect. NeuroImage, 62(1), 154–166. 10.1016/j.neuroimage.2012.04.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ivlieva, N. Y. , & Ivliev, D. A. (2014). Specific role of dopamine in striatum during instrumental learning. Zhurnal Vyssheĭ Nervnoĭ Deiatelnosti Imeni I P Pavlova, 64(3), 251–254. 10.7868/s004446771403006x [DOI] [PubMed] [Google Scholar]
  12. Kiss, M. , Raymond, J. E. , Westoby, N. , Nobre, A. C. , & Eimer, M. (2008). Response inhibition is linked to emotional devaluation: Behavioural and electrophysiological evidence. Frontiers in Human Neuroscience, 2, 13. 10.3389/neuro.09.013.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Lamb, R. J. , Schindler, C. W. , & Pinkston, J. W. (2016). Conditioned stimuli's role in relapse: Preclinical research on Pavlovian‐instrumental‐transfer. Psychopharmacology, 233(10), 1933–1944. 10.1007/s00213-016-4216-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Lewis, A. H. , Niznikiewicz, M. A. , Delamater, A. R. , & Delgado, M. R. (2013). Avoidance‐based human Pavlovian‐to‐instrumental transfer. The European Journal of Neuroscience, 38(12), 3740–3748. 10.1111/ejn.12377 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Li, P. , Han, C. , Lei, Y. , Holroyd, C. B. , & Li, H. (2011). Responsibility modulates neural mechanisms of outcome processing: An ERP study. Psychophysiology, 48(8), 1129–1133. 10.1111/j.1469-8986.2011.01182.x [DOI] [PubMed] [Google Scholar]
  16. Liao, Y. , Gramann, K. , Feng, W. , Deák, G. O. , & Li, H. (2011). This ought to be good: Brain activity accompanying positive and negative expectations and outcomes. Psychophysiology, 48(10), 1412–1419. 10.1111/j.1469-8986.2011.01205.x [DOI] [PubMed] [Google Scholar]
  17. Lipton, D. M. , Gonzales, B. J. , & Citri, A. (2019). Dorsal striatal circuits for habits, compulsions and addictions. Frontiers in Systems Neuroscience, 13, 28. 10.3389/fnsys.2019.00028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. López, M. E. , Pusil, S. , Pereda, E. , Maestú, F. , & Barceló, F. (2019). Dynamic low frequency EEG phase synchronization patterns during proactive control of task switching. NeuroImage, 186, 70–82. 10.1016/j.neuroimage.2018.10.068 [DOI] [PubMed] [Google Scholar]
  19. Luft, C. D. B. , Nolte, G. , & Bhattacharya, J. (2013). High‐learners present larger mid‐frontal theta power and connectivity in response to incorrect performance feedback. Journal of Neuroscience, 33(5), 2029–2038. 10.1523/JNEUROSCI.2565-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. MacLaren, V. V. (2016). Video lottery is the most harmful form of gambling in Canada. Journal of Gambling Studies, 32(2), 459–485. 10.1007/s10899-015-9560-z [DOI] [PubMed] [Google Scholar]
  21. Maris, E. , & Oostenveld, R. (2007). Nonparametric statistical testing of EEG‐and MEG‐data. Journal of Neuroscience Methods, 164(1), 177–190. 10.1016/j.jneumeth.2007.03.024 [DOI] [PubMed] [Google Scholar]
  22. Miltner, W. H. , Braun, C. H. , & Coles, M. G. (1997). Event‐related brain potentials following incorrect feedback in a time‐estimation task: Evidence for a “generic” neural system for error detection. Journal of Cognitive Neuroscience, 9(6), 788–798. 10.1162/jocn.1997.9.6.788 [DOI] [PubMed] [Google Scholar]
  23. Nolte, G. , Ziehe, A. , Nikulin, V. V. , Schlögl, A. , Krämer, N. , Brismar, T. , & Müller, K.‐R. (2008). Robustly estimating the flow direction of information in complex physical systems. Physical Review Letters, 100(23), 234101. 10.1103/PhysRevLett.100.234101 [DOI] [PubMed] [Google Scholar]
  24. Robinson, T. E. , & Berridge, K. C. (2001). Incentive‐sensitization and addiction. Addiction, 96(1), 103–114. 10.1046/j.1360-0443.2001.9611038.x [DOI] [PubMed] [Google Scholar]
  25. San Martín, R. (2012). Event‐related potential studies of outcome processing and feedback‐guided learning. Frontiers in Human Neuroscience, 6, 304. 10.3389/fnhum.2012.00304 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80(1), 1–27. 10.1152/jn.1998.80.1.1 [DOI] [PubMed] [Google Scholar]
  27. Skinner, B. F. (1951). How to teach animals. Scientific American, 185(6), 26–29. 10.1038/scientificamerican1251-26 [DOI] [Google Scholar]
  28. Swart, J. C. , Froböse, M. I. , Cook, J. L. , Geurts, D. E. , Frank, M. J. , Cools, R. , & den Ouden, H. E. (2017). Catecholaminergic challenge uncovers distinct Pavlovian and instrumental mechanisms of motivated (in)action. eLife, 6, e22169. 10.7554/eLife.22169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. van de Vijver, I. , Ridderinkhof, K. R. , & Cohen, M. X. (2011). Frontal oscillatory dynamics predict feedback learning and action adjustment. Journal of Cognitive Neuroscience, 23(12), 4106–4121. 10.1162/jocn_a_00110 [DOI] [PubMed] [Google Scholar]
  30. Walsh, M. M. , & Anderson, J. R. (2011). Learning from delayed feedback: Neural responses in temporal credit assignment. Cognitive, Affective, & Behavioral Neuroscience, 11(2), 131–143. 10.3758/s13415-011-0027-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Williams, B. A. (1994). Conditioned reinforcement: Experimental and theoretical issues (Vol. 17) (pp. 261–285). Springer. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Yeung, N. , Holroyd, C. B. , & Cohen, J. D. (2005). ERP correlates of feedback and reward processing in the presence and absence of response choice. Cerebral Cortex, 15(5), 535–544. 10.1093/cercor/bhh153 [DOI] [PubMed] [Google Scholar]
  33. Yi, W. , Mei, S. , Li, Q. , Liu, X. , & Zheng, Y. (2018). How choice influences risk processing: An ERP study. Biological Psychology, 138, 223–230. 10.1016/j.biopsycho.2018.08.011 [DOI] [PubMed] [Google Scholar]
  34. Zalesky, A. , Fornito, A. , & Bullmore, E. T. (2010). Network‐based statistic: Identifying differences in brain networks. NeuroImage, 53(4), 1197–1207. 10.1016/j.neuroimage.2010.06.041 [DOI] [PubMed] [Google Scholar]
  35. Zhou, Z. , Yu, R. , & Zhou, X. (2010). To do or not to do? Action enlarges the FRN and P300 effects in outcome evaluation. Neuropsychologia, 48(12), 3606–3613. 10.1016/j.neuropsychologia.2010.08.010 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1: Comparison of Outcome‐locked ERPs between trials with high probability of win vs. loss (280‐320 ms post outcome presentation, electrode FZ)

Table S2: Comparison of FRN amplitude between active and passive trials across electrodes

Figure S1: FRN amplitudes for active and passive trials across electrodes.

Data Availability Statement

Data are available upon request from the corresponding author.


Articles from The European Journal of Neuroscience are provided here courtesy of Wiley

RESOURCES