Skip to main content
PLOS One logoLink to PLOS One
. 2021 Jan 7;16(1):e0245191. doi: 10.1371/journal.pone.0245191

How using brain-machine interfaces influences the human sense of agency

Emilie A Caspar 1,*,#, Albert De Beir 2,3,#, Gil Lauwers 2, Axel Cleeremans 1,, Bram Vanderborght 2,3,
Editor: Jane Elizabeth Aspell4
PMCID: PMC7790430  PMID: 33411838

Abstract

Brain-machine interfaces (BMI) allows individuals to control an external device by controlling their own brain activity, without requiring bodily or muscle movements. Performing voluntary movements is associated with the experience of agency (“sense of agency”) over those movements and their outcomes. When people voluntarily control a BMI, they should likewise experience a sense of agency. However, using a BMI to act presents several differences compared to normal movements. In particular, BMIs lack sensorimotor feedback, afford lower controllability and are associated with increased cognitive fatigue. Here, we explored how these different factors influence the sense of agency across two studies in which participants learned to control a robotic hand through motor imagery decoded online through electroencephalography. We observed that the lack of sensorimotor information when using a BMI did not appear to influence the sense of agency. We further observed that experiencing lower control over the BMI reduced the sense of agency. Finally, we observed that the better participants controlled the BMI, the greater was the appropriation of the robotic hand, as measured by body-ownership and agency scores. Results are discussed based on existing theories on the sense of agency in light of the importance of BMI technology for patients using prosthetic limbs.

Introduction

Most of our actions are automatic and triggered externally. Voluntary actions, however, because they are initiated endogenously on the basis of our intentions, are seen as a fundamental marker of human behaviour. Such goal-oriented behaviour is associated with the ‘sense of agency,’ that is, the subjective feeling that one exerts control over one’s own actions and on their outcomes [1].

The sense of agency (SoA) is not a unitary phenomenon. SoA involves different aspects of the conscious experience of being the agent of one’s actions; in particular feelings of agency or authorship (FoA), and judgments of agency (JoA) [2]. FoA refers to a pre-reflective sensorimotor experience of being the author of an action, while JoA refers to the explicit declaration that an outcome was (or not) caused by our own actions. JoA is typically measured through explicit questions asked to the participants (e.g. Are you the author of that action?), while FoA is typically measured more implicitly, for instance by using time perception (i.e., the intentional binding paradigm). In the classical intentional binding paradigm [3], participants estimate the delay between their action (i.e. a keypress) and an outcome (i.e. a tone) by judging the moment at which each event occurs by means of a clock-like representation [4]. If the movement is performed voluntarily, the perceived time between the two events is reported to be shorter than in a condition in which the movement is performed involuntarily (for instance, when it is triggered by a TMS pulse over the motor cortex). This suggests that sense of agency modifies time perception by reducing the perceived duration of the delay that lapses between an action that we feel we carried out (vs. not) and its consequences (see [57] for reviews and [8]).

Today, many actions that we carry out are mediated by computers and machines, thus modifying the basic experience of performing an action through one’s own body [9]. Such human-computer interactions appear to influence the sense of agency. For instance, Coyle and colleagues [10] showed that intentional binding was stronger in a condition in which participants used a skin-based input system that detected when participants were tapping on their arm to produce a resulting tone rather than in a condition in which they were tapping on a button to produce a similar tone. These results can be understood based on the comparator model ([11]; see [12] for a review), which suggests that SoA is experienced most clearly when there is a match between the predicted outcome and the actual outcome. Thus, according to the comparator model theory, internal forward models continuously predicts the sensory consequences of motor commands by computing the effects of an efference copy of such commands and comparing predicted and actual sensory outcomes. If there is a mismatch (for instance, created by a spatial or temporal inconsistency), the discrepancy between the predicted and sensory actual feedback is detected and the sense of being the author of the action and of controlling its outcome is reduced. In the experiment of Coyle et al. [10], the degree of congruence (and thus predictability) between the internally predicted outcome and the actual predicted outcome of skin-based stimulation could have been higher than when an external device is used to produce the outcome. SoA could thus have been boosted with skin-based inputs.

Another account is based on cue integration theory [13, 14], which suggests that SoA is based on multiple sources of information that are then combined online to infer agency. Based on this theory, receiving additional sensory cues from the limbs may contribute to a greater sense of agency vs. when not receiving this additional source of information. Using technologies such as brain-machine interfaces (BMI, or brain-computer interfaces, BCI) might thus reduce one’s primary experience of being the author of one’s own action, since reafference (i.e., the sensory consequences of producing an action) is weakened or absent in such cases.

The key principle of BMI is that participants can control external devices (such as a computer or neuroprostheses) by monitoring and controlling their mental state without the intervention of their sensorimotor system [15]. To do so, bidirectional learning takes place between the user and the computer: While the user learns to produce specific task-relevant (i.e., imagining moving one’s right hand) vs. irrelevant (i.e., thinking about nothing) mental states, a decoding algorithm that monitors brain activity (i.e., electrical signals over motor cortex) is trained to differentiate between relevant and irrelevant activity so that the movements of an external device (i.e., a robotic hand) under its control reflect the motor intentions of the user. If the accuracy provided by the algorithm is sufficiently high, the user will then be able to efficiently control the BMI by using the same mental states as used during training. Thus, to generate an action of the external device, only internal cues are used; BMIs bypass the muscular system of the user [16]. According to the hypotheses generated by Limerick et al. [9], the absence of sensory feedback during BMI-generated actions should reduce SoA in comparison with situations in which participants use their own hand to perform the action, since it involves either a reduction in the quality of predictions (i.e. the comparator model) or a reduced number of cues, since no reafferences are produced (i.e. the cue integration theory). The observation of decreased SoA during BMI-generated actions could have substantial implications for the daily life support of patients using neuroprostheses, but also for the notion of responsibility for actions carried out when using such devices [17, 18].

In light of these issues, we therefore aimed to address the following questions. First, we assessed to what extent individual performance to control the BMI has a measurable influence on both explicit (i.e. JoA) and implicit sense of agency (i.e., FoA). According to either the comparator model or the cue integration theory, using a BMI should reduce the sense of agency since no efference copy is available. Second, using a BMI can be particularly tiring. We thus also evaluated to what extent the reported difficulty to make the robotic hand move influences the sense of agency over BMI-generated actions. Finally, we investigated to what extent explicit and implicit sense of agency over BMI-generated actions and resulting outcomes influence the appropriation of the BMI device (here, a robotic hand), as measured through the embodiment of the device (see literature on the rubber hand illusion, [19, 20]). The embodiment of the device was measured with body-ownership, location and agency towards the robotic hand [21].

In a first study, we investigated whether or not the use of BMI influences SoA by comparing explicit and implicit SoA, both for body-generated actions and for BMI-generated actions. In the body-generated action condition, participants used their right index finger to press a key whenever they chose so as to trigger a resulting tone. Next, they had to estimate, in milliseconds, the duration of the delay between their keypress and the tone (i.e. implicit SoA; [21]). In the BMI-generated action condition, participants controlled an external (right) robotic hand through motor imagery. Instead of using their own hand, participants were instructed to relax and to use the robotic hand to press the key. Again, they had to judge the duration of the delay between the robotic hand keypress and the tone. They also had to indicate how much they felt in control of the movement that they had seen (i.e. explicit SoA). Based on Limerick et al. [9], a reduced SoA in the BMI-generated action condition in comparison with the body-generated action condition is expected. However, a previous study suggested that BMI-generated actions could share similar characteristics as body-generated actions [22]. These authors observed that explicit judgements of agency were higher for congruent BMI-actions than if neuro-visual delays were introduced—an effect that was already observed for body-generated actions [23]. However, these authors did not systematically compare implicit and explicit SoA for both BMI-generated and body-generated actions, which constitutes the main extension in the first study here. At the end of the experiment, participants had to complete a questionnaire assessing the extent to which they felt the robotic hand was embodied, as well as their level of cognitive fatigue before and after each experimental condition.

Study 1—Method

Participants

A total of 30 naïve right-handed participants were recruited. To the best of our knowledge, no previous studies directly compared SoA for manual and BCI actions. Given that the average sample size for SoA studies is about 20 participants (e.g. [20, 24]) and that a substantial proportion of people do not succeed in controlling a brain-computer interface–a phenomenon called ‘BCI illiteracy’ [25]—we decided to test 30 participants. Each participant received €15 for their participation. The following exclusion criteria were chosen prior to the experiment: (1) failure to produce temporal intervals covarying monotonically with actual action-tone interval–(2) failure to reach chance level after the training session of motor imagery (accuracy score < 50%)–or (3) observing evidence for muscular activity during the motor imagery task. Three participants were excluded due to (3), see also point 3.4. Of the 27 remaining participants, 9 were males. The mean age was 23.78 (SD = 2.68). All participants provided written informed consent prior to the experiment. The study was approved by the local ethical committee of the Université libre de Bruxelles (054/2015).

Materials and procedure

The experiment lasted about 1.5 hours and took place in a single session. In the first +/- 30 minutes, participants listened to instructions about the task and filled in the consent form. Next, the electroencephalography equipment was fitted. Participants then performed the first interval estimates task with their own hand (i.e., body-generated action condition). During the next +/- 30 minutes, participants performed the BCI training procedure, the duration of which depended on the number of training sessions necessary for participants to control the BCI. The final +/- 30 minutes were spent on the real feedback session with the robotic hand, the second interval estimates task (i.e., BMI-generated action condition) and the completion of the questionnaire assessing body-ownership, location and agency over the robotic hand as well as cognitive fatigue.

Robotic hand

The robotic hand we used was an upgrade of the open source version described in [26] (for general requirements when designing such devices, see [27]). The index finger was actuated by a high voltage coreless digital servomotor (BMS-28A). The motion of the finger was triggered via a Matlab function using serial communication (USB). The latency between the classifier and the motion of the finger was measured to be about 110ms and was caused by the latency of the serial communication (10ms) as well as the latency of the classifier (100ms). The motion of the finger was pre-recorded in such a way that, once the motion was triggered, the finger would move quickly, taking less than 10ms to press the key. The robotic finger would maintain the key pressed for 900ms before returning to its initial position. This 900-ms timing made it possible to avoid creating a temporal distortion in the interval estimates carried out by participants, since the tone was always produced before the finger of the robotic hand lifted up. We used a membrane keyboard that produced a quiet but audible sound when pressed or released. In addition, the servomotor of the robotic hand also produced an audible “motor” sound when actuated. These two sounds were dampened substantially by the headphones used by participants during the task.

Electroencephalography recordings and processing

Brain activity was recorded using a 64-channels electrode cap with the ActiveTwo system (BioSemi). Data were analysed using Fieldtrip software [28]. Muscular activity from left and right mastoids was also recorded and was used to re-reference the electrodes on the scalp. Amplified voltages were sampled at 2048 Hz. The classifier analysed the data from 15 electrodes over the motor cortex (FC3, FC1, FCz, FC2, FC4, C3, C1, Cz, C2, C4, CP3, CP1, CPz, CP2, CP4). Raw EEG data were bandpass filtered in the mu- and beta-bands (i.e., 8–30 Hz). Time-frequency decomposition of all epochs was obtained using Hanning taper with a fixed time window. In the training phase, the epochs consisted in the subtraction of the 3s-time window of the rest phase from the 3s-time window of the imagery phase. In the main experimental conditions, the epochs consisted in the 500-ms period preceding each keypress, either with participants’ own fingers or with the robotic hand. Ocular activity was removed through Independent Component Analysis (ICA).

Electromyography recordings and processing

Two external electrodes were placed on the flexor digitorum superficialis to control for finger movements. Participants were reminded to remain motionless when using the robotic hand. The signal was monitored by the experimenter during the experimental session and participants were reminded not to move if muscular contractions were detectable. In addition, the signal of those electrodes was recorded and analysed after the session in order to remove trials in which muscular activity took place before the robotic hand movement. Data were filtered with a highpass of 10Hz and baseline-corrected with the period ranging from -2s. to -1.9s prior to the keypress.

BMI training procedure

The BCILAB platform was used for the classifier [29]. The classifier had to discriminate between two different states: motor imagery of the right hand and being at rest. To obtain the relevant data, participants were invited to sit in a relaxing position and to watch a screen. A cross was presented on the screen, either alone or accompanied by a red arrow pointing to the right (see Fig 1A). The arrow appeared during 3s. and disappeared during 3s. These two phases were followed by a 2 s. resting phase in which the arrow was not present but the cross was still displayed. Markers used for classification were placed at the beginning of the 3 s. of cross display and when the arrow appeared on the screen. When the cross was presented alone, participants were invited to think of nothing. Some participants reported difficulties to think of nothing and were thus invited to think of a landscape or a blue sky. Some participants also used their own strategy, like thinking of a banana on a table or of the screensaver of their own computer. When the red arrow appeared, participants were asked to imagine a movement of their right hand. They were told that they had to visualise the movement but also try to feel it from a somato-motor point of view, without actually performing the movement. Previous literature has indeed indicated that kinaesthetic motor imagery gives better performance than visual-motor imagery (e.g. [30]). Each training session lasted 2.5 minutes and was composed of 20 trials. Each trial was composed of 3s for the rest phase (i.e. thinking about nothing), 3s for the motor imagery phase and a 2s-break (see Fig 1A). The feature extraction that followed each training session was carried out with a Variant of Common Spatial Pattern (CSP), and the classifier used Linear Discriminant Analysis (LDA, see [29]). We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy.

Fig 1.

Fig 1

(A) Visual display of the participant’s screen during the training phase. When the cross appeared alone, the participant was told to think about nothing. When the red arrow appeared, the participant had to think about a movement of the right hand. (B) Visual display on the experimenter’s computer. On the left, the real-time prediction window and on the right, the final plot displaying the fit between the actual data and the expected data. (C) Graphical representation of the interval estimate tasks, in both the body-generated action condition (top) and the BMI-generated action condition (bottom). (D) General setup of the experiment with the robotic hand.

These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50–60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. We chose to also accept participants who were at the chance level in order to create variability in the control of the BMI to be able to study how this variability influences other factors. An online prediction of the classifier was shown to the experimenter in real-time, in order to have an idea about the real-time accuracy of the classifier (see Fig 1B). This prediction oscillated between 1 (corresponding to a threshold of 0) and 2 (corresponding to a threshold of 1), 1 corresponding to rest and 2 corresponding to the imagination of the right hand. The mean threshold was set at 0.5. This means that each time the predictions were above 0.5, they were automatically transformed into a ‘2’, and when they were below 0.5, they were automatically transformed to a ‘1’. The classification performance score represents the number of correct predictions divided by the number of total predictions, transformed in percentage.

Real-time feedback session

After the training session, participants were invited to freely control the robotic hand. They were asked to relax and to avoid performing movements with their right fingers or with their right arm. They could freely decide when to make the robotic hand move or not during 2–3 minutes, without any cues. We used the model that had reached the highest percentage of classification performance. If participants still reported that they felt that they did not control the robotic hand well enough (which happened for one participant), they again performed the BMI training procedure in order to try to increase accuracy. If participants reported being satisfied with the degree of accuracy during the real-time feedback session, we used the same model for the intentional binding task that they performed directly after.

Interval estimates

Participants performed the intentional binding task through the method of interval estimates [13]. This method is relatively similar to the classical intentional binding paradigm, but here, participants have to explicitly estimate and report, in ms, the duration of the delay between their keypress and the resulting tone (e.g. [21]).

When they were ready to start a trial, participants were invited to press the ENTER key on the keyboard, using their left hand. Next, a cross appeared on the screen and participants were told that they could press the ‘+’ key with their right index finger whenever they wanted (see Fig 1C). They were nonetheless asked to wait a minimum of 2s. before pressing the key because of the EEG recording. After the keypress, a tone occurred randomly after 100, 500 or 900 ms, and participants were asked to estimate the delay between the keypress and the tone. To do so, they had to enter their answer with their left hand by using the keyboard’s numeric keypad. They were informed that the delay would vary randomly between 1 and 1000 ms on a trial-by-trial basis (they were reminded that 1000 ms equals 1s). Participants were also told (1) to make use of all possible numbers between 1 and 1000, as appropriate, (2) to avoid restricting their answer space (i.e., not to keep using numbers falling between 1 and 100 for instance), and (3) to avoid rounding. This task was composed of 60 trials (20 for each delay).

The second interval estimates task was performed only if participants had reached at least chance level with the classification performance after the training session. The procedure was similar to the first interval estimates task, with the only difference being that participants were requested to keep their right hand in a relaxed position during the entire procedure. They were told that, as before, their left hand would serve to start each trial and to write down their interval estimations (see Fig 1D). To press the ‘+’ key and thus to trigger the tone, participants were told that they had to imagine, whenever they wanted, the movement they had imagined during the BMI training session in order to make the robotic hand move. When the robotic hand moved, its right index finger pressed on the ‘+’ key and a tone occurred. Participants were first asked to estimate the delay that occurred between the robotic hand’s keypress and the tone. Then, they were invited to indicate, on a scale from 0 (‘the hand moved by its own’) to 10 (‘the hand moved according to my own will’), if the movement of the robotic hand was caused by their own will, that is, whether the movement of the hand corresponded to the moment where they imagined the movement or not. For this task, the threshold to trigger the robotic hand was ‘1’, meaning that they had to reach a ‘2’ in order to trigger the movement of the hand. This was decided in order to reduce the number of false positives, since the movement of the robotic hand was harder to trigger.

Post-session questionnaire. Participants filled in a single questionnaire at the end of the experimental session. It assessed participants’ feeling of body-ownership, location and agency towards the robotic hand. This questionnaire was build based on questionnaires used in previous studies [20, 31] and adapted to the robotic hand [32]. Participants had to rate, on a scale from 0 (‘totally disagree’) to 6 (‘totally agree’) each of the 12 items of the scale (see Table 1). We also asked participants to complete a brief questionnaire assessing their degree of cognitive fatigue on a scale from ‘0’ (no cognitive fatigue at all) to ‘10’ (very high cognitive fatigue) before and after the body-generated action condition and the BMI-generated action condition.

Table 1. RHI questionnaire—appropriation of the robotic hand.
Ownership
I felt as if I was looking at my own hand, instead of a robotic hand
I felt as if the robotic hand started to look like my real hand
I felt as if the robotic hand was my hand
I felt as if the robotic hand belonged to me
I felt as if the robotic hand was part of my body
Location
I felt as if my real hand was localized where the robotic hand was
I felt as if the robotic hand was localized where my real hand was
It seemed as if I were sensing the movement of my finger in the location where the robotic finger moved
Agency
The robotic hand moved just like I wanted it to, as if it was obeying my will
I felt as if I was controlling the movements of the robotic hand
I felt as if I was causing the movement I saw
Whenever I moved my finger I expected the robotic finger to move in the same way

Study 1—Results

We performed different analyses. First, we evaluated whether or not motor imagery involves a greater desynchronization in the contralateral component in the mu and beta bands during the training phase than when being at rest. Next, we tried to ascertain if those differences correlated with the classification performance provided by the algorithm. Next, we assessed whether or not the classification performance provided by the algorithm predicted the perceived feeling of control that participants had experienced over the robotic hand during the interval estimation task. To address our main research question, we compared the interval estimates in the body-generated action condition and in the BMI-generated action condition. We additionally investigated if the variability of the perceived control of the robotic hand amongst participants could account for those results. Then, we examined whether or not this perceived control during the task would influence the global appropriation of the robotic hand, as measured through body-ownership, location and agency over this hand. Finally, we examined the potential differences between mu and beta activity during the first interval estimate task (real hand) and the second interval estimate task (robotic hand).

Electrodes and rhythms associated with the desynchronization during the training

We conducted a repeated measures ANOVA with Rhythm (Mu, Beta) and Electrode (C3, Cz, C4) as within-subject factors on the difference in spectral power in the mu- and beta-bands between the imagery phases and the rest phases during the training session. Data were normalized by subtracting the global mean within each frequency band from each data point and by dividing the result by the global SD within the same frequency band. Since desynchronization is expected during motor imagery phases while no desynchronization is expected during rest phases, a negative value indicates that the desynchronization was stronger during motor imagery phases than during rest phases. The main effect of Rhythm was not significant (p > .4). The main effect of Electrode was significant (F(2,52) = 14.471, p < .001, η2partial = .358). Paired comparisons indicated that the desynchronization was stronger (i.e. a lower power) over C3 (-.239, SD = .87) than over Cz (.412, SD = .83; t(26) = -4.874, p < .001, Cohen’s d = -.938) and, than over C4 (.035, SD = .77; t(26) = -2.250, p = .033, Cohen’s d = -.433), see Fig 2A. Desynchronization was also stronger in C4 than in Cz (t(26) = 3.495, p = .002, Cohen’s d = .673). This stronger desynchronization over C3 confirmed that participants were imagining a movement of the right hand. The main effect of Rhythm was not significant (p > .4). The interaction Electrode x Rhythm was significant (F(2,52) = 28.037, p < .001, η2partial = .519). We observed that power in the mu and beta-bands did not differ for C3, Cz and C4 (all ps > .1). We observed that power over C3 was lower than power over Cz in the mu-band (t(26) = -7.256, p < .001, Cohen’s d = -1.396) but not in the beta-band (p > .3). Similarly, power over C4 was lower than power over Cz in the mu-band (t(26) = 6.102, p< .001, Cohen’s d = 1.174) but not in the beta-band (p > .2). Results also indicated that power over C3 was lower than power over C4 in the beta-band (t(26) = -2.088, p = .047, Cohen’s d = -.402) but not conclusive in the mu-band (p = .083).

Fig 2.

Fig 2

(A) Topographical representations of the power in the mu-band and of the power in the beta-band during the training phase. The color bar represents the power difference between the imagery phase and the rest phase. 1 = 100% and -1 = -100%. (B) Graphical representation of the correlations between the classification performance and the power in the mu- and the beta-bands over C3. All tests were two-tailed. (C) Graphical representation of the correlation between the classification performance and the perceived control over the movement of the robotic hand.

Correlation between classification performance and mu/beta oscillations

We performed Pearson correlations between the classification performance and the mu and beta oscillations across C3, Cz and C4. Given that multiple correlations were performed, we applied a Bonferroni correction (α/6 = 0.008). After this correction, we observed a significant negative correlation between the classification performance and power in the mu-band over C3 (r = -.569, p = .002) and power in the beta-band over C3 (r = -.506, p = .007), see Fig 2B. Other correlations were not significant (all ps > .014). Further, linear regressions with classification performance as the dependent variable and power in the mu- and beta-band over C3 as independent variables indicated that power in the mu-band over C3 (t(26) = -2.305, p = .030, Beta = -.422, VIF = 1.321) was a better predictor of classification performance than power in the beta-band over C3 (p > .1, Beta = -.299). One participant could be considered as an outlier since the power in the mu-band differed from more than 5 SDs from the rest of the sample. We therefore carried out the previous analysis again to ensure that the mere presence or absence of this participant would not change the results. We again observed that the power in the mu-band (r = -.675, p < .001) significantly correlated with the classification performance. Power in the beta-band on C3 marginally correlated with the classification performance (r = -.493, p = .011). Other correlations remained non-significant (all ps > .021).

Does classification performance predict perceived control?

We conducted a linear regression with classification performance as the independent variable and perceived control as the dependent variable. Results indicated that the classification performance score, while remaining unknown for the participants, significantly predicted their perceived control over the robotic hand t(26) = 2.438, p = .022, Beta = .438), see Fig 2C.

Interval estimates

Before conducting further statistical analyses, we first ensured that participants had not moved their right hand muscles when they had to control the robotic hand through motor imagery in the second intentional binding task. We observed that only 2.21/60 trials (SD = 2.99) on average contained muscular activity prior to the movement of the robotic hand, thus confirming that participants globally managed to completely relax their right hand while using the BCI. Fig 3B shows the averaged data (rejections-free) in the condition with participants’ real hand and in the condition with the robotic hand at the moment of the keypress. Trials containing muscular activity prior to the actual movement of the robotic hand were discarded from all further statistical analyses. We then compared the interval estimates between the two tasks in order to assess whether or not the lack of sensorimotor information coming from the muscles at the moment of the keypress would reduce the implicit sense of agency. We conducted a paired-samples t-test in order to compare the interval estimates in the real hand condition and the robotic hand condition. We observed that this difference was not significant (p > .8), see Fig 3A. The Bayesian version of the same analysis supported H0 (BF10 = .207). The BF value was calculated using JASP (JASP Team, 2019) and we used the default priors implemented in JASP [33]. Afterward, we investigated whether he perceived control of the robotic hand during the second interval estimate task could influence the reported interval estimates in this task, on a trial-by-trial basis. We thus performed a linear regression with the interval estimates as the dependent variable and perceived control as the independent variable. This regression was not significant (p > .1).

Fig 3.

Fig 3

(A) Graphical representation of the comparison between interval estimates in the real hand condition and in the robotic hand condition. Test was two-tailed. (B) Graphical representation of the muscular activity before the keypress with the real hand or with the robotic hand. It confirmed that participants used motor imagery to make the robotic hand moving instead of their own muscle.

We also explore if the reported interval estimates would vary regarding actual action-tone intervals, depending on the experimental condition. We thus run a repeated-measures ANOVA with Action (real hand, robotic hand) and Delays (100, 400, 700 ms) as within-subject factors on the reported interval estimates. We observed a significant interaction between Action and Delay (F(2,52) = 12.956, p < .001, η2partial = .358). Paired-comparisons indicated that interval estimates were shorter when participants performed the action with the real hand (210 ms, SD = 163) compared to when they used the robotic hand (290 ms, SD = 137) for the 100-ms action-tone delay (t(26) = -2.717, p = .012, Cohen’s d = -.523). We also observed that interval estimates were longer when participants used their real hand (624 ms, SD = 133) compared to the robotic hand (541 ms, SD = 142) for the 700ms action-tone delay (t(26) = 3.339, p = .003, Cohen’s d = .643). The difference was not significant for the 500 ms action-tone delay.

Perceived control, interval estimates, and appropriation of the robotic hand

We first examined whether or not the perceived control of the robotic hand could predict higher appropriation of the robotic hand, as measured through scores on body-ownership, location. We took the averaged perceived control for each participant and computed non-parametric correlations (i.e. Spearman’ Rho, ⍴) with body-ownership, location and agency. We applied Bonferroni correction for multiple correlations (α/3 = 0.016). Results indicated that a higher score on perceived control correlated with a higher score on agency (⍴ = .620, p < .001). Other correlations were also positive but failed to reach significance (p > .1). Then, we performed a Pearson correlation on a trial-by-trial basis between the reported perceived control and reported interval estimates. This correlation was not significant (p > .1).

Differences in mu and beta rhythms when participants used their real hand VS motor imagery

We conducted a repeated-measure ANOVA with Action (real hand, robotic hand), Rhythm (mu, beta) and Electrode (C3, Cz, C4) on the 500 ms period preceding each keypress. Importantly, we conducted those analyses without any baseline corrections. This decision was taken given that we could not ensure that baselines were absolutely identical neither between the first and the second interval estimates task nor between trials in the second interval estimates task. Participants could indeed press the key (with their real hand or with the robotic hand) whenever they wanted, thus resulting in different time periods before the actual keypress. Also, in the second interval estimates task, participants had to reach the maximum threshold (‘1’) to trigger the movement of the robotic hand. Thus, on some trials participants start to imagine the movement a relatively long (and undetermined) time before the actual keypress while for others the robotic hand moved even without motor imagery (false positives). Thus, we decided to perform the subsequent analyses without baseline corrections. Data were normalized by subtracting the global mean from each data point and by dividing the result by the global SD. Due to a technical failure (i.e., EEG triggers not recorded), the data of one participant were lost. The main effect of Action was not significant (p > .2), suggesting no main difference between cases when the keypress was performed with participant’s own finger and cases involving the robotic hand (Fig 4). The main effect of Rhythm was also not significant (p > .9). The main effect of Electrode was significant F(2,50) = 6.568, p = .003, η2partial = .208). Consistently with the fact that participants performed a movement with their right hand or were imagining a movement of a right hand, paired comparisons indicated that power was lower over C3 (-.06, SD = .83) than over C4 (.14, SD = 1.15, t(25) = -2.546, p = .017, Cohen’s d = -.499) and lower over Cz (-.077, SD = .81) than over C4 (t(25) = -2.946, p = .007, Cohen’s d = -.578). The difference was not significant between C3 and Cz (p > .8). We also observed a significant interaction Rhythm x Electrode (F(2,50) = 7.285, p = .002, η2partial = .226). Paired comparisons indicated that power in the mu-band did not statistically differ between C3 and Cz, and Cz and C4 (all ps > .1). Power in the mu-band over C3 (-0.45, SD = .89) was marginally lower than the power in the mu-band over C4 (.082, SD = 1.14, t(25) = -2.030, p = .053, Cohen’s d = -.398). Paired comparisons further indicated that power in the beta-band was lower over C3 (-.09, SD = .82) than over C4 (.21, SD = 1.20, t(25) = -2.694, p = .012, Cohen’s d = -.528) and that power in the beta-band was also lower over Cz (-.11, SD = .85) than over C4 (t(25) = -4.139, p < .001, Cohen’s d = -.812). The difference between C3 and Cz in the beta-band was not significant (p > .6). The interaction Action x Electrode was also significant (F(2,50) = 6.765, p = .003, η2partial = .213). Paired comparisons indicated that in the body-generated action condition, power was lower over C3 than over C4 (t(25) = -2.851, p = .009, Cohen’s d = -.559) and lower over Cz than C4 (t(25) = -3.160, p = .004, Cohen’s d = -.620). The difference was significant between power over C3 and power over Cz (p>.9). In the BMI-generated action condition, power was lower over Cz than over C4 (t(25) = -2.457, p = .021, Cohen’s d = .482). Other comparisons were not significant (all ps > .070). These results are consistent with the fact that the lateralization over a right hand movement is more marked when participants execute a movement with their own right hand than when they imagine a movement of the right hand. Other interactions were not significant (all ps > .079).

Fig 4. Topographical representations of the power in the mu-band and of the power in the beta-band preceding the keypress in both the real hand condition and the robotic hand condition.

Fig 4

Data are displayed without baseline corrections.

We additionally investigated, with Pearson correlations, whether or not the difference in power in the mu- and the beta-bands between the real hand keypress and the robotic hand keypress on C3, Cz and C4 for each participant could predict the difference in interval estimates between the same two tasks. None of these correlations were significant (all ps > .2), thus suggesting that none of the differences observed in the mu- and beta-bands could predict the difference at the implicit level of the sense of agency, as measured through interval estimates. Finally, we performed Pearson correlations to check whether or not mu and/or beta desynchronization during the second interval estimates task were associated with the perceived control of the robotic hand. None of these correlations were significant, no matter the rhythm or the electrode (all ps > .4).

Cognitive fatigue after each interval estimate task

At the end of the experiment, participants were asked to rate their level of cognitive fatigue on a scale from ‘0’ (no cognitive fatigue at all) to ‘10’ (very high cognitive fatigue) before and after the body-generated action condition and the BMI-generated action condition. We conducted a repeated-measures ANOVA with Condition (body-generated, BMI-generated) and Moment (Before the task, After the task) on the reported level of cognitive fatigue. We observed a main effect of Condition, with participants reporting more cognitive fatigue after the BMI-generated action condition (2.77, SD = 2.20) than after the body-generated action condition (1.68, SD = 1.32, F(1,25) = 86.510, p < .001, η2partial = .776). We also observed a main effect of Moment, with more cognitive fatigue reported after each task (4.59, SD = 1.58) than before each task (2.30, SD = 1.43, F(1,25) = 89.378, p < .001, η2partial = .781). The interaction Condition x Moment was also significant (F(1,25) = 5.749, p = .024, η2partial = .187). Paired comparisons indicated that participants reported a higher level of cognitive fatigue when they started the BMI-generated action condition (3.27, SD = 1.61) than when they started the body-generated action condition (1.35, SD = 1.59, t(25) = -6.809, p < .001, Cohen’s d = -1.335). This result is consistent with the fact that participants always performed the BMI-generated action condition after the body-generated action condition. Participants also reported a higher level of cognitive fatigue after the BMI-generated action condition (6.12, SD = 2.1) than after the body-generated action condition (3.08, SD = 1.6, t(25) = -7.354, p < .001, Cohen’s d = -1.442).

Study 1—Discussion

The main goal of Study 1 was to examine whether or not the lack of sensorimotor information when using a BMI would diminish the primary experience of agency and to investigate how the degree of control over the BMI would lead to a greater embodiment of the robotic hand.

First, we ensured that the motor imagery-based BMI that we developed was reliable. In the training session, we replicated the classical pattern of oscillations of motor imagery: We observed that the desynchronization during motor imagery was greater on C3 than on Cz and C4 in comparison with being at rest, which is consistent with the fact that we asked the participants to imagine a movement of the right hand. We additionally observed that the difference between the imagery phases and the rest phases in the mu band on C3 was the better predictor of the classification performance provided by the algorithm after the training. Even though the power in the mu- and beta-bands did not differ over C3, the fact that the power in the mu-band was a better predictor of the classification performance than power in the beta-band is consistent with past literature comparing kinaesthetic motor imagery and visual-motor imagery [30]. Power in the mu-band has indeed been associated to the sensorimotor treatment of movements [3438] and we specifically asked our participants, not only to try to image a movement of the right hand, but also to try to ‘feel’ that movement. This probably explains why power in the mu-band was a better predictor of the classification performance than power in the beta-band. It is important to note that participants were not told what the classification performance score was after their training sessions. Yet, their perceived control over the movement of the robotic hand positively correlated with this classification performance score. Based on those results, we concluded that our BMI was reliable since it involved motor imagery that was similar to previous studies and that the desynchronization of the mu band during the motor imagery task predicted the classification performance score, which correlated with the perceived control.

To examine whether or not the lack of sensorimotor information when using a BMI would reduce the primary experience of agency, we compared a condition in which participants had to use their own hand to press a key versus a robotic hand controlled through motor imagery. When they use motor imagery to control an external device, participants do not receive sensorimotor feedback while using the robotic hand to press the key [39]. If sensorimotor information during the action is crucial for SoA, we would have expected longer interval estimates when participants used a robotic hand in comparison with when they used their own hand. However, we observed that this was not the case, since interval estimates did not differ between these two experimental conditions. It thus appears that the sensorimotor information is not the most important cue for generating a sense of agency (i.e. the cue integration theory). This result is consistent with previous studies, which showed that the congruence of visual cues was more important for SoA than sensorimotor feedback [22, 40]. It could nonetheless be argued that since participants used kinaesthetic motor imagery to control the BMI they still somehow received sensorimotor information during the preparation of the movement. However, this sensorimotor information was necessarily incongruent, since participants imagined a movement of their whole right hand while the robotic hand performed a simple keypress with its index finger. Future studies could use a BMI that does not rely on kinaesthetic motor imagery (e.g. [41]) to further explore the relevance of sensorimotor information in the generation of a sense of agency.

We also observed that for small action-tone delays, the reported interval estimates were smaller for biological movements executed with one’s own hand than for movements executed through the robotic hand. However, this pattern reversed for longer action-tone delays. This is suggestive that different mechanisms drive agency for biological and non-biological movements, an aspect of our findings that warrants further research.

As in previous studies, we did not observe differences in mu and beta oscillations during motor imagery and real hand movements (e.g. [42]). We nonetheless observed that the desynchronization was more contralateral during real hand movements than during motor imagery. This may be due to the fact that in the motor imagery task, even if participants were suggested to imagine a movement of their right hand, some of them could have used their own strategy, such as imagining a movement of both hands.

We also observed that the higher the perceived control over the robotic hand was, the higher the scores of agency over the robotic hand were. For body-ownership and location scores, this correlation was also positive but did not reach significance.

Learning to control a BMI, to some extent, involves learning to perform a new movement. Successfully triggering a movement of the robotic hand with a BMI requires a greater cognitive effort than simply pressing the keypress with one’s own finger, which is a very common movement for healthy adults. Using a BMI is also more cognitively demanding than acting with one’s own finger directly (e.g. [43]). The majority of our participants indeed reported that using the robotic hand was exhausting: They reported a higher level of cognitive fatigue in the questionnaires after the BMI-generated action condition than after the body-generated action condition. This may be due to, first, the fact that the BMI-generated action condition was always conducted after the body-generated action condition, and second, that the BMI-generated action condition was particularly tiring. These factors have been previously shown to modulate the experience of agency [4447] and could thus also have influenced our results in the present study. To diminish the level of cognitive fatigue in Study 2, we did not include the body-generated condition. Instead we trained participants on two consecutive days with the BMI-generated action condition only. On each day, participants performed 5 training sessions instead of a variable number of sessions, such as in Study 1. This was intended to keep cognitive fatigue comparable between participants.

Study 2 –Method

Participants

As in Study 1, a total of 30 (new) naïve right-handed participants (13 males) were recruited. Each participant received €40 for their participation. No participants were excluded based on the same exclusion criteria as used in Study 1. However, we realised after the first 4 participants that an error in the code introduced a delay longer than the one defined between the action and the tone. Those 4 participants were excluded from further analyses associated with interval estimates, but were preserved so to evaluate the reliability of our BMI. The mean age of all included participants was 23.77 (SD = 2.76). All participants provided written informed consent prior to the experiment. The study was approved by the local ethical committee of the Université libre de Bruxelles (054/2015).

Method and material

The method was globally similar to the one of Study 1. Participants were trained on two consecutive days, at the same hour of the day. On the first day, they first started with a 24-trial training session for the interval estimate task. During those two experimental sessions, all participants had a fixed number of training sessions (i.e. 5). At the end of those 5 training sessions, they performed the interval estimate task with the robotic hand controlled through the BMI. After judging the action-tone interval, participants were asked, on each trial, to judge if the movement of the robotic hand was caused by their own will (such as in Study 1) and to indicate how difficult (‘0’—not difficult at all to ‘10’—very difficult) it was to make the robotic hand move. At the end of the session, participants also completed a scale that evaluated their level of cognitive fatigue (‘0’—not cognitively tired at all to ‘10’—very cognitively tired) before and after the interval estimate task. They also completed the questionnaire evaluating body-ownership, location and agency over the robotic hand on both days.

Study 2 –Results

We first assessed if we could replicate the main characteristics associated with the classification performance of the BMI observed in Study 1. To do so, we first evaluated desynchronization in the mu- and beta-bands between the imagery phases and the rest phases of the training sessions on both days. Again, we then checked if power in the mu- and beta-bands correlated with the classification performance provided by the algorithm. Next, we assessed whether or not the classification performance predicted the perceived feeling of control that the participants had over the robotic hand during the interval estimate task. Then, we investigated if the implicit sense of agency when using the BMI was different on both days and if it correlated with the reported perceived control of the hand and the reported difficulty making the robotic hand perform the movement, on a trial basis. We also examined if the interval estimates correlated with the appropriation of the hand. Finally, we evaluated if perceived control, classification performance, cognitive fatigue and the appropriation of the hand differed on days 1 and 2, and how these variables correlated with each other.

Electrodes and rhythms associated with the desynchronization during the training

We conducted a repeated-measures ANOVA with Session (Day 1, Day 2), Rhythm (mu, beta) and Electrode (C3, Cz, C4) and as within-subject factors on the difference between the imagery phases and the rest phases during the training session. The data of 2 participants were lost due to faulty electrodes in at least one training session. The main effect of Electrode was significant (F(2,54) = 8.993, p < .001, η2partial = .250). Paired comparisons indicated that the desynchronization was stronger over C3 (-.239, SD = .85) than over Cz (.22, SD = .92; t(27) = -3.448, p = .002, Cohen’s d = -.652) and, than over C4 (.01, SD = .72; t(27) = -2.517, p = .018, Cohen’s d = -.476). The desynchronization was also stronger in C4 than in Cz (t(27) = 2.393, p = .024, Cohen’s d = .452). The higher desynchronization in C3 confirmed that participants mostly imagined a movement of the right hand, see Fig 5. Neither the main effect of Rhythm (p > .9), nor the effect of Session (p > .9) were significant. The interaction Electrode x Rhythm was marginal (F(2,54) = 3.066, p = .055, η2partial = .102) and thus not investigated further. Other interactions were not significant (all ps > .4).

Fig 5. Topographical representations of the power in the mu-band and in the beta-band during the training session on both days.

Fig 5

The color bar represents the power difference between the imagery phase and the rest phase. 1 = 100% and -1 = -100%.

Classification performance and correlation with mu/beta oscillations

We performed Pearson correlations between the classification performance and the mu and beta oscillations across C3, Cz and C4. Analyses were performed on both days combined to achieve better statistical power, since the effect of session did not significantly influence power in the mu- and beta-bands. Given that multiple correlations were performed, we applied a Bonferroni correction (α/6 = 0.008). After this correction, we observed a significant correlation between classification performance and power in the mu-band (r = -.557, p < .001) and power in the beta-band (r = -.488, p < .001) over C3. The correlation was also significant with the power in the mu-band over C4 (r = -.417, p = .001). Other correlations were not significant (all ps > .064). Linear regressions with classification performance as the dependent variable and power in the mu- and beta-bands over C3 and power in the mu-band over C4 as independent variables indicated that power in the mu-band over C3 (t(55) = -2.833, p = .006, Beta = -.460) was a better predictor of the classification performance than the power in the beta-band on C3 (t(55) = -2.116, p = .039, Beta = -.294) and, than the power in the mu-band on C4 which was no significant anymore (p > .6). Again, results were similar to those of Study 1, with a greater desynchronization in the mu-band on C3 being associated with a higher classification performance.

Does classification performance predict perceived control?

We again conducted a linear regression with classification performance as the independent variable and perceived control as the dependent variable, again on both sessions combined. As in Study 1, results indicated that classification performance, unbeknownst to participants, significantly predicted their perceived control over the robotic hand (t(51) = 2.247, p = .029, Beta = .303).

Interval estimates

As expected, participants reported shorter interval estimates for the 100 ms-delay (442.8 ms, SD = 212.3) than for the 400 ms-delay (528.9 ms, SD = 208.7) and, than for the 700 ms-delay (575 ms, SD = 218.72). Paired comparisons indicated that interval estimates did not significantly differ between day 1 and day 2 (p >.06). We then performed non-parametric correlation with Spearman’s rho (⍴) between the reported interval estimates, the perceived control over the movement of the robotic hand and the reported difficulty to make the robotic hand perform the movement, on a trial-by-trial basis. Bonferroni corrections involved a α/2 = 0.025. Results revealed a negative significant correlation between the interval estimates and the perceived control (⍴ = -.106, p < .001), see Fig 6A. Since longer interval estimates actually reflect a reduced sense of agency, this negative correlation indicates that the higher the perceived control was, the higher sense of agency was. The correlation between the interval estimates and the reported difficulty was also significant (⍴ = .061, p = .001), suggesting that the more difficult it was for the participants to make the robotic hand move, the lower was their sense of agency, since it corresponds to higher interval estimates. Perceived control also negatively correlated with the reported difficulty (⍴ = -.184, p < .001), suggesting that the more difficult it was for participants to make the robotic hand move, the lower was their perceived control of the hand. We then performed a linear regression with the estimated interval estimate as the dependent variable and the perceived control and the reported difficulty to make the hand performing the movement as the independent variables. Results indicated that the perceived control (t(3184) = -4.131, p < .001, Beta = -.074) was a better predictor of interval estimates than the reported difficulty (t(3184) = 2.195, p = .028, Beta = .039), see Fig 6B. VIFs were of 1.021, indicating an absence of collinearity. Further, we took the averaged interval estimates for each participant as the independent variable and we conducted linear regressions to assess whether it would predict scores on body-ownership, location and agency over the robotic hand. Results were again computed on sessions 1 and 2 combined. We observed that interval estimates negatively predicted the score on agency (t(51) = -2.229, p = .031, Beta = -.382), meaning that the higher was their implicit sense of agency during the task, the more agency participants reported over the robotic hand. Other linear regressions with body-ownership and location were not significant (all ps > .6).

Fig 6.

Fig 6

(A) Graphical representation of the correlations between interval estimates and the perceived control over the movement of the robotic hand (left) and the difficulty to control the movement of the robotic hand (right). (B) Graphical representation of the relationship between interval estimates and body-ownership, location and agency scores over the robotic hand. All tests were two-tailed. Full lines represent a significant result and dotted lines represent a non-significant result.

Classification performance, perceived control, cognitive fatigue and appropriation of the robotic hand

We first performed a paired comparison between the classification performance on day 1 and day 2. Results indicated that the classification performance was higher on day 2 (61.72%, SD = 7.43) than on day 1 (59.47%, SD = 6.79, t(29) = -2.266, p = .031, Cohen’s d = .414), showing that participants improved their performance from day 1 to day 2, see Fig 7A. However, the perceived control did not differ between the two days (p > .3). We then compared the reported cognitive fatigue on both day 1 and day 2 after the use of the BMI during the interval estimate task. To do so, we first subtracted the reported score of cognitive fatigue before the task to the reported score of cognitive fatigue after the task. We then performed a paired comparison, which revealed that participants reported less cognitive fatigue on day 2 (2.96, SD = 1.93) than on day 1 (4.03, SD = 2.64, t(29) = 2.948, p = .006, Cohen’s d = .535). Other dependent variables associated with the appropriation of the robotic hand (i.e., body-ownership, location, agency) did not differ between the two days (all ps > .06).

Fig 7.

Fig 7

(A) Graphical representations of the difference between day 1 and day 2 on the classification performance scores and on the reported cognitive fatigue. All tests were two-tailed. * indicates a p between 01 and 05. ** represents a p between .001 and .01. (B) Graphical representation of the relationship between the perceived control over the robotic hand body-ownership, location and agency score. All tests were two-tailed.

We took the averaged perceived control for each participant and ran non-parametric correlations (i.e. Spearman’ Rho, ⍴) with body-ownership, location and agency in both sessions combined. We applied Bonferroni correction for multiple correlations (α/3 = 0.016). We observed a significant positive correlation between the perceived control and body-ownership scores (⍴ = .453, p = .001) and agency scores (⍴ = .899, p < .001). The correlation between the perceived control and location scores was marginal (⍴ = .319, p = .019), see Fig 7B. These results suggest that a greater perceived control over the action of the robotic hand leads to a greater appropriation of the robotic hand, with higher scores on body-ownership, location and agency.

Study 2 –Discussion

The pattern of results of Study 2 was globally similar to Study 1. Indeed, we observed that the desynchronization in the mu band on C3 again predicted classification performance, which itself predicted perceived control over the robotic hand. Again, we observed that perceived control over the robotic hand was a reliable predictor of the appropriation of the robotic hand, as shown by significant correlations with body-ownership, location and agency.

Interestingly, classification performance was higher on day 2 than on day 1. Further, the cognitive fatigue reported was lower on day 2 than on day 1, suggesting a beneficial effect of BMI training. However, this 2-day training did not appear to be sufficient to produce an improvement in the other variables, such as the interval estimates and the scores on body-ownership, location and agency. Future studies could evaluate the impact of longer BMI training on those variables.

Results on interval estimates showed that the implicit sense of agency when using BMI was modulated by both perceived control over the external device and by the reported difficulty to make it move. We indeed observed that the more participants reported subjective control over the robotic hand, the stronger their implicit sense of agency was. Of note, this relationship was not significant in Study 1, probably due to a lower statistical power since the analyses included twice as many trials in Study 2 compared to Study 1, due to the combination of both sessions. We also observed that the more difficulty participants reported in performing the movements, the lower was their implicit sense of agency, as shown by longer interval estimates. This is consistent with previous studies that have suggested that SoA is reduced by cognitive effort (e.g. [45]).

General discussion

In this paper, we investigated how using a BMI would influence the experience of being the author of an action for the user. BMI involves performing movements through an external device that does not imply any bodily or muscle movements. Previous theories on the sense of agency had emphasised the importance of sensorimotor cues for generating a sense of agency (SoA) [10, 13, 14, 48]. We found in Study 1 that the absence of sensorimotor information was not detrimental for the feeling of agency, as measured through interval estimates, which did not differ between a condition in which participants used their own hand and a condition in which they used the robotic hand to perform a keypress. It thus appears that participants can experience a ‘disembodied agency’, as long as they feel that they have an active control over the device [16]. It had also been previously suggested that SoA for BMI can be illusory [49]. Here, we observed that it may not be the case. Participants were not aware of the classification performance score that they had reached after the training session. Yet, their own perception of control over the robotic hand correlated with this classification performance score and with their implicit experience of agency, as measured through interval estimates. This result is in line with a former study, which showed that participants are good at evaluating their ability to control a BMI even if they do not receive an online feedback on their performance [50].

In both studies, we observed that experiencing a high control over the brain-computer interface predicts a greater appropriation of the robotic hand, as measured through questionnaires assessing body-ownership, location and agency. The strong relationship between the perceived control of the robotic hand during the task and the sense of agency measured during a post-session questionnaire is coherent since these two measures make it possible to assess explicit sense of agency (i.e., SoA). The fact that perceived control of the robotic hand also positively predicts body-ownership has deeper theoretical implications about the relationship between body-ownership and SoA. Thus, in [24], the authors compared ownership scores and proprioceptive drift (i.e. an implicit measure of body-ownership, [19]) for the classic rubber hand illusion (i.e. visuo-tactile) and the active rubber hand illusion. Crucially, they found no differences between these two conditions, neither with the proprioceptive drift nor with the questionnaires. They suggested that different types of sensory information can be combined to elicit the ownership perceptions and that agency is not a modulator of body-ownership because efferent signals received in the active version of the RHI did not increase the strength of the illusion. However, this conclusion did not receive the support of other studies (e.g. [20, 51]). The present study supports that a high feeling of control can contribute to a greater feeling of body-ownership, which is an interesting finding for restorative medicine for amputee patients.

A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex [3] or by having a motoric setup forcing the participant’s finger to press the key (e.g. [13]). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see [57] for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.

Several articles have argued that demand characteristics [52] and expectancies can predict scores on the classical rubber hand illusion (RHI) questionnaires [53]. In the present study, we limited as much as possible the effects of both demand characteristics and expectancies on participants’ answer to the questionnaire related to the appropriation of the robotic hand: (1) Participants were not told anything about the rubber hand illusion or the fact that some people may experience the robotic hand as a part of their body during the experiment; (2) We did not tell them in advance that they would have to fill in a questionnaire regarding their appropriation of the robotic hand at the end of the experiment; (3) We avoided creating a proper classical ‘rubber hand illusion’: the robotic hand was placed at a congruent place on the table regarding the participant’s arm, but we did not use the blanket to cover their real arm, neither we used a paintbrush to induce an illusion. This procedure limited the possibility for participants to guess what we would assess since in the classical rubber hand illusion, placing the blanket on the participant’s arm make them realize that we investigate to what extent they appropriate this hand in their body schema. However, we cannot fully rule out the influence of demand characteristics and expectancies, as participants perceiving a better control over the hand, even without knowing their own classification performance, could have intuitively indicated higher scores on the questionnaires. Control studies could thus be performed to manipulate demand characteristics in order to evaluate its effect on the appropriation of a robotic hand in a brain-machine interface procedure [53].

Finally, another limitation is that we could not entirely rule out the presence of sub-thresholded muscular activity that could have triggered the BMI during the task. The flexor digitorum superficialis where the electrodes were placed does not allow to record activity associated with the movement of the thumb. In addition, no electrodes were placed on the left arm. Those movements were only controlled visually by the experimenter during the task.

Given that the entire brain-computer interface was connected to an external robotic hand connected to the computer through an USB port, a certain latency was present during the moment where the algorithm detected a hit during motor imagery and the actual movement of the hand. This latency has been calculated and is about 110 ms. It refers to the maximum latency of the classifier (predictions were realized each 100 ms) added to the averaged latency of an USB command (+- 10 ms). One could thus argue that this delay could have negatively affected SoA when using the robotic hand in comparison with the real hand movement. However, a previous study observed that visual delays below 1s did not modulate SoA over the movement of two virtual hands controlled through motor imagery [22]. We thus consider that a 110 ms-delay did not influence our results.

One of the main and most important implications of the use of BMI is for amputees or paralysed patients. It is important to understand what factors would lead to a better appropriation of the controlled device in order to improve the well-being of those patients. In the present study, we bring the first evidence that using BMI interfaces is unlikely to negatively impact human SoA and that the better participants’ control is, the greater the appropriation of the device is.

Data Availability

Data are available on OSF with the following DOI: 10.17605/OSF.IO/SN8PJ.

Funding Statement

E.A.C. was supported by the FRS-F.N.R.S (Belgium). A.D.B. was supported by the Research Foundation Flanders (FWO) under grant number G026214N. This work was partially supported by the Flemish Government under the program “Onderzoeksprogramma Artificiële Inelligentie (AI) Vlaanderen”. A.C. is a research director with the F.R.S.-FNRS (Belgium). This work was partially supported by ERC Advanced Grant #340718 “Radical” to A.C.

References

  • 1.Gallagher S. (2000). Philosophical conceptions of the self: implications for cognitive science. Trends in cognitive sciences, 4(1), 14–21. 10.1016/s1364-6613(99)01417-5 [DOI] [PubMed] [Google Scholar]
  • 2.Synofzik M., Vosgerau G., & Newen A. (2008). Beyond the comparator model: a multifactorial two-step account of agency. Consciousness and cognition, 17(1), 219–239. 10.1016/j.concog.2007.03.010 [DOI] [PubMed] [Google Scholar]
  • 3.Haggard P., Clark S., & Kalogeras J. (2002). Voluntary action and conscious awareness. Nature neuroscience, 5(4), 382–385. 10.1038/nn827 [DOI] [PubMed] [Google Scholar]
  • 4.Libet B., Gleason C. A., Wright E. W., & Pearl D. K. (1993). Time of conscious intention to act in relation to onset of cerebral activity (readiness-potential) In Neurophysiology of consciousness (pp. 249–268). Birkhäuser, Boston, MA. [DOI] [PubMed] [Google Scholar]
  • 5.Moore J. W., & Obhi S. S. (2012). Intentional binding and the sense of agency: a review. Consciousness and cognition, 21(1), 546–561. 10.1016/j.concog.2011.12.002 [DOI] [PubMed] [Google Scholar]
  • 6.Moore J. W. (2016). What is the sense of agency and why does it matter?. Frontiers in psychology, 7, 1272 10.3389/fpsyg.2016.01272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Haggard P. (2017). Sense of agency in the human brain. Nature Reviews Neuroscience, 18(4), 196 10.1038/nrn.2017.14 [DOI] [PubMed] [Google Scholar]
  • 8.Rohde M., & Ernst M. O. (2016). Time, agency, and sensory feedback delays during action. Current Opinion in Behavioral Sciences, 8, 193–199. [Google Scholar]
  • 9.Limerick H., Coyle D., & Moore J. W. (2014). The experience of agency in human-computer interactions: a review. Frontiers in human neuroscience, 8, 643 10.3389/fnhum.2014.00643 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Coyle, D., Moore, J., Kristensson, P. O., Fletcher, P., & Blackwell, A. (2012, May). I did that! Measuring users’ experience of agency in their own actions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2025–2034).
  • 11.Blakemore S. J., Wolpert D., & Frith C. (2000). Why can’t you tickle yourself?. Neuroreport, 11(11), R11–R16. 10.1097/00001756-200008030-00002 [DOI] [PubMed] [Google Scholar]
  • 12.Chambon V., & Haggard P. (2013). 14 Premotor or Ideomotor: How Does the Experience of Action Come About?. Action science: Foundations of an emerging discipline, 359. [Google Scholar]
  • 13.Moore J. W., Wegner D. M., & Haggard P. (2009). Modulating the sense of agency with external cues. Consciousness and cognition, 18(4), 1056–1064. 10.1016/j.concog.2009.05.004 [DOI] [PubMed] [Google Scholar]
  • 14.Moore J. W., & Fletcher P. C. (2012). Sense of agency in health and disease: a review of cue integration approaches. Consciousness and cognition, 21(1), 59–68. 10.1016/j.concog.2011.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lebedev M. A., & Nicolelis M. A. (2006). Brain–machine interfaces: past, present and future. TRENDS in Neurosciences, 29(9), 536–546. 10.1016/j.tins.2006.07.004 [DOI] [PubMed] [Google Scholar]
  • 16.Steinert S., Bublitz C., Jox R., & Friedrich O. (2019). Doing things with thoughts: Brain-computer interfaces and disembodied agency. Philosophy & Technology, 32(3), 457–482. [Google Scholar]
  • 17.Haselager P. (2013). Did I do that? Brain–computer interfacing and the sense of agency. Minds and Machines, 23(3), 405–418. [Google Scholar]
  • 18.Tamburrini G. (2009). Brain to computer communication: ethical perspectives on interaction models. Neuroethics, 2(3), 137–149. [Google Scholar]
  • 19.Botvinick M., & Cohen J. (1998). Rubber hands ‘feel’ touch that eyes see. Nature, 391(6669), 756–756. 10.1038/35784 [DOI] [PubMed] [Google Scholar]
  • 20.Kalckert A., & Ehrsson H. H. (2012). Moving a rubber hand that feels like your own: a dissociation of ownership and agency. Frontiers in human neuroscience, 6, 40 10.3389/fnhum.2012.00040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Caspar E. A., Cleeremans A., & Haggard P. (2015). The relationship between human agency and embodiment. Consciousness and cognition, 33, 226–236. 10.1016/j.concog.2015.01.007 [DOI] [PubMed] [Google Scholar]
  • 22.Evans N., Gale S., Schurger A., & Blanke O. (2015). Visual feedback dominates the sense of agency for brain-machine actions. PloS one, 10(6). 10.1371/journal.pone.0130019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Farrer C., Valentin G., & Hupé J. M. (2013). The time windows of the sense of agency. Consciousness and cognition, 22(4), 1431–1441. 10.1016/j.concog.2013.09.010 [DOI] [PubMed] [Google Scholar]
  • 24.Kalckert A., & Ehrsson H. H. (2014). The moving rubber hand illusion revisited: Comparing movements and visuotactile stimulation to induce illusory ownership. Consciousness and cognition, 26, 117–132. 10.1016/j.concog.2014.02.003 [DOI] [PubMed] [Google Scholar]
  • 25.Vidaurre C., & Blankertz B. (2010). Towards a cure for BCI illiteracy. Brain topography, 23(2), 194–198. 10.1007/s10548-009-0121-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.De Beir, A., Caspar, E., Yernaux, F., da Gama, P. M. D. S., Vanderborght, B., & Cleeremans, A. (2014, August). Developing new frontiers in the rubber hand illusion: Design of an open source robotic hand to better understand prosthetics. In The 23rd IEEE International Symposium on Robot and Human Interactive Communication (pp. 905–910). IEEE.
  • 27.Beckerle, P., De Beir, A., Schürmann, T., & Caspar, E. A. (2016, August). Human body schema exploration: analyzing design requirements of robotic hand and leg illusions. In 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) (pp. 763–768). IEEE.
  • 28.Oostenveld R., Fries P., Maris E., & Schoffelen J. M. (2011). FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational intelligence and neuroscience, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kothe C. A., & Makeig S. (2013). BCILAB: a platform for brain–computer interface development. Journal of neural engineering, 10(5), 056014 10.1088/1741-2560/10/5/056014 [DOI] [PubMed] [Google Scholar]
  • 30.Neuper C., Scherer R., Reiner M., & Pfurtscheller G. (2005). Imagery of motor actions: Differential effects of kinesthetic and visual–motor mode of imagery in single-trial EEG. Cognitive brain research, 25(3), 668–677. 10.1016/j.cogbrainres.2005.08.014 [DOI] [PubMed] [Google Scholar]
  • 31.Longo M. R., Schüür F., Kammers M. P., Tsakiris M., & Haggard P. (2008). What is embodiment? A psychometric approach. Cognition, 107(3), 978–998. 10.1016/j.cognition.2007.12.004 [DOI] [PubMed] [Google Scholar]
  • 32.Caspar E. A., De Beir A., Magalhaes De Saldanha da Gama P.A., Yernaux F., Cleeremans A., & Vanderborght B. (2015). New frontiers in the rubber hand experiment: when a robotic hand becomes one’s own. Behavior Research Methods, 47(3), 744–755. 10.3758/s13428-014-0498-3 [DOI] [PubMed] [Google Scholar]
  • 33.JASP Team (2019). JASP (Version 0.9.2)[Mac]
  • 34.Chatrian G. E., Petersen M. C., & Lazarte J. A. (1959). The blocking of the rolandic wicket rhythm and some central changes related to movement. Electroencephalography and clinical neurophysiology, 11(3), 497–510. 10.1016/0013-4694(59)90048-3 [DOI] [PubMed] [Google Scholar]
  • 35.Pfurtscheller G., Neuper C., & Krausz G. (2000). Functional dissociation of lower and upper frequency mu rhythms in relation to voluntary limb movement. Clinical neurophysiology, 111(10), 1873–1879. 10.1016/s1388-2457(00)00428-4 [DOI] [PubMed] [Google Scholar]
  • 36.Pineda J. A. (2005). The functional significance of mu rhythms: translating “seeing” and “hearing” into “doing”. Brain research reviews, 50(1), 57–68. 10.1016/j.brainresrev.2005.04.005 [DOI] [PubMed] [Google Scholar]
  • 37.Hari R., & Salmelin R. (1997). Human cortical oscillations: a neuromagnetic view through the skull. Trends in neurosciences, 20(1), 44–49. 10.1016/S0166-2236(96)10065-5 [DOI] [PubMed] [Google Scholar]
  • 38.Mary A., Bourguignon M., Wens V., de Beeck M. O., Leproult R., De Tiège X., et al. (2015). Aging reduces experience-induced sensorimotor plasticity. A magnetoencephalographic study. NeuroImage, 104, 59–68. 10.1016/j.neuroimage.2014.10.010 [DOI] [PubMed] [Google Scholar]
  • 39.Wolpaw J. R. (2007). Brain–computer interfaces as new brain output pathways. The Journal of physiology, 579(3), 613–619. 10.1113/jphysiol.2006.125948 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Fourneret P., & Jeannerod M. (1998). Limited conscious monitoring of motor performance in normal subjects. Neuropsychologia, 36(11), 1133–1140. 10.1016/s0028-3932(98)00006-2 [DOI] [PubMed] [Google Scholar]
  • 41.Serby H., Yom-Tov E., & Inbar G. F. (2005). An improved P300-based brain-computer interface. IEEE Transactions on neural systems and rehabilitation engineering, 13(1), 89–98. 10.1109/TNSRE.2004.841878 [DOI] [PubMed] [Google Scholar]
  • 42.Pfurtscheller G., & Neuper C. (1997). Motor imagery activates primary sensorimotor area in humans. Neuroscience letters, 239(2–3), 65–68. 10.1016/s0304-3940(97)00889-6 [DOI] [PubMed] [Google Scholar]
  • 43.Roy, R. N., Bonnet, S., Charbonnier, S., & Campagne, A. (2013, July). Mental fatigue and working memory load estimation: interaction and implications for EEG-based passive BCI. In 2013 35th annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 6607–6610). IEEE. [DOI] [PubMed]
  • 44.Demanet J., Muhle-Karbe P. S., Lynn M. T., Blotenberg I., & Brass M. (2013). Power to the will: how exerting physical effort boosts the sense of agency. Cognition, 129(3), 574–578. 10.1016/j.cognition.2013.08.020 [DOI] [PubMed] [Google Scholar]
  • 45.Howard E. E., Edwards S. G., & Bayliss A. P. (2016). Physical and mental effort disrupts the implicit sense of agency. Cognition, 157, 114–125. 10.1016/j.cognition.2016.08.018 [DOI] [PubMed] [Google Scholar]
  • 46.Damen T. G., Dijksterhuis A., & Baaren R. B. V. (2014). On the other hand: nondominant hand use increases sense of agency. Social Psychological and Personality Science, 5(6), 680–683. [Google Scholar]
  • 47.Minohara R., Wen W., Hamasaki S., Maeda T., Kato M., Yamakawa H., et al. (2016). Strength of intentional effort enhances the sense of agency. Frontiers in psychology, 7, 1165 10.3389/fpsyg.2016.01165 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gallagher S. (2012). Multiple aspects in the sense of agency. New ideas in psychology, 30(1), 15–31. [Google Scholar]
  • 49.Vlek R., van Acken J. P., Beursken E., Roijendijk L., & Haselager P. (2014). BCI and a user’s judgment of agency In Brain-Computer-Interfaces in their ethical, social and cultural contexts (pp. 193–202). Springer, Dordrecht. [Google Scholar]
  • 50.Schurger A., Gale S., Gozel O., & Blanke O. (2017). Performance monitoring for brain-computer-interface actions. Brain and cognition, 111, 44–50. 10.1016/j.bandc.2016.09.009 [DOI] [PubMed] [Google Scholar]
  • 51.Dummer T., Picot-Annand A., Neal T., & Moore C. (2009). Movement and the rubber hand illusion. Perception, 38(2), 271–280. 10.1068/p5921 [DOI] [PubMed] [Google Scholar]
  • 52.Orne M. T. (1962). On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. American psychologist, 17(11), 776. [Google Scholar]
  • 53.Lush P. (2020). Demand characteristics confound the rubber hand illusion. Collabra: Psychology. [Google Scholar]

Decision Letter 0

Jane Elizabeth Aspell

15 Jun 2020

PONE-D-20-06849

How using brain-machine interfaces influences the human sense of agency

PLOS ONE

Dear Dr. Caspar,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jul 30 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Jane Elizabeth Aspell, PhD

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

3. Please include a copy of Table 1 which you refer to in your text on page 19.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: PONE-D-20-06849

This study looked at the sense of agency for actions performed using an EEG-based brain-machine interface (BMI). The intentional binding effect was used as the primary dependent measure, where subjects directly estimate the interval between the cause (button press) and effect (auditory click). In two experiments there were no significant differences in interval estimates between motor actions and BMI-mediated actions, suggesting no difference in agency. The authors did, however, find relationships between other variables in the experiment, such as the sense of control, experienced difficulty, ownership of the robotic hand, and agency (as indexed by the interval estimates).

The paper is not very clearly written, and it was very difficult to make sense of it in places, often because crucial information was left out or not made clear, or jargon was undefined. It really reads like a first or second draft of a paper, rather than a paper that has been through a few rounds of internal review before being deemed ready to submit to a journal for peer review. Many of my comments are thus focused on the writing and on making things more clear. The paper does not make any strong claims, and the conclusions that are drawn seem reasonable given the data. The methods involved in controlling the robotic hand and the robotic hand itself were not given in sufficient detail (for example we do not know anything about the features used to discriminate motor imagery from rest conditions, even though this might be relevant to some of the results that involved correlation with the “theoretical accuracy”). The statistics appear to be appropriate and properly reported, except that for correlations with ordinal data (as in figure 6A) it can be better to use a non-parametric statistic like Spearman’s rho.

Specific comments (in no particular order):

p. 4, line 3: what is a “skin-based input”?

p. 6, line 21”main age” should be “mean age”

p. 7, bottom and p. 12, line 22, and figure 3

So they used bi-polar electrodes to record EMG over the flexor digitorum superficialis, but it seems that they just monitored it on-line or on a single-trial basis (says that they used the EMG traces off-line to remove trials with muscle activity). But they did not, it seems, check for sub-threshold muscle activity during motor imagery by looking at the average EMG power over all trials. To do this you have to take the sqrt of the sum of squares in a sliding window for each trial, before you average the trials together. You have to do this if you want to rule out any muscle activity, including sub-threshold muscle activity.

Also it says that EMG data were high-pass filtered at 10 Hz, but then nothing else. You can’t really see muscle activity without at least rectifying the signal, or (better) converting to power.

p. 9, line 4: They checked to see if, after six training sessions, accuracy of the classifier was below chance. But that’s not what you want to do. You want to check to see if accuracy of the classifier is significantly *different* from chance. Performance of 51% correct might still be chance! You should always say “not different from chance” rather than “below chance”.

p. 10, line 5: "red cross” should be “red arrow”

It would be helpful and useful if you computed AUC for control of the robotic hand, just to give an idea of what level of control your subjects managed to achieve.

What do you mean by the “theoretical” accuracy score? This was not clear enough. On what data set was this theoretical accuracy determined? You should have a test set and a training set, but you did not specify what you used as training data and what you used as test data. Please clarify.

If theoretical accuracy refers to the predicted accuracy on test data given the accuracy of the classifier on then training data, then of course we expect mu and beta power to be predictive, because the classifier is almost certainly using mu and beta power to discriminate between motor imagery and rest. Using CSP and LDA is a very common recipe for EEG-based BCIs, and it tends very often to converge on mu and beta desynchronization. This brings up another point which is that you should specify in your methods section which were the features that the classifier used. You don’t have to go into all of the details of the classifier, but at a minimum you should state what the features were. Did the classifier operate on the raw time series, or (more likely) on a time-frequency representation of the data?

p. 14, label says “FIGURE 3” but I think this is figure 2.

“Location” = electrode location? This was not always clear. Maybe just call it “electrode”.

What does the word “location” refer to in the section on "Perceived control and appropriation of the robotic hand”?

Are you using the word “location” in two different ways. I know that one is the electrode location, but then it seems you are using it for something else as well. This was confusing.

For spectral power calculations on the EEG data, did you normalize, i.e. divide by the standard deviation of the baseline? I realize that you did your power spectral analyses without baseline correction because the time leading up to the action was too heterogenous. But couldn’t you use a time window before even the start of the trial as your baseline, or perhaps during the two-second waiting period? Because power is typically higher at lower frequencies because of 1/f scaling, so it is more meaningful to express your power estimates as a Z score before comparing. It is not really valid to compare power at in different frequency bands. I.e. you comparing power in the mu band to power in the beta band in kind of meaningless. Mu band will always win, just because there is always more power at lower frequencies.

Figure 2 A and C: what are the units???

p. 18, line 16: What are the two variables precisely? One is the difference in interval estimates between the two tasks, but what is the other one? You just refer to the “difference between the real hand key press and the robotic hand key press”, but the difference in what? Spectral power? And what is each data point? A subject (I presume)? You just need to be more clear and specific.

p. 20, line 10: This is confusing. You write “we observed a greater desynchronization in the beta band than in the mu band.” Do you mean for real hand movements? I thought that you had observed greater desynchronization in the mu band compared to the beta band (see p. 19 line 15).

You state that there was no difference in the interval estimates for real button press versus motor imagery, but did you verify that you even obtained a canonical intentional binding effect in the first place? I did not see this anywhere. Maybe there was an IB effect for both real-hand and robotic-hand keypresses, in which case you might not see any difference. Or maybe neither of them had the effect. Don’t you need a passive viewing or involuntary-motor task to control for that?

p. 24, line 6: You talk about the “difference between the imagery phases and the rest phases”, but the difference in what?? You should always specify what variable you are talking about, even if you think it is obvious. Did you mean the difference in spectral power? If so then how measured? I.e. over what time interval / frequency range? And how did you compute your power estimates? Wavelets? Band-pass filter plus Hilbert transform? In general you have not been precise enough in describing your analyses.

Figure 5: units!!

When you talk about the “mu rhythm” and “beta rhythm” what do you mean? Do you mean power in the mu-band and power in the beta band? If so then you should say that. Just saying “mu rhythm” or “beta rhythm” is not specific enough. It’s unclear.

Figure 6A: I am not sure it is appropriate or optimal to use Pearson’s correlation coefficient on a discrete variable (like perceived control).

No detail was given about the robotic hand itself, and very little detail about the BMI that was controlling it. How long was the latency between the BMI decision and the movement of the robotic hand? This information was given in the discussion, but should appear in the methods section as well. How long did it take for the robotic hand to depress the key? Was it an abrupt movement, or did the robotic hand move slowly until the key was pressed? Or did it move at a speed that was determined by the degree of BMI control? Did the key produce an audible click sound when pressed? These might seem like trivial details, but they may play a role in determining the sense of agency over the robotic hand, and the subsequent sense of agency over the keypress made by the robotic hand.

Does the theoretical accuracy predict perceived control? This was significant in both experiments and simply suggests that better performance on the part of the classifier is associated with a stronger feeling of control over the BCI (which it should). A prior study that is directly relevant here is Schurger et al, Brain & Cognition (2016) which also looked at judgements of control over a BMI.

p. 26, lines 9-11: This sentence is simply confusing. I can’t make any sense of it. Please clarify. And what do you mean by the “estimated interval estimate”? Isn’t it just either the “estimated interval” or the “interval estimate”? And what is "the reported difficulty to make the hand performing the movement”? Does this refer to the robotic hand? Do you perhaps mean “the reported difficulty making the robotic hand perform the movement”?

p. 26, lines 24-25: "Results indicated that the perceived control was a better predictor of interval estimates than the reported difficulty.” Isn’t there guaranteed to be some collinearity in this regression since we know already that perceived control and reported difficulty are correlated? And how did you determine that perceived control was the better predictor? If they are collinear then this would be difficult to ascertain. In part it could just be that this was not clearly expressed in writing. The writing definitely could stand to be improved.

p. 27, line 5: “p = -2.360”???

Again, what are “location scores”? This was unclear to me given that you used the word “location” to refer to which electrode you were looking at.

Reviewer #2: In two studies the authors lay out the importance of one’s sense of agency (SoA) over one’s actions and in particular actions performed by assistive technology via a brain-computer interface. Furthermore, they explore how the performance of the BCI as well as one’s perceived control or fatigue affect one’s SoA for the observed actions and how this may affect one’s sense of ownership over the assistive technology.

• What are the main claims of the paper and how significant are they for the discipline?

- The main claims of the paper are that i) the performance of the BCI has immediate influence over one’s explicit SoA (judgment of agency - JoA), even though the BCI actions do not produce reafferences; ii) similarly, the performance of the BCI predicts one’s implicit SoA (or feeling of Agency – FoA), as determined using intentional binding (IB) as dependent variable; iii) the perceived effort or difficulty of completing the task with the BCI had a negative effect on the FoA, and that iv) an increased SoA also aids embodiment or ownership of the device.

Generally, these claims are very interesting for the discipline, as they attempt to disentangle different aspects of one’s SoA by combining implicit and explicit judgments and linking them to movement-related cortical desynchronization.

Overall, I am not convinced the authors have controlled for all aspects of their study to fully support these claims.

• Are the claims properly placed in the context of the previous literature? Have the authors treated the literature fairly?

- The introduction of the manuscript provides an overview of relevant literature on the sense of agency, intentional binding, and BCIs. While the authors aim to disentangle these different concepts in order to motivate their study design, I disagree with some of their key arguments here. I would appreciate some clarification on these points, as they are important for the study design, the interpretation of their results, as well as their implications.

Intentional binding

- I agree that it is important to separate the FoA from the JoA, as the authors do. However, I am not convinced that the “pre-reflective experience of” the FoA can be assessed using Intentional Binding. Indeed, I would think that e.g. Moore and Obhi (2012 Consciousness & Cognition) or Ebert and Wegner (2010, also C&C) would argue that IB is rather compatible with the JoA. (I have more comments on this for the analysis.) The FoA describes an implicit, on-going aspect of performing an action or making a movement. Neither of these points is true for IB, where the action has already been completed with the press of the button. I understand that this is a general discussion point, not specific to this study, but I think it should be considered.

- Comparing the JoA and IB also seems difficult, as the former starts with one’s movement intention and ends with the press of the button (“the hand moved according to my will”), whereas the latter only reflects the cause-and-effect of the button-press followed by a tone (“the delay that occurred between the robotic hand keypress and the tone”). It could therefore be argued that the JoA and the FoA measured here concern different processes and should not directly be compared (without further justification).

- With respect to further literature on the IB paradigm, Suzuki et al.’s findings (Psychological Science 2019) suggesting that IB may simply reflect “multisensory causal binding” would be relevant to include, particularly as the current paper does neither includes baseline estimations for action- or outcome-binding nor a passive control condition. The work by Rohde and Ernst (e.g. Behavioural Science, 2016) is also relevant.

Questionnaire Data

- Table 1 seems to be missing in my documents, so I am not entirely sure, my comments on these data are completely accurate. However, a general point to consider for the questionnaire is – what is the control condition and is there a control question? (also see my comment in the data analysis section.) Recently, the questionnaires used in the rubber hand illusion and comparable studies have come under (even more) scrutiny. It would be great, if the authors could include a brief discussion of their questionnaire results with respect to the points raised by Peter Lush (2020) “Demand Characteristics Confound the Rubber Hand Illusion” Collabra: Psychology.

• Do the data and analyses fully support the claims? If not, what other evidence is required?

- IB: Was there any evidence of an intentional binding effect? It would be interesting to see the distribution of the reported intervals – is it trimodal? (Cf. figures in Suzuki et al.) Was the interval actually reduced? As there is apparently no passive or control condition and no difference between human or robot hand movements, it is not clear if there was any effect at all. Perhaps this is something that can be explored in the existing data set. If there is no effect of binding, what does that mean with respect to the findings of study 2?

- Control condition: Moore and Obhi 2012 argue that “Out of all the factors that have been linked to intentional binding, it appears that the presence of efferent information is most central to the manifestation of intentional binding as, when efferent information is not present, such as in passive movement conditions, or passive observation of others, intentional binding is reduced or absent.” Should you not have included such a condition in order to verify an effect of IB?

- Agency Question: Were there any false positives or catch trials, in which the robotic hand moved without the user’s intention? If the hand only ever moves when participants “intend” to move, then the hand can never actually “move by its own”. Furthermore, if the question(s) are only asked after a successful triggering of the hand movement, is the latter answer not ~unattainable?

- Ownership Question: Is there a way to distinguish between ownership and agency in the current paradigm? Are the scores highly correlated? Can you exclude a counterargument such as demand characteristics being responsible for the ownership scores?

- Theoretical accuracy: The EEG (and EMG) methods are clearly described. The results with respect to mu- and beta-band suppression and electrode location are in-line with prior findings and it is good to see this BCI approach being applied to agency and ownership questions. I have a couple of questions which you can probably quite easily clarify. How does “theoretical accuracy” differ from (actual) classifier performance? Could you also briefly explain why you do not use cross-validation to measure the performance?

- On the same point, can you calculate the classifier performance during the actual experimental block? Does this match the training performance and is this a better or poorer indicator of perceived control?

- Theoretical accuracy and correlation to mu/beta oscillations: If I understand correctly, the classifier is trained to detect desynchronization in the mu/beta frequency bands. Performance is then quantified as theoretical accuracy. What does the correlation between theoretical accuracy and mu/beta oscillation actually tell us? Is this simply the confirmation that these are the trained criteria? (Apologies, if I simply missed the point here.)

- Sample size: The justification of the sample size is not very clear. Would it not be better to either calculate it based on prior or expected effect sizes and add a percentage of participants in case of BCI illiteracy? Alternatively, you could use a Bayesian approach and determine a cut-off criterion.

• PLOS ONE encourages authors to publish detailed protocols and algorithms as supporting information online. Do any particular methods used in the manuscript warrant such treatment? If a protocol is already provided, for example for a randomized controlled trial, are there any important deviations from it? If so, have the authors explained adequately why the deviations occurred?

- No issues.

• If the paper is considered unsuitable for publication in its present form, does the study itself show sufficient potential that the authors should be encouraged to resubmit a revised version?

- Not applicable.

• Are original data deposited in appropriate repositories and accession/version numbers provided for genes, proteins, mutants, diseases, etc.?

- Yes

• Are details of the methodology sufficient to allow the experiments to be reproduced?

- Yes

• Is the manuscript well organized and written clearly enough to be accessible to non-specialists?

- Yes

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Aaron Schurger

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 7;16(1):e0245191. doi: 10.1371/journal.pone.0245191.r002

Author response to Decision Letter 0


11 Aug 2020

Dear Editor,

We would like to thank you for your time on this manuscript. We have revised the manuscript according to the comments received from the two reviewers. You will find attached a point-by-point response to the reviewers’ comments. We hope that the manuscript has been improved and will be suitable for publication in Plos One. 

Sincerely,

Emilie A. Caspar, Albert De Beir, Gil Lauwers, Axel Cleeremans and Bram Vanderborght

Reviewer #1: PONE-D-20-06849

This study looked at the sense of agency for actions performed using an EEG-based brain-machine interface (BMI). The intentional binding effect was used as the primary dependent measure, where subjects directly estimate the interval between the cause (button press) and effect (auditory click). In two experiments there were no significant differences in interval estimates between motor actions and BMI-mediated actions, suggesting no difference in agency. The authors did, however, find relationships between other variables in the experiment, such as the sense of control, experienced difficulty, ownership of the robotic hand, and agency (as indexed by the interval estimates).

The paper is not very clearly written, and it was very difficult to make sense of it in places, often because crucial information was left out or not made clear, or jargon was undefined. It really reads like a first or second draft of a paper, rather than a paper that has been through a few rounds of internal review before being deemed ready to submit to a journal for peer review. Many of my comments are thus focused on the writing and on making things more clear. The paper does not make any strong claims, and the conclusions that are drawn seem reasonable given the data. The methods involved in controlling the robotic hand and the robotic hand itself were not given in sufficient detail (for example we do not know anything about the features used to discriminate motor imagery from rest conditions, even though this might be relevant to some of the results that involved correlation with the “theoretical accuracy”). The statistics appear to be appropriate and properly reported, except that for correlations with ordinal data (as in figure 6A) it can be better to use a non-parametric statistic like Spearman’s rho.

 → We thank the referee for this frank but sympathetic assessment and apologize for the quality of the language. We have revised the manuscript extensively, improving both language quality and presentation. We have also added information about the decoding algorithms we used and now hope the manuscript feels more straightforward. We reply to the referee’s specific comments below: 

Specific comments (in no particular order):

C1. p. 4, line 3: what is a “skin-based input”?

→ R1. In the paper of Coyle and colleagues, the authors used a system that was detecting when participants were tapping on their arm to produce an outcome (i.e. a tone). We have now added this description in the relevant passage, as follows: For instance, Coyle, Moore, Kristensson, Fletcher and Blackwell (2012) showed that intentional binding was stronger in a condition in which participants used a skin-based input system that detected when participants were tapping on their arm to produce a resulting tone rather than in a condition in which they were tapping on a button to produce a similar tone

C2. p. 6, line 21”main age” should be “mean age”

→ R2. Corrected.

C3. p. 7, bottom and p. 12, line 22, and figure 3

So they used bi-polar electrodes to record EMG over the flexor digitorum superficialis, but it seems that they just monitored it on-line or on a single-trial basis (says that they used the EMG traces off-line to remove trials with muscle activity). But they did not, it seems, check for sub-threshold muscle activity during motor imagery by looking at the average EMG power over all trials. To do this you have to take the sqrt of the sum of squares in a sliding window for each trial, before you average the trials together. You have to do this if you want to rule out any muscle activity, including sub-threshold muscle activity.

 Also it says that EMG data were high-pass filtered at 10 Hz, but then nothing else. You can’t really see muscle activity without at least rectifying the signal, or (better) converting to power.

→ R3. We agree with this comment. An important aspect of the methods that we did not mention in the manuscript was that there was a baseline correction for the analysis of the EMG. This information has now been added in the relevant section. However, even using a different approach to analyse the EMG data, we cannot entirely rule out the presence of muscular activity. The flexor digitorum superficialis does not control for the movement of the thumb for instance - a movement that we could only check with visual control for overts mouvements. There were also no electrodes placed on the left arm for instance. We agree that these shortcomings constitute a limitation and have now added a discussion paragraph regarding this aspect, as follows (page 42): “Finally, another limitation is that we could not entirely rule out the presence of sub-thresholded muscular activity that could have triggered the BMI during the task. The flexor digitorum superficialis where the electrodes were placed does not allow to record activity associated with the movement of the thumb. In addition, no electrodes were placed on the left arm. Those movements were only controled visually by the experimenter during the task.”. 

C4. p. 9, line 4: They checked to see if, after six training sessions, accuracy of the classifier was below chance. But that’s not what you want to do. You want to check to see if accuracy of the classifier is significantly *different* from chance. Performance of 51% correct might still be chance! You should always say “not different from chance” rather than “below chance”.

→ R4. We were not interested in including only participants that were above chance level, as is indeed usually done in the classic literature regarding BCI. Our aim was to include even people at the chance level in order to have variability in our sample, that is a distribution ranging from people only able to control the BCI at chance level to people having a good control of the BCI. This variability was of interest to us in order to evaluate how different degrees of control over the BCI result in a modulation of the sense of agency as well as in embodiment of the robotic hand. We further clarified this choice on page 12 as follows: “We chose to also accept participants who were at the chance level in order to create variability in the control of the BMI to be able to study how this variability influences other factors.” Our only criterion was excluding participants if they were lower than 50%. We agreed that performance should be significantly lower than 50% to truly consider that our exclusion criteria were respected, since indeed a score of 48 or 49% could still be considered as the chance level. We have now modified the sentence on page 12 as follows: “At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. ”

C5. p. 10, line 5: "red cross” should be “red arrow”

 → R5. Thank you for noticing this typo, we have now corrected it.

C6. It would be helpful and useful if you computed AUC for control of the robotic hand, just to give an idea of what level of control your subjects managed to achieve.

 → R6. We thank the referee for this suggestion. Computing the area under the curve indeed makes it possible to better describe a binary classifier. Although often used in data science, we found very few examples of this metric in our field. As the goal of this paper was to use the BCI merely as a method in behavioural research  rather than to develop a better BCI, we decided not to include the suggested AUC analyses.

C7. What do you mean by the “theoretical” accuracy score? This was not clear enough. On what data set was this theoretical accuracy determined? You should have a test set and a training set, but you did not specify what you used as training data and what you used as test data. Please clarify.

→ R7. Thank you for this comment. We agree that the theoretical accuracy score could be confusing, such as pointed out by Reviewer 2 as well. We have now decided to use the terminology “classification performance” for more clarity. We have also added more information about the training session, as follows, on page 12: “These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50-60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed.”

C8. If theoretical accuracy refers to the predicted accuracy on test data given the accuracy of the classifier on then training data, then of course we expect mu and beta power to be predictive, because the classifier is almost certainly using mu and beta power to discriminate between motor imagery and rest. Using CSP and LDA is a very common recipe for EEG-based BCIs, and it tends very often to converge on mu and beta desynchronization. This brings up another point which is that you should specify in your methods section which were the features that the classifier used. You don’t have to go into all of the details of the classifier, but at a minimum you should state what the features were. Did the classifier operate on the raw time series, or (more likely) on a time-frequency representation of the data?

→ R8. Thank you for this comment. We have now added more information about the classification performance, as follows, on page 11: “The feature extraction that followed each training session was carried out with a Variant of Common Spatial Pattern (CSP), and the classifier used  Linear Discriminant Analysis (LDA, see Kothe & Makeig, 2013). We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy. ”

C9. p. 14, label says “FIGURE 3” but I think this is figure 2.

 → R9. We have now corrected this typo. 

C10. “Location” = electrode location? This was not always clear. Maybe just call it “electrode”. What does the word “location” refer to in the section on "Perceived control and appropriation of the robotic hand”? Are you using the word “location” in two different ways. I know that one is the electrode location, but then it seems you are using it for something else as well. This was confusing.

→ R10. We agree with this suggestion and we have now used ‘electrode’ instead of ‘location’ when necessary, since location could also refer to the ‘location’ score from the rubber hand illusion questionnaire measuring the degree of appropriation of the robotic hand.  ‘Location’ when referring to the appropriation of the robotic hand was left as before since this is in accordance with the wording used in other papers on the RHI. 

C11. For spectral power calculations on the EEG data, did you normalize, i.e. divide by the standard deviation of the baseline? I realize that you did your power spectral analyses without baseline correction because the time leading up to the action was too heterogenous. But couldn’t you use a time window before even the start of the trial as your baseline, or perhaps during the two-second waiting period? Because power is typically higher at lower frequencies because of 1/f scaling, so it is more meaningful to express your power estimates as a Z score before comparing. It is not really valid to compare power at in different frequency bands. I.e. you comparing power in the mu band to power in the beta band in kind of meaningless. Mu band will always win, just because there is always more power at lower frequencies.

→ R11. Thank you for this insightful comment.Those two options are unfortunately not reliable. Regarding the 2s waiting period, even though we asked participants to relax, we did not specify exactly when the waiting period finished (e.g. with a sentence appearing on the screen for instance, explicitly instructing them to relax). In the BMI-generated action condition, we cannot be entirely sure of when participants start to use motor imagery. Some participants could for instance start trying to use motor imagery to make the robotic hand move after 1s or 1.5 s. Since we don’t have this information, we doubt we can use it  as a baseline. Regarding the start of the trial option, participants finished the former trial by validating their answer by pressing the ENTER keypress. They did not have to wait for a fixed period of time before starting the next trial by pressing the ENTER key again. Most of the participants started the next trial right away. Therefore, this cannot constitute a proper baseline period either. But we agree that for future studies we should integrate a fixed and more clear time period to obtain a proper baseline correction. 

We have followed the suggestion to normalize the data by transforming  the data associated with each frequency band into z scores. We carried all analyses again on the normalized data, and reported these analyses in our revision. 

C12. Figure 2 A and C: what are the units???

 → R12. This has now been added. 

C13. p. 18, line 16: What are the two variables precisely? One is the difference in interval estimates between the two tasks, but what is the other one? You just refer to the “difference between the real hand key press and the robotic hand key press”, but the difference in what? Spectral power? And what is each data point? A subject (I presume)? You just need to be more clear and specific.

→ R13. We apologize for the confusion. We have now modified the sentence as follows: “We additionally investigated, with Pearson correlations, whether or not the difference in power in the mu- and the beta-bands between the real hand keypress and the robotic hand keypress on C3, Cz and C4 for each participant could predict the difference in interval estimates between the same two tasks.”

C14. p. 20, line 10: This is confusing. You write “we observed a greater desynchronization in the beta band than in the mu band.” Do you mean for real hand movements? I thought that you had observed greater desynchronization in the mu band compared to the beta band (see p. 19 line 15).

→ R14. Those results have now changed because we have normalized the data within each frequency band. The new results and the adapted discussion are now reported in the revised version of the manuscript. Overall, power in the mu and the beta-band now do not significantly differ between each other. 

C15. You state that there was no difference in the interval estimates for real button press versus motor imagery, but did you verify that you even obtained a canonical intentional binding effect in the first place? I did not see this anywhere. Maybe there was an IB effect for both real-hand and robotic-hand keypresses, in which case you might not see any difference. Or maybe neither of them had the effect. Don’t you need a passive viewing or involuntary-motor task to control for that?

→ R15. We agree with this comment, which is similar to comment C31 from reviewer 2. In the classic IB literature, a comparison between two experimental conditions is necessary, generally an active and a passive condition, in order to detect changes in the sense of agency. In our study, we did not introduce a passive condition for the following reason: the experiment was already quite long and exhausting for participants. Adding additional experimental conditions would have led to even higher cognitive fatigue, which could have been detrimental for the interval estimate task. Our main aim was to investigate if agency would differ between the body-generated action condition and the BMI-generated action condition, and hence we did not introduce a control, passive condition which would have been critical to know if agency was ‘high’ in both conditions or ‘low’ in both conditions. Given the extensive existing literature on body-generated action and interval estimates, we considered that, like in the existing literature, this condition reflected a sort of ‘high agency condition’, since movements were voluntarily produced with participants’ own keypress. Based on this, we thus extrapolated that agency was also high in the BMI-generated action condition. We fully agree that this extrapolation may not be sufficient, and we have now added this comment as a limitation in the general discussion of the present paper, as follows: “A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex (Haggard et al., 2002) or by having a motoric setup forcing the participant’s finger to press the key (e.g. Moore, Wegner & Haggard, 2009). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue  of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see Moore & Obhi, 2012; Moore, 2016; Haggard, 2017 for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.”. 

C16. p. 24, line 6: You talk about the “difference between the imagery phases and the rest phases”, but the difference in what?? You should always specify what variable you are talking about, even if you think it is obvious. Did you mean the difference in spectral power? If so then how measured? I.e. over what time interval / frequency range? And how did you compute your power estimates? Wavelets? Band-pass filter plus Hilbert transform? In general you have not been precise enough in describing your analyses.

→ R16. We agree that we should have been more precise in the way we report our analyses. We have now added the following information: “We conducted a repeated measures ANOVA with Rhythm (Mu, Beta) and Electrode (C3, Cz, C4) as within-subject factors on the difference in spectral power in the mu- and beta-bands between the imagery phases and the rest phases during the training session. Data were normalized by subtracting the global mean from each data point and by dividing the result  by the global SD. ” Also, in the section entitled ‘electroencephalography recordings and processing’, we have now added more information on how the power estimates were computed. 

C17. Figure 5: units!!

 → R17. We have added the units.

C18. When you talk about the “mu rhythm” and “beta rhythm” what do you mean? Do you mean power in the mu-band and power in the beta band? If so then you should say that. Just saying “mu rhythm” or “beta rhythm” is not specific enough. It’s unclear.

→ R18. We have now changed the wording in the entire manuscript, as suggested. We hope that the manuscript is now clearer. 

C19. Figure 6A: I am not sure it is appropriate or optimal to use Pearson’s correlation coefficient on a discrete variable (like perceived control).

→ R19. We agree that there is an ongoing debate about the use of Pearson correlations for discrete variables that have 6 points or more or if the use of nonparametric tests is more appropriate. However, R1 is right that the use of Spearman’s Rho could be more adapted and we have now changed the manuscript where necessary by reporting results with Spearman’s Rho instead of Pearson r. 

C20. No detail was given about the robotic hand itself, and very little detail about the BMI that was controlling it. How long was the latency between the BMI decision and the movement of the robotic hand? This information was given in the discussion, but should appear in the methods section as well. How long did it take for the robotic hand to depress the key? Was it an abrupt movement, or did the robotic hand move slowly until the key was pressed? Or did it move at a speed that was determined by the degree of BMI control? Did the key produce an audible click sound when pressed? These might seem like trivial details, but they may play a role in determining the sense of agency over the robotic hand, and the subsequent sense of agency over the keypress made by the robotic hand.

→ We have now added an additional section in the method section regarding the robotic hand. We have mentioned its features, the publications that detail how to build it and the time to press the keys. The time the robotic hand takes to press the key is indeed very important regarding the time estimation that participants have to make. We set-up this timing to 900 ms in order to have the robotic hand redressing the finger only after the beep occured on each trial and so to avoid temporal distortion for interval estimates. 

C21. Does the theoretical accuracy predict perceived control? This was significant in both experiments and simply suggests that better performance on the part of the classifier is associated with a stronger feeling of control over the BCI (which it should). A prior study that is directly relevant here is Schurger et al, Brain & Cognition (2016) which also looked at judgements of control over a BMI.

→ R21. There was indeed a significant correlation between classification performance and perceived control. We have now added the requested reference in the discussion, since it was indeed relevant: “This result is in line with a former study, which showed that participants are good at evaluating their ability to control a BMI even if they do not receive an online feedback on their performance (Schurger, Gale, Gozel, & Blanke, 2017). ”

C22. p. 26, lines 9-11: This sentence is simply confusing. I can’t make any sense of it. Please clarify. And what do you mean by the “estimated interval estimate”? Isn’t it just either the “estimated interval” or the “interval estimate”? And what is "the reported difficulty to make the hand performing the movement”? Does this refer to the robotic hand? Do you perhaps mean “the reported difficulty making the robotic hand perform the movement”?

→ R22. We have now changed these sentences as follows: “We then performed non-parametric correlation with Spearman's rho (⍴) between the reported interval estimates, the perceived control over the movement of the robotic hand and the reported difficulty to make the robotic hand perform the movement, on a trial-by-trial basis. ”

C23. p. 26, lines 24-25: "Results indicated that the perceived control was a better predictor of interval estimates than the reported difficulty.” Isn’t there guaranteed to be some collinearity in this regression since we know already that perceived control and reported difficulty are correlated? And how did you determine that perceived control was the better predictor? If they are collinear then this would be difficult to ascertain. In part it could just be that this was not clearly expressed in writing. The writing definitely could stand to be improved.

 → R23. We have now reported the VIF that rules out a collinearity effect (all VIF are always =< 2).  However, we do agree that both factors (i.e. perceived control and reported difficulty) could play antagonist roles on the implicit sense of agency: while higher perceived control could boost the implicit sense of agency, higher difficulty could reduce it. 

C24. p. 27, line 5: “p = -2.360”???

 → R24. We have now corrected this typo. 

C25. Again, what are “location scores”? This was unclear to me given that you used the word “location” to refer to which electrode you were looking at.

→ R25. We have now used the word ‘electrode’ when it referred to the electrode location. The word ‘location’ now only refers to the location scores from the questionnaire assessing the appropriation of the robotic hand. 

------

Reviewer #2: In two studies the authors lay out the importance of one’s sense of agency (SoA) over one’s actions and in particular actions performed by assistive technology via a brain-computer interface. Furthermore, they explore how the performance of the BCI as well as one’s perceived control or fatigue affect one’s SoA for the observed actions and how this may affect one’s sense of ownership over the assistive technology.

• What are the main claims of the paper and how significant are they for the discipline?

The main claims of the paper are that i) the performance of the BCI has immediate influence over one’s explicit SoA (judgment of agency - JoA), even though the BCI actions do not produce reafferences; ii) similarly, the performance of the BCI predicts one’s implicit SoA (or feeling of Agency – FoA), as determined using intentional binding (IB) as dependent variable; iii) the perceived effort or difficulty of completing the task with the BCI had a negative effect on the FoA, and that iv) an increased SoA also aids embodiment or ownership of the device.

Generally, these claims are very interesting for the discipline, as they attempt to disentangle different aspects of one’s SoA by combining implicit and explicit judgments and linking them to movement-related cortical desynchronization.

Overall, I am not convinced the authors have controlled for all aspects of their study to fully support these claims.

→ Thanks for judging our work and claims as interesting. We hope our revision further supports these claims. 

• Are the claims properly placed in the context of the previous literature? Have the authors treated the literature fairly?

C27. The introduction of the manuscript provides an overview of relevant literature on the sense of agency, intentional binding, and BCIs. While the authors aim to disentangle these different concepts in order to motivate their study design, I disagree with some of their key arguments here. I would appreciate some clarification on these points, as they are important for the study design, the interpretation of their results, as well as their implications.

Intentional binding

C26. I agree that it is important to separate the FoA from the JoA, as the authors do. However, I am not convinced that the “pre-reflective experience of” the FoA can be assessed using Intentional Binding. Indeed, I would think that e.g. Moore and Obhi (2012 Consciousness & Cognition) or Ebert and Wegner (2010, also C&C) would argue that IB is rather compatible with the JoA. (I have more comments on this for the analysis.) The FoA describes an implicit, on-going aspect of performing an action or making a movement. Neither of these points is true for IB, where the action has already been completed with the press of the button. I understand that this is a general discussion point, not specific to this study, but I think it should be considered.

→ R26.This is indeed a fair point and a lot of explicit measures have also been criticized in terms of how much they relate to agency, since classically authors use different questions that are adapted to their experimental paradigm (i.e. “Do you feel in control of the action”, “Do you feel that you are the authors of the action”, “Do you feel that who caused the outcomes”? etc.). However, since the point is somewhat derivative with respect to our main goals,  we chose not to further develop this discussion point. 

C27.  Comparing the JoA and IB also seems difficult, as the former starts with one’s movement intention and ends with the press of the button (“the hand moved according to my will”), whereas the latter only reflects the cause-and-effect of the button-press followed by a tone (“the delay that occurred between the robotic hand keypress and the tone”). It could therefore be argued that the JoA and the FoA measured here concern different processes and should not directly be compared (without further justification).

→ R27. We agree with the reviewer’s comment and that’s why we do not directly compare them. We nonetheless evaluate the correlations between these two experiences of the self, when one is producing an action. Several earlier  papers have evaluated (through correlations) how these two experiences of the self, despite measuring different aspects, relate and that is the procedure we have used here. 

C28. With respect to further literature on the IB paradigm, Suzuki et al.’s findings (Psychological Science 2019) suggesting that IB may simply reflect “multisensory causal binding” would be relevant to include, particularly as the current paper does neither includes baseline estimations for action- or outcome-binding nor a passive control condition. The work by Rohde and Ernst (e.g. Behavioural Science, 2016) is also relevant.

→ R28. According to the study of Suzuki et al 2019, any study wishing to draw conclusions specific to voluntary action, or similar theoretical constructs, needs to be based on the comparison of carefully matched conditions that should ideally differ in just one critical aspect. Here, our conditions only differed in the sensorimotor information coming from the keypress; they were similar in terms of causality and volition. We are aware of the current debate over the effect of causality on IB. While some authors argued that causality plays an important role (e.g. Buehner & Humphrey, 2009) in explaining variability in IB between 2 experimental conditions, other authors have shown that when causality is controlled as well as other features between two experimental conditions, and that only intentionality remains, a difference in IB still occur (Caspar, Cleeremans & Haggard, 2018).  - study 2). Very recent studies come to the conclusion that even if several factors, such as causality, play a role in the modulation of binding, intentionality also plays a key role (e.g., Lorimer, … , Buehner, 2020). However, since this debate is not the aim of the present paper, we do not discuss this literature in the present paper.

Questionnaire Data

C29. Table 1 seems to be missing in my documents, so I am not entirely sure, my comments on these data are completely accurate. However, a general point to consider for the questionnaire is – what is the control condition and is there a control question? (also see my comment in the data analysis section.) Recently, the questionnaires used in the rubber hand illusion and comparable studies have come under (even more) scrutiny. It would be great, if the authors could include a brief discussion of their questionnaire results with respect to the points raised by Peter Lush (2020) “Demand Characteristics Confound the Rubber Hand Illusion” Collabra: Psychology.

→ R29. We apologize for the omission. Table 1 was indeed missing and has now been added in the manuscript.  As far as we understand it, the critical difference between our design and that of Lush (2020) is that we never mentioned anything to our participants regarding the ‘robotic’ hand illusion or even that they would be given a questionnaire at the end of the experiment evaluating their appropriation of the robotic hand. We even avoided putting the classical blanket between the robotic hand and the participant to cover the participant’s arm, in order to avoid them thinking that we would evaluate the robotic hand as part of their own body. Our participants thus had a limited room for creating expectancies regarding the questionnaires and the robotic hand. However, we agree that controlling for expectancies is relevant in the case of the rubber hand illusion. We have now integrated this discussion in the general discussion section, as follows: “Several articles have argued that demand characteristics (Orne, 1962) and expectancies can predict scores on the classical rubber hand illusion (RHI) questionnaires (Lush, 2020). In the present study, we limited as much as possible the effects of both demand characteristics and expectancies on participants’ answer to the questionnaire related to the appropriation of the robotic hand: (1) Participants were not told anything about the rubber hand illusion or the fact that some people may experience the robotic hand as a part of their body during the experiment; (2) We did not tell them in advance that they would have to fill in a questionnaire regarding their appropriation of the robotic hand at the end of the experiment; (3) We avoided creating a proper classical ‘rubber hand illusion’: the robotic hand was placed at a congruent place on the table regarding the participant’s arm, but we did not use the blanket to cover their real arm, neither we used a paintbrush to induce an illusion. This procedure limited the possibility for participants to guess what we would assess since in the classical rubber hand illusion, placing the blanket on the participant’s arm make them realize that we investigate to what extent they appropriate this hand in their body schema. However, we cannot fully rule out the influence of demand characteristics and expectancies, as participants perceiving a better control over the hand, even without knowing their own classification performance, could have intuitively indicated higher scores on the questionnaires. Control studies could thus be performed to manipulate demand characteristics in order to evaluate its effect on the appropriation of a robotic hand in a brain-machine interface procedure (Lush, 2020).”

• Do the data and analyses fully support the claims? If not, what other evidence is required?

C30. IB: Was there any evidence of an intentional binding effect? It would be interesting to see the distribution of the reported intervals – is it trimodal? (Cf. figures in Suzuki et al.) Was the interval actually reduced? As there is apparently no passive or control condition and no difference between human or robot hand movements, it is not clear if there was any effect at all. Perhaps this is something that can be explored in the existing data set. If there is no effect of binding, what does that mean with respect to the findings of study 2?

→ R30. With respect to Study 2, we cannot infer that the sense of agency was ‘strong’ or ‘weak’ on either training days, which we did not argue for in our discussion. Rather, we can say that having a good control over the BMI boosts sense of agency, as measured through the method of interval estimates. We had three intervals: we have run those analyses with delays as an additional within-subject factor and have observed that IE were longer for the BMI condition for the 100ms-interval than for the real hand condition, not statistically different for the 400ms-interval between the two conditions, and shorter for the BMI condition for the 700ms-interval than for the real hand condition. In our opinion, nothing reliable could be drawn from those analyses regarding binding, and we therefore do not mention these data in the paper. But we indeed agree that this is a critical point in our paper, which is now further elaborated on in the discussion section. 

C31.  Control condition: Moore and Obhi 2012 argue that “Out of all the factors that have been linked to intentional binding, it appears that the presence of efferent information is most central to the manifestation of intentional binding as, when efferent information is not present, such as in passive movement conditions, or passive observation of others, intentional binding is reduced or absent.” Should you not have included such a condition in order to verify an effect of IB?

→ R31. We agree with R2 that we should have included such control conditions. The main reason for not including them was the exhausting nature of the task (learning to control a BMI) in addition to the duration of the task (about 2 hours). We thus considered that the body-generated action condition, which was similar to a host of previous studies using temporal binding was our baseline condition, corresponding to a high sense of agency. We have now added this point as a limitation in our study in the discussion section: “A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex (Haggard et al., 2002) or by having a motoric setup forcing the participant’s finger to press the key (e.g. Moore, Wegner & Haggard, 2009). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue  of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see Moore & Obhi, 2012; Moore, 2016; Haggard, 2017 for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.”

C32. Agency Question: Were there any false positives or catch trials, in which the robotic hand moved without the user’s intention? If the hand only ever moves when participants “intend” to move, then the hand can never actually “move by its own”. Furthermore, if the question(s) are only asked after a successful triggering of the hand movement, is the latter answer not ~unattainable?

→ R32. There were no catch trials per se, in the sense that we did not introduce trials in which the robotic hand moved by its own in the experimental design. There were however false positives, induced by the participants’ lack of control over the robotic hand. The explicit question about control was developed to find out when those false positives occurred: With this question, participants could tell us if each trial was a false positive (i.e. the robotic hand moved by ‘its own’) or an intended trial. We thus indeed rely on participant’s perception to know if the trial was a ‘false positive’ or not. 

C33.  Ownership Question: Is there a way to distinguish between ownership and agency in the current paradigm? Are the scores highly correlated? Can you exclude a counterargument such as demand characteristics being responsible for the ownership scores?

→ R33. Agency and ownership scores were indeed correlated, such as in several previous studies. We can’t really ensure that demand characteristics played a role in the ownership scores, but we tried,  however, to limit its impact more than in former studies (see also R29). This has been added in the discussion section.

C34.  Theoretical accuracy: The EEG (and EMG) methods are clearly described. The results with respect to mu- and beta-band suppression and electrode location are in-line with prior findings and it is good to see this BCI approach being applied to agency and ownership questions. I have a couple of questions which you can probably quite easily clarify. How does “theoretical accuracy” differ from (actual) classifier performance? Could you also briefly explain why you do not use cross-validation to measure the performance?

→ R34. In fact the theoretical accuracy score was the name that we gave to the classifier performance. We agree that this may have been confusing and, in line with Reviewer 1, we have now replaced the word ‘theoretical accuracy’ by ‘classification performance throughout the entire manuscript for the sake of clarity. More information has now been added in the manuscript regarding how the classification performance was calculated and regarding cross-validation: “We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy. ”  and “These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50-60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. ”

C35.  On the same point, can you calculate the classifier performance during the actual experimental block? Does this match the training performance and is this a better or poorer indicator of perceived control?

→ R35. The classifier performance was calculated based on the discrepancy/predictability between the model and participant’s performance during the training session. It means that in the training session, the classifier could know what was the expected fit because we alternated a cross appearing on the screen (corresponding to the rest phase) and an arrow appearing on that cross (corresponding to the motor imagery phase). This is what is represented on Figure 1A and Figure 1B. In the experimental block, participants could manipulate the robotic hand as they wanted, meaning that there was no expected match between a specific stimulus appearing on the screen and participants’ willingness to make the robotic hand move or not. It was thus not possible to calculate a score of classification performance during the experimental block, since we could not ask participants to report continuously when they were imagining the ‘rest phase’ to avoid making the robotic hand moving. 

C36. Theoretical accuracy and correlation to mu/beta oscillations: If I understand correctly, the classifier is trained to detect desynchronization in the mu/beta frequency bands. Performance is then quantified as theoretical accuracy. What does the correlation between theoretical accuracy and mu/beta oscillation actually tell us? Is this simply the confirmation that these are the trained criteria? (Apologies, if I simply missed the point here.)

→ R36. As a reminder, we have now changed the word ‘theoretical accuracy’ into ‘classification performance’ for more clarity. In fact, the feature extraction with CSP and LDA is not directly linked to mu and beta oscillations. However, since we filter the data based on the mu and the beta-bands, this means the classification performance score could be related to oscillations in the mu- and beta- bands. The correlation thus suggests that CSP+LDA correctly identifies the changes in mu and beta oscillations. According to the comment of Reviewer 1 (see R7 and R8), we have now added more information regarding the classification performance in the manuscript. We hope that this section is now clearer. 

- C37. Sample size: The justification of the sample size is not very clear. Would it not be better to either calculate it based on prior or expected effect sizes and add a percentage of participants in case of BCI illiteracy? Alternatively, you could use a Bayesian approach and determine a cut-off criterion.

→ R37. We agree with R2 but we  did not have  a strong prior regarding the expected effect sizes since no previous studies used a similar method and since we had no idea how many participants would be lower or higher than the chance level with our BMI. We did not have to exclude participants with a classification performance score lower than 50% and thus did not have to remove participants. Future studies using a similar approach will include a-priori computation of the sample size.

Dear Editor,

We would like to thank you for your time on this manuscript. We have revised the manuscript according to the comments received from the two reviewers. You will find attached a point-by-point response to the reviewers’ comments. We hope that the manuscript has been improved and will be suitable for publication in Plos One. 

Sincerely,

Emilie A. Caspar, Albert De Beir, Gil Lauwers, Axel Cleeremans and Bram Vanderborght

Reviewer #1: PONE-D-20-06849

This study looked at the sense of agency for actions performed using an EEG-based brain-machine interface (BMI). The intentional binding effect was used as the primary dependent measure, where subjects directly estimate the interval between the cause (button press) and effect (auditory click). In two experiments there were no significant differences in interval estimates between motor actions and BMI-mediated actions, suggesting no difference in agency. The authors did, however, find relationships between other variables in the experiment, such as the sense of control, experienced difficulty, ownership of the robotic hand, and agency (as indexed by the interval estimates).

The paper is not very clearly written, and it was very difficult to make sense of it in places, often because crucial information was left out or not made clear, or jargon was undefined. It really reads like a first or second draft of a paper, rather than a paper that has been through a few rounds of internal review before being deemed ready to submit to a journal for peer review. Many of my comments are thus focused on the writing and on making things more clear. The paper does not make any strong claims, and the conclusions that are drawn seem reasonable given the data. The methods involved in controlling the robotic hand and the robotic hand itself were not given in sufficient detail (for example we do not know anything about the features used to discriminate motor imagery from rest conditions, even though this might be relevant to some of the results that involved correlation with the “theoretical accuracy”). The statistics appear to be appropriate and properly reported, except that for correlations with ordinal data (as in figure 6A) it can be better to use a non-parametric statistic like Spearman’s rho.

 → We thank the referee for this frank but sympathetic assessment and apologize for the quality of the language. We have revised the manuscript extensively, improving both language quality and presentation. We have also added information about the decoding algorithms we used and now hope the manuscript feels more straightforward. We reply to the referee’s specific comments below: 

Specific comments (in no particular order):

C1. p. 4, line 3: what is a “skin-based input”?

→ R1. In the paper of Coyle and colleagues, the authors used a system that was detecting when participants were tapping on their arm to produce an outcome (i.e. a tone). We have now added this description in the relevant passage, as follows: For instance, Coyle, Moore, Kristensson, Fletcher and Blackwell (2012) showed that intentional binding was stronger in a condition in which participants used a skin-based input system that detected when participants were tapping on their arm to produce a resulting tone rather than in a condition in which they were tapping on a button to produce a similar tone

C2. p. 6, line 21”main age” should be “mean age”

→ R2. Corrected.

C3. p. 7, bottom and p. 12, line 22, and figure 3

So they used bi-polar electrodes to record EMG over the flexor digitorum superficialis, but it seems that they just monitored it on-line or on a single-trial basis (says that they used the EMG traces off-line to remove trials with muscle activity). But they did not, it seems, check for sub-threshold muscle activity during motor imagery by looking at the average EMG power over all trials. To do this you have to take the sqrt of the sum of squares in a sliding window for each trial, before you average the trials together. You have to do this if you want to rule out any muscle activity, including sub-threshold muscle activity.

 Also it says that EMG data were high-pass filtered at 10 Hz, but then nothing else. You can’t really see muscle activity without at least rectifying the signal, or (better) converting to power.

→ R3. We agree with this comment. An important aspect of the methods that we did not mention in the manuscript was that there was a baseline correction for the analysis of the EMG. This information has now been added in the relevant section. However, even using a different approach to analyse the EMG data, we cannot entirely rule out the presence of muscular activity. The flexor digitorum superficialis does not control for the movement of the thumb for instance - a movement that we could only check with visual control for overts mouvements. There were also no electrodes placed on the left arm for instance. We agree that these shortcomings constitute a limitation and have now added a discussion paragraph regarding this aspect, as follows (page 42): “Finally, another limitation is that we could not entirely rule out the presence of sub-thresholded muscular activity that could have triggered the BMI during the task. The flexor digitorum superficialis where the electrodes were placed does not allow to record activity associated with the movement of the thumb. In addition, no electrodes were placed on the left arm. Those movements were only controled visually by the experimenter during the task.”. 

C4. p. 9, line 4: They checked to see if, after six training sessions, accuracy of the classifier was below chance. But that’s not what you want to do. You want to check to see if accuracy of the classifier is significantly *different* from chance. Performance of 51% correct might still be chance! You should always say “not different from chance” rather than “below chance”.

→ R4. We were not interested in including only participants that were above chance level, as is indeed usually done in the classic literature regarding BCI. Our aim was to include even people at the chance level in order to have variability in our sample, that is a distribution ranging from people only able to control the BCI at chance level to people having a good control of the BCI. This variability was of interest to us in order to evaluate how different degrees of control over the BCI result in a modulation of the sense of agency as well as in embodiment of the robotic hand. We further clarified this choice on page 12 as follows: “We chose to also accept participants who were at the chance level in order to create variability in the control of the BMI to be able to study how this variability influences other factors.” Our only criterion was excluding participants if they were lower than 50%. We agreed that performance should be significantly lower than 50% to truly consider that our exclusion criteria were respected, since indeed a score of 48 or 49% could still be considered as the chance level. We have now modified the sentence on page 12 as follows: “At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. ”

C5. p. 10, line 5: "red cross” should be “red arrow”

 → R5. Thank you for noticing this typo, we have now corrected it.

C6. It would be helpful and useful if you computed AUC for control of the robotic hand, just to give an idea of what level of control your subjects managed to achieve.

 → R6. We thank the referee for this suggestion. Computing the area under the curve indeed makes it possible to better describe a binary classifier. Although often used in data science, we found very few examples of this metric in our field. As the goal of this paper was to use the BCI merely as a method in behavioural research  rather than to develop a better BCI, we decided not to include the suggested AUC analyses.

C7. What do you mean by the “theoretical” accuracy score? This was not clear enough. On what data set was this theoretical accuracy determined? You should have a test set and a training set, but you did not specify what you used as training data and what you used as test data. Please clarify.

→ R7. Thank you for this comment. We agree that the theoretical accuracy score could be confusing, such as pointed out by Reviewer 2 as well. We have now decided to use the terminology “classification performance” for more clarity. We have also added more information about the training session, as follows, on page 12: “These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50-60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed.”

C8. If theoretical accuracy refers to the predicted accuracy on test data given the accuracy of the classifier on then training data, then of course we expect mu and beta power to be predictive, because the classifier is almost certainly using mu and beta power to discriminate between motor imagery and rest. Using CSP and LDA is a very common recipe for EEG-based BCIs, and it tends very often to converge on mu and beta desynchronization. This brings up another point which is that you should specify in your methods section which were the features that the classifier used. You don’t have to go into all of the details of the classifier, but at a minimum you should state what the features were. Did the classifier operate on the raw time series, or (more likely) on a time-frequency representation of the data?

→ R8. Thank you for this comment. We have now added more information about the classification performance, as follows, on page 11: “The feature extraction that followed each training session was carried out with a Variant of Common Spatial Pattern (CSP), and the classifier used  Linear Discriminant Analysis (LDA, see Kothe & Makeig, 2013). We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy. ”

C9. p. 14, label says “FIGURE 3” but I think this is figure 2.

 → R9. We have now corrected this typo. 

C10. “Location” = electrode location? This was not always clear. Maybe just call it “electrode”. What does the word “location” refer to in the section on "Perceived control and appropriation of the robotic hand”? Are you using the word “location” in two different ways. I know that one is the electrode location, but then it seems you are using it for something else as well. This was confusing.

→ R10. We agree with this suggestion and we have now used ‘electrode’ instead of ‘location’ when necessary, since location could also refer to the ‘location’ score from the rubber hand illusion questionnaire measuring the degree of appropriation of the robotic hand.  ‘Location’ when referring to the appropriation of the robotic hand was left as before since this is in accordance with the wording used in other papers on the RHI. 

C11. For spectral power calculations on the EEG data, did you normalize, i.e. divide by the standard deviation of the baseline? I realize that you did your power spectral analyses without baseline correction because the time leading up to the action was too heterogenous. But couldn’t you use a time window before even the start of the trial as your baseline, or perhaps during the two-second waiting period? Because power is typically higher at lower frequencies because of 1/f scaling, so it is more meaningful to express your power estimates as a Z score before comparing. It is not really valid to compare power at in different frequency bands. I.e. you comparing power in the mu band to power in the beta band in kind of meaningless. Mu band will always win, just because there is always more power at lower frequencies.

→ R11. Thank you for this insightful comment.Those two options are unfortunately not reliable. Regarding the 2s waiting period, even though we asked participants to relax, we did not specify exactly when the waiting period finished (e.g. with a sentence appearing on the screen for instance, explicitly instructing them to relax). In the BMI-generated action condition, we cannot be entirely sure of when participants start to use motor imagery. Some participants could for instance start trying to use motor imagery to make the robotic hand move after 1s or 1.5 s. Since we don’t have this information, we doubt we can use it  as a baseline. Regarding the start of the trial option, participants finished the former trial by validating their answer by pressing the ENTER keypress. They did not have to wait for a fixed period of time before starting the next trial by pressing the ENTER key again. Most of the participants started the next trial right away. Therefore, this cannot constitute a proper baseline period either. But we agree that for future studies we should integrate a fixed and more clear time period to obtain a proper baseline correction. 

We have followed the suggestion to normalize the data by transforming  the data associated with each frequency band into z scores. We carried all analyses again on the normalized data, and reported these analyses in our revision. 

C12. Figure 2 A and C: what are the units???

 → R12. This has now been added. 

C13. p. 18, line 16: What are the two variables precisely? One is the difference in interval estimates between the two tasks, but what is the other one? You just refer to the “difference between the real hand key press and the robotic hand key press”, but the difference in what? Spectral power? And what is each data point? A subject (I presume)? You just need to be more clear and specific.

→ R13. We apologize for the confusion. We have now modified the sentence as follows: “We additionally investigated, with Pearson correlations, whether or not the difference in power in the mu- and the beta-bands between the real hand keypress and the robotic hand keypress on C3, Cz and C4 for each participant could predict the difference in interval estimates between the same two tasks.”

C14. p. 20, line 10: This is confusing. You write “we observed a greater desynchronization in the beta band than in the mu band.” Do you mean for real hand movements? I thought that you had observed greater desynchronization in the mu band compared to the beta band (see p. 19 line 15).

→ R14. Those results have now changed because we have normalized the data within each frequency band. The new results and the adapted discussion are now reported in the revised version of the manuscript. Overall, power in the mu and the beta-band now do not significantly differ between each other. 

C15. You state that there was no difference in the interval estimates for real button press versus motor imagery, but did you verify that you even obtained a canonical intentional binding effect in the first place? I did not see this anywhere. Maybe there was an IB effect for both real-hand and robotic-hand keypresses, in which case you might not see any difference. Or maybe neither of them had the effect. Don’t you need a passive viewing or involuntary-motor task to control for that?

→ R15. We agree with this comment, which is similar to comment C31 from reviewer 2. In the classic IB literature, a comparison between two experimental conditions is necessary, generally an active and a passive condition, in order to detect changes in the sense of agency. In our study, we did not introduce a passive condition for the following reason: the experiment was already quite long and exhausting for participants. Adding additional experimental conditions would have led to even higher cognitive fatigue, which could have been detrimental for the interval estimate task. Our main aim was to investigate if agency would differ between the body-generated action condition and the BMI-generated action condition, and hence we did not introduce a control, passive condition which would have been critical to know if agency was ‘high’ in both conditions or ‘low’ in both conditions. Given the extensive existing literature on body-generated action and interval estimates, we considered that, like in the existing literature, this condition reflected a sort of ‘high agency condition’, since movements were voluntarily produced with participants’ own keypress. Based on this, we thus extrapolated that agency was also high in the BMI-generated action condition. We fully agree that this extrapolation may not be sufficient, and we have now added this comment as a limitation in the general discussion of the present paper, as follows: “A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex (Haggard et al., 2002) or by having a motoric setup forcing the participant’s finger to press the key (e.g. Moore, Wegner & Haggard, 2009). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue  of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see Moore & Obhi, 2012; Moore, 2016; Haggard, 2017 for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.”. 

C16. p. 24, line 6: You talk about the “difference between the imagery phases and the rest phases”, but the difference in what?? You should always specify what variable you are talking about, even if you think it is obvious. Did you mean the difference in spectral power? If so then how measured? I.e. over what time interval / frequency range? And how did you compute your power estimates? Wavelets? Band-pass filter plus Hilbert transform? In general you have not been precise enough in describing your analyses.

→ R16. We agree that we should have been more precise in the way we report our analyses. We have now added the following information: “We conducted a repeated measures ANOVA with Rhythm (Mu, Beta) and Electrode (C3, Cz, C4) as within-subject factors on the difference in spectral power in the mu- and beta-bands between the imagery phases and the rest phases during the training session. Data were normalized by subtracting the global mean from each data point and by dividing the result  by the global SD. ” Also, in the section entitled ‘electroencephalography recordings and processing’, we have now added more information on how the power estimates were computed. 

C17. Figure 5: units!!

 → R17. We have added the units.

C18. When you talk about the “mu rhythm” and “beta rhythm” what do you mean? Do you mean power in the mu-band and power in the beta band? If so then you should say that. Just saying “mu rhythm” or “beta rhythm” is not specific enough. It’s unclear.

→ R18. We have now changed the wording in the entire manuscript, as suggested. We hope that the manuscript is now clearer. 

C19. Figure 6A: I am not sure it is appropriate or optimal to use Pearson’s correlation coefficient on a discrete variable (like perceived control).

→ R19. We agree that there is an ongoing debate about the use of Pearson correlations for discrete variables that have 6 points or more or if the use of nonparametric tests is more appropriate. However, R1 is right that the use of Spearman’s Rho could be more adapted and we have now changed the manuscript where necessary by reporting results with Spearman’s Rho instead of Pearson r. 

C20. No detail was given about the robotic hand itself, and very little detail about the BMI that was controlling it. How long was the latency between the BMI decision and the movement of the robotic hand? This information was given in the discussion, but should appear in the methods section as well. How long did it take for the robotic hand to depress the key? Was it an abrupt movement, or did the robotic hand move slowly until the key was pressed? Or did it move at a speed that was determined by the degree of BMI control? Did the key produce an audible click sound when pressed? These might seem like trivial details, but they may play a role in determining the sense of agency over the robotic hand, and the subsequent sense of agency over the keypress made by the robotic hand.

→ We have now added an additional section in the method section regarding the robotic hand. We have mentioned its features, the publications that detail how to build it and the time to press the keys. The time the robotic hand takes to press the key is indeed very important regarding the time estimation that participants have to make. We set-up this timing to 900 ms in order to have the robotic hand redressing the finger only after the beep occured on each trial and so to avoid temporal distortion for interval estimates. 

C21. Does the theoretical accuracy predict perceived control? This was significant in both experiments and simply suggests that better performance on the part of the classifier is associated with a stronger feeling of control over the BCI (which it should). A prior study that is directly relevant here is Schurger et al, Brain & Cognition (2016) which also looked at judgements of control over a BMI.

→ R21. There was indeed a significant correlation between classification performance and perceived control. We have now added the requested reference in the discussion, since it was indeed relevant: “This result is in line with a former study, which showed that participants are good at evaluating their ability to control a BMI even if they do not receive an online feedback on their performance (Schurger, Gale, Gozel, & Blanke, 2017). ”

C22. p. 26, lines 9-11: This sentence is simply confusing. I can’t make any sense of it. Please clarify. And what do you mean by the “estimated interval estimate”? Isn’t it just either the “estimated interval” or the “interval estimate”? And what is "the reported difficulty to make the hand performing the movement”? Does this refer to the robotic hand? Do you perhaps mean “the reported difficulty making the robotic hand perform the movement”?

→ R22. We have now changed these sentences as follows: “We then performed non-parametric correlation with Spearman's rho (⍴) between the reported interval estimates, the perceived control over the movement of the robotic hand and the reported difficulty to make the robotic hand perform the movement, on a trial-by-trial basis. ”

C23. p. 26, lines 24-25: "Results indicated that the perceived control was a better predictor of interval estimates than the reported difficulty.” Isn’t there guaranteed to be some collinearity in this regression since we know already that perceived control and reported difficulty are correlated? And how did you determine that perceived control was the better predictor? If they are collinear then this would be difficult to ascertain. In part it could just be that this was not clearly expressed in writing. The writing definitely could stand to be improved.

 → R23. We have now reported the VIF that rules out a collinearity effect (all VIF are always =< 2).  However, we do agree that both factors (i.e. perceived control and reported difficulty) could play antagonist roles on the implicit sense of agency: while higher perceived control could boost the implicit sense of agency, higher difficulty could reduce it. 

C24. p. 27, line 5: “p = -2.360”???

 → R24. We have now corrected this typo. 

C25. Again, what are “location scores”? This was unclear to me given that you used the word “location” to refer to which electrode you were looking at.

→ R25. We have now used the word ‘electrode’ when it referred to the electrode location. The word ‘location’ now only refers to the location scores from the questionnaire assessing the appropriation of the robotic hand. 

------

Reviewer #2: In two studies the authors lay out the importance of one’s sense of agency (SoA) over one’s actions and in particular actions performed by assistive technology via a brain-computer interface. Furthermore, they explore how the performance of the BCI as well as one’s perceived control or fatigue affect one’s SoA for the observed actions and how this may affect one’s sense of ownership over the assistive technology.

• What are the main claims of the paper and how significant are they for the discipline?

The main claims of the paper are that i) the performance of the BCI has immediate influence over one’s explicit SoA (judgment of agency - JoA), even though the BCI actions do not produce reafferences; ii) similarly, the performance of the BCI predicts one’s implicit SoA (or feeling of Agency – FoA), as determined using intentional binding (IB) as dependent variable; iii) the perceived effort or difficulty of completing the task with the BCI had a negative effect on the FoA, and that iv) an increased SoA also aids embodiment or ownership of the device.

Generally, these claims are very interesting for the discipline, as they attempt to disentangle different aspects of one’s SoA by combining implicit and explicit judgments and linking them to movement-related cortical desynchronization.

Overall, I am not convinced the authors have controlled for all aspects of their study to fully support these claims.

→ Thanks for judging our work and claims as interesting. We hope our revision further supports these claims. 

• Are the claims properly placed in the context of the previous literature? Have the authors treated the literature fairly?

C27. The introduction of the manuscript provides an overview of relevant literature on the sense of agency, intentional binding, and BCIs. While the authors aim to disentangle these different concepts in order to motivate their study design, I disagree with some of their key arguments here. I would appreciate some clarification on these points, as they are important for the study design, the interpretation of their results, as well as their implications.

Intentional binding

C26. I agree that it is important to separate the FoA from the JoA, as the authors do. However, I am not convinced that the “pre-reflective experience of” the FoA can be assessed using Intentional Binding. Indeed, I would think that e.g. Moore and Obhi (2012 Consciousness & Cognition) or Ebert and Wegner (2010, also C&C) would argue that IB is rather compatible with the JoA. (I have more comments on this for the analysis.) The FoA describes an implicit, on-going aspect of performing an action or making a movement. Neither of these points is true for IB, where the action has already been completed with the press of the button. I understand that this is a general discussion point, not specific to this study, but I think it should be considered.

→ R26.This is indeed a fair point and a lot of explicit measures have also been criticized in terms of how much they relate to agency, since classically authors use different questions that are adapted to their experimental paradigm (i.e. “Do you feel in control of the action”, “Do you feel that you are the authors of the action”, “Do you feel that who caused the outcomes”? etc.). However, since the point is somewhat derivative with respect to our main goals,  we chose not to further develop this discussion point. 

C27.  Comparing the JoA and IB also seems difficult, as the former starts with one’s movement intention and ends with the press of the button (“the hand moved according to my will”), whereas the latter only reflects the cause-and-effect of the button-press followed by a tone (“the delay that occurred between the robotic hand keypress and the tone”). It could therefore be argued that the JoA and the FoA measured here concern different processes and should not directly be compared (without further justification).

→ R27. We agree with the reviewer’s comment and that’s why we do not directly compare them. We nonetheless evaluate the correlations between these two experiences of the self, when one is producing an action. Several earlier  papers have evaluated (through correlations) how these two experiences of the self, despite measuring different aspects, relate and that is the procedure we have used here. 

C28. With respect to further literature on the IB paradigm, Suzuki et al.’s findings (Psychological Science 2019) suggesting that IB may simply reflect “multisensory causal binding” would be relevant to include, particularly as the current paper does neither includes baseline estimations for action- or outcome-binding nor a passive control condition. The work by Rohde and Ernst (e.g. Behavioural Science, 2016) is also relevant.

→ R28. According to the study of Suzuki et al 2019, any study wishing to draw conclusions specific to voluntary action, or similar theoretical constructs, needs to be based on the comparison of carefully matched conditions that should ideally differ in just one critical aspect. Here, our conditions only differed in the sensorimotor information coming from the keypress; they were similar in terms of causality and volition. We are aware of the current debate over the effect of causality on IB. While some authors argued that causality plays an important role (e.g. Buehner & Humphrey, 2009) in explaining variability in IB between 2 experimental conditions, other authors have shown that when causality is controlled as well as other features between two experimental conditions, and that only intentionality remains, a difference in IB still occur (Caspar, Cleeremans & Haggard, 2018).  - study 2). Very recent studies come to the conclusion that even if several factors, such as causality, play a role in the modulation of binding, intentionality also plays a key role (e.g., Lorimer, … , Buehner, 2020). However, since this debate is not the aim of the present paper, we do not discuss this literature in the present paper.

Questionnaire Data

C29. Table 1 seems to be missing in my documents, so I am not entirely sure, my comments on these data are completely accurate. However, a general point to consider for the questionnaire is – what is the control condition and is there a control question? (also see my comment in the data analysis section.) Recently, the questionnaires used in the rubber hand illusion and comparable studies have come under (even more) scrutiny. It would be great, if the authors could include a brief discussion of their questionnaire results with respect to the points raised by Peter Lush (2020) “Demand Characteristics Confound the Rubber Hand Illusion” Collabra: Psychology.

→ R29. We apologize for the omission. Table 1 was indeed missing and has now been added in the manuscript.  As far as we understand it, the critical difference between our design and that of Lush (2020) is that we never mentioned anything to our participants regarding the ‘robotic’ hand illusion or even that they would be given a questionnaire at the end of the experiment evaluating their appropriation of the robotic hand. We even avoided putting the classical blanket between the robotic hand and the participant to cover the participant’s arm, in order to avoid them thinking that we would evaluate the robotic hand as part of their own body. Our participants thus had a limited room for creating expectancies regarding the questionnaires and the robotic hand. However, we agree that controlling for expectancies is relevant in the case of the rubber hand illusion. We have now integrated this discussion in the general discussion section, as follows: “Several articles have argued that demand characteristics (Orne, 1962) and expectancies can predict scores on the classical rubber hand illusion (RHI) questionnaires (Lush, 2020). In the present study, we limited as much as possible the effects of both demand characteristics and expectancies on participants’ answer to the questionnaire related to the appropriation of the robotic hand: (1) Participants were not told anything about the rubber hand illusion or the fact that some people may experience the robotic hand as a part of their body during the experiment; (2) We did not tell them in advance that they would have to fill in a questionnaire regarding their appropriation of the robotic hand at the end of the experiment; (3) We avoided creating a proper classical ‘rubber hand illusion’: the robotic hand was placed at a congruent place on the table regarding the participant’s arm, but we did not use the blanket to cover their real arm, neither we used a paintbrush to induce an illusion. This procedure limited the possibility for participants to guess what we would assess since in the classical rubber hand illusion, placing the blanket on the participant’s arm make them realize that we investigate to what extent they appropriate this hand in their body schema. However, we cannot fully rule out the influence of demand characteristics and expectancies, as participants perceiving a better control over the hand, even without knowing their own classification performance, could have intuitively indicated higher scores on the questionnaires. Control studies could thus be performed to manipulate demand characteristics in order to evaluate its effect on the appropriation of a robotic hand in a brain-machine interface procedure (Lush, 2020).”

• Do the data and analyses fully support the claims? If not, what other evidence is required?

C30. IB: Was there any evidence of an intentional binding effect? It would be interesting to see the distribution of the reported intervals – is it trimodal? (Cf. figures in Suzuki et al.) Was the interval actually reduced? As there is apparently no passive or control condition and no difference between human or robot hand movements, it is not clear if there was any effect at all. Perhaps this is something that can be explored in the existing data set. If there is no effect of binding, what does that mean with respect to the findings of study 2?

→ R30. With respect to Study 2, we cannot infer that the sense of agency was ‘strong’ or ‘weak’ on either training days, which we did not argue for in our discussion. Rather, we can say that having a good control over the BMI boosts sense of agency, as measured through the method of interval estimates. We had three intervals: we have run those analyses with delays as an additional within-subject factor and have observed that IE were longer for the BMI condition for the 100ms-interval than for the real hand condition, not statistically different for the 400ms-interval between the two conditions, and shorter for the BMI condition for the 700ms-interval than for the real hand condition. In our opinion, nothing reliable could be drawn from those analyses regarding binding, and we therefore do not mention these data in the paper. But we indeed agree that this is a critical point in our paper, which is now further elaborated on in the discussion section. 

C31.  Control condition: Moore and Obhi 2012 argue that “Out of all the factors that have been linked to intentional binding, it appears that the presence of efferent information is most central to the manifestation of intentional binding as, when efferent information is not present, such as in passive movement conditions, or passive observation of others, intentional binding is reduced or absent.” Should you not have included such a condition in order to verify an effect of IB?

→ R31. We agree with R2 that we should have included such control conditions. The main reason for not including them was the exhausting nature of the task (learning to control a BMI) in addition to the duration of the task (about 2 hours). We thus considered that the body-generated action condition, which was similar to a host of previous studies using temporal binding was our baseline condition, corresponding to a high sense of agency. We have now added this point as a limitation in our study in the discussion section: “A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex (Haggard et al., 2002) or by having a motoric setup forcing the participant’s finger to press the key (e.g. Moore, Wegner & Haggard, 2009). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue  of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see Moore & Obhi, 2012; Moore, 2016; Haggard, 2017 for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.”

C32. Agency Question: Were there any false positives or catch trials, in which the robotic hand moved without the user’s intention? If the hand only ever moves when participants “intend” to move, then the hand can never actually “move by its own”. Furthermore, if the question(s) are only asked after a successful triggering of the hand movement, is the latter answer not ~unattainable?

→ R32. There were no catch trials per se, in the sense that we did not introduce trials in which the robotic hand moved by its own in the experimental design. There were however false positives, induced by the participants’ lack of control over the robotic hand. The explicit question about control was developed to find out when those false positives occurred: With this question, participants could tell us if each trial was a false positive (i.e. the robotic hand moved by ‘its own’) or an intended trial. We thus indeed rely on participant’s perception to know if the trial was a ‘false positive’ or not. 

C33.  Ownership Question: Is there a way to distinguish between ownership and agency in the current paradigm? Are the scores highly correlated? Can you exclude a counterargument such as demand characteristics being responsible for the ownership scores?

→ R33. Agency and ownership scores were indeed correlated, such as in several previous studies. We can’t really ensure that demand characteristics played a role in the ownership scores, but we tried,  however, to limit its impact more than in former studies (see also R29). This has been added in the discussion section.

C34.  Theoretical accuracy: The EEG (and EMG) methods are clearly described. The results with respect to mu- and beta-band suppression and electrode location are in-line with prior findings and it is good to see this BCI approach being applied to agency and ownership questions. I have a couple of questions which you can probably quite easily clarify. How does “theoretical accuracy” differ from (actual) classifier performance? Could you also briefly explain why you do not use cross-validation to measure the performance?

→ R34. In fact the theoretical accuracy score was the name that we gave to the classifier performance. We agree that this may have been confusing and, in line with Reviewer 1, we have now replaced the word ‘theoretical accuracy’ by ‘classification performance throughout the entire manuscript for the sake of clarity. More information has now been added in the manuscript regarding how the classification performance was calculated and regarding cross-validation: “We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy. ”  and “These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50-60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. ”

C35.  On the same point, can you calculate the classifier performance during the actual experimental block? Does this match the training performance and is this a better or poorer indicator of perceived control?

→ R35. The classifier performance was calculated based on the discrepancy/predictability between the model and participant’s performance during the training session. It means that in the training session, the classifier could know what was the expected fit because we alternated a cross appearing on the screen (corresponding to the rest phase) and an arrow appearing on that cross (corresponding to the motor imagery phase). This is what is represented on Figure 1A and Figure 1B. In the experimental block, participants could manipulate the robotic hand as they wanted, meaning that there was no expected match between a specific stimulus appearing on the screen and participants’ willingness to make the robotic hand move or not. It was thus not possible to calculate a score of classification performance during the experimental block, since we could not ask participants to report continuously when they were imagining the ‘rest phase’ to avoid making the robotic hand moving. 

C36. Theoretical accuracy and correlation to mu/beta oscillations: If I understand correctly, the classifier is trained to detect desynchronization in the mu/beta frequency bands. Performance is then quantified as theoretical accuracy. What does the correlation between theoretical accuracy and mu/beta oscillation actually tell us? Is this simply the confirmation that these are the trained criteria? (Apologies, if I simply missed the point here.)

→ R36. As a reminder, we have now changed the word ‘theoretical accuracy’ into ‘classification performance’ for more clarity. In fact, the feature extraction with CSP and LDA is not directly linked to mu and beta oscillations. However, since we filter the data based on the mu and the beta-bands, this means the classification performance score could be related to oscillations in the mu- and beta- bands. The correlation thus suggests that CSP+LDA correctly identifies the changes in mu and beta oscillations. According to the comment of Reviewer 1 (see R7 and R8), we have now added more information regarding the classification performance in the manuscript. We hope that this section is now clearer. 

- C37. Sample size: The justification of the sample size is not very clear. Would it not be better to either calculate it based on prior or expected effect sizes and add a percentage of participants in case of BCI illiteracy? Alternatively, you could use a Bayesian approach and determine a cut-off criterion.

→ R37. We agree with R2 but we  did not have  a strong prior regarding the expected effect sizes since no previous studies used a similar method and since we had no idea how many participants would be lower or higher than the chance level with our BMI. We did not have to exclude participants with a classification performance score lower than 50% and thus did not have to remove participants. Future studies using a similar approach will include a-priori computation of the sample size.

Dear Editor,

We would like to thank you for your time on this manuscript. We have revised the manuscript according to the comments received from the two reviewers. You will find attached a point-by-point response to the reviewers’ comments. We hope that the manuscript has been improved and will be suitable for publication in Plos One. 

Sincerely,

Emilie A. Caspar, Albert De Beir, Gil Lauwers, Axel Cleeremans and Bram Vanderborght

Reviewer #1: PONE-D-20-06849

This study looked at the sense of agency for actions performed using an EEG-based brain-machine interface (BMI). The intentional binding effect was used as the primary dependent measure, where subjects directly estimate the interval between the cause (button press) and effect (auditory click). In two experiments there were no significant differences in interval estimates between motor actions and BMI-mediated actions, suggesting no difference in agency. The authors did, however, find relationships between other variables in the experiment, such as the sense of control, experienced difficulty, ownership of the robotic hand, and agency (as indexed by the interval estimates).

The paper is not very clearly written, and it was very difficult to make sense of it in places, often because crucial information was left out or not made clear, or jargon was undefined. It really reads like a first or second draft of a paper, rather than a paper that has been through a few rounds of internal review before being deemed ready to submit to a journal for peer review. Many of my comments are thus focused on the writing and on making things more clear. The paper does not make any strong claims, and the conclusions that are drawn seem reasonable given the data. The methods involved in controlling the robotic hand and the robotic hand itself were not given in sufficient detail (for example we do not know anything about the features used to discriminate motor imagery from rest conditions, even though this might be relevant to some of the results that involved correlation with the “theoretical accuracy”). The statistics appear to be appropriate and properly reported, except that for correlations with ordinal data (as in figure 6A) it can be better to use a non-parametric statistic like Spearman’s rho.

 → We thank the referee for this frank but sympathetic assessment and apologize for the quality of the language. We have revised the manuscript extensively, improving both language quality and presentation. We have also added information about the decoding algorithms we used and now hope the manuscript feels more straightforward. We reply to the referee’s specific comments below: 

Specific comments (in no particular order):

C1. p. 4, line 3: what is a “skin-based input”?

→ R1. In the paper of Coyle and colleagues, the authors used a system that was detecting when participants were tapping on their arm to produce an outcome (i.e. a tone). We have now added this description in the relevant passage, as follows: For instance, Coyle, Moore, Kristensson, Fletcher and Blackwell (2012) showed that intentional binding was stronger in a condition in which participants used a skin-based input system that detected when participants were tapping on their arm to produce a resulting tone rather than in a condition in which they were tapping on a button to produce a similar tone

C2. p. 6, line 21”main age” should be “mean age”

→ R2. Corrected.

C3. p. 7, bottom and p. 12, line 22, and figure 3

So they used bi-polar electrodes to record EMG over the flexor digitorum superficialis, but it seems that they just monitored it on-line or on a single-trial basis (says that they used the EMG traces off-line to remove trials with muscle activity). But they did not, it seems, check for sub-threshold muscle activity during motor imagery by looking at the average EMG power over all trials. To do this you have to take the sqrt of the sum of squares in a sliding window for each trial, before you average the trials together. You have to do this if you want to rule out any muscle activity, including sub-threshold muscle activity.

 Also it says that EMG data were high-pass filtered at 10 Hz, but then nothing else. You can’t really see muscle activity without at least rectifying the signal, or (better) converting to power.

→ R3. We agree with this comment. An important aspect of the methods that we did not mention in the manuscript was that there was a baseline correction for the analysis of the EMG. This information has now been added in the relevant section. However, even using a different approach to analyse the EMG data, we cannot entirely rule out the presence of muscular activity. The flexor digitorum superficialis does not control for the movement of the thumb for instance - a movement that we could only check with visual control for overts mouvements. There were also no electrodes placed on the left arm for instance. We agree that these shortcomings constitute a limitation and have now added a discussion paragraph regarding this aspect, as follows (page 42): “Finally, another limitation is that we could not entirely rule out the presence of sub-thresholded muscular activity that could have triggered the BMI during the task. The flexor digitorum superficialis where the electrodes were placed does not allow to record activity associated with the movement of the thumb. In addition, no electrodes were placed on the left arm. Those movements were only controled visually by the experimenter during the task.”. 

C4. p. 9, line 4: They checked to see if, after six training sessions, accuracy of the classifier was below chance. But that’s not what you want to do. You want to check to see if accuracy of the classifier is significantly *different* from chance. Performance of 51% correct might still be chance! You should always say “not different from chance” rather than “below chance”.

→ R4. We were not interested in including only participants that were above chance level, as is indeed usually done in the classic literature regarding BCI. Our aim was to include even people at the chance level in order to have variability in our sample, that is a distribution ranging from people only able to control the BCI at chance level to people having a good control of the BCI. This variability was of interest to us in order to evaluate how different degrees of control over the BCI result in a modulation of the sense of agency as well as in embodiment of the robotic hand. We further clarified this choice on page 12 as follows: “We chose to also accept participants who were at the chance level in order to create variability in the control of the BMI to be able to study how this variability influences other factors.” Our only criterion was excluding participants if they were lower than 50%. We agreed that performance should be significantly lower than 50% to truly consider that our exclusion criteria were respected, since indeed a score of 48 or 49% could still be considered as the chance level. We have now modified the sentence on page 12 as follows: “At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. ”

C5. p. 10, line 5: "red cross” should be “red arrow”

 → R5. Thank you for noticing this typo, we have now corrected it.

C6. It would be helpful and useful if you computed AUC for control of the robotic hand, just to give an idea of what level of control your subjects managed to achieve.

 → R6. We thank the referee for this suggestion. Computing the area under the curve indeed makes it possible to better describe a binary classifier. Although often used in data science, we found very few examples of this metric in our field. As the goal of this paper was to use the BCI merely as a method in behavioural research  rather than to develop a better BCI, we decided not to include the suggested AUC analyses.

C7. What do you mean by the “theoretical” accuracy score? This was not clear enough. On what data set was this theoretical accuracy determined? You should have a test set and a training set, but you did not specify what you used as training data and what you used as test data. Please clarify.

→ R7. Thank you for this comment. We agree that the theoretical accuracy score could be confusing, such as pointed out by Reviewer 2 as well. We have now decided to use the terminology “classification performance” for more clarity. We have also added more information about the training session, as follows, on page 12: “These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50-60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed.”

C8. If theoretical accuracy refers to the predicted accuracy on test data given the accuracy of the classifier on then training data, then of course we expect mu and beta power to be predictive, because the classifier is almost certainly using mu and beta power to discriminate between motor imagery and rest. Using CSP and LDA is a very common recipe for EEG-based BCIs, and it tends very often to converge on mu and beta desynchronization. This brings up another point which is that you should specify in your methods section which were the features that the classifier used. You don’t have to go into all of the details of the classifier, but at a minimum you should state what the features were. Did the classifier operate on the raw time series, or (more likely) on a time-frequency representation of the data?

→ R8. Thank you for this comment. We have now added more information about the classification performance, as follows, on page 11: “The feature extraction that followed each training session was carried out with a Variant of Common Spatial Pattern (CSP), and the classifier used  Linear Discriminant Analysis (LDA, see Kothe & Makeig, 2013). We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy. ”

C9. p. 14, label says “FIGURE 3” but I think this is figure 2.

 → R9. We have now corrected this typo. 

C10. “Location” = electrode location? This was not always clear. Maybe just call it “electrode”. What does the word “location” refer to in the section on "Perceived control and appropriation of the robotic hand”? Are you using the word “location” in two different ways. I know that one is the electrode location, but then it seems you are using it for something else as well. This was confusing.

→ R10. We agree with this suggestion and we have now used ‘electrode’ instead of ‘location’ when necessary, since location could also refer to the ‘location’ score from the rubber hand illusion questionnaire measuring the degree of appropriation of the robotic hand.  ‘Location’ when referring to the appropriation of the robotic hand was left as before since this is in accordance with the wording used in other papers on the RHI. 

C11. For spectral power calculations on the EEG data, did you normalize, i.e. divide by the standard deviation of the baseline? I realize that you did your power spectral analyses without baseline correction because the time leading up to the action was too heterogenous. But couldn’t you use a time window before even the start of the trial as your baseline, or perhaps during the two-second waiting period? Because power is typically higher at lower frequencies because of 1/f scaling, so it is more meaningful to express your power estimates as a Z score before comparing. It is not really valid to compare power at in different frequency bands. I.e. you comparing power in the mu band to power in the beta band in kind of meaningless. Mu band will always win, just because there is always more power at lower frequencies.

→ R11. Thank you for this insightful comment.Those two options are unfortunately not reliable. Regarding the 2s waiting period, even though we asked participants to relax, we did not specify exactly when the waiting period finished (e.g. with a sentence appearing on the screen for instance, explicitly instructing them to relax). In the BMI-generated action condition, we cannot be entirely sure of when participants start to use motor imagery. Some participants could for instance start trying to use motor imagery to make the robotic hand move after 1s or 1.5 s. Since we don’t have this information, we doubt we can use it  as a baseline. Regarding the start of the trial option, participants finished the former trial by validating their answer by pressing the ENTER keypress. They did not have to wait for a fixed period of time before starting the next trial by pressing the ENTER key again. Most of the participants started the next trial right away. Therefore, this cannot constitute a proper baseline period either. But we agree that for future studies we should integrate a fixed and more clear time period to obtain a proper baseline correction. 

We have followed the suggestion to normalize the data by transforming  the data associated with each frequency band into z scores. We carried all analyses again on the normalized data, and reported these analyses in our revision. 

C12. Figure 2 A and C: what are the units???

 → R12. This has now been added. 

C13. p. 18, line 16: What are the two variables precisely? One is the difference in interval estimates between the two tasks, but what is the other one? You just refer to the “difference between the real hand key press and the robotic hand key press”, but the difference in what? Spectral power? And what is each data point? A subject (I presume)? You just need to be more clear and specific.

→ R13. We apologize for the confusion. We have now modified the sentence as follows: “We additionally investigated, with Pearson correlations, whether or not the difference in power in the mu- and the beta-bands between the real hand keypress and the robotic hand keypress on C3, Cz and C4 for each participant could predict the difference in interval estimates between the same two tasks.”

C14. p. 20, line 10: This is confusing. You write “we observed a greater desynchronization in the beta band than in the mu band.” Do you mean for real hand movements? I thought that you had observed greater desynchronization in the mu band compared to the beta band (see p. 19 line 15).

→ R14. Those results have now changed because we have normalized the data within each frequency band. The new results and the adapted discussion are now reported in the revised version of the manuscript. Overall, power in the mu and the beta-band now do not significantly differ between each other. 

C15. You state that there was no difference in the interval estimates for real button press versus motor imagery, but did you verify that you even obtained a canonical intentional binding effect in the first place? I did not see this anywhere. Maybe there was an IB effect for both real-hand and robotic-hand keypresses, in which case you might not see any difference. Or maybe neither of them had the effect. Don’t you need a passive viewing or involuntary-motor task to control for that?

→ R15. We agree with this comment, which is similar to comment C31 from reviewer 2. In the classic IB literature, a comparison between two experimental conditions is necessary, generally an active and a passive condition, in order to detect changes in the sense of agency. In our study, we did not introduce a passive condition for the following reason: the experiment was already quite long and exhausting for participants. Adding additional experimental conditions would have led to even higher cognitive fatigue, which could have been detrimental for the interval estimate task. Our main aim was to investigate if agency would differ between the body-generated action condition and the BMI-generated action condition, and hence we did not introduce a control, passive condition which would have been critical to know if agency was ‘high’ in both conditions or ‘low’ in both conditions. Given the extensive existing literature on body-generated action and interval estimates, we considered that, like in the existing literature, this condition reflected a sort of ‘high agency condition’, since movements were voluntarily produced with participants’ own keypress. Based on this, we thus extrapolated that agency was also high in the BMI-generated action condition. We fully agree that this extrapolation may not be sufficient, and we have now added this comment as a limitation in the general discussion of the present paper, as follows: “A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex (Haggard et al., 2002) or by having a motoric setup forcing the participant’s finger to press the key (e.g. Moore, Wegner & Haggard, 2009). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue  of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see Moore & Obhi, 2012; Moore, 2016; Haggard, 2017 for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.”. 

C16. p. 24, line 6: You talk about the “difference between the imagery phases and the rest phases”, but the difference in what?? You should always specify what variable you are talking about, even if you think it is obvious. Did you mean the difference in spectral power? If so then how measured? I.e. over what time interval / frequency range? And how did you compute your power estimates? Wavelets? Band-pass filter plus Hilbert transform? In general you have not been precise enough in describing your analyses.

→ R16. We agree that we should have been more precise in the way we report our analyses. We have now added the following information: “We conducted a repeated measures ANOVA with Rhythm (Mu, Beta) and Electrode (C3, Cz, C4) as within-subject factors on the difference in spectral power in the mu- and beta-bands between the imagery phases and the rest phases during the training session. Data were normalized by subtracting the global mean from each data point and by dividing the result  by the global SD. ” Also, in the section entitled ‘electroencephalography recordings and processing’, we have now added more information on how the power estimates were computed. 

C17. Figure 5: units!!

 → R17. We have added the units.

C18. When you talk about the “mu rhythm” and “beta rhythm” what do you mean? Do you mean power in the mu-band and power in the beta band? If so then you should say that. Just saying “mu rhythm” or “beta rhythm” is not specific enough. It’s unclear.

→ R18. We have now changed the wording in the entire manuscript, as suggested. We hope that the manuscript is now clearer. 

C19. Figure 6A: I am not sure it is appropriate or optimal to use Pearson’s correlation coefficient on a discrete variable (like perceived control).

→ R19. We agree that there is an ongoing debate about the use of Pearson correlations for discrete variables that have 6 points or more or if the use of nonparametric tests is more appropriate. However, R1 is right that the use of Spearman’s Rho could be more adapted and we have now changed the manuscript where necessary by reporting results with Spearman’s Rho instead of Pearson r. 

C20. No detail was given about the robotic hand itself, and very little detail about the BMI that was controlling it. How long was the latency between the BMI decision and the movement of the robotic hand? This information was given in the discussion, but should appear in the methods section as well. How long did it take for the robotic hand to depress the key? Was it an abrupt movement, or did the robotic hand move slowly until the key was pressed? Or did it move at a speed that was determined by the degree of BMI control? Did the key produce an audible click sound when pressed? These might seem like trivial details, but they may play a role in determining the sense of agency over the robotic hand, and the subsequent sense of agency over the keypress made by the robotic hand.

→ We have now added an additional section in the method section regarding the robotic hand. We have mentioned its features, the publications that detail how to build it and the time to press the keys. The time the robotic hand takes to press the key is indeed very important regarding the time estimation that participants have to make. We set-up this timing to 900 ms in order to have the robotic hand redressing the finger only after the beep occured on each trial and so to avoid temporal distortion for interval estimates. 

C21. Does the theoretical accuracy predict perceived control? This was significant in both experiments and simply suggests that better performance on the part of the classifier is associated with a stronger feeling of control over the BCI (which it should). A prior study that is directly relevant here is Schurger et al, Brain & Cognition (2016) which also looked at judgements of control over a BMI.

→ R21. There was indeed a significant correlation between classification performance and perceived control. We have now added the requested reference in the discussion, since it was indeed relevant: “This result is in line with a former study, which showed that participants are good at evaluating their ability to control a BMI even if they do not receive an online feedback on their performance (Schurger, Gale, Gozel, & Blanke, 2017). ”

C22. p. 26, lines 9-11: This sentence is simply confusing. I can’t make any sense of it. Please clarify. And what do you mean by the “estimated interval estimate”? Isn’t it just either the “estimated interval” or the “interval estimate”? And what is "the reported difficulty to make the hand performing the movement”? Does this refer to the robotic hand? Do you perhaps mean “the reported difficulty making the robotic hand perform the movement”?

→ R22. We have now changed these sentences as follows: “We then performed non-parametric correlation with Spearman's rho (⍴) between the reported interval estimates, the perceived control over the movement of the robotic hand and the reported difficulty to make the robotic hand perform the movement, on a trial-by-trial basis. ”

C23. p. 26, lines 24-25: "Results indicated that the perceived control was a better predictor of interval estimates than the reported difficulty.” Isn’t there guaranteed to be some collinearity in this regression since we know already that perceived control and reported difficulty are correlated? And how did you determine that perceived control was the better predictor? If they are collinear then this would be difficult to ascertain. In part it could just be that this was not clearly expressed in writing. The writing definitely could stand to be improved.

 → R23. We have now reported the VIF that rules out a collinearity effect (all VIF are always =< 2).  However, we do agree that both factors (i.e. perceived control and reported difficulty) could play antagonist roles on the implicit sense of agency: while higher perceived control could boost the implicit sense of agency, higher difficulty could reduce it. 

C24. p. 27, line 5: “p = -2.360”???

 → R24. We have now corrected this typo. 

C25. Again, what are “location scores”? This was unclear to me given that you used the word “location” to refer to which electrode you were looking at.

→ R25. We have now used the word ‘electrode’ when it referred to the electrode location. The word ‘location’ now only refers to the location scores from the questionnaire assessing the appropriation of the robotic hand. 

------

Reviewer #2: In two studies the authors lay out the importance of one’s sense of agency (SoA) over one’s actions and in particular actions performed by assistive technology via a brain-computer interface. Furthermore, they explore how the performance of the BCI as well as one’s perceived control or fatigue affect one’s SoA for the observed actions and how this may affect one’s sense of ownership over the assistive technology.

• What are the main claims of the paper and how significant are they for the discipline?

The main claims of the paper are that i) the performance of the BCI has immediate influence over one’s explicit SoA (judgment of agency - JoA), even though the BCI actions do not produce reafferences; ii) similarly, the performance of the BCI predicts one’s implicit SoA (or feeling of Agency – FoA), as determined using intentional binding (IB) as dependent variable; iii) the perceived effort or difficulty of completing the task with the BCI had a negative effect on the FoA, and that iv) an increased SoA also aids embodiment or ownership of the device.

Generally, these claims are very interesting for the discipline, as they attempt to disentangle different aspects of one’s SoA by combining implicit and explicit judgments and linking them to movement-related cortical desynchronization.

Overall, I am not convinced the authors have controlled for all aspects of their study to fully support these claims.

→ Thanks for judging our work and claims as interesting. We hope our revision further supports these claims. 

• Are the claims properly placed in the context of the previous literature? Have the authors treated the literature fairly?

C27. The introduction of the manuscript provides an overview of relevant literature on the sense of agency, intentional binding, and BCIs. While the authors aim to disentangle these different concepts in order to motivate their study design, I disagree with some of their key arguments here. I would appreciate some clarification on these points, as they are important for the study design, the interpretation of their results, as well as their implications.

Intentional binding

C26. I agree that it is important to separate the FoA from the JoA, as the authors do. However, I am not convinced that the “pre-reflective experience of” the FoA can be assessed using Intentional Binding. Indeed, I would think that e.g. Moore and Obhi (2012 Consciousness & Cognition) or Ebert and Wegner (2010, also C&C) would argue that IB is rather compatible with the JoA. (I have more comments on this for the analysis.) The FoA describes an implicit, on-going aspect of performing an action or making a movement. Neither of these points is true for IB, where the action has already been completed with the press of the button. I understand that this is a general discussion point, not specific to this study, but I think it should be considered.

→ R26.This is indeed a fair point and a lot of explicit measures have also been criticized in terms of how much they relate to agency, since classically authors use different questions that are adapted to their experimental paradigm (i.e. “Do you feel in control of the action”, “Do you feel that you are the authors of the action”, “Do you feel that who caused the outcomes”? etc.). However, since the point is somewhat derivative with respect to our main goals,  we chose not to further develop this discussion point. 

C27.  Comparing the JoA and IB also seems difficult, as the former starts with one’s movement intention and ends with the press of the button (“the hand moved according to my will”), whereas the latter only reflects the cause-and-effect of the button-press followed by a tone (“the delay that occurred between the robotic hand keypress and the tone”). It could therefore be argued that the JoA and the FoA measured here concern different processes and should not directly be compared (without further justification).

→ R27. We agree with the reviewer’s comment and that’s why we do not directly compare them. We nonetheless evaluate the correlations between these two experiences of the self, when one is producing an action. Several earlier  papers have evaluated (through correlations) how these two experiences of the self, despite measuring different aspects, relate and that is the procedure we have used here. 

C28. With respect to further literature on the IB paradigm, Suzuki et al.’s findings (Psychological Science 2019) suggesting that IB may simply reflect “multisensory causal binding” would be relevant to include, particularly as the current paper does neither includes baseline estimations for action- or outcome-binding nor a passive control condition. The work by Rohde and Ernst (e.g. Behavioural Science, 2016) is also relevant.

→ R28. According to the study of Suzuki et al 2019, any study wishing to draw conclusions specific to voluntary action, or similar theoretical constructs, needs to be based on the comparison of carefully matched conditions that should ideally differ in just one critical aspect. Here, our conditions only differed in the sensorimotor information coming from the keypress; they were similar in terms of causality and volition. We are aware of the current debate over the effect of causality on IB. While some authors argued that causality plays an important role (e.g. Buehner & Humphrey, 2009) in explaining variability in IB between 2 experimental conditions, other authors have shown that when causality is controlled as well as other features between two experimental conditions, and that only intentionality remains, a difference in IB still occur (Caspar, Cleeremans & Haggard, 2018).  - study 2). Very recent studies come to the conclusion that even if several factors, such as causality, play a role in the modulation of binding, intentionality also plays a key role (e.g., Lorimer, … , Buehner, 2020). However, since this debate is not the aim of the present paper, we do not discuss this literature in the present paper.

Questionnaire Data

C29. Table 1 seems to be missing in my documents, so I am not entirely sure, my comments on these data are completely accurate. However, a general point to consider for the questionnaire is – what is the control condition and is there a control question? (also see my comment in the data analysis section.) Recently, the questionnaires used in the rubber hand illusion and comparable studies have come under (even more) scrutiny. It would be great, if the authors could include a brief discussion of their questionnaire results with respect to the points raised by Peter Lush (2020) “Demand Characteristics Confound the Rubber Hand Illusion” Collabra: Psychology.

→ R29. We apologize for the omission. Table 1 was indeed missing and has now been added in the manuscript.  As far as we understand it, the critical difference between our design and that of Lush (2020) is that we never mentioned anything to our participants regarding the ‘robotic’ hand illusion or even that they would be given a questionnaire at the end of the experiment evaluating their appropriation of the robotic hand. We even avoided putting the classical blanket between the robotic hand and the participant to cover the participant’s arm, in order to avoid them thinking that we would evaluate the robotic hand as part of their own body. Our participants thus had a limited room for creating expectancies regarding the questionnaires and the robotic hand. However, we agree that controlling for expectancies is relevant in the case of the rubber hand illusion. We have now integrated this discussion in the general discussion section, as follows: “Several articles have argued that demand characteristics (Orne, 1962) and expectancies can predict scores on the classical rubber hand illusion (RHI) questionnaires (Lush, 2020). In the present study, we limited as much as possible the effects of both demand characteristics and expectancies on participants’ answer to the questionnaire related to the appropriation of the robotic hand: (1) Participants were not told anything about the rubber hand illusion or the fact that some people may experience the robotic hand as a part of their body during the experiment; (2) We did not tell them in advance that they would have to fill in a questionnaire regarding their appropriation of the robotic hand at the end of the experiment; (3) We avoided creating a proper classical ‘rubber hand illusion’: the robotic hand was placed at a congruent place on the table regarding the participant’s arm, but we did not use the blanket to cover their real arm, neither we used a paintbrush to induce an illusion. This procedure limited the possibility for participants to guess what we would assess since in the classical rubber hand illusion, placing the blanket on the participant’s arm make them realize that we investigate to what extent they appropriate this hand in their body schema. However, we cannot fully rule out the influence of demand characteristics and expectancies, as participants perceiving a better control over the hand, even without knowing their own classification performance, could have intuitively indicated higher scores on the questionnaires. Control studies could thus be performed to manipulate demand characteristics in order to evaluate its effect on the appropriation of a robotic hand in a brain-machine interface procedure (Lush, 2020).”

• Do the data and analyses fully support the claims? If not, what other evidence is required?

C30. IB: Was there any evidence of an intentional binding effect? It would be interesting to see the distribution of the reported intervals – is it trimodal? (Cf. figures in Suzuki et al.) Was the interval actually reduced? As there is apparently no passive or control condition and no difference between human or robot hand movements, it is not clear if there was any effect at all. Perhaps this is something that can be explored in the existing data set. If there is no effect of binding, what does that mean with respect to the findings of study 2?

→ R30. With respect to Study 2, we cannot infer that the sense of agency was ‘strong’ or ‘weak’ on either training days, which we did not argue for in our discussion. Rather, we can say that having a good control over the BMI boosts sense of agency, as measured through the method of interval estimates. We had three intervals: we have run those analyses with delays as an additional within-subject factor and have observed that IE were longer for the BMI condition for the 100ms-interval than for the real hand condition, not statistically different for the 400ms-interval between the two conditions, and shorter for the BMI condition for the 700ms-interval than for the real hand condition. In our opinion, nothing reliable could be drawn from those analyses regarding binding, and we therefore do not mention these data in the paper. But we indeed agree that this is a critical point in our paper, which is now further elaborated on in the discussion section. 

C31.  Control condition: Moore and Obhi 2012 argue that “Out of all the factors that have been linked to intentional binding, it appears that the presence of efferent information is most central to the manifestation of intentional binding as, when efferent information is not present, such as in passive movement conditions, or passive observation of others, intentional binding is reduced or absent.” Should you not have included such a condition in order to verify an effect of IB?

→ R31. We agree with R2 that we should have included such control conditions. The main reason for not including them was the exhausting nature of the task (learning to control a BMI) in addition to the duration of the task (about 2 hours). We thus considered that the body-generated action condition, which was similar to a host of previous studies using temporal binding was our baseline condition, corresponding to a high sense of agency. We have now added this point as a limitation in our study in the discussion section: “A limitation of the present study is that we did not use a control, passive condition similarly to previous studies on interval estimates. Such passive conditions involve a lack of volition in the realization of the movement, for instance by using a TMS over the motor cortex (Haggard et al., 2002) or by having a motoric setup forcing the participant’s finger to press the key (e.g. Moore, Wegner & Haggard, 2009). Neither Study 1 nor Study 2, includes such passive conditions, thus precluding our conclusions on the extent to which a BMI-generated action condition does involve a high sense of agency. The main reason for not introducing such control conditions stemmed from the tiring nature of the task asked from participants (i.e. learning to control a robotic hand through a BMI) and its duration (i.e. 2 hours). Adding additional experimental conditions would have strongly increased the fatigue  of participants during the task. The existing literature on active, voluntary conditions in which participants use their own index finger to press a key whenever they want is very extensive and consistently points towards lower interval estimates in active conditions than in passive conditions, involving a higher sense of agency (see Moore & Obhi, 2012; Moore, 2016; Haggard, 2017 for reviews). We thus considered that the body-generated action condition used in the present study, which is entirely similar to the active conditions previously reported in the literature, would be the baseline associated with a high sense of agency. This condition was then compared to the BMI-generated action condition in order to evaluate if using a BMI would lead to a similarly high sense of agency than in the body-generated action condition. In Study 2, we assumed that results of Study 1 were replicated for the BMI-generated action condition, thus resulting in a sense of agency on both Days 1 and 2. However, to draw more reliable conclusions on the sense of agency for MBI-generated actions, a passive control condition should be added.”

C32. Agency Question: Were there any false positives or catch trials, in which the robotic hand moved without the user’s intention? If the hand only ever moves when participants “intend” to move, then the hand can never actually “move by its own”. Furthermore, if the question(s) are only asked after a successful triggering of the hand movement, is the latter answer not ~unattainable?

→ R32. There were no catch trials per se, in the sense that we did not introduce trials in which the robotic hand moved by its own in the experimental design. There were however false positives, induced by the participants’ lack of control over the robotic hand. The explicit question about control was developed to find out when those false positives occurred: With this question, participants could tell us if each trial was a false positive (i.e. the robotic hand moved by ‘its own’) or an intended trial. We thus indeed rely on participant’s perception to know if the trial was a ‘false positive’ or not. 

C33.  Ownership Question: Is there a way to distinguish between ownership and agency in the current paradigm? Are the scores highly correlated? Can you exclude a counterargument such as demand characteristics being responsible for the ownership scores?

→ R33. Agency and ownership scores were indeed correlated, such as in several previous studies. We can’t really ensure that demand characteristics played a role in the ownership scores, but we tried,  however, to limit its impact more than in former studies (see also R29). This has been added in the discussion section.

C34.  Theoretical accuracy: The EEG (and EMG) methods are clearly described. The results with respect to mu- and beta-band suppression and electrode location are in-line with prior findings and it is good to see this BCI approach being applied to agency and ownership questions. I have a couple of questions which you can probably quite easily clarify. How does “theoretical accuracy” differ from (actual) classifier performance? Could you also briefly explain why you do not use cross-validation to measure the performance?

→ R34. In fact the theoretical accuracy score was the name that we gave to the classifier performance. We agree that this may have been confusing and, in line with Reviewer 1, we have now replaced the word ‘theoretical accuracy’ by ‘classification performance throughout the entire manuscript for the sake of clarity. More information has now been added in the manuscript regarding how the classification performance was calculated and regarding cross-validation: “We used the Filter Bank Common Spatial Pattern (FBCSP) variant, which first separates the signal into different frequency bands before applying CSP on each band. Results in each frequency band are then automatically weighted in such a way that the two conditions are optimally discriminated. Two frequency bands were selected: 8-12Hz and 13-30Hz. FBCSP and LDA were applied on data collected within a 1s a time window. The position of this time window inside the - 3s of rest or right hand motor imagery was automatically optimized for each participant by sliding the time window, training the classifier, comparing achieved accuracy by cross-validation, and keeping the classifier giving the best classification accuracy. ”  and “These training sessions were performed at least twice. The first training session was used to build an initial data set containing the EEG data of the subject and the known associated condition: “motor imagery of the right hand” or “being at rest”. This data set was then used to train the classifier. During the training of the classifier, a first theoretical indication of performance was obtained by cross-validation, which consisted in training the classifier using a subset of the data set, and of testing the obtained classifier on the remaining data. The second training session was used to properly evaluate the classification performance of the classifier. When the participant was asked to perform the training session a second time, the classifier was predicting in real time whether the participant was at rest or in right hand motor imagery condition. If classification performance was still low (around 50-60%), a new model was trained based on the additional training session, and combined with all previous training sessions. At most, six training sessions were performed. If the level of classification performance failed to reach chance level (i.e. 50%) after six sessions, the experiment was then terminated. ”

C35.  On the same point, can you calculate the classifier performance during the actual experimental block? Does this match the training performance and is this a better or poorer indicator of perceived control?

→ R35. The classifier performance was calculated based on the discrepancy/predictability between the model and participant’s performance during the training session. It means that in the training session, the classifier could know what was the expected fit because we alternated a cross appearing on the screen (corresponding to the rest phase) and an arrow appearing on that cross (corresponding to the motor imagery phase). This is what is represented on Figure 1A and Figure 1B. In the experimental block, participants could manipulate the robotic hand as they wanted, meaning that there was no expected match between a specific stimulus appearing on the screen and participants’ willingness to make the robotic hand move or not. It was thus not possible to calculate a score of classification performance during the experimental block, since we could not ask participants to report continuously when they were imagining the ‘rest phase’ to avoid making the robotic hand moving. 

C36. Theoretical accuracy and correlation to mu/beta oscillations: If I understand correctly, the classifier is trained to detect desynchronization in the mu/beta frequency bands. Performance is then quantified as theoretical accuracy. What does the correlation between theoretical accuracy and mu/beta oscillation actually tell us? Is this simply the confirmation that these are the trained criteria? (Apologies, if I simply missed the point here.)

→ R36. As a reminder, we have now changed the word ‘theoretical accuracy’ into ‘classification performance’ for more clarity. In fact, the feature extraction with CSP and LDA is not directly linked to mu and beta oscillations. However, since we filter the data based on the mu and the beta-bands, this means the classification performance score could be related to oscillations in the mu- and beta- bands. The correlation thus suggests that CSP+LDA correctly identifies the changes in mu and beta oscillations. According to the comment of Reviewer 1 (see R7 and R8), we have now added more information regarding the classification performance in the manuscript. We hope that this section is now clearer. 

- C37. Sample size: The justification of the sample size is not very clear. Would it not be better to either calculate it based on prior or expected effect sizes and add a percentage of participants in case of BCI illiteracy? Alternatively, you could use a Bayesian approach and determine a cut-off criterion.

→ R37. We agree with R2 but we  did not have  a strong prior regarding the expected effect sizes since no previous studies used a similar method and since we had no idea how many participants would be lower or higher than the chance level with our BMI. We did not have to exclude participants with a classification performance score lower than 50% and thus did not have to remove participants. Future studies using a similar approach will include a-priori computation of the sample size.

Decision Letter 1

Jane Elizabeth Aspell

30 Oct 2020

PONE-D-20-06849R1

How using brain-machine interfaces influences the human sense of agency

PLOS ONE

Dear Dr. Caspar,

Thank you for submitting your manuscript to PLOS ONE. The reviewers only require some additional small changes to the manuscript. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 14 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Jane Elizabeth Aspell, PhD

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: On page 48 of the pdf file, line 21, it reads that "each training session lasted 2.5 minutes..." This seems a bit short for 40 trials. Are you sure you didn't mean 25 minutes?

Also you noted that for the time-frequency data you normalized the power estimates by subtracting the global mean and dividing the result by the global SD. What did you mean by "global" here? You should subtract the mean power within each frequency band and also divide by the standard deviation within each frequency band. This should be clarified.

Reviewer #2: Dear authors, thank you for your detailed feedback.

The current version of the manuscript is much easier to follow and a number of items have been clarified. Overall, I appreciate the BCI-approach to questions of embodiment and the effort that has gone into these two studies. Unfortunately, the lack of a control condition and of control items for the questionnaire limit the interpretation of the findings. However, I agree with reviewer 1 that the claims you are making are not very strong, and given the revised manuscript, are supported by your findings.

In my opinion, it would be worth briefly discussing the actual interval estimates. I think the fact that IEs for shorter delays are smaller for "biological" movement than for robotic movements, whereas the reverse is true for IEs after longer delays, is quite interesting.

Along these lines - have you checked wether the reported SoA differs depending on the delay? Even though the agency question is phrased clearly, it would be worth checking if the responses are affected by the subsequent interval.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Aaron Schurger

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 7;16(1):e0245191. doi: 10.1371/journal.pone.0245191.r004

Author response to Decision Letter 1


7 Nov 2020

Reviewer #1: On page 48 of the pdf file, line 21, it reads that "each training session lasted 2.5 minutes..." This seems a bit short for 40 trials. Are you sure you didn't mean 25 minutes?

For each trial, there was 3s for motor imagery, 3s for “thinking about nothing” and 2s for a break (see figure1A). We thus have 8s per trial * 20 trials =160 s. = 2.5 min. But we agree it’s confusing in the way this section is currently written and have made some modifications, as follows:

Each training session lasted 2.5 minutes and was composed of 20 trials. Each trial was composed of 3s for the rest phase (i.e. thinking about nothing), 3s for the motor imagery phase and a 2s-break (see Figure 1A).

Also you noted that for the time-frequency data you normalized the power estimates by subtracting the global mean and dividing the result by the global SD. What did you mean by "global" here? You should subtract the mean power within each frequency band and also divide by the standard deviation within each frequency band. This should be clarified.

This is indeed what we did, we are sorry that it was not clear enough with the word “global”. We have now modified this sentence.

Data were normalized by subtracting the global mean within each frequency band from each data point and by dividing the result by the global SD within the same frequency band.

Reviewer #2: Dear authors, thank you for your detailed feedback.

The current version of the manuscript is much easier to follow and a number of items have been clarified. Overall, I appreciate the BCI-approach to questions of embodiment and the effort that has gone into these two studies. Unfortunately, the lack of a control condition and of control items for the questionnaire limit the interpretation of the findings. However, I agree with reviewer 1 that the claims you are making are not very strong, and given the revised manuscript, are supported by your findings.

In my opinion, it would be worth briefly discussing the actual interval estimates. I think the fact that IEs for shorter delays are smaller for "biological" movement than for robotic movements, whereas the reverse is true for IEs after longer delays, is quite interesting.

Along these lines - have you checked whether the reported SoA differs depending on the delay? Even though the agency question is phrased clearly, it would be worth checking if the responses are affected by the subsequent interval.

We have now added the statistical result regarding interval estimates for each delay in the result section and briefly discuss this finding in the discussion section.

We also explore if the reported interval estimates would vary regarding actual action-tone intervals, depending on the experimental condition. We thus run an repeated-measures ANOVA with Action (real hand, robotic hand) and Delays (100, 400, 700 ms) as within-subject factors on the reported interval estimates. We observed a significant interaction between Action and Delay (F(2,52)=12.956, p < .001, η2partial = .358). Paired-comparisons indicated that interval estimates were shorter when participants performed theaction with the real hand (210 ms, SD=163) compared to when they used the robotic hand (290 ms, SD=137) for the 100-ms action-tone delay (t(26)=-2.717, p=.012, Cohen’s d=-.523). We also observed that interval estimates were longer when participants used their real hand (624 ms, SD=133) compared to the robotic hand (541 ms, SD=142) for the 700ms action-tone delay (t(26)=3.339, p=.003, Cohen’s d=.643). The difference was not significant for the 500 ms action-tone delay.

We also observed that for small action-tone delays, the reported interval estimates were smaller for biological movements executed with one’s own hand than for movements executed through the robotic hand. However, this pattern reversed for longer action-tone delays. This is suggestive that different mechanisms drive agency for biological and non-biological movements, an aspect of our findings that warrants further research.

The perceived control over a movement was not statistically influenced by the action-tone delays.

Decision Letter 2

Jane Elizabeth Aspell

15 Dec 2020

PONE-D-20-06849R2

How using brain-machine interfaces influences the human sense of agency

PLOS ONE

Dear Dr. Caspar,

Thank you for submitting your manuscript to PLOS ONE. Reviewer 2 has some remaining minor issues that s/he would like to be addressed. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised.

Please submit your revised manuscript by Jan 29 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Jane Elizabeth Aspell, PhD

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: Thank you for the revisions. I only have a few comments that should be addressed.

General Discussion (page 36, lines 22/24): You state that “[agency] did not differ between a condition in which participants used their own hand and a condition in which they used the robotic hand to perform a keypress”

This is not true as stated and should be removed/edited. As before, you should distinguish between the JoA and the FoA here. For the JoA, no comparison between the real hand and the BCI was made. So this statement is simply not supported by any data. One would have to use the same questionnaire for the real hand condition to determine if the JoA differed between the two conditions. Similarly, FoA (via delay estimates) was only compared in study 1, and there you have an interesting interaction between the factors delay and condition.

You could rephrase this by stating that participants report a high/strong JoA for BCIs (but leave out a comparison which was not part of the study design). FoA, in study 1, was comparable in the two conditions. (You could add a Bayesian test to see if there is evidence to “support the null”).

Study 2 – Interval estimates. Please include the actual estimate results here, as for study 1. What are the means ±SDs and do participants distinguish between the levels? These will be relevant values to report for subsequent studies investigating Intentional Binding paradigms for BCIs/prostheses, too.

Page 15, ownership question – “I felt as if the robotic hand belonged to me”. (missing -ed)

Page 27, line 20 – I am not sure that “volition” can be used in this way. Maybe something like a greater cognitive effort/more concentration etc. would be better.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Aaron Schurger

Reviewer #2: Yes: Oliver A Kannape

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 7;16(1):e0245191. doi: 10.1371/journal.pone.0245191.r006

Author response to Decision Letter 2


17 Dec 2020

Reviewer #2: Thank you for the revisions. I only have a few comments that should be addressed.

General Discussion (page 36, lines 22/24): You state that “[agency] did not differ between a condition in which participants used their own hand and a condition in which they used the robotic hand to perform a keypress”

This is not true as stated and should be removed/edited. As before, you should distinguish between the JoA and the FoA here. For the JoA, no comparison between the real hand and the BCI was made. So this statement is simply not supported by any data. One would have to use the same questionnaire for the real hand condition to determine if the JoA differed between the two conditions. Similarly, FoA (via delay estimates) was only compared in study 1, and there you have an interesting interaction between the factors delay and condition.

You could rephrase this by stating that participants report a high/strong JoA for BCIs (but leave out a comparison which was not part of the study design). FoA, in study 1, was comparable in the two conditions. (You could add a Bayesian test to see if there is evidence to “support the null”).

We have now reported the BF for this comparison, which indeed supported H0 (BF10=.207). We have added this result in the result section and we have also added a note regarding how this value was obtained using JAPS and its default priors.

In the general discussion section, we have modified the sentence as follow: “We found in Study 1 that the absence of sensorimotor information was not detrimental for the feeling of agency, as measured through interval estimates, which did not differ between a condition in which participants used their own hand and a condition in which they used the robotic hand to perform a keypress.”

Study 2 – Interval estimates. Please include the actual estimate results here, as for study 1. What are the means ±SDs and do participants distinguish between the levels? These will be relevant values to report for subsequent studies investigating Intentional Binding paradigms for BCIs/prostheses, too.

Participants who did not discriminate between levels were excluded, as reported in the participant section with the exclusion criteria. Out of 30 participants, 4 did not discriminate with a significant trend the three intervals. So, the 26 remaining participants for this experiment did significantly discriminate between the 3 delays.

We agree it could be interesting but it is not the same comparison as in Study 1 since here there was not real hand condition, so we did not run the same analysis than in Study 1. We have nonetheless added the means and SDs for the 3 intervals in the “interval estimates” section.

Page 15, ownership question – “I felt as if the robotic hand belonged to me”. (missing -ed)

This has been corrected

Page 27, line 20 – I am not sure that “volition” can be used in this way. Maybe something like a greater cognitive effort/more concentration etc. would be better.

We have replaced the word “volition” by “a greater cognitive effort”

Attachment

Submitted filename: BCI_Paper_ResponseToReviewers#3.docx

Decision Letter 3

Jane Elizabeth Aspell

26 Dec 2020

How using brain-machine interfaces influences the human sense of agency

PONE-D-20-06849R3

Dear Dr. Caspar,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Jane Elizabeth Aspell, PhD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: Thank you for taking my feedback into account and for clarifying the statement in the discussion.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: Yes: Oliver A Kannape

Acceptance letter

Jane Elizabeth Aspell

30 Dec 2020

PONE-D-20-06849R3

How using brain-machine interfaces influences the human sense of agency

Dear Dr. Caspar:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Jane Elizabeth Aspell

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: BCI_Paper_ResponseToReviewers#3.docx

    Data Availability Statement

    Data are available on OSF with the following DOI: 10.17605/OSF.IO/SN8PJ.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES