Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Oct 1.
Published in final edited form as: J Neural Eng. 2010 Sep 21;7(5):056013. doi: 10.1088/1741-2560/7/5/056013

Does the “P300” Speller Depend on Eye Gaze?

P Brunner 1,2,3, S Joshi 1,3, S Briskin 1, JR Wolpaw 1,5, H Bischof 2, G Schalk 1,3,4,5,6
PMCID: PMC2992970  NIHMSID: NIHMS241977  PMID: 20858924

Abstract

Many people affected by debilitating neuromuscular disorders such as amyotrophic lateral sclerosis (ALS), brainstem stroke, or spinal cord injury, are impaired in their ability to, or even unable to, communicate. A Brain-Computer Interface (BCI) uses brain signals, rather than muscles, to re-establish communication with the outside world. One particular BCI approach is the so-called “P300 matrix speller” that was first described by Farwell and Donchin in 1988. It has been widely assumed that this method does not depend on the ability to focus on the desired character, because it was thought that it relies primarily on the P300 evoked potential and minimally if at all on other EEG features such as the visual evoked potential (VEP). This issue is highly relevant for clinical application of this BCI method, because eye movements may be impaired or lost in the relevant user population.

This study investigated to what extent performance in a “P300” speller BCI depends on eye gaze. We evaluated the performance of 17 healthy subjects using a “P300” matrix speller during two conditions. In one condition (“letter”), the subjects focused their eye gaze on the intended letter, while in the second condition (“center”), subjects focused eye gaze on a fixation cross that was located in the center of the matrix.

The results show that the performance of the “P300” matrix speller in normal subjects depends in considerable measure on gaze direction. They thereby disprove a widespread assumption in BCI research, and suggest that this BCI might function more effectively for people who retain some eye-movement control. The applicability of these findings to people with severe neuromuscular disabilities (particularly in eye-movements) remains to be determined.

1. Introduction

Many people affected by debilitating neuromuscular disorders such as amyotrophic lateral sclerosis (ALS), brainstem stroke, or spinal cord injury are impaired in their ability to or even unable to communicate with their family and caregivers. A Brain-Computer Interface (BCI) uses brain signals directly, rather than muscles, to re-establish communication with the outside world. One well-known BCI approach is the so-called “P300 matrix speller” that was first described by Farwell and Donchin in 1988. In this system, the user pays attention to a character in a matrix while each row and column is intensified in a random sequence. The brain produces a response to the row or column that contains the intended character (i.e., the oddball); this response is not present for the other rows or columns. The BCI typically averages several responses, detects the row and column with the strongest responses, and thereby identifies the character the user wants to select.

The individual parameters of the “P300” matrix speller have each been studied and optimized extensively. This includes the matrix size (Allison and Pineda 2003), stimulation frequency (Sellers, Krusienski, McFarland, Vaughan and Wolpaw 2006), stimulation intensity (Takano et al. 2009), classification algorithm (Krusienski et al. 2006) and electrode locations (Krusienski et al. 2008). It has been recently shown that more than 80% of the population can use such a BCI (Guger et al. 2009). The “P300” speller has also been used for a variety of applications, such as web browser navigation (Mugler et al. 2008), control of ambient environment (Edlinger et al. 2009), wheelchair navigation (Rebsamen et al. 2007), and mouse movement (Citi et al. 2008), which demonstrates the broad utility of this approach. Most important to the eventual goal of BCI research, several studies have also begun to show mounting evidence that the “P300” speller is a feasible, practical, and useful method to restore function in severely disabled individuals (Nijboer et al. 2008; Sellers, Kübler and Donchin 2006; Sellers et al. 2010; Vaughan et al. 2006; see Donchin and Arbel 2009 for a comprehensive review). Interestingly, clinical studies with ALS patients (e.g., Nijboer et al. 2008; Sellers, Kübler and Donchin 2006; Sellers et al. 2010; Vaughan et al. 2006) show lower spelling performance (i.e., 1.4–3 selections per minute, 79–83% accuracy) than laboratory demonstrations with healthy subjects (Lenhardt et al. 2008; Serby et al. 2005, 4–4.6 selections per minute, 79–83% accuracy).

Since the original description of the “P300” speller in 1988, it has been unclear whether this method relies primarily on the P300 evoked potential, and minimally if at all on other EEG features, such as the visual evoked potential (VEP), that strongly depend on eye-gaze direction (Donchin et al. 2000; Sellers, Kübler and Donchin 2006; Serby et al. 2005). Omitting visual crowding (Korte 1923; Strasburger 2005), a P300 is not markedly affected by whether the target is foveated, whereas a VEP is larger when the target is foveated. This distinction is important for clinical application of this BCI method, because eye movements are often impaired or lost in the target population. For example, although some people with ALS maintain residual eye movement for years (Birbaumer and Cohen 2007; Cohen and Caroscio 1983; Palmowski et al. 1995), others progress to near-complete and complete paralysis. It has been shown that the distance to foveation inuences visual acuity (see Fig. 1) and also VEP amplitude (De Keyser et al. 1990; Sherman 1979).

Figure 1. The effects of the distance to the center (eccentricity) on visual acuity.

Figure 1

This figure shows the degradation of visual acuity at increasing angles from a centered focus point. The visual acuity (expressed as the Snellen fraction, i.e., 20/20 equals 100%) quickly declines to approximately 20% at 10 degrees eccentricity. (Modified from Westheimer 1965.)

The goal of this study was to determine to what extent performance in a “P300” speller depends on eye gaze. We hypothesized that fixation of the target item would produce both a P300 and a VEP, while fixation of a location other than that of the target would produce a P300 and a much smaller VEP. We also hypothesized that the 8-channel montage that had previously been optimized for target fixation is suboptimal when the eyes do not fixate the target. Furthermore, we hypothesized that the richer information (i.e., about target or non-target stimuli) in the target fixation condition would result in better speller performance (i.e., higher accuracy). Our results from 15 subjects unequivocally support these hypotheses, and thereby disprove the assumption that the performance of the “P300” speller does not depend on the subject’s ability to fixate the target character.

2. Methods

2.1. Human Subjects

We collected a total of 17 datasets of 8- or 64-channel EEG from 15 right-handed subjects (two subjects participated twice) using the general-purpose BCI software platform BCI2000 (Mellinger and Schalk 2007; Schalk et al. 2004; Schalk and Mellinger 2010). The subjects were 6 females and 11 males, aged 20 to 62. All subjects had normal or corrected-to-normal vision, and gave informed consent through a protocol reviewed and approved by the Wadsworth Center Institutional Review Board.

2.2. Experimental Paradigm

Subjects sat 60 cm (± 6 cm) in front of a at-screen monitor. They were presented with a 6×6 matrix of 36 alphanumeric letters and numbers that was centered on the screen (see Fig. 2). At this distance, the matrix subtended ± 7.1 degrees of the visual field both horizontally and vertically. Eye gaze was measured 60 times per second by an eye tracker (Tobii T60, Tobii Technology, Inc., Sweden) that was integrated with the at-screen monitor. These eye-gaze coordinates were acquired by BCI2000 along with the ongoing EEG and stored to disk. In addition, they were also used online to control for gaze direction as described below.

Figure 2. Experimental setup.

Figure 2

Subjects were presented with a matrix on a computer screen. There were two experimental conditions. In condition 1 (“letter”), the subject was free to gaze at the target (e.g., the letter F). In condition 2 (“center”), the subject was asked to gaze at a fixation cross in the center of the matrix. Fixation was verified in real time by an eye tracker.

Each subject participated in one two-hour session. In this session, we collected data during two experimental conditions. In Condition 1, the “letter” condition, the subject was asked to gaze at the target item. In Condition 2, the “center” condition, the subject was asked to gaze only at a fixation cross located in the center of the screen while paying attention to the target item. In both conditions, the subject was asked to note every time the target ashed. The fixation cross was color- and intensity-matched to the matrix elements and, in Condition 2, it rotated by 45 degrees if the subject shifted eye gaze more than 2.8 deg from the cross for more than 300 msec. Appropriate eye gaze in these two conditions was also verified offline as described later.

The subjects performed a total of 24 runs – 12 for each task – in an alternating fashion. Each run presented four different target items in succession (i.e., four trials), using 15 ashes (i.e., 15 ashes of each row and each column for each of the four targets). Each intensification lasted 125 ms and was followed by an interval of 125 ms at a contrast ratio of 5:1. An 8-sec pause between trials gave the subject time to shift attention (and, in Condition 1, eye gaze also) to the new target, which was presented in the center (i.e., (instead of the fixation cross) for the first 5 sec of the 8-sec pause, and was also present throughout the trial on the top left of the screen.

Each run was balanced and block randomized such that it contained one target from each matrix quadrant and one target at each of four of the six possible distances from the center (i.e., the fixation cross). The sequence of 12 runs was presented in the opposite direction for the two experimental conditions (i.e., the first run of the first set of four targets for the “letter” condition was the last run for the “center” condition). All subjects that participated in this study had successfully used the “P300” matrix speller prior to this study. Because Condition 2, the “center” condition, may be a more complicated task than that in Condition 1, one practice run familiarized them with both conditions prior to the actual data collection.

2.3. Data Collection

In 10 subjects (Group A), we recorded EEG from 8 scalp locations (Fz, Cz, P3, Pz, P4, PO7, Oz, PO8) using an 8-channel analog amplifier (g.MOBIlab, g.tec, Austria). In 7 subjects (Group B), we recorded EEG from 64 scalp locations (extended 10–20 montage (Sharbrough et al. 1991)) using a 64-channel digital amplifier (g.USBamp, g.tec, Austria). For both groups, the left and right mastoids served as ground and reference, respectively (see Fig. 2). The 8-channel Group A montage had previously been shown to provide performance on the “P300” speller similar to that of the full 64-channel Group B montage (Krusienski et al. 2006, 2008). The 64-channel data of Group B allowed us to define the topographies of the responses to the ashing stimuli. Both amplifiers sampled the signal at 256 Hz and used a high pass filter and a notch filter to remove frequency components be low 0.1 Hz and at 60 Hz, respectively. In addition to the 8 or 64 EEG channels, eye gaze coordinates were independently acquired 60 times per second for the left and right eyes, aligned with the EEG data, and stored.

2.4. Feature Extraction

In offline analyses, we first filtered the signal between 0.1 and 20 Hz and downsampled it to 40 Hz. We then extracted the stimulus response, which was defined as the 750 ms of EEG after stimulus onset from all eight channels of the optimized montage (i.e., the same channels whether 8 or 64 channels were recorded). This yielded 30 features (i.e., 40 × 0.75 = 30) per channel or a total of 240 features for all 8 channels. Each sequence had 12 stimuli, i.e., ashes of 6 rows and 6 columns of the matrix. Of these 12 ashes, two included the target and thus elicited a target evoked potential (EP), while the other ten did not include the target and thus elicited a non-target EP. The 15 sequences in each each trial (i.e., with each target) yielded 30 target EPs and 180 non-target EPs. Because a subject performed 48 trials in each of the two conditions, we had a total of 1320 target EPs and 7920 non-target EPs from each subject.

2.5. Modeling and Evaluation

We used previously established methods (Krusienski et al. 2006) to discriminate target EPs from non-target EPs. In particular, we used a stepwise regression (penter = 0.1, premove = 0.15, Jennrich 1977) to reduce the 240 features to a maximum of 60 features. The regression established a linear model that predicted from the selected features whether a particular row or column did or did not contain the target. This model was constructed and evaluated using a leave-one-out cross validation scheme. Thus, in each of the 12 folds of this cross validation, a model was constructed using 11 out of 12 runs (i.e., 44 targets) and was tested on the remaining run. Each run served once as the test run. For each trial, the intersection of the row and column that analysis indicated produced a target EP defined the predicted target. Chance accuracy was 1/36, or 2.8% ( 166100). We calculated the average classification accuracy for the 12 cross validation folds.

2.6. Verification of Behavioral Compliance

Because the two experimental conditions in this study were set up to assess differences in the EEG that were related to the gaze location, it was critical to verify that the subjects actually fixated on the target in the “letter” condition and on the fixation cross in the “center” condition. As described above, in the “center” condition the subjects received immediate visual feedback if they looked away from the fixation cross for more than 300 ms, but the trial was not aborted. To verify that the subjects did maintain gaze as instructed, we also analyzed the gaze data offline. The results are summarized in Fig. 4. The traces show the distributions of the horizontal (for the six columns) or vertical (for the six rows) distances of gaze location from the fixation cross for the two conditions. The red trace (“letter” condition) shows six peaks for the six rows/columns, while the blue trace (“center” condition) shows only one peak sharply focused on the fixation cross. These data show that the subjects did follow the instructions, that is, they looked at the target in the “letter” condition and at the fixation cross in the “center” condition. It is relevant to note that the subject’s behavior during the “letter” condition (i.e., looking at the target) using the instructions used in this study (i.e., to fixate on the target) was comparable to that using the common instructions (i.e., to focus attention on the target, see Supplementary Figure Fig. A3).

Figure 4. Distributions of the distance of eye gaze from the center during the two conditions.

Figure 4

The traces show the distributions of the horizontal (for the six columns) or vertical (for the six rows) distances of gaze location from the fixation cross for the “letter” condition (red) and the “center” condition (blue). Shading shows standard deviation across subjects.

3. Results

3.1. Effect of Condition on Classification Accuracy

The main results of this study are shown in Fig. 5. This figure shows the classification accuracy (i.e., the accuracy in identifying the target) for the two conditions as a function of stimulus repetitions. All subjects performed significantly better (pairwise t-test, p < 0.001) for Condition 1, the “letter” condition, than for Condition 2, the “center” condition. The final classification accuracy after 15 stimulus repetitions (i.e., the right-most data point in each trace) ranged from 80% to 100% for the “letter” condition and from 2.8% (i.e., chance level) to 90% for the “center” condition. These offline analyses showed that the target could be identified with 100% accuracy for the majority (53%) of subjects for the “letter” condition. In contrast, accuracy did not reach 100% for any subject during the “center” condition, and reached at least 50% in only 47% of the subjects.

Figure 5. Classification accuracy as a function of the number of stimulus repetitions.

Figure 5

As expected, classification accuracy steadily increases with number of stimulus repetitions. Accuracy is substantially greater for the “letter” condition than for the “center” condition.

3.2. Effect on Accuracy of Gaze Distance from the Center

Expanding on the results shown in the previous section, we determined whether accuracy depended on the distance between the target and the fixation cross. We hypothesized that this distance would not affect classification accuracy when the subjects fixated on the target (Condition 1), but would adversely affect accuracy when the subject fixated on the center (Condition 2). Blue and red traces in Fig. 6 show for Conditions 1 (red) and 2 (blue) the accuracy for all subjects for whom accuracy with 15 stimulus repetitions was > 50% as a function of distance of eye gaze from the center. The results confirm our hypothesis: in Condition 2 only, accuracy declined as the distance of the target from the center increased (i.e., as the target moved from near the center of the visual field toward the periphery).

Figure 6. Accuracy as a function of the distance of eye gaze from the center.

Figure 6

Red and blue traces show results for Conditions 1 and 2, respectively. Shading indicates standard deviation across subjects.

3.3. Effect of Electrode Montage on Classification Accuracy

We also determined whether accuracy would be increased by a larger number of electrodes. As described above, in the 10 subjects of Group A we recorded EEG using an 8-channel montage that had previously been optimized for the “P300” speller (Krusienski et al. 2006, 2008), while in the 7 subjects of Group B we used a full 64-channel extended 10–20 montage (Sharbrough et al. 1991). In offline analysis of the Group B data, we compared accuracies for the optimized 8-channel montage and the full 64-channel montage (see Fig. 3). The results are shown in Fig. 7 for the two montages and the two conditions. For both conditions, the 64 channel montage consistently yielded higher accuracies. Consistent with a previous study (Krusienski et al. 2008), the superiority of the 64-channel montage over the optimized 8-channel montage was modest (4.4%, p = 0.34, pairwise t-test) for Condition 1 (red). In contrast, the improvement with the larger montage was much greater for Condition 2 (18.7%, p < 0.01, pairwise t-test). These results suggest that when a subject does not fixate the target, a different (or larger) montage may be helpful.

Figure 3. Electrode montage for groups A and B.

Figure 3

EEG from group B was recorded from the 64 locations shown here (extended 10–20 montage (Sharbrough et al. 1991)). EEG from group A was recorded from an optimized subset of 8 electrodes (shown in blue) (Krusienski et al. 2006, Krusienski et al. 2008).

Figure 7. Accuracy versus stimulus repetitions for the 8-channel (red) and 64-channel (blue) montages for Condition 1 (down-pointing triangles) and Condition 2 (up-pointing triangles) for the subset of subjects with 64-channel recordings.

Figure 7

3.4. Effect of Fixation Task on EEG Responses

The results presented in Section 3.1 show that all subjects performed significantly better in Condition 1 than in Condition 2 (pairwise t-test, p < 0.001). We were interested in the physiological basis for this difference. To assess the effect of condition on the actual responses to the target and non-target stimuli, we calculated, for each subject’s data under each condition, signed squared correlation coefficient (r2) values for each time segment of the target and non-target responses at each of the 8 electrodes of the optimized montage. We then calculated the average Condition 1 and Condition 2 results across all subjects. The results are shown in Fig. 8. Fig. 8A and Fig. 8B show the signed r2 time courses and raw EEG time courses, respectively, for Condition 1 (red) and Condition 2 (blue). The Condition 1 traces show early components around 180 ms after stimulus onset that are absent in the Condition 2 traces. The P3 components appear to be delayed and smaller in amplitude for the “center” task. Fig. 8C–D show color-coded topographies for the 8-channel and 64-channel datasets, respectively. The topographies are consistent for the 8-channel and 64-channel datasets. They show an early VEP component that is focused on visual/occipital areas, with polarity reversal over central and frontal locations, as well as a following P3 component that is focused on central-parietal areas. The early VEP component is missing for the “center” task. We quantified the impact of this early VEP component on classification accuracy by running similar analyses as before, except that we excluded all data between 0 and 250 ms post stimulus. Compared to the results that included all data, the results show a significant reduction in classification accuracy (16%, p < 0.01) for Condition 1, the “letter” condition, and no change (0%, p = 0.9) for Condition 2, the “center” condition.

Figure 8. Average traces and topographies for the two tasks.

Figure 8

Average signed r2 traces (A) and wave forms (B) for the two tasks. The traces show negative early VEP components around 180 ms post stimulus for the “letter” task (red traces). P3 components appear to be delayed and smaller in amplitude for the “center” task (blue traces). (C,D): Topographies show an early VEP component (topographies at 180 and 300 ms) for the “letter” task that is absent for the “center” task. Topographies also show a classical P300 response (topographies at 420 ms).

4. Discussion

This study shows that accuracy of the “P300” speller is affected by gaze direction: fixating on the target (as in Condition 1) produces substantially better classification than fixating on a center point (as in Condition 2). These results suggest that online performance of a “P300” speller-based BCI can be expected to be substantially reduced when subjects do not gaze at the desired item. We also found that accuracy decreases as the distance between the gaze fixation point and the target increases. Finally, we found that the 8-channel montage, which focuses on central parietal and occipital areas and has previously been optimized for the “P300” matrix speller (Krusienski et al. 2008), is suboptimal when subjects do not gaze at the target. Detailed analysis of the target and non-target responses indicates that the decreased performance when the subject does not gaze at the target is due mainly to the lack of an early response over posterior (i.e., visual) cortex. These findings are in general alignment with a recently performed study (Treder and Blankertz 2010). In Figure A1, we demonstrate that task-related ERPs between 50–400 ms are negatively correlated with distance to the center point. Together with the fact that P300 evoked responses have not been reported to occur around 180 ms and over visual areas, these results suggest that the matrix speller BCI usually depends, as has been previously shown, on the P300 ERP that is evoked by the recognition of the desired stimulus, but also on a visual ERP that is evoked by the ashing target stimulus.

Our results may explain in part why “P300” speller performance in ALS patients tends to be lower than that in healthy subjects (Nijboer et al. 2008; Sellers, Kübler and Donchin 2006; Vaughan et al. 2006). At the same time, further studies are needed to determine the relationship of gaze and performance in ALS patients, and to optimize the montage for this population. In summary, our findings suggest that the clinical applicability of the “P300” matrix speller in subjects with impaired gaze may be limited. In such subjects, an auditory “P300” matrix speller (e.g., Furdea et al. 2009; Klobassa et al. 2009; Schreuder et al. 2010) may prove useful.

The percent of subjects (94%) with high accuracy (i.e., 80–100% correct) in Condition 1 was similar to that found in a recently published study (Guger et al. 2009) that allowed the subjects to gaze directly at the target.

The accuracy shown in this study declines with increasing distance of the target from the center of foveation due mainly to the lack of an early response over posterior (i.e., visual) cortex. Because of the relationship between the visual acuity and the VEP amplitude (De Keyser et al. 1990; Sherman 1979; Westheimer 1965), we expected a more extensive decline in accuracy. This was not the case. Supplementary Figure Fig. A1 shows a decline over the distance for the VEP (i.e., 50–200 ms) and the P300 (i.e., 250–400 ms) responses, while the amplitude of the late ERP (i.e., 400–600 ms) response increases. This increase in amplitude of the late ERP may explain the less extensive decline in accuracy with increasing distance of the target from the center of foveation. As a side note, this change of the ERP component amplitude with increasing distance of the target from the center of foveation did not affect the generalization of the classifier as Fig. A2 shows.

Hubel and Wiesel (1959, 1962) showed that the visual cortex performs neuronal processing of spatial frequency, orientation, motion, direction, speed, and many other spatiotemporal features. A recent study (Martens et al. 2009) showed that these properties of the visual system can be exploited to increase the amplitude of the EEG response, and thereby the overall classification accuracy. Di Russo et al. 2002 showed the same polarity reversal of the early VEP components between visual/occipital cortex and central and frontal locations that we observed in Condition 1 (see Section 3.4).

The optimization of “P300” stimulation parameters is usually based on data from normal subjects obtained in Condition-1-like circumstances (i.e., the subject is allowed to gaze at the target). Our results suggest that such optimization is determined more by the VEP than by P300 (Gonsalvez and Polich 2002). Thus, the lack of early VEP components over visual/occipital cortex, with polarity reversal over central and frontal locations, in Condition 2 (see Section 3.4) suggests that optimization based on data from normal subjects (Krusienski et al. 2006, 2008; Sellers, Krusienski, McFarland, Vaughan and Wolpaw 2006; Takano et al. 2009) may not generalize well to subjects in whom gaze control is impaired.

Two aspects of the study methodology may have exaggerated the actual difference in accuracy between the two conditions. That is, the improvement produced by gazing at the target rather than at a central fixation point may not be as great as the the present data imply. First, Condition 2, the “center” condition, is a more demanding task than is Condition 1, the “letter” condition. In Condition 1, the subject has only to look at the target and pay attention to it, while in Condition 2 s/he has to look at the fixation cross and pay attention to the target. Tasks that require greater amounts of attentional resources have been shown to elicit smaller and delayed P300 responses than tasks that require lesser attentional resources (Kok 2001; Polich 1987). This is consistent with our results shown in Section 3.4. This could account for much of the difference in accuracy between the conditions summarized in Fig. 5. Furthermore, it is possible that with continued practice, the subjects might improve their performance on the more difficult task of Condition 2, and thereby reduce the difference in accuracy between the two conditions.

Second, given the inverse relationship between visual acuity and distance from the point of gaze (e.g., Westheimer 1965), under Condition 2, 32 of the 36 possible targets (see Figure 2) were at a disadvantage because some of their non-target competitors (i.e., some of the possible mistakes) were closer to the point of gaze. This finding suggests that, if the impact of having non-targets closer to the fixation point than the target were eliminated (e.g., by having all possible targets in a circle centered on the fixation point), Condition 2 accuracy would improve, and the superiority of Condition 1 would be less marked.

To verify that these aspects did not affect the main result of this paper, i.e., that the performance of the “P300” matrix speller in normal subjects not only depends on the P300 evoked potential, but also on other EEG features such as the visual evoked potential (VEP) that strongly depend on eye-gaze direction, we conducted the following analysis. We hypothesized that the classification accuracy in Condition 1, the “letter” condition, that is unaffected by both before-mentioned aspects, would significantly decrease if the visually evoked potential (VEP) were not used. Thus, we compared the classification accuracy within Condition 1, the “letter” condition, when we used either all data (i.e., 0–800 ms after stimulus presentation) or data that excludes VEP components (i.e., 300–800 ms). ERPs that depend on eye gaze such as VEPs are known to occur 150–350 ms post stimulus, while P300 ERPs are known to occur 300–600 ms post stimulus. The results shown in Fig. A4 demonstrate that 14/17 subjects performed significantly worse when the ERPs used for classification were restricted to 300 to 800 ms compared to when they were not (29.6%, p<0.05, pairwise t-test). This confirms our hypothesis and proves that the main result of this paper is not affected by the two aspects mentioned above. At the same time, it is important to remember that the results of this analysis were done in healthy subjects and may not be identical to those for people with impaired eye-gaze.

Visual crowding (Korte 1923; Strasburger 2005), i.e., the impaired recognition of a suprathreshold target due to the presence of distractor elements in the neighborhood of that target, may have also had an adverse effect on the classification accuracy in Condition 2, the “center” condition. Crowding is inevitable in the matrix “P300” speller and can only be avoided by arranging the letters in a circle rather than a matrix.

Finally, while we controlled for eye movements in this study, we did not control for or eliminate very small or very brief eye movements. Such movements, e.g., micro-saccades (Cornsweet 1956), contribute to maintaining foveal visibility by continuously stimulating neurons in primary visual areas (Rolfs 2009). While it is known that ALS can impair eye-gaze (Cohen and Caroscio 1983; Palmowski et al. 1995), the effect of ALS on such saccades has not yet been studied. Thus, the difference between the results shown here and the results that can be expected in people with ALS is currently unclear.

Further studies are needed to optimize P300 recording montages and stimulation and analysis parameters and to evaluate the effect of online feedback and extended training in this user population.

In summary, this study shows in normal subjects that the classification accuracy of the “P300” matrix speller BCI is substantially improved when the subject gazes directly at the target. Thus, the study disproves the widespread assumption that the performance of the “P300” speller does not depend on fixating the target. Further research is needed to determine whether this effect is similarly prominent in the potential user population (e.g., people severely disabled by ALS), and whether their performance can be improved by modifications in montage selection, algorithm parameterization, or other aspects of BCI operation.

Supplementary Material

Supplementary figure 1. Figure A1. Average squared correlation coefficient r2 values of the target and non-target responses as a function of the distance of gaze from the center.

The traces show the r2 averaged over all 8 electrodes of the optimized montage and all subjects that achieved more than 50% in Condition 2, the “center” condition. The traces show a decline of the r2 value over the distance for the 50–200 ms (i.e., VEP) and the 250–400 ms (i.e., P300) period, while the r2 value for the 400–600 ms (i.e., late ERP) period increases.

Supplementary figure 2. Figure A2. Classification accuracy, based on distance specific classifiers, as a function of the distance of gaze from the center.

The bars show the classification accuracy in Condition 2, the “center” condition, for classifiers specifically trained on each of 6 distance of gaze to the center. The distance specific classifiers poorly generalize to other distances (e.g., trained on 2 degree, tested on 9.9 degree), while the generalized classifier (dark blue bar) results in the best overall and best generalizing classification accuracy.

Supplementary figure 3. Figure A3. Distributions of the distance of eye gaze from the center when given normal “P300” matrix speller usage instructions.

The traces show the distributions of the horizontal (for the six columns) or vertical (for the six rows) distances of gaze location from the fixation cross for one subject that was given the instruction to “focus attention” on the intended letter.

Supplementary figure 4. Figure A4. Classification accuracy in the “letter” condition as a function of the number of stimulus repetitions and data period.

The left panel shows results when we used all data (i.e., 0–800 ms post stimulus). The right panel shows results when we used only data after 300 ms (i.e., 300–800 ms post stimulus). See text for details.

Acknowledgments

We would like to acknowledge Dr. Dennis McFarland for his helpful comments. This work was supported by the NIH (EB006356 (GS), EB00856 (JRW and GS)) and the US Army Research Office (W911NF-07-1-0415 (GS) and W911NF-08-1-0216 (GS)).

References

  1. Allison BZ, Pineda JA. ERPs evoked by different matrix sizes: implications for a brain computer interface (BCI) system. IEEE Trans Neural Syst Rehabil Eng. 2003;11(2):110–113. doi: 10.1109/TNSRE.2003.814448. [DOI] [PubMed] [Google Scholar]
  2. Birbaumer N, Cohen LG. Brain-computer interfaces: communication and restoration of movement in paralysis. J Physiol. 2007;579(Pt 3):621–636. doi: 10.1113/jphysiol.2006.125633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Citi L, Poli R, Cinel C, Sepulveda F. P300-based BCI mouse with genetically-optimized analogue control. IEEE Trans Neural Syst Rehabil Eng. 2008;16(1):51–61. doi: 10.1109/TNSRE.2007.913184. [DOI] [PubMed] [Google Scholar]
  4. Cohen B, Caroscio J. Eye movements in amyotrophic lateral sclerosis. J Neural Transm Suppl. 1983;19:305–315. [PubMed] [Google Scholar]
  5. Cornsweet TN. Determination of the stimuli for involuntary drifts and saccadic eye movements. J Opt Soc Am. 1956;46(11):987–993. doi: 10.1364/josa.46.000987. [DOI] [PubMed] [Google Scholar]
  6. De Keyser M, Vissenberg I, AN Are visually evoked potentials (VEP) useful for determination of visual acuity?: A clinical trial. Neuro-Ophthalmology. 1990;10(3):153–163. [Google Scholar]
  7. Di Russo F, Martínez A, Sereno MI, Pitzalis S, Hillyard SA. Cortical sources of the early components of the visual evoked potential. Hum Brain Mapp. 2002;15(2):95–111. doi: 10.1002/hbm.10010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Donchin E, Arbel Y. Neuroergonomics and Operational Neuroscience; FAC ‘09: Proceedings of the 5th International Conference on Foundations of Augmented Cognition; Berlin, Heidelberg: Springer-Verlag; 2009. pp. 724–731. [Google Scholar]
  9. Donchin E, Spencer KM, Wijesinghe R. The mental prosthesis: assessing the speed of a P300-based brain-computer interface. IEEE Trans Rehabil Eng. 2000;8(2):174–179. doi: 10.1109/86.847808. [DOI] [PubMed] [Google Scholar]
  10. Edlinger G, Holzner C, Groenegress C, Guger C, Slater M. Neuroergonomics and Operational Neuroscience; FAC ‘09: Proceedings of the 5th International Conference on Foundations of Augmented Cognition; Berlin, Heidelberg: Springer-Verlag; 2009. pp. 732–740. [Google Scholar]
  11. Farwell LA, Donchin E. Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. Electroencephalogr Clin Neurophysiol. 1988;70(6):510–523. doi: 10.1016/0013-4694(88)90149-6. [DOI] [PubMed] [Google Scholar]
  12. Furdea A, Halder S, Krusienski DJ, Bross D, Nijboer F, Birbaumer N, Kübler A. An auditory oddball (P300) spelling system for brain-computer interfaces. Psychophysiology. 2009;46(3):617–625. doi: 10.1111/j.1469-8986.2008.00783.x. [DOI] [PubMed] [Google Scholar]
  13. Gonsalvez CL, Polich J. P300 amplitude is determined by target-to-target interval. Psychophysiology. 2002;39(3):388–396. doi: 10.1017/s0048577201393137. [DOI] [PubMed] [Google Scholar]
  14. Guger C, Daban S, Sellers E, Holzner C, Krausz G, Carabalona R, Gramatica F, Edlinger G. How many people are able to control a P300-based brain-computer interface (BCI)? Neurosci Lett. 2009;462(1):94–98. doi: 10.1016/j.neulet.2009.06.045. [DOI] [PubMed] [Google Scholar]
  15. Hubel DH, Wiesel TN. Receptive fields of single neurones in the cat’s striate cortex. J Physiol. 1959;148:574–591. doi: 10.1113/jphysiol.1959.sp006308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol. 1962;160:106–154. doi: 10.1113/jphysiol.1962.sp006837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Jennrich RI. John Wiley and Sons. New York: 1977. pp. 58–75. [Google Scholar]
  18. Klobassa DS, Vaughan TM, Brunner P, Schwartz NE, Wolpaw JR, Neuper C, Sellers EW. Toward a high-throughput auditory P300-based brain-computer interface. Clin Neurophysiol. 2009;120(7):1252–1261. doi: 10.1016/j.clinph.2009.04.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kok A. On the utility of P3 amplitude as a measure of processing capacity. Psychophysiology. 2001;38(3):557–577. doi: 10.1017/s0048577201990559. [DOI] [PubMed] [Google Scholar]
  20. Korte W. Über die Gestaltauffassung im indirekten Sehen. Zeitschrift für Psychologie. 1923;93:17–82. [Google Scholar]
  21. Krusienski DJ, Sellers EW, Cabestaing F, Bayoudh S, McFarland DJ, Vaughan TM, Wolpaw JR. A comparison of classification techniques for the P300 Speller. J Neural Eng. 2006;3(4):299–305. doi: 10.1088/1741-2560/3/4/007. [DOI] [PubMed] [Google Scholar]
  22. Krusienski DJ, Sellers EW, McFarland DJ, Vaughan TM, Wolpaw JR. Toward enhanced P300 speller performance. J Neurosci Methods. 2008;167(1):15–21. doi: 10.1016/j.jneumeth.2007.07.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lenhardt A, Kaper M, Ritter HJ. An adaptive P300-based online brain-computer interface. IEEE Trans Neural Syst Rehabil Eng. 2008;16(2):121–130. doi: 10.1109/TNSRE.2007.912816. [DOI] [PubMed] [Google Scholar]
  24. Martens SM, Hill NJ, Farquhar J, Schölkopf B. Overlap and refractory effects in a brain-computer interface speller based on the visual P300 event-related potential. J Neural Eng. 2009;6(2):026003–026003. doi: 10.1088/1741-2560/6/2/026003. [DOI] [PubMed] [Google Scholar]
  25. Mellinger J, Schalk G. In: Toward Brain-Computer Interfacing. Dornhege G, del J, Millan R, Hinterberger T, McFarland D, Müller K, editors. MIT Press; 2007. pp. 359–367. [Google Scholar]
  26. Mugler E, Bensch M, Halder S, Rosenstiel W, Bogdan M, Birbaumer N, Kübler A. Control of an Internet Browser Using the P300 Event Related Potential. 2008;10(1):56–63. [Google Scholar]
  27. Nijboer F, Sellers EW, Mellinger J, Jordan MA, Matuz T, Furdea A, Halder S, Mochty U, Krusienski DJ, Vaughan TM, Wolpaw JR, Birbaumer N, Kübler A. A P300-based brain-computer interface for people with amyotrophic lateral sclerosis. Clin Neurophysiol. 2008;119(8):1909–1916. doi: 10.1016/j.clinph.2008.03.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Palmowski A, Jost WH, Prudlo J, Osterhage J, Käsmann B, Schimrigk K, Ruprecht KW. Eye movement in amyotrophic lateral sclerosis: a longitudinal study. Ger J Ophthalmol. 1995;4(6):355–362. [PubMed] [Google Scholar]
  29. Polich J. Task difficulty, probability, and inter-stimulus interval as determinants of P300 from auditory stimuli. Electroencephalogr Clin Neurophysiol. 1987;68(4):311–320. doi: 10.1016/0168-5597(87)90052-9. [DOI] [PubMed] [Google Scholar]
  30. Rebsamen B, Burdet E, Guan C, Zhang H, Teo CL, Zeng Q, Laugier C, Ang MH., Jr Controlling a Wheelchair Indoors Using Thought. IEEE Intelligent Systems. 2007;22(2):18–24. [Google Scholar]
  31. Rolfs M. Microsaccades: small steps on a long way. Vision Res. 2009;49(20):2415–2441. doi: 10.1016/j.visres.2009.08.010. [DOI] [PubMed] [Google Scholar]
  32. Schalk G, McFarland DJ, Hinterberger T, Birbaumer N, Wolpaw JR. BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans Biomed Eng. 2004;51(6):1034–1043. doi: 10.1109/TBME.2004.827072. [DOI] [PubMed] [Google Scholar]
  33. Schalk G, Mellinger J. A Practical Guide to BrainComputer Interfacing with BCI2000. 1. Springer; 2010. [Google Scholar]
  34. Schreuder M, Blankertz B, Tangermann M. A new auditory multi-class brain-computer interface paradigm: spatial hearing as an informative cue. PLoS One. 2010;5(4) doi: 10.1371/journal.pone.0009813. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Sellers EW, Krusienski DJ, McFarland DJ, Vaughan TM, Wolpaw JR. A P300 event-related potential brain-computer interface (BCI): the effects of matrix size and inter stimulus interval on performance. Biol Psychol. 2006;73(3):242–252. doi: 10.1016/j.biopsycho.2006.04.007. [DOI] [PubMed] [Google Scholar]
  36. Sellers EW, Kübler A, Donchin E. Brain-computer interface research at the University of South Florida Cognitive Psychophysiology Laboratory: the P300 Speller. IEEE Trans Neural Syst Rehabil Eng. 2006;14(2):221–224. doi: 10.1109/TNSRE.2006.875580. [DOI] [PubMed] [Google Scholar]
  37. Sellers EW, Vaughan TM, Wolpaw JR. A brain-computer interface for long-term independent home use. Amyotrophic Lateral Sclerosis. 2010 doi: 10.3109/17482961003777470. (in press) [DOI] [PubMed] [Google Scholar]
  38. Serby H, Yom-Tov E, Inbar GF. An improved P300-based brain-computer interface. IEEE Trans Neural Syst Rehabil Eng. 2005;13(1):89–98. doi: 10.1109/TNSRE.2004.841878. [DOI] [PubMed] [Google Scholar]
  39. Sharbrough F, Chatrian GE, Lesser RP, Luders H, Nuwer M, Picton TW. American Electroencephalographic Society guidelines for standard electrode position nomenclature. Electroenceph Clin Neurophysiol. 1991;8:200–202. [PubMed] [Google Scholar]
  40. Sherman J. Visual evoked potential (VEP): basic concepts and clinical applications. J Am Optom Assoc. 1979;50(1):19–30. [PubMed] [Google Scholar]
  41. Strasburger H. Unfocused spatial attention underlies the crowding effect in indirect form vision. J Vis. 2005;5(11):1024–1037. doi: 10.1167/5.11.8. [DOI] [PubMed] [Google Scholar]
  42. Takano K, Komatsu T, Hata N, Nakajima Y, Kansaku K. Visual stimuli for the P300 brain-computer interface: a comparison of white/gray and green/blue icker matrices. Clin Neurophysiol. 2009;120(8):1562–1566. doi: 10.1016/j.clinph.2009.06.002. [DOI] [PubMed] [Google Scholar]
  43. Treder MS, Blankertz B. (C)overt attention and visual speller design in an ERP-based brain-computer interface. Behav Brain Funct. 2010;6(1):28–28. doi: 10.1186/1744-9081-6-28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Vaughan TM, McFarland DJ, Schalk G, Sarnacki WA, Krusienski DJ, Sellers EW, Wolpaw JR. The Wadsworth BCI Research and Development Program: at home with BCI. IEEE Trans Neural Syst Rehabil Eng. 2006;14(2):229–233. doi: 10.1109/TNSRE.2006.875577. [DOI] [PubMed] [Google Scholar]
  45. Westheimer G. Visual acuity. Annu Rev Psychol. 1965;16:359–380. doi: 10.1146/annurev.ps.16.020165.002043. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary figure 1. Figure A1. Average squared correlation coefficient r2 values of the target and non-target responses as a function of the distance of gaze from the center.

The traces show the r2 averaged over all 8 electrodes of the optimized montage and all subjects that achieved more than 50% in Condition 2, the “center” condition. The traces show a decline of the r2 value over the distance for the 50–200 ms (i.e., VEP) and the 250–400 ms (i.e., P300) period, while the r2 value for the 400–600 ms (i.e., late ERP) period increases.

Supplementary figure 2. Figure A2. Classification accuracy, based on distance specific classifiers, as a function of the distance of gaze from the center.

The bars show the classification accuracy in Condition 2, the “center” condition, for classifiers specifically trained on each of 6 distance of gaze to the center. The distance specific classifiers poorly generalize to other distances (e.g., trained on 2 degree, tested on 9.9 degree), while the generalized classifier (dark blue bar) results in the best overall and best generalizing classification accuracy.

Supplementary figure 3. Figure A3. Distributions of the distance of eye gaze from the center when given normal “P300” matrix speller usage instructions.

The traces show the distributions of the horizontal (for the six columns) or vertical (for the six rows) distances of gaze location from the fixation cross for one subject that was given the instruction to “focus attention” on the intended letter.

Supplementary figure 4. Figure A4. Classification accuracy in the “letter” condition as a function of the number of stimulus repetitions and data period.

The left panel shows results when we used all data (i.e., 0–800 ms post stimulus). The right panel shows results when we used only data after 300 ms (i.e., 300–800 ms post stimulus). See text for details.

RESOURCES