Skip to main content
Cognitive Neurodynamics logoLink to Cognitive Neurodynamics
. 2021 Sep 17;16(2):365–377. doi: 10.1007/s11571-021-09717-7

Categorizing objects from MEG signals using EEGNet

Ran Shi 1, Yanyu Zhao 1, Zhiyuan Cao 1, Chunyu Liu 1, Yi Kang 1, Jiacai Zhang 1,2,
PMCID: PMC8934895  PMID: 35401863

Abstract

Magnetoencephalography (MEG) signals have demonstrated their practical application to reading human minds. Current neural decoding studies have made great progress to build subject-wise decoding models to extract and discriminate the temporal/spatial features in neural signals. In this paper, we used a compact convolutional neural network—EEGNet—to build a common decoder across subjects, which deciphered the categories of objects (faces, tools, animals, and scenes) from MEG data. This study investigated the influence of the spatiotemporal structure of MEG on EEGNet’s classification performance. Furthermore, the EEGNet replaced its convolution layers with two sets of parallel convolution structures to extract the spatial and temporal features simultaneously. Our results showed that the organization of MEG data fed into the EEGNet has an effect on EEGNet classification accuracy, and the parallel convolution structures in EEGNet are beneficial to extracting and fusing spatial and temporal MEG features. The classification accuracy demonstrated that the EEGNet succeeds in building the common decoder model across subjects, and outperforms several state-of-the-art feature fusing methods.

Keywords: Neural decoding, Magnetoencephalography, Deep learning, Feature fusion

Introduction

It remains a challenge to build a map between external visual stimuli and internal neural representations. Previous human neuroimaging studies have provided strong evidence for “brain reading” or neural decoding. Neural decoding reads out the detailed contents of a person’s mental state (such as the stimulus of a human face) from the underlying brain activity, which opens a window for understanding the correspondence between a simple cognitive state and human neuroimaging. (Blair et al. 2015; Lee et al. 2020; Simanova et al. 2012; Wang et al. 2012).

Recent progress in neurotechnology has made it possible to study neural activation maps for specific stimuli and discriminate stimuli from various modalities of neural signals, such as functional magnetic resonance imaging (fMRI), electroencephalography (EEG), and magnetoencephalography (MEG) (Gross 2019). Kamitani and colleagues reliably predicted the 8 edge orientations the subject was seeing using ensemble fMRI signals in early visual areas (Kamitani and Tong 2005). Kay and Gallant summarized several advancements of brain decoders of visual stimuli via fMRI including distinguish among a handful of predetermined images (faces or houses), identifying the specific image that the subject saw out of a set of potential images, even constructing the contents of stimulus images (Kay and Gallant 2009). The research group of Beijing Normal University has deciphered three disparity conditions (crossed disparity, uncrossed disparity, and zero disparity) of 3-dimensional images stimuli from functional connectivity patterns in fMRI data (Liu et al. 2020b). Mitchell’s work has shown that spatial patterns of neural activation in fMRI are associated with thinking about specific semantic categories of pictures and words (for example, tools, buildings, and animals, and so on), and he presented a computational model that predicts the fMRI neural activation associated for thousands of other concrete nouns in the text corpus for which fMRI data are not yet available (Mitchell et al. 2008). A brain-computer interface(BCI) study in Columbia University employed single-trial EEG activity to detect the target images from images presented in rapid serial visual presentation (Gerson et al. 2006). Recently Willett used a recurrent neural network to develop an intracortical BCI that decodes attempted handwriting movements from neural activity in the motor cortex and translates it to text in real time (Willett et al. 2020). Liu’s study showed that the multivariate decoding algorithm based on FC patterns from MEG data can categorize image stimuli in four categories (faces, scenes, animals, and tools) (Liu et al. 2020a).

Neural mapping between visual stimuli and brain signals is a lasting research topic in neuroscience research community (Horikawa et al. 2013; Lee et al. 2020) Early studies used fMRI signals to identify the brain areas corresponding to specific objects. For example, Okada et al. (2000)demonstrated that naming animals caused more activation of the primary visual cortex bilaterally and the ventral occipital cortex while naming tools caused more activation of the temporal area, the parietal lobule, and the left inferior frontal cortex. Although these studies are important for understanding the visual information processing mechanism, to better understand the temporal dynamics within object recognition process, researchers employed EEG data due to its high temporal resolution and ability to record the simultaneous fluctuation of brain response. Philiastides (2006) practically classified face and vehicle images from waveforms in event-related potential (ERP) components (early 170 ms and late > 300 ms components) in single-trial EEGs. Wang et al. (2012) demonstrated that a combination of ERP components improved the classification accuracies of four categories (e.g., faces, buildings, animals, and cars). Although EEG has provided significant findings in neural encoding/decoding studies, EEG studies failed to advance much in understanding the spatial neural patterns (spatial connections, neural pathways) of visual information processing within human brain for its limitation of poor spatial resolution. Therefore, the no-invasive and whole-head neuroimaging modality—MEG, which has both high temporal resolution and scalp spatial resolution (Dash et al. 2019), holds the potential for further investigation of visual neural decoding. MEG signals provide a new technology to investigate cognitive task-specific activations, which is not influenced by abnormalities of blood flow, variable volume conduction, and cerebral structural lesions (Qadri et al. 2021). MEG allows us to record the magnetic fields produced by the electrical activity of the brain and to explore important questions in neuroscience, neural engineering, and clinical studies. Scientist have successfully used spike events from MEG raw data and the statistical dependencies between the MEG time-series to diagnose epilepsy (Zheng et al. 2019) and Alzheimer’s disease(Lopez-Martin et al. 2020).

MEG spatiotemporal features have demonstrated their great potentials for neural decoding (Alexander. et al. 2011; Dash et al. 2020; Huang et al. 2019; Muukkonen et al. 2020; Seeliger et al. 2017; Van et al. 2013). Kostas’s work showed that deep neural networks trained with raw MEG data can predict the age of children performing a verb-generation task, a monosyllable speech-elicitation task, and a multi-syllabic speech-elicitation task (Kostas et al. 2019). Dash et al. (2020) used a unique representation of spatial, spectral, and temporal features from MEG data to represent the whole brain dynamics and decode imagined/spoken phrases. The event-related potential (ERP) and short-time dynamic FC patterns extracted from MEG were also used to classify different visual images (Liu et al. 2020a; Qin et al. 2016; Wang et al. 2012). Previous studies demonstrate MEG providing timing as well as spatial information about brain activity and expanding our understanding of the neural processes underlying visual object recognition (Contini et al. 2017). Therefore, this study focuses on applications of machine learning in MEG features extraction and consequent decoding model for stimulus image categorization.

Machine learning approaches have been widely used in MEG decoding. Sen et al. (2018) used a complex Morlet wavelet transform to extract two types of features and then used SVM and ANN to compare these two features. They found that local features can achieve higher accuracy. To improve the MEG motion decoding accuracy, Zhang et al. (2011) introduced a clustering linear discriminant analysis (CLDA) algorithm, which can accurately capture correlation information between features. Recent studies have suggested that deep neural networks outperform conventional machine learning methods because deep learning allows for learning representations of data with multiple levels of abstraction and generating features from raw features (Cetiner et al. 2016). Dashed et al. demonstrated that ANN and CNNs can effectively decode imagined and spoken phrases directly from non-invasive neural MEG signals(Dash et al. 2020). Zubarev et al. (2019) introduced a convolutional neural network model that follows a generative model to classify evoked and oscillatory MEG data across subjects. They demonstrated that combining prior knowledge about the process of generating MEG observations can effectively reduce the model complexity while maintaining its high accuracy and interpretability. Kostas’s results shows that end-to-end approaches with deep learning significantly outperformed feature-based models in MEG classification tasks(Kostas et al. 2019). This result, maybe it was not a unanimous demonstration, showed us the end-to-end models’ fairly promising indication of performance. Huang et al. (2019) proposed a 3D-CNN architecture and an improved self-training method to use the spatial–temporal information of raw MEG data, which require cheaper pre-processing while improving the accuracy of cross-subject MEG decoding. All the examples mentioned above demonstrate the potential of deep learning applications in MEG neural decoding.

However, the application of deep neural networks in MEG decoding faces several challenges. First, the features extracted automatically from deep networks have poor interpretability. The introduction of prior knowledge into the network to learn the spatiotemporal features of MEG signals needs further investigation. Second, the high individual difference is another key challenge. The decoding model at the individual level has been widely explored in previous literature. Some MEG features, such as typical ERP components and FC patterns of MEG data have been proved useful to build decoding models for intrasubject(Liu et al. 2020a; Van et al. 2013; Wang et al. 2012). However, current MEG decoding often calls for skilled human intervention to achieve MEG feature extraction(Gross 2019). The manually selected feature may highlight limitations in capturing the rich variability between individuals, which is not effective for building the common decoding model across subjects. Thus, such studies remain scant. Is it possible to extract the common features across subjects by deep learning methods?.

Herein, we employed a compact convolutional neural network called EEGNet to implement the end-to-end classification of MEG signals(Lawhern et al. 2016). EEGNet was used to identify predictive neurophysiological processes(such as attentional and motor response selection processes) under action control tasks in single-trial EEG data(Vahid et al. 2020). The above studies inspired us to investigate the classification performance of EEGNet on MEG data. We also replaced the convolution layer in EEGNet with two parallel convolution networks to extract the MEG spatial and temporal information respectively and fused them by concatenating them and feeding them into EEGNet. The contributions of this study are as follows:

First, we utilized a compact convolutional neural network—EEGNet that is commonly used in EEG classification to MEG data and built neural decoding models both in individual level and cross-subject level.

Second, we demonstrated the category-related neural presentation in spatiotemporal distribution and investigated the influence of spatiotemporal structure on decoding accuracy.

At last, we modified EEGNet with two sets of parallel convolution structures to fuse the spatial and temporal features in MEG, one convolution layer for the original MEG matrix and the other for the transpose of the MEG matrix, and examined the feature fusion effects on the performance.

Methods

Ethics statement

This study protocol was approved by the Centre for Functional Imaging Review Board on Experimental Ethics of Peking University. Prior to the experiment, all subjects provided written informed consent to participate.

Subjects and tasks

Nineteen healthy right-handed students from Beijing Normal University have been recruited as participants for this study. All subjects (23 ± 2 years, 9 females) were right-handed, had normal hearing and normal or corrected-to-normal vision, and have no neurological disorders. All participants provided written informed consent. Experimental procedures were approved by the Peking University Institutional Review Board. Due to excessive head movements of 2 participants during the experiment, only the data of 17 participants were adopted in this study. During the MEG recording experiment, they viewed visual images displayed by a projector with arms relaxed.

The image stimulus comprised of 640 images from four categories (160 images per category), including faces with neutral expressions, scenes, animals, and tools. Face images with neutral expressions were selected from the Chinese affective picture system (Bai et al. 2005) and included 80 unique male and female neutral faces. The scene stimuli were all outdoor scenes (http://cvcl.mit.edu/database.htm), including mountains, countryside scenes, streets, and buildings, with 40 unique pictures of each type. The animal and tool materials were collected from the Internet with some modifications. Animal stimuli included 40 items from mammals, birds, insects, and reptiles, each item with four exemplars. The tool stimuli also comprised 40 items for kitchen utensils, farm implements, and other common indoor tools, and each item had four exemplars. All 640 images were converted to greyscale and cropped to the central 300 × 300 pixels with a visual angle of 7.92° × 7.92°.

The experiment consisted of 10 sessions. In each session, 64 visual stimuli from 4 categories (16 faces, 16 scenes, 16 animals, and 16 tools) were presented to subjects randomly (each image in a trial). In each trial, a visual stimulus image was displayed for 2000 ms, followed by a blank-screen interstimulus interval (ISI) ranging randomly from 1500 to 2000 ms. There was a green cross at the center of each image and a blank screen to help the subjects keep their fixation. Subjects were required to press the key when they saw the repeated pictures in the same session, and their reaction time was recorded.

Data recording and preprocessing

MEG signals were acquired in the Centre for Functional Imaging of Peking University using a 306-channel (306 data channels are distributed in 102 locations, each with 2 gradiometer sensors and 1 magnetometer sensor)whole-head MEG system (Elekta neuroimage, Helsinki, Finland) in a magnetically shielded room with a sampling rate of 1000 Hz (Liu et al. 2020a). We lined the Dewar helmet with foam wedges before placing it on the subject’s head to ensure stability and used a wooden head supporter to ensure the head displacements between runs were below 5 mm.

The preprocessing of MEG data was performed using MaxFilter scanner software (Elekta-Neuroimage, Helsinki, Finland) and Brainstorm mind mapping software. All 306 channels were checked by visual inspection, and no significant abnormal noise was found. To interpolate the value missed in bad sensors and suppress magnetic interferences, the temporal extension of Signal Space Separation (tSSS) in MaxFilter (Taulu and Simola 2006) was employed to raw data. The tSSS with a subspace correlation limit of 0.9 and a sliding window of 10 s was used to remove artifacts from the external interference. The head positions during each pair of runs were co-registered in reference to the position using Maxmove (a sub-component of MaxFilter software). Based on visual inspection, data containing high amplitude recorded artifacts were removed. During MEG preprocessing step using Brainstorm (Tadel et al. 2011), the MEG data were filtered between 0.08 and 40 Hz with FIR bandpass filter provided by brainstorm software. The first one or two eye movements components artifacts were further removed using independent component analysis (ICA) without affecting the rest of the recordings for further analysis. Afterward, the data were scaled up to the [10–3, 103] interval to avoid exceeding the numerical representation range of the computer. Previous visual neural decoding studies have proved that the discriminative information of visual stimulation is temporally distributed across typical ERP components appearing within 600 ms after stimulus onset (Dalponte et al. 2007; Ewca et al. 2020; Qin et al. 2016; Simanova et al. 2012; Wang et al. 2012). Therefore, we segmented the MEG data in trials from − 100 to 600 ms. The prestimulus data were used as the baseline (Liu et al. 2020a), and the poststimulus data were used for further analysis.

EEGNet

The deep learning architecture we used was a compact convolutional neural network called EEGNet. Its good performance in components such as the P300, visual-evoked potentials, error-related negativity responses (ERN), movement-related cortical potentials (MRCP), and sensory motor rhythms (SMR)has been demonstrated (Lawhern et al. 2016). The structure of the EEGNet and the parameters finally used in this study is shown in Table 1. The EEGNet architecture consists of three parts: the regular 2D convolutional layer, the depth-wise convolution, and the separable convolution consisting of a depth-wise convolution followed by a pointwise convolution. The advantage of the depth-wise convolution is that it is not fully connected to all previous feature maps but is connected just in one previous feature map. Separable convolution has fewer parameters than ordinary convolution, so it is possible to train a model that is less prone to over-fitting. It can not only extract interpretable abstract features from neurophysiological signals but also build reliable models with small datasets. It has manifested its great application value (Huang et al. 2020; Riyad et al. 2020; Vahid et al. 2019). More details about EEGNET can be found in Lawhern’s study.

Table 1.

Parameters in EEGNet for our MEG data

Module Layer type Filters Size Parameters Output dimension Activation Dropout
Frequency domain filtering Input (306,600)
Reshape (1,306,600)
2D convolution 16 (1, 500) 500 × 16 (16,306,600) Linear
Batch normalization 2 × 16 (16,306,600)
Space domain filtering Depthwise convolution 2 × 16 (306, 1) 306 × 2 × 16 (2 × 16,1,600) Linear
Batch normalization 2 × 2 × 16 (2 × 16,1,600)
Activation (2 × 16,1,600) ELU
Pooling (1, 4) (2 × 16,1,600/4)
Dropout (2 × 16,1,600/4) p = 0.25
Time domain filtering Separate convolution 32 (1, 16) (16 + 32) × 2 × 16 (32,1,600/4) Linear
Batch normalization 2 × 32 (32,1,600/4)
Activation ELU
Pooling (1, 8) (32,1,600/32)
Dropout (32,1,600/32) p = 0.25
Classification Reshape (1,32 × 600/32) SoftMax
Dense layer 4

Chance level estimation

Due to our finite MEG samples, we used the bootstrap method (Felsenstein and Joseph 1985; Johnson 1998; Kohavi 1995) to estimate the chance level instead of applying 25% probabilistic chance-level in 4-class classification (Combrisson and Jerbi 2015).

Firstly, by randomly permuting the observations across classes, we randomly exchanged labels of the original MEG samples.

Secondly, for the data set with sample size n, we do n times sampling with replacement to generate a training set. The probability of any given instance not being chosen after n samples is 1-1/nne-10.368 (Kohavi 1995). Then, about 36.8% of the samples in the original dataset that do not appear in the new dataset were used as the testing set, and 63.2% of the original samples were used as the training dataset.

Thirdly, the accuracy was evaluated by the following formula:

Acc=0.632Acctest_set+0.368Acctrain_set

Lastly, we repeat the 2–3 steps 100 times and take the value of 95 percentile of empirical performance distribution established by randomly permuting the data as our chance level, then one can assert that the original classification is significant with p < 0.05 (Combrisson and Jerbi 2015).

Spatiotemporal features

To investigate the MEG features in temporal fields, we first compared the classification accuracies of MEG data segments in different time stages. Poststimulus MEG data (0–600 ms) in each trial were separated into 30 equal segments by a non-overlapping sliding window of 20 ms. Then, we identified the intervals corresponding to components P1 (100–140 ms), N1/N170 (160–220 ms), P2a (225–295 ms), and P2b (320–400 ms) (see Fig. 1). The length of the above ERP intervals was determined using the visual inspection from the mean ERP in the occipital electrodes across subjects (Qin et al. 2016; Wang et al. 2012). Then, the temporal waves were used as temporal features and fed into the classifier.

Fig. 1.

Fig. 1

Durations of different ERP components (gray blocks represent the durations of P1, N1, P2a, and P2b) (Qin et al. 2016)

To investigate the MEG spatial features, we also grouped the 306 channels of the whole scalp into five sets (see Fig. 2) according to the sensor chip layout and the channel distribution model provided by brainstorm software: occipital lobe signal (72 channels), temporal lobe signal (78 channels), frontal lobe signal (78 channels) and parietal lobe signal (78 channels). The different groups of signals and the whole-scalp signals were used as spatial features for classification.

Fig. 2.

Fig. 2

Layout of sensor chips provided in Elekta neuroimage TRUIX user’s manual. Each number in the figure corresponds to a group of sensors including 1 magnetometer and 2 gradiometers

Spatiotemporal feature fusion

We used two parallel convolution architectures to fuse spatial and temporal features in MEG (see Fig. 3). The first set of convolution architectures retained the original EEGNet structure. We fed the EEGNet with MEG trials in the form of a C × T matrix, where C and T represent the number of channels and time points in MEG data, respectively. A convolutional kernel of 1 × Fs/2 (where Fs is the sampling frequency) was applied to obtain temporal feature maps corresponding to different frequencies. After that, the depthwise convolution kernel size of C × 1 was applied to extract spatial information and learn frequency-specific spatial filters. The separable convolution was finally applied to obtain temporal features.

Fig. 3.

Fig. 3

The proposed spatiotemporal feature fusion architecture

The second set of convolution architectures was also constructed as the same sequence as the first set: 2D convolution, depth-wise convolution, and separable convolution. The difference was that we transposed the input signal to a T × C matrix. Accordingly, we use a convolution kernel of size 1 × C/2 to obtain the spatial feature map corresponding to different frequencies. After that, we apply the depthwise convolution kernel size of T × 1 to extract temporal information and learn frequency-specific temporal filters. Then, we use separable convolution to obtain spatial features.

After the features were extracted in two different convolution networks in a parallel manner, we fused the temporal feature and the spatial feature obtained from the aforementioned two and then fed the new feature vectors into a classifier of a dense layer with a SoftMax activation function.

Results

Classification accuracy at the individual level and across subjects

Classification accuracy on individual level

The mixed MEG data of 640 trials (160 animals, 160 tools, 160 scenes, 160 faces) from each subject were pooled together and divided into a training set and testing set randomly at the ratio of 7:3. The five-fold cross-validation method was performed on the training data to select the best model, which was validated on the test set. The test results of the binary classification and four-category classification for each subject are presented in Fig. 4. We observed that for each individual, the decoder obtained an average accuracy of 67.40% for 4-class classification, which was far above the mean single-subject chance level (25.39% ± 0.2%) evaluated with the bootstrap method. However, Fig. 4 also demonstrates the difference of subjects, and the accuracy varied in the large range from 41.48% to 79.98% when discriminating four categories.

Fig. 4.

Fig. 4

Binary and four-class classification results at the individual level and across subject level

In binary classification, the average classification accuracy of 17 subjects is 84.21%. The accuracies between the face and other three visual stimulus categories (Face vs. Animal: 85.07%, Face vs. Scene: 90.07%, Face vs. Tool: 89.48%) were higher than the accuracies among the other three category pairs (Tool vs. Animal: 72.72%, Tool vs. Scene: 84.74%, Scene vs. Animal: 83.15%). The average accuracy of the tool-animal group was the lowest.

Classification accuracy across subject level

We grouped all the MEG samples of 17 subjects (10,880 samples in total) into 17 subset by subjects (640 samples for each subject). We implemented the leave-one-out-subject (LOOS) approach with an extended nested cross-validation procedure(Mh et al. 2020). Here, the inner cross-validations are run within an outer cross-validation procedure, with a different portion of the data (MEG samples from 1 subject) serving as outer “hold-out set” on each outer iteration. The best parameters and hyperparameters were determined on each inner cross-validation. In the inner cross-validation, 1 out of the rest 16 subjects was selected for validation sets, and the remaining 15 subjects’ MEG samples were pooled together to train the classifier. The above process including outer layer iteration repeating 17 times (each subject was tested once) and inner layer iteration for 16-fold cross-validation. The model performance was evaluated with the average accuracy computed across all of the hold-out sets for each subject.

The LOOS approach results were shown in Fig. 4. The across-subject decoding model using EEGNet obtained an average accuracy of 74.75% for binary classification and 52.39% for four-class classification. Furthermore, from the result of binary classifications, we noted that the classifier performed better when discriminating a face from other objects with an average accuracy of 77.57%. It obtained the lowest accuracy of 60.97% when distinguishing animals and tools. This finding was consistent with the classification results at the individual level.

Comparison of spatiotemporal features

Temporal feature results

To explore the MEG temporal feature, we plotted the fluctuation of averaged 4-class decoding performance across 17 subjects over time in Fig. 5. The decoding accuracy started to rise rapidly at approximately 100 ms after the stimulus onset, peaked at 150–200 ms, and then decreased over time.

Fig. 5.

Fig. 5

Classification accuracies of different time intervals from 0 to 600 ms after stimulus onset

This suggested that brain activities associated with object category perception may start as early as ERP component N1. The highest accuracy of 48.46% at 160 ms after stimulus onset indicated that classifiers effectively used the discriminative information contained in MEG features from N170.

Spatial feature results

To explore the contribution of different brain regions to image categorization, we compared the classification accuracy corresponding to each brain region (frontal lobe; temporal lobe; parietal lobe; occipital lobe) and the whole brain. As shown in Table 2, the occipital lobe related to visual processing achieved the best accuracy second only to the whole brain, while the frontal lobe far away from the visual area had the lowest decoding accuracy. In addition, decoding occipital lobe features took the least time.

Table 2.

Classification performance of different brain regions

Brain regions Classification accuracy Computation time
Frontal lobe 42.78% (± 0.91%) 0 h, 23 min, 58 s
Parietal lobe 47.89% (± 0.53%) 0 h, 23 min, 54 s
Temporal lobe 53.89% (± 0.52%) 0 h, 23 min, 23 s
Occipital lobe 59.56% (± 0.49%) 0 h, 21 min, 35 s
Whole-brain 62.19% (± 0.47%) 3 h, 19 min, 43 s

Classification with spatiotemporal feature and fusion

The 4-class recognition accuracies of the fusion method improved classification performance compared to using raw data or transpose data as input, which only extracted temporal features or spatial features, respectively (Table 3).

Table 3.

Classification performance of spatiotemporal feature fusion

Feature Classification accuracy Computing time
Raw data 60.62% (± 0.20%) 3 h, 15 min, 01 s
Transpose data 66.25% (± 0.54%) 1 h, 38 min, 32 s
Spatiotemporal feature fusion 69.19% (± 0.38%) 1 h, 55 min, 16 s

Discussion

Further discussion on classification results

Previous studies have used multivariate pattern analysis (MVPA) to decode visual categories from fMRI and the ERP from EEG data (Simanova et al. 2012) or MEG data. Wang et al. (2012) employed Fisher discrimination and LDA to classify spatial features extracted from single ERP components. Qin et al. (2016) used a multiple kernel support vector machine (SVM) to design classifiers for single ERP features and concatenated ERP features. Although these methods decoded visual image categories with good accuracies at the individual level, their performance depended on dramatic feature extraction. Beyond that, it is a major challenge to build an across-subject model for neural decoding.

Our study showed that the deep learning method has an excellent capability for neural decoding. First, the EEGNet succeeds in building the common decoder model for 17 subjects, including feature extraction and classification. In addition, as a deep learning network, the EEGNet also extracts abstract features from raw MEG data to decode visual categories.

In the intersubject analysis, the classification results of the leave-one subject-out validation showed that the EEGNet can effectively learn generic discriminating features across subjects instead of subject-specific features. However, individual differences between subjects are inevitable since each subject has particular characteristics. Moreover, our results also verified that neural responses from face images and nonface images are discriminated with high classification accuracies. Tools and scenes were difficult to discriminate from each other. We hypothesized that scene stimuli-related neural activities are mainly distributed in the parahippocampal area (Jonas et al. 2017), which is deep in the brain cortex, and that its MEG response is weak in the deep region. The consistency of our results with previous literature (Dima et al. 2018a, 2018b; Marti and Dehaene 2017; Ramkumar et al. 2013) proved that our common model is effective for compensating for differences between subjects and capturing globally discriminating information. With stacked layers, the deep neural network allows for learning hierarchical representations and complex nonlinear transformations from input data in an adaptive way (Lecun et al. 2015). Moreover, the parameters of convolutional layers in our neural network are the same for different neurons and inputs, which is beneficial for learning the shared features. Therefore, the deep neural network is a promising tool for identifying common components and extracting high-level abstractions across different subjects.

The capability of extracting features from raw data attributes to the hierarchical nature of the deep neural network. It allows the iterative fine-tuning of learning parameters; thus, less prior expert knowledge is required. This automatic end-to-end fashion does not rely on handcrafted features and simplifies the processing pipelines. In addition, as data increases, this data-driven network training allows learning more features than manual selection. In neural decoding, deep learning neural networks are usually designed to learn temporal and spatial filters by successive convolution layers, either spatial filter following temporal filter, or vice versa, and the EEGNet belongs to the first.

For the abstract feature representation, the EEGNet model provides an EEG-like way to understand the process of MEG feature extraction. The compact and interpretable CNN architectures project the outputs in each step into well-known EEG feature spaces, such as time–frequency and spatial frequency filtering (Lawhern et al. 2016). The two-step convolutional sequence imitates the filter-bank common spatial pattern (FBCSP) algorithm. In the first layer, convolution is conducted solely along the time axis, in which the kernel acts as a temporal filter that extracts different bandpass frequencies of the MEG signals. Then, the depthwise convolution in the second layer conducts a convolution along the electrodes to learn several spatial filters by each feature map.

MEG signals are characterized by a low signal-to-noise ratio (SNR) and high-dimensional spatiotemporal structures (Norman et al. 2006). It is difficult to decipher visual categories from single-trial MEG with a common neural decoder from different subjects. Subject differences were used to explain the poor performance of cross-subject classifiers. Here, our cross-subject decoder illustrated that the deep neural network can automatically learn robust and generic feature representation across subjects. This may open up new windows for MEG-based neural decoding.

Hyperparameters and model performance

We explored the influence of hyperparameters on the performance of the EEGNet with several commonly used metrics, including classification accuracy, model robustness, and computation time. First, our experiments demonstrated that the learning rate affected model stability. The classification accuracy fluctuated drastically for the EEGNet trained with a large learning rate (from 0.05 to 0.1), and it was greatly improved when the learning rate was relatively small (less than 0.01).

In addition, we compared the performance of the model with batch sizes of 8, 16, 32, 64, and 128. The results showed that our model was not sensitive to batch size. In this study, due to the small size of the MEG data, the candidate batch size values we could choose were limited within a certain range, and our model achieved the best performance when 32 was chosen to be the batch size value.

Moreover, we investigated configurations of the EEGNet architecture by varying the number of temporal filters: the number of temporal filters F1; the number of pointwise filters F2; and the number of spatial filters per temporal filter D. We compared the performance of the model when F1 was set to 4, 8, 16 and 32. According to Table 4, with the increase in the number of F1, the classification accuracy was gradually improved and reached the highest at 60.65% (± 0.51%). When the number of convolution kernels was greater than 16, the accuracy barely stopped growing. We can also observe that increasing the number of convolution kernels led to a longer computation time. Finally, the model with 16 frequency domain filtering convolution kernels in this study was the best-performing model. In most studies on neural decoding, the brain is usually divided into the left and right brain or divided into the occipital lobe, temporal lobe, parietal lobe, and frontal lobe for analysis. Therefore, we set the number of spatial filters to 2, 4, and 8. The results 4 suggested that the number of spatial filters had little effect on classification accuracy, yet its increase caused a longer calculation time. Here, we set D to 2. Although F2 can take any value in principle, F2 < D ∗ F1 denotes a compressed representation, whereas F2 > D ∗ F1 denotes an overcomplete representation (Lawhern et al. 2016). Therefore, we set F2 to 32(F2 = D ∗ F1) here. By setting several common dropouts (0.5, 0.25, 0.125), we found that the smallest value was suitable for our small MEG data set. To improve both within-subject and cross-subject classification accuracy, we have to further refine the model.

Table 4.

The effect of the number of 2D convolution kernels on model performance

F1 (D = 2) Classification accuracy Calculation time
4 57.23% (± 0.49%) 1 h, 16 min, 01 s
8 59.68% (± 0.50%) 1 h, 54 min, 16 s
16 60.65% (± 0.51%) 3 h, 15 min, 01 s
32 58.74% (± 0.7%) 5 h, 54 min, 49 s

Spatial and temporal features in classification

Over the past decade, several studies have investigated the neural response to visual stimuli over time. Van et al. (2013) found that classification accuracy peaked within the first 100–200 ms after visual stimulation. Marti and Dehaene (2017) revealed that a target picture could be detected with maximum classification accuracy at 160 ms from MEG data in a rapid serial visual presentation task. Liu et al. (2020a) indicated that decoding accuracies of brain activity patterns peaked at the interval of 0 ~ 200 ms using short-term dynamic FC patterns of MEG data. In addition, the aforementioned studies all found that brain activity related to category perception starts from approximately 100 ms after the visual stimulus. In line with the above conclusions, our MEG results showed that EEGNet classification accuracy started to rise at approximately 100 ms after stimulus onset and peaked at 160 ms.

In addition, we investigated the spatial distribution of category-related features by classifying neural representations in different brain regions. Our 4-category classification results demonstrated deep learning method can extract spatial patterns distributed in local brain regions from MEG responses and discriminate the categories of stimulus images. The whole scalp sensor achieved the best accuracy with the longest time. Moreover, the electrodes in the occipital lobe achieved the highest accuracy among four separate brain regions, suggesting that our model could more effectively classify the MEG response features in this region than other brain regions. Previous studies have identified some areas that are biologically plausible for category perception in the occipital lobe. The inferior occipital gyrus has been proven to be related to face processing (David 2010; Gauthier et al. 2000). The superior occipital gyrus has been shown to be activated by tools (Almeida et al. 2010) or at least tool-shaped objects (Sakuraba et al. 2012). Given its central involvement in object recognition (Liu et al. 2020a; Van et al. 2013; Wang et al. 2012), it is likely that this specific region contains the most substantial information for discriminating 4-category objects. What’s more, the occipital lobe electrodes not only achieved a satisfying accuracy but also greatly reduced the operation time.

In conclusion, we have confirmed the existence of specific neural patterns over time and space for visual object categorization (Wang et al. 2011). Visual object categorization is a dynamic process originating from a variety of brain cortex regions, which highlighted that the related information is not only involved in temporal variation, but also spatial correlations among electrodes. Thus, the extraction and fusion of stimulus-related latent neural feature representation in temporal and spatial spaces are of great significance for neural decoding. Liu et al. (2020b) used FC patterns at three different spatial scales (whole brain, visual cortex, and single ROI) extracted from fMRI time course within time intervals to decode disparity categories in 3D images. The results showed that FC patterns of different spatial scales contain different amounts of information. Gerson et al. (2006) has described a real-time EEG-based brain-computer interface (BCI) system, which can detect the stereotypical spatiotemporal response of visual recognition events elicited during the rapid serial visual presentation (RSVP). They used a spatial filter to extract the spatial feature and averaged the data in the time dimension, which essentially only extracts spatial features and does not make full use of the temporal features of EEG data. While previous studies about visual decoding have investigated the spatial and temporal representations in the human brain, the implicit temporal and spatial relationships need further identification. Our study tried to use EEGNeT to extract both spatial and temporal features and explore their relationships.

To further investigate discriminative information over time and space, we investigated the influence of the signal separation and combination on decoding accuracy. The classification results of MEG features whose spatial structure and temporal structure were destroyed indicated that either of the two destruction methods would lead to lower classification accuracy. This can be explained by the destruction of discriminative information and further verified that object categorization relative information distributed both in time and space. Moreover, time series corresponding to four ERP components (N100: 80–120 ms, N170: 150–190 ms, P2a: 200–240 ms, and P2b: 280–320 ms) of 72 electrodes in the occipital lobe were selected and concatenated in a way of expanding channels and then were fed into the model in the inter-subject analysis. The results showed that the model using the fusing feature could discriminate four types of visual objects with higher accuracy and less time compared with using raw data. This indicated that spatiotemporal features were complementary and their combination provided more effective discriminatory information.

Inspired by this, we further fused spatial and temporal features using two parallel architectures based on the EEGNet. In this fusion architecture, convolution operations are performed in both time and space simultaneously, and then high-level temporal and spatial features are obtained and fused automatically. Compared with using the temporal feature or spatial feature alone, this method combined temporal and spatial features at the same time. This fusion method used raw MEG data and its transposition as two inputs, which contained intrinsic and complete information in both the time and space dimensions. Moreover, the parallel convolution architecture allows for extracting spatial and temporal features in a way that does not interfere with each other, which helps to preserve the distribution of distinguishable information over time and space. These reasons are perhaps why this fusion method improved the classification results. Altogether, the fusion of temporal features and spatial features is a promising way to improve the decoding accuracy of visual object categories. Presently, we still do not have a clear understanding of how spatiotemporal features complement each other and how they relate to object representations. In future studies, more effort is needed to investigate the relationship between the spatial and temporal features of MEG data to fuse them more effectively.

Conclusion

In this study, we successfully used a deep neural network to decode different natural image categories from MEG data and offered reliable classifiers for both intrasubject and intersubject analyses. Moreover, we identified category-related features over time and space and proved that the MEG spatial–temporal structure influences classification performance. Finally, we propose a parallel architecture to fuse spatial features and temporal features, which improves decoding accuracy. Our study confirms the potential of deep learning in MEG-based neural decoding, and advanced deep neural networks could be inspired to further improve classification performance. In the future, more tests should be carried out to validate the performance of our model in a wider range of MEG-related fields.

In the current study, it’s hard to explain which spatial or temporal features extracted by EEGNet contribute to the classification performance. Our results in Table 2 showed that the occipital lobe surpasses other scalp regions in classification accuracy. According to the classification results in Fig. 5, the classifier gets its peak when MEG time courses containing the interval of 150–200 ms from stimulus onset. Taken together, the results inspired us to think that component N1 plays an important role in the task of image categorization. In the future, it’s worth further exploring the classification results from component features (such as P1, N1, P3) extracted from specific time intervals and scalp regions in a single trial MEG.

The other limitation of our studies lies in the fact that our results just validate temporal and spatial features extracted by EEGNet from the MEG matrix and its transpose can improve the classification accuracy. We will also explore other spatial and temporal feature fusion methods, for example, employing the generic additive network (gaNet-C) methods to fuse the temporal and spatial features in MEG signals (Lopez-Martin et al. 2019).

The other interesting finding in our results is that the 4-category classification accuracy decreased by 15–20 percent comparing with the binary classification task. We guess that the classification results can be affected by the categories of stimulus images to discriminate. For example, faces images have been demonstrated to be easy to discriminate from non-face images and thus attain higher classification accuracies. Hence, the category number and the object category of images likely have a significant impact on the classification accuracy. The potential classification performance of EEGNet on MEG data of image stimulus containing other common objects still needs to investigate.

Acknowledgements

This work is funded by the General Program (61977010) of the Nature Science Foundation of China. The authors would also like to thank all anonymous participants of the MEG experiments.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Alexander CA, Halgren E, Marinkovic K, Cash SS. Decoding word and category-specific spatiotemporal representations from MEG and EEG. Neuroimage. 2011;54(4):3028–3039. doi: 10.1016/j.neuroimage.2010.10.073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Almeida J, Mahon BZ, Caramazza A. The role of the dorsal visual processing stream in tool identification. Psychol Sci. 2010;21(6):772–778. doi: 10.1177/0956797610371343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bai L, Ma H, Huang Y, Luo Y. The development of native Chinese affective picture system–a pretest in 46 college students. Chin Ment Health J. 2005 doi: 10.1016/j.molcatb.2005.02.001. [DOI] [Google Scholar]
  4. Blair K, Marcos PG, Hyung-Suk K, Norcia AM, Patrick S, Joseph N. A representational similarity analysis of the dynamics of object processing using single-trial EEG classification. PLoS ONE. 2015 doi: 10.1371/journal.pone.0135697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cetiner I, Var AA, Cetiner H. Classification of knot defect types using wavelets and KNN. Elektron Elektrotech. 2016;22(6):67–72. doi: 10.5755/j01.eie.22.6.17227. [DOI] [Google Scholar]
  6. Combrisson E, Jerbi K. Exceeding chance level by chance: the caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy. J Neurosci Methods. 2015;250:126–136. doi: 10.1016/j.jneumeth.2015.01.010. [DOI] [PubMed] [Google Scholar]
  7. Contini EW, Wardle SG, Carlson TA. Decoding the time-course of object recognition in the human brain: from visual features to categorical decisions. Neuropsychologia. 2017 doi: 10.1016/j.neuropsychologia.2017.02.013. [DOI] [PubMed] [Google Scholar]
  8. Contini EW, Goddard E, Grootswagers T, Williams M, Carlson T. A humanness dimension to visual object coding in the brain. Neuroimage. 2020 doi: 10.1016/j.neuroimage.2020.117139. [DOI] [PubMed] [Google Scholar]
  9. Dalponte M, Bovolo F, Bruzzone L. Automatic selection of frequency and time intervals for classification of EEG signals. Electron Lett. 2007;43:1406–1408. doi: 10.1049/el:20072428. [DOI] [Google Scholar]
  10. Dash D, Ferrari P, Heitzman D, Wang J (2019) Decoding speech from single trial MEG signals using convolutional neural networks and transfer learning. In: International conference of the IEEE engineering in medicine and biology society (EMBC) [DOI] [PubMed]
  11. Dash D, Ferrari P, Wang J. Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Front Neurosci. 2020 doi: 10.3389/fnins.2020.00290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. David FN. Decoding of faces and face components in face-sensitive human visual cortex. Front Psychol. 2010 doi: 10.3389/fpsyg.2010.00028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dima DC, Gavin P, Eirini M, Jiaxiang Z, Singh KD (2018a) Spatiotemporal dynamics in human visual cortex rapidly encode the emotional content of faces. Hum Brain Mapp [DOI] [PMC free article] [PubMed]
  14. Dima DC, Gavin P, Singh KD. Spatial frequency supports the emergence of categorical representations in visual cortex during natural scene perception. Neuroimage. 2018;179:102–116. doi: 10.1016/j.neuroimage.2018.06.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. doi: 10.1111/j.1558-5646.1985.tb00420.x. [DOI] [PubMed] [Google Scholar]
  16. Gauthier I, Tarr MJ, Moylan J, Skudlarski P, Gore JC, Anderson AW. The fusiform "face area" is part of a network that processes faces at the individual level. J Cogn Neurosci. 2000;12:495–504. doi: 10.1162/089892900562165. [DOI] [PubMed] [Google Scholar]
  17. Gerson AD, Parra LC, Sajda P. Cortically coupled computer vision for rapid image search. IEEE Trans Neural Syst Rehabil Eng. 2006;14(2):174–179. doi: 10.1109/TNSRE.2006.875550. [DOI] [PubMed] [Google Scholar]
  18. Gross J. Magnetoencephalography in cognitive neuroscience: a primer. Neuron. 2019;104(2):189–204. doi: 10.1016/j.neuron.2019.07.001. [DOI] [PubMed] [Google Scholar]
  19. Horikawa T, Tamaki M, Miyawaki Y, Kamitani Y. Neural decoding of visual imagery during sleep. Science. 2013;340(6132):639–642. doi: 10.1126/science.1234330. [DOI] [PubMed] [Google Scholar]
  20. Huang Z, Yu T (2019) Cross-subject MEG decoding using 3D convolutional neural networks. In: 2019 world robot conference symposium on advanced robotics and automation
  21. Huang W, Xue Y, Hu L, Liuli H. S-EEGNet: electroencephalogram signal classification based on a separable convolution neural network with bilinear interpolation. IEEE Access. 2020 doi: 10.1109/ACCESS.2020.3009665. [DOI] [Google Scholar]
  22. Johnson RW (1998) An introduction to the bootstrap. Teach Stat
  23. Jonas J, Brissart H, Hossu G, Colnat-Coulbois S, Maillard L. A face identity hallucination (palinopsia) generated by intracerebral stimulation of the face-selective right lateral fusiform cortex. Cortex. 2017;99:296–310. doi: 10.1016/j.cortex.2017.11.022. [DOI] [PubMed] [Google Scholar]
  24. Kamitani Y, Tong F. Decoding the visual and subjective contents of the human brain. Nat Neurosci. 2005;8(5):679–685. doi: 10.1038/nn1444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kay KN, Gallant JL. I can see what you see. Nat Neurosci. 2009;12(3):245. doi: 10.1038/nn0309-245. [DOI] [PubMed] [Google Scholar]
  26. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint conference on artificial intelligence
  27. Kostas D, Pang EW, Rudzicz F. Machine learning for MEG during speech tasks. Sci Rep. 2019 doi: 10.1038/s41598-019-38612-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ. EEGNet: a compact convolutional network for EEG-based brain-computer interfaces. J Neural Eng. 2016 doi: 10.1088/1741-2552/aace8c. [DOI] [PubMed] [Google Scholar]
  29. Lecun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
  30. Lee SH, Lee M, Lee SW. Neural decoding of imagined speech and visual imagery as intuitive paradigms for BCI communication. IEEE Trans Neural Syst Rehabil Eng. 2020;28:2647–2659. doi: 10.1109/TNSRE.2020.3040289. [DOI] [PubMed] [Google Scholar]
  31. Liu C, Kang Y, Zhang L, Zhang J. Rapidly decoding image categories from MEG data using a multivariate short-time FC pattern analysis approach. IEEE J Biomed Health Inform. 2020 doi: 10.1109/JBHI.2020.3008731. [DOI] [PubMed] [Google Scholar]
  32. Liu C, Li Y, Song S, Zhang J. Decoding disparity categories in 3-dimensional images from fMRI data using functional connectivity patterns. Cogn Neurodynamics. 2020 doi: 10.1007/s11571-019-09557-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lopez-Martin M, Carro B, Sanchez-Esguevillas A. IoT type-of-traffic forecasting method based on gradient boosting neural networks. Futur Gener Comput Syst. 2019 doi: 10.1016/j.future.2019.12.013. [DOI] [Google Scholar]
  34. Lopez-Martin M, Nevado A, Carro B. Detection of early stages of Alzheimer’s disease based on MEG activity with a randomized convolutional neural network. Artif Intell Med. 2020;107:101924. doi: 10.1016/j.artmed.2020.101924. [DOI] [PubMed] [Google Scholar]
  35. Marti S, Dehaene S. Discrete and continuous mechanisms of temporal selection in rapid visual streams. Nat Commun. 2017 doi: 10.1038/s41467-017-02079-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Mh A, Mp B, Jc C, Cf D, Wj A, Hba E, Bw F. I tried a bunch of things: the dangers of uneXpected overfitting in classification of brain data. Neurosci Biobehav Rev. 2020;119:456–467. doi: 10.1016/j.neubiorev.2020.09.036. [DOI] [PubMed] [Google Scholar]
  37. Mitchell TM, Shinkareva SV, Carlson A, Chang KM, Malave VL, Mason RA, Just MA. Predicting human brain activity associated with the meanings of nouns. Science. 2008 doi: 10.1126/science.1152876. [DOI] [PubMed] [Google Scholar]
  38. Muukkonen I, Ölander K, Numminen J, Salmela VR. Spatio-temporal dynamics of face perception. Neuroimage. 2020 doi: 10.1101/550038. [DOI] [PubMed] [Google Scholar]
  39. Norman KA, Polyn SM, Detre GJ, Haxby JV. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends Cogn Sci. 2006;10(9):424–430. doi: 10.1016/j.tics.2006.07.005. [DOI] [PubMed] [Google Scholar]
  40. Okada T, Tanaka S, Nakai T, Nishizawa S, Konishi J. Naming of animals and tools: a functional magnetic resonance imaging study of categorical differences in the human brain areas commonly used for naming visually presented objects. Neurosci Lett. 2000;296:33–36. doi: 10.1016/S0304-3940(00)01612-8. [DOI] [PubMed] [Google Scholar]
  41. Philiastides MG. Neural representation of task difficulty and decision making during perceptual categorization: a timing diagram. J Neurosci. 2006;26(35):8965–8975. doi: 10.1523/jneurosci.1655-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Qadri S, Dave H, Das R, Alick-Lindstrom S (2021) Beyond the wada: an updated approach to pre-surgical language and memory testing. Epilepsy Res [DOI] [PubMed]
  43. Qin Y, et al. Classifying four-category visual objects using multiple ERP components in single-trial ERP. Cogn Neurodyn. 2016;10:275–285. doi: 10.1007/s11571-016-9378-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Ramkumar P, Jas M, Pannasch S, Hari R, Parkkonen L. Feature-specific information processing precedes concerted activation in human visual cortex. J Neurosci. 2013;33(18):7691–7699. doi: 10.1523/JNEUROSCI.3905-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Riyad M, Khalil M, Adib A. MI-EEGNET: a novel convolutional neural network for motor imagery classification. J Neurosci Methods. 2020 doi: 10.1016/j.jneumeth.2020.109037. [DOI] [PubMed] [Google Scholar]
  46. Sakuraba S, Sakai S, Yamanaka M, Yokosawa K, Hirayama K. Does the human dorsal stream really process a category for tools? J Neurosci. 2012;32(11):3949–3953. doi: 10.1523/JNEUROSCI.3973-11.201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Seeliger K, Fritsche M, Güçlü U, Schoenmakers S, Gerven M. Convolutional neural network-based encoding and decoding of visual object recognition in space and time. Neuroimage. 2017 doi: 10.1016/j.neuroimage.2017.07.018. [DOI] [PubMed] [Google Scholar]
  48. Sen S, Daimi SN, Watanabe K, Bhattacharya J, Saha G (2018) A machine learning approach to decode mental states in bistable perception. In: 2017 international conference on information technology (ICIT)
  49. Simanova I, Gerven MV, Oostenveld R, Hagoort P. Identifying object categories from event-related EEG: toward decoding of conceptual representations. PLoS ONE. 2012;5(12):e14465. doi: 10.1371/journal.pone.0014465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Tadel F, Baillet S, Mosher JC, Pantazis D, Leahy RM. Brainstorm: a user-friendly application for MEG/EEG analysis. Comput Intell Neurosci. 2011 doi: 10.1155/2011/879716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Taulu S, Simola J. Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements. Phys Med Biol. 2006;51(7):1759–1768. doi: 10.1289/ehp.114-a403a. [DOI] [PubMed] [Google Scholar]
  52. Vahid A, Bluschke A, Roessner V, Stober S, Beste C. Deep learning based on event-related EEG differentiates children with ADHD from healthy controls. J Clin Med. 2019;8(7):1055. doi: 10.3390/jcm8071055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Vahid A, Mückschel M, Stober S, Stock AK, Beste C. Applying deep learning to single-trial EEG data provides evidence for complementary theories on action control. Commu Biol. 2020 doi: 10.1038/s42003-020-0846-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Van de Nieuwenhuijzen ME, Backus AR, Bahramisharif A, Doeller CF, Jensen O, Van Gerven MA. MEG-based decoding of the spatiotemporal dynamics of visual category perception. Neuroimage. 2013;83:1063–1073. doi: 10.1016/j.neuroimage.2013.07.075. [DOI] [PubMed] [Google Scholar]
  55. Wang C, Xiaoping HU, Li Y, Shi X, Jiacai Z. Spatio-temporal pattern analysis of single-trial EEG signals recorded during visual object recognition. Sci China. 2011;54(012):2499–2507. doi: 10.1007/s11432-011-4507-1. [DOI] [Google Scholar]
  56. Wang C, Xiong S, Hu X, Yao L, Zhang J. Combining features from ERP components in single-trial EEG for discriminating four-category visual objects. J Neural Eng. 2012;9(5):056013. doi: 10.1088/1741-2560/9/5/056013. [DOI] [PubMed] [Google Scholar]
  57. Willett FR, Avansino DT, Hochberg LR, Henderson JM, Shenoy KV (2020) High-performance brain-to-text communication via imagined handwriting. BioRxiv 10.1101/2020.07.01.183384 [DOI] [PMC free article] [PubMed]
  58. Zhang J, Sudre G, Xin L, Wei W, Bagic A. Clustering linear discriminant analysis for MEG-based brain computer interfaces. IEEE Trans Neural Syst Rehabil Eng. 2011;19(3):221–231. doi: 10.1109/TNSRE.2011.2116125. [DOI] [PubMed] [Google Scholar]
  59. Zheng L, Liao P, Shen L, Sheng J, Gao JH. EMS-Net: a deep learning method for autodetecting epileptic magnetoencephalography spikes. IEEE Trans Med Imaging. 2019 doi: 10.1109/TMI.2019.2958699. [DOI] [PubMed] [Google Scholar]
  60. Zubarev I, Zetter R, Halme HL, Parkkonen L. Adaptive neural network classifier for decoding MEG signals. Neuroimage. 2019 doi: 10.1016/j.neuroimage.2019.04.068. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Cognitive Neurodynamics are provided here courtesy of Springer Science+Business Media B.V.

RESOURCES