Table 1.
Design aspect | Our choice | Variants | Motivation |
---|---|---|---|
Activation functions Pooling mode |
ELU Max |
Square, ReLU Mean |
We expected these choices to be sensitive to the type of feature (e.g., signal phase or power), as squaring and mean pooling results in mean power (given a zero‐mean signal). Different features may play different roles in the low‐frequency components vs the higher frequencies (see the section “Datasets and Preprocessing”). |
Regularization and intermediate normalization | Dropout + batch normalization + a new tied loss function (explanations see text) | Only batch normalization, only dropout, neither of both, nor tied loss | We wanted to investigate whether recent deep learning advances improve accuracies and check how much regularization is required. |
Factorized temporal convolutions | One 10 × 1 convolution per convolutional layer | Two 6 × 1 convolutions per convolutional layer | Factorized convolutions are used by other successful ConvNets [Szegedy et al., 2015] |
Splitted vs one‐step convolution | Splitted convolution in first layer (see the section “Deep ConvNet for raw EEG signals”) | One‐step convolution in first layer | Factorizing convolution into spatial and temporal parts may improve accuracies for the large number of EEG input channels (compared with three rgb color channels of regular image datasets). |
Design choices we evaluated for our convolutional networks. “Our choice” is the choice we used when evaluating ConvNets in the remainder of this article, for example, versus FBCSP. Note that these design choices have not been evaluated in prior work, see Supporting Information, Section A.1.