. 2017 Aug 7;38(11):5391–5420. doi: 10.1002/hbm.23730

Table 1.

Evaluated design choices

Design aspect	Our choice	Variants	Motivation
Activation functions Pooling mode	ELU Max	Square, ReLU Mean	We expected these choices to be sensitive to the type of feature (e.g., signal phase or power), as squaring and mean pooling results in mean power (given a zero‐mean signal). Different features may play different roles in the low‐frequency components vs the higher frequencies (see the section “Datasets and Preprocessing”).
Regularization and intermediate normalization	Dropout + batch normalization + a new tied loss function (explanations see text)	Only batch normalization, only dropout, neither of both, nor tied loss	We wanted to investigate whether recent deep learning advances improve accuracies and check how much regularization is required.
Factorized temporal convolutions	One 10 × 1 convolution per convolutional layer	Two 6 × 1 convolutions per convolutional layer	Factorized convolutions are used by other successful ConvNets [Szegedy et al., 2015]
Splitted vs one‐step convolution	Splitted convolution in first layer (see the section “Deep ConvNet for raw EEG signals”)	One‐step convolution in first layer	Factorizing convolution into spatial and temporal parts may improve accuracies for the large number of EEG input channels (compared with three rgb color channels of regular image datasets).

Design choices we evaluated for our convolutional networks. “Our choice” is the choice we used when evaluating ConvNets in the remainder of this article, for example, versus FBCSP. Note that these design choices have not been evaluated in prior work, see Supporting Information, Section A.1.