Table 1.
1/2 chip | 1 chip | 2 chip | 4 chip |
S-12 | S-16 | S-32 | S-64 |
P4-128 (4) | P4-252 (2) | S-128 (4) | S-256 (8) |
D | N-256 (2) | N-128 (1) | N-256 (2) |
S-256 (16) | P-256 (8) | P-128 (4) | P-256 (8) |
N-256 (2) | S-512 (32) | S-256 (16) | S-512 (32) |
P-512 (16) | N-512 (4) | N-256 (2) | N-512 (4) |
S-1020 (4) | N-512 (4) | P-256 (8) | P-512 (16) |
(6,528/class) | N-512 (4) | S-512 (32) | S-1024 (64) |
P-512 (16) | N-512 (4) | N-1024 (8) | |
S-1024 (64) | P-512 (16) | P-1024 (32) | |
N-1024 (8) | S-2048 (64) | S-2048 (128) | |
P-1024 (32) | N-2048 (16) | N-2048 (16) | |
N-1024 (8) | N-2048 (16) | N-2048 (16) | |
N-1024 (8) | N-2048 (16) | N-2048 (16) | |
N-2040 (8) | N-4096 (16) | N-4096 (16) | |
(816/class) | (6,553/class) | (6,553/class) |
Each layer is described as type-features (groups), where type can be S for spatial filter layers with filter size and stride 1, N for network-in-network layers with filter size and stride 1, P for convolutional pooling layer with filter size and stride 2, P4 for convolutional pooling layer with filter size and stride 2, and D for dropout layers. The number of output features assigned to each of the 10 CIFAR10 classes is indicated below the final layer as (features/class). The eight-chip network is the same as a four-chip network with twice as many features per layer.