Table 3.
Dataset | Data augmentation | Architecture | Activation | MLP | MLPFixProb | SET-MLP | |||
---|---|---|---|---|---|---|---|---|---|
Accuracy [%] | n W | Accuracy [%] | n W | Accuracy [%] | n W | ||||
MNIST | No | 784-1000-1000-1000-10 | SReLU | 98.55 | 2,794,000 | 97.68 | 89,797 | 98.74 | 89,797 |
CIFAR10 | Yes | 3072-4000-1000-4000-10 | SReLU | 68.70 | 20,328,000 | 62.19 | 278,630 | 74.84 | 278,630 |
HIGGS | No | 28-1000-1000-1000-2 | SReLU | 78.44 | 2,038,000 | 76.69 | 80,614 | 78.47 | 80,614 |
On each dataset, we report the best classification accuracy obtained by each model on the test data. nW represents the number of weights in the model. The only difference between the three models is the network topology, i.e. MLP has fully connected layers, MLPFixProb has sparse layers with Erdös–Rényi fixed topology, and SET-MLP has sparse evolutionary layers trained with SET