Table 3.
Test errors for MNIST trained convolutional neural networks (CNNs) and the CIFAR-100 trained “Network in Network” (NiN) models.
Method | Error (%) |
---|---|
MNIST | |
Baseline CNN | 0.63 |
Teacher | 0.56 |
Teacher with finetuning | 0.48 |
Student with deep supervision | 0.55 |
Student with hints | 0.56 |
Student with RDL | 0.49 |
CIFAR-100 | |
Baseline NiN | 30.68 |
Teacher with finetuning | 38.75 |
Student with deep supervision | 29.46 |
Student with hints | 29.37 |
Student with RDL | 28.77 |
The performance of the teacher for the CIFAR-100 classification is not shown, since it was trained on CIFAR-10 and, therefore, predicted across 10 not 100 classes, making it unable to perform the CIFAR-100 task.