Table 1.
Test set performance of end models trained on subsamples of the labeled training data (%), not including validation splits, using various data augmentation approaches. None indicates performance with no augmentation. All tasks are measured in accuracy, except ACE which is measured by F1 score.
Task | % | None | Basic | Heur. | MF | LSTM |
---|---|---|---|---|---|---|
MNIST | 1 | 90.2 | 95.3 | 95.9 | 96.5 | 96.7 |
10 | 97.3 | 98.7 | 99.0 | 99.2 | 99.1 | |
| ||||||
CIFAR-10 | 10 | 66.0 | 73.1 | 77.5 | 79.8 | 81.5 |
100 | 87.8 | 91.9 | 92.3 | 94.4 | 94.0 | |
| ||||||
ACE (F1) | 100 | 62.7 | 59.9 | 62.8 | 62.9 | 64.2 |
| ||||||
DDSM DDSM + DS |
10 | 57.6 | 58.8 | 59.3 53.7 |
58.2 59.9 |
61.0 62.7 |