. Author manuscript; available in PMC: 2018 Jan 26.

Published in final edited form as: Adv Neural Inf Process Syst. 2017 Dec;30:3239–3249.

Table 1.

Test set performance of end models trained on subsamples of the labeled training data (%), not including validation splits, using various data augmentation approaches. None indicates performance with no augmentation. All tasks are measured in accuracy, except ACE which is measured by F1 score.

Task	%	None	Basic	Heur.	MF	LSTM
MNIST	1	90.2	95.3	95.9	96.5	96.7
	10	97.3	98.7	99.0	99.2	99.1

CIFAR-10	10	66.0	73.1	77.5	79.8	81.5
	100	87.8	91.9	92.3	94.4	94.0

ACE (F1)	100	62.7	59.9	62.8	62.9	64.2

DDSM DDSM + DS	10	57.6	58.8	59.3 53.7	58.2 59.9	61.0 62.7