Skip to main content
. 2019 Jul 5;35(14):i31–i40. doi: 10.1093/bioinformatics/btz394

Fig. 3.

Fig. 3.

Results on the E2-fix dataset. Training dataset is randomly subsampled to create unbalance: healthy (for IBD) and overweight (for BMI) samples constitute 1/10 (10-versus-90), 1/5 (20-versus-80) or 1/3 (33-versus-66) of the samples for both the training and testing sets. We compare AUC on the original training set (no augmentation); the over-represented class down-sampled to match the number of under-represented class (down-sampling); and, augmentation using SMOTE, ADASYN and TADA. Methods are run in two ways: TADA-Balance just adds samples to the healthy class to balance labels; TADA-Balance++ adds both healthy and unhealthy samples to make them balanced and to increase the total number of samples by 50×