Skip to main content
. Author manuscript; available in PMC: 2020 Nov 13.
Published in final edited form as: Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:151–159. doi: 10.1145/3368555.3384468

Table 1:

Accuracy of a ResNeXt-29, 8x64d trained using the full CIFAR-100 dataset (“Baseline”) and two synthetic experiments with altered datasets. (“Subsample”) drops 75% of the dolphin and mountain subclasses from the training dataset, and (“Random Noise”) assigns 25% of examples from these subclasses a random superclass label. Results reported are on superclass labels for the validation set. Numbers in parentheses are reductions in performance with respect to the baseline model for each experimental condition.

Subclass Baseline Superclass Baseline Subclass Subsample Superclass Subsample Subclass Random Noise Superclass Random Noise Subclass
Dolphin 0.69 0.78 0.65 (−4) 0.64 (−14) 0.67 (−2) 0.73 (−5)
Mountain 0.87 0.90 0.82 (−5) 0.71 (−19) 0.82 (−5) 0.73 (−17)