Table 4.
Pretext dataset | Augmentation | Pretext task |
Downstream ImageNet classification |
||
---|---|---|---|---|---|
Training loss ↓ | Training loss ↓ | Test loss ↓ | Test accuracy (%) ↑ | ||
House100K | – | 4.08 ± 0.003 | 6.13 ± 0.229 | 6.20 ± 0.18 | 9.70 ± 0.16 |
House100KLighting | – | 4.07 ± 0.005 | 5.77 ± 0.314 | 5.92 ± 0.35 | 14.09 ± 0.25 |
House100K | ✓ | 4.00 ± 0.005 | 4.67 ± 0.002 | 4.75 ± 0.01 | 18.05 ± 0.04 |
House100KLighting | ✓ | 4.03 ± 0.001 | 4.49 ± 0.013 | 4.51 ± 0.01 | 20.74 ± 0.17∗ |
The column “augmentation” indicates whether the pretext training uses the augmentation method from the original MoCo. The best downstream classification result for the datasets is indicated with an asterisk.