Table 5.
Computer Vision Datasets used for Domain Adaptation | |
---|---|
MNIST[130]a | This is a binary (mostly black and white, but actually grayscale due to anti-aliasing) handwritten digit dataset (digits 0–9), which stands for “modified NIST.” It is based on the National Institute of Standards and Technology’s (NIST) Special Database 1 and 3, one of which was easier than the other, so MNIST is a combination of the two that are size normalized to fit in a 20×20 box preserving the aspect ratio and centered in a 28×28 pixel image. |
MNIST-M[73]b | This is a modification of MNIST where the digits are blended with random patches from BSDS500 dataset color photos. |
USPS[131]c | This is another handwritten digit dataset (digits 0–9). It consists of handwritten zipcodes scanned and segmented by the U.S. Postal Service (USPS). They were size normalized to 16×16 pixels preserving the aspect ratio. The values are normalized to be between −1 and 1. |
SVHN[175]d | The Streetview House Numbers (SVHN) consists of single digits extracted from images of urban house numbers in Google Street View. The digits have been size normalized to 32×32 pixels. |
SYNN[73]b | Ganin et al.[73] used Microsoft Windows fonts to create a synthetic digit dataset (“Syn Numbers”) consisting of 1–3 digit numbers with various positions, orientation, background color, stroke color, and amount of blur. |
SYNS[168]e | This is a synthetic sign dataset created from modifications to Wikipedia pictograms of traffic signs. It consists of 100,000 images and 43 classes of signs. |
GTSRB[224]f | The German Traffic Signs Recognition Benchmark (GTSRB) is a dataset created from video taken driving around Germany. It consists of about 50,000 images and 43 classes of signs. |
Office[202]g | This dataset consists of 31 classes of objects in three different domains: Amazon (taken from its online website; medium resolution and studio lighting), DSLR (taken with a digital SLR camera; high resolution and in a real-world environment), and Webcam (taken with a 640×480 computer webcam; have noise, artifacts, and white balance issues). Note: due to Office’s small size, some networks [73, 199, 226] were pre-trained on ImageNet. |
See Ganin’s website http://yaroslav.ganin.net/ for links to download.
This can be found on various sites and some Github repositories. One such place: https://web.stanford.edu/~hastie/ElemStatLearn/data.html
The synthetic dataset linked to on: http://graphics.cs.msu.ru/en/research/projects/imagerecognition/trafficsign