Skip to main content
. Author manuscript; available in PMC: 2021 Jul 30.
Published in final edited form as: ACM Trans Intell Syst Technol. 2020 Jul 5;11(5):1–46. doi: 10.1145/3400066

Table 5.

List and description of computer vision datasets from Tables 2 and 3

Computer Vision Datasets used for Domain Adaptation
MNIST[130]a This is a binary (mostly black and white, but actually grayscale due to anti-aliasing) handwritten digit dataset (digits 0–9), which stands for “modified NIST.” It is based on the National Institute of Standards and Technology’s (NIST) Special Database 1 and 3, one of which was easier than the other, so MNIST is a combination of the two that are size normalized to fit in a 20×20 box preserving the aspect ratio and centered in a 28×28 pixel image.
MNIST-M[73]b This is a modification of MNIST where the digits are blended with random patches from BSDS500 dataset color photos.
USPS[131]c This is another handwritten digit dataset (digits 0–9). It consists of handwritten zipcodes scanned and segmented by the U.S. Postal Service (USPS). They were size normalized to 16×16 pixels preserving the aspect ratio. The values are normalized to be between −1 and 1.
SVHN[175]d The Streetview House Numbers (SVHN) consists of single digits extracted from images of urban house numbers in Google Street View. The digits have been size normalized to 32×32 pixels.
SYNN[73]b Ganin et al.[73] used Microsoft Windows fonts to create a synthetic digit dataset (“Syn Numbers”) consisting of 1–3 digit numbers with various positions, orientation, background color, stroke color, and amount of blur.
SYNS[168]e This is a synthetic sign dataset created from modifications to Wikipedia pictograms of traffic signs. It consists of 100,000 images and 43 classes of signs.
GTSRB[224]f The German Traffic Signs Recognition Benchmark (GTSRB) is a dataset created from video taken driving around Germany. It consists of about 50,000 images and 43 classes of signs.
Office[202]g This dataset consists of 31 classes of objects in three different domains: Amazon (taken from its online website; medium resolution and studio lighting), DSLR (taken with a digital SLR camera; high resolution and in a real-world environment), and Webcam (taken with a 640×480 computer webcam; have noise, artifacts, and white balance issues). Note: due to Office’s small size, some networks [73, 199, 226] were pre-trained on ImageNet.