Table 1.
Name | Description | Size | |A| | |M| | Views | Size (MB) |
---|---|---|---|---|---|---|
Synthethic Datasets | ||||||
SYN | Randomly distributed, varying # distinct values | 1M | 50 | 20 | 1000 | 411 |
SYN*-10 | Randomly distributed, 10 distinct values/dim | 1M | 20 | 1 | 20 | 21 |
SYN*-100 | Randomly distributed, 100 distinct values/dim | 1M | 20 | 1 | 20 | 21 |
Real Datasets | ||||||
BANK | Customer Loan dataset | 40K | 11 | 7 | 77 | 6.7 |
DIAB | Hospital data about diabetic patients | 100K | 11 | 8 | 88 | 23 |
AIR | Airline delays dataset | 6M | 12 | 9 | 108 | 974 |
AIR10 | Airline dataset scaled 10X | 60M | 12 | 9 | 108 | 9737 |
Real Datasets - User Study | ||||||
CENSUS | Census data | 21K | 10 | 4 | 40 | 2.7 |
HOUSING | Housing prices | 0.5K | 4 | 10 | 40 | <1 |
MOVIES | Movie sales | 1K | 8 | 8 | 64 | 1.2 |