Table 1. A summary of the datasets used in the experiments.
Set of experiments | |||
Large-scale evaluation | Real-world demonstration | ||
Targeted genomic dataset (D ) | Dataset | Simulated genomic dataset | Craig Venter’s data |
Attributes | Year of birth, U.S. state, 16 STRs | Year of birth, U.S. state, 50 STRs | |
Records | 1000 | 1 | |
Genetic genealogy dataset (DG) | Dataset | Simulated genetic genealogy dataset | Ysearch |
Attributes | Surname, 16 STRs | Surname, 50 STRs | |
Records | 20,000 | 58,218 | |
Public identified dataset (DI) | Dataset | Simulated demographic dataset | PeopleFinders |
Attributes | ID, name, year of birth, U.S. state | Name, age, U.S. state | |
Records | 20,000 | ~250 million |