Table 1.
Category | Number of Samples | Percentage of Samples |
---|---|---|
Starting MVP sample set for analysis | 514,383 | − |
Intentionally duplicated samples | 25,291 | − |
Uniquely genotyped individuals | 485,856 | 100.00% |
Samples with call rates below 98.5% | 15,436 | 3.18% |
Positive-control samples | 3,236 | 0.66% |
Samples with sex misclassification | 1,450 | 0.29% |
Samples on plates containing 4 or more sex misclassifications | 2,619 | 0.53% |
Unintentionally duplicated samples | 1,149 | 0.23% |
Samples on plates containing an intentional duplicate with high discordance | 9,975 | 2.05% |
Samples with high heterozygosity | 248 | 0.05% |
Samples with no or multiple unique participant identifiers | 71 | 0.01% |
Intentionally duplicated samples with high discordance | 413 | 0.08% |
Samples with 7 or more “relatives” | 466 | 0.09% |
Samples excluded from the dataset | 28,527 | 5.87% |
Samples quarantined from the dataset | 31,836 | 6.55% |
Sample set in current data release | 459,777 | − |
Percentages are calculated from the total number of uniquely genotyped individuals (485,856). Categories are not mutually exclusive (i.e., a sample can be removed as a result of more than one category and is counted in each applicable category in the table).