Table 1.
Quarantine and Exclusion Criteria for MVP Samples and Sample Count per Category
| Category | Number of Samples | Percentage of Samples |
|---|---|---|
| Starting MVP sample set for analysis | 514,383 | − |
| Intentionally duplicated samples | 25,291 | − |
| Uniquely genotyped individuals | 485,856 | 100.00% |
| Samples with call rates below 98.5% | 15,436 | 3.18% |
| Positive-control samples | 3,236 | 0.66% |
| Samples with sex misclassification | 1,450 | 0.29% |
| Samples on plates containing 4 or more sex misclassifications | 2,619 | 0.53% |
| Unintentionally duplicated samples | 1,149 | 0.23% |
| Samples on plates containing an intentional duplicate with high discordance | 9,975 | 2.05% |
| Samples with high heterozygosity | 248 | 0.05% |
| Samples with no or multiple unique participant identifiers | 71 | 0.01% |
| Intentionally duplicated samples with high discordance | 413 | 0.08% |
| Samples with 7 or more “relatives” | 466 | 0.09% |
| Samples excluded from the dataset | 28,527 | 5.87% |
| Samples quarantined from the dataset | 31,836 | 6.55% |
| Sample set in current data release | 459,777 | − |
Percentages are calculated from the total number of uniquely genotyped individuals (485,856). Categories are not mutually exclusive (i.e., a sample can be removed as a result of more than one category and is counted in each applicable category in the table).