Table 2. Variant-level quality metrics of high-quality variants in the BP dataset processed by different methods.
Metric | No QC | ABHet | VQSR | ForestQC |
---|---|---|---|---|
Total SNVs | 25081636 | 22415368 | 24239357 | 22227503 |
Known SNVs | 21165051 | 19665276 | 20675746 | 19361635 |
Known SNVs (%) | 84.38% | 87.73% | 85.30% | 87.11% |
Total indels | 3976710 | 2670647 | 3212886 | 2789037 |
Known indels | 3094271 | 2188996 | 2758783 | 2237002 |
Known indels (%) | 77.81% | 81.97% | 85.87% | 80.21% |
Multi-allelic SNVs | 153836 | 26549 | 128894 | 77693 |
Multi-allelic SNVs (%) | 0.61% | 0.12% | 0.53% | 0.35% |
Four methods are compared, including no QC applied, ABHet approach, VQSR and ForestQC. “Known” stands for variants found in dbSNP. The version of dbSNP is 150.