Table 4:
Variant Type | Metrics | BGISEQ-500 PE50 | BGISEQ-500 PE100 | HiSeq2500 PE150 |
---|---|---|---|---|
SNPs | True positive | 3 006 132 | 3 071 579 | 3 084 449 |
False positive | 15 203 | 6907 | 4318 | |
False negative | 186 825 | 121 379 | 108 508 | |
Precision | 99.50% | 99.78% | 99.86% | |
Sensitivity | 94.15% | 96.20% | 96.60% | |
FPR | 0.00060% | 0.00020% | 0.00017% | |
FNR | 5.85% | 3.80% | 3.40% | |
indels | True positive | 261 867 | 326 810 | 355 728 |
False positive | 16 931 | 22 246 | 7981 | |
False negative | 107 311 | 42 391 | 13 751 | |
Precision | 93.93% | 93.63% | 97.81% | |
Sensitivity | 70.93% | 88.52% | 96.28% | |
FPR | 0.00067% | 0.00069% | 0.00032% | |
FNR | 29.7% | 11.48% | 3.72% |
*Above, the first four metrics are calculated using rtg-tools software. True positive (TP) is the number of SNPs that are found in the high-confidence reference dataset, false positive (FP) is the number of SNPs that are not found in reference dataset, and false negative (FN) is the number of SNPs that are found in high-confidence reference dataset but are not found in reference dataset. Precision is TP/(TP+FP)*100. Sensitivity is TP/(TP+FN)*100. FPR is FP/(all high-confident region length-TP-FN)*100, where high-confident region length equals 252 9164 928 bp, which comes from GIAB released high-confidence variants datasets [19]. FNR is FN/(FN+TP)*100.