Skip to main content
. 2017 Apr 1;6(5):1–9. doi: 10.1093/gigascience/gix024

Table 4:

Performances of variation calling of dataset*

Variant Type Metrics BGISEQ-500 PE50 BGISEQ-500 PE100 HiSeq2500 PE150
SNPs True positive 3 006 132 3 071 579 3 084 449
False positive 15 203 6907 4318
False negative 186 825 121 379 108 508
Precision 99.50% 99.78% 99.86%
Sensitivity 94.15% 96.20% 96.60%
FPR 0.00060% 0.00020% 0.00017%
FNR 5.85% 3.80% 3.40%
indels True positive 261 867 326 810 355 728
False positive 16 931 22 246 7981
False negative 107 311 42 391 13 751
Precision 93.93% 93.63% 97.81%
Sensitivity 70.93% 88.52% 96.28%
FPR 0.00067% 0.00069% 0.00032%
FNR 29.7% 11.48% 3.72%

*Above, the first four metrics are calculated using rtg-tools software. True positive (TP) is the number of SNPs that are found in the high-confidence reference dataset, false positive (FP) is the number of SNPs that are not found in reference dataset, and false negative (FN) is the number of SNPs that are found in high-confidence reference dataset but are not found in reference dataset. Precision is TP/(TP+FP)*100. Sensitivity is TP/(TP+FN)*100. FPR is FP/(all high-confident region length-TP-FN)*100, where high-confident region length equals 252 9164 928 bp, which comes from GIAB released high-confidence variants datasets [19]. FNR is FN/(FN+TP)*100.