. 2017 Apr 1;6(5):1–9. doi: 10.1093/gigascience/gix024

Table 4:

Performances of variation calling of dataset*

Variant Type	Metrics	BGISEQ-500 PE50	BGISEQ-500 PE100	HiSeq2500 PE150
SNPs	True positive	3 006 132	3 071 579	3 084 449
	False positive	15 203	6907	4318
	False negative	186 825	121 379	108 508
	Precision	99.50%	99.78%	99.86%
	Sensitivity	94.15%	96.20%	96.60%
	FPR	0.00060%	0.00020%	0.00017%
	FNR	5.85%	3.80%	3.40%
indels	True positive	261 867	326 810	355 728
	False positive	16 931	22 246	7981
	False negative	107 311	42 391	13 751
	Precision	93.93%	93.63%	97.81%
	Sensitivity	70.93%	88.52%	96.28%
	FPR	0.00067%	0.00069%	0.00032%
	FNR	29.7%	11.48%	3.72%

*Above, the first four metrics are calculated using rtg-tools software. True positive (TP) is the number of SNPs that are found in the high-confidence reference dataset, false positive (FP) is the number of SNPs that are not found in reference dataset, and false negative (FN) is the number of SNPs that are found in high-confidence reference dataset but are not found in reference dataset. Precision is TP/(TP+FP)*100. Sensitivity is TP/(TP+FN)*100. FPR is FP/(all high-confident region length-TP-FN)*100, where high-confident region length equals 252 9164 928 bp, which comes from GIAB released high-confidence variants datasets [19]. FNR is FN/(FN+TP)*100.