Skip to main content
. 2014 Jun 26;15(6):R84. doi: 10.1186/gb-2014-15-6-r84

Table 1.

Long-read validation rates for each tool relative to randomly permuted data

Method Total calls Observed validations (fraction) Expected validations (fraction)
50X coverage
  LUMPY (pe + sr)
4,347
2,653 (0.61)
37.9 ± 1.2 (0.009)
  LUMPY (pe + sr + prior)
4,809
2,706 (0.563)
41.1 ± 1.3 (0.009)
  LUMPY trio (pe + sr)
5,108
2,660 (0.521)
31.5 ± 1.1 (0.006)
  LUMPY (pe + sr&rd)
1,355
1,114 (0.822)
5.4 ± 0.5 (0.001)
  GASVPro
3,929
2,249 (0.572)
61.1 ± 1.5 (0.016)
  DELLY
12,272
3,127 (0.255)
219.2 ± 2.9 (0.018)
  Pindel
7,219
2,208 (0.306)
0.7 ± 0.2 (~0)
5X coverage
  LUMPY (pe + sr)
643
619 (0.963)
4.9 ± 0.4 (0.008)
  LUMPY (pe + sr + prior)
840
785 (0.935)
4.3 ± 0.4 (0.005)
  LUMPY trio (pe + sr)
1,006
958 (0.952)
4.1 ± 0.4 (0.004)
  LUMPY (pe + sr&rd)
73
66 (0.904)
0.01 ± 0.02 (~0)
  GASVPro
356
338 (0.949)
10.2 ± 0.6 (0.029)
  DELLY
798
698 (0.875)
4.5 ± 0.4 (0.006)
  Pindel 640 521 (0.814) 0.04 ± 0.04 (~0)

Monte Carlo simulations were performed to assess the rate at which false positive SV calls are validated purely by chance using split-read mapping analysis of PacBio and Moleculo data. For each NA12878 deletion callset shown in Figures 5 and 6, deletion coordinates were shuffled 100 times (retaining the breakpoint interval sizes and total span of each deletion call), and validation experiments were conducted precisely as for real data. For each callset, we show the total number of deletion calls, the number of validated calls with the fraction validation in parentheses, and the number of validations expected by chance and the 95% confidence interval (with the expected fraction in parentheses) based on Monte Carlo simulations. pe, paired-end; rd, read-depth; sr, split-read.