Table 1.
Method | Total calls | Observed validations (fraction) | Expected validations (fraction) |
---|---|---|---|
50X coverage | |||
LUMPY (pe + sr) |
4,347 |
2,653 (0.61) |
37.9 ± 1.2 (0.009) |
LUMPY (pe + sr + prior) |
4,809 |
2,706 (0.563) |
41.1 ± 1.3 (0.009) |
LUMPY trio (pe + sr) |
5,108 |
2,660 (0.521) |
31.5 ± 1.1 (0.006) |
LUMPY (pe + sr&rd) |
1,355 |
1,114 (0.822) |
5.4 ± 0.5 (0.001) |
GASVPro |
3,929 |
2,249 (0.572) |
61.1 ± 1.5 (0.016) |
DELLY |
12,272 |
3,127 (0.255) |
219.2 ± 2.9 (0.018) |
Pindel |
7,219 |
2,208 (0.306) |
0.7 ± 0.2 (~0) |
5X coverage | |||
LUMPY (pe + sr) |
643 |
619 (0.963) |
4.9 ± 0.4 (0.008) |
LUMPY (pe + sr + prior) |
840 |
785 (0.935) |
4.3 ± 0.4 (0.005) |
LUMPY trio (pe + sr) |
1,006 |
958 (0.952) |
4.1 ± 0.4 (0.004) |
LUMPY (pe + sr&rd) |
73 |
66 (0.904) |
0.01 ± 0.02 (~0) |
GASVPro |
356 |
338 (0.949) |
10.2 ± 0.6 (0.029) |
DELLY |
798 |
698 (0.875) |
4.5 ± 0.4 (0.006) |
Pindel | 640 | 521 (0.814) | 0.04 ± 0.04 (~0) |
Monte Carlo simulations were performed to assess the rate at which false positive SV calls are validated purely by chance using split-read mapping analysis of PacBio and Moleculo data. For each NA12878 deletion callset shown in Figures 5 and 6, deletion coordinates were shuffled 100 times (retaining the breakpoint interval sizes and total span of each deletion call), and validation experiments were conducted precisely as for real data. For each callset, we show the total number of deletion calls, the number of validated calls with the fraction validation in parentheses, and the number of validations expected by chance and the 95% confidence interval (with the expected fraction in parentheses) based on Monte Carlo simulations. pe, paired-end; rd, read-depth; sr, split-read.