Skip to main content
. 2021 Feb 17;12:1077. doi: 10.1038/s41467-021-21395-x

Fig. 2. Characteristics and validation statistics of SV calls by Aquila.

Fig. 2

a Size frequency distributions of insertion calls and deletion calls in both individuals (L3 for NA12878 and L5 for NA24385). Black areas represent indels of which at least 80% are a close match to Alu-element consensus sequences. b Call validation rates by three validation strategies of two libraries (L2 and L3; L5 and L6) per individual; SVs called in both libraries are in the overlap, flanked by SVs unique to each library. c Overlap analysis and comparison of three validation strategies, by call and individual; numbers inside the Venn diagrams are counts of SVs. SVs are validated by: PacBio data from the same individual, (PacBio); the other individual (Both Individuals); in the chimp or orang genome (Apes). Overlaps represent two or more of these criteria fulfilled. d, e Comparative precision of SVs present in both individuals, as a function of validation by three validation strategies (d) or sequence class (e). Bar graphs depict counts of SVs that have precisely the same breakpoint coordinates in both individuals (0 bp), that differ by <10 (1–9 bp), or that differ by 10 or more (≥0 bp). “Repeats” class includes simple sequence and tandem repeats but not mobile elements; “Other” class includes all SVs that do not overlap more than 80% with Alus and are not part of the Repeats class. f Inference of actual molecular mechanism that produced the SV by expanding the alignment between the reference sequence (Ref) and the alternate allele call from the Individual (Ind) to include chimp or orang sequences; the sequence (reference or alternate) that matches the ape is the ancestral allele. “Actual insertion” and “Actual deletion” refer to the molecular mechanism that produced the derived allele. Approximately 45% of deletion and 24% of insertion calls are thus ‘inverted’ (blue arrows). g Size frequency distributions of actual insertions and actual deletions in both individuals. Black areas represent indels of which at least 80% are a close match to the Alu-element consensus sequence. The peak at around 330 base pairs captures nearly all Alu SVs.