Skip to main content
. Author manuscript; available in PMC: 2022 Mar 9.
Published in final edited form as: Curr Protoc Hum Genet. 2020 Sep;107(1):e102. doi: 10.1002/cphg.102

Figure 2.

Figure 2

Identification of TE insertions from short-read sequencing data. Paired-end short reads from an individual with a TE insertion are aligned to the reference genome. A TE insertion is detected by identifying two types of read clusters near the insertion breakpoints: (i) discordant reads (reads 1–4) are uniquely aligned to flanking regions and have their mate-pair reads aligned to one of many reference TE copies remotely located from the breakpoints; and (ii) clipped reads or split reads (reads 5–8) span the insertion breakpoints, and thus have soft-clipped or split mapping to the reference (shown in dotted blue boxes). The change in read depth at a non-reference insertion site is shown at the bottom. Gray dashed lines indicate the boundary of TSDs.