Skip to main content
. 2016 Aug 2;33(10):2759–2764. doi: 10.1093/molbev/msw137

Fig. 1.

Fig. 1.

Overview of PoPoolationTE2. (A) TE insertions (black arrow) result in paired ends (yellow), with one read mapping to a reference chromosome (X) and the other one to a TE (copia). One group of such discordantly mapped reads is located to the left of the insertion (forward signature) and one to the right (reverse signature). (B) The absence of TE insertions results in proper pairs spanning a putative insertion site (green). (C) Mapped paired end reads may be used to generate a base coverage track (gray) and a physical coverage track (green). For the base coverage, the position of the reads is considered whereas for the physical coverage the region between the reads. (D) TE insertions result in paired ends that support a TE insertion (yellow). This can be translated into an additional type of physical coverage (yellow track). The median distance of proper pairs is used to estimate the distance between such discordant pairs. (E) Increasing the inner distance between paired ends compared with panel D results in more reads supporting a TE insertion (copia) and a higher physical coverage. If paired ends are overlapping the physical coverage of individual-paired ends is summed up, contributing to the total height of the physical coverage track. Physical coverage supporting the presence (yellow) and absence (green) of a TE may overlap (central region). (F) Combining the information of all paired ends for each genomic position results in a physical coverage track. (G) To homogenize the power to identify TEs, the physical coverage is randomly sampled to equal levels for each genomic position. (H) The position of signatures of TE insertions is determined using a sliding window (black lines on top) approach and the window with the maximal physical coverage supporting a TE (the red line indicates the window with the highest copia coverage) is used for further analysis. (I) The population frequency of TE signatures is estimated from the ratio of average physical coverage supporting a TE to the total physical coverage in a window (copia =72/(72+18)=0.8). (J) Matching pairs of TE signatures (forward and reverse) of the same TE family within a given distance are joined, yielding a final set of TE insertions. Final population frequency and position estimates are obtained by averaging the estimates for forward and reverse signature. (K) Accuracy of the population frequency estimates for 1,000 TEs in a simulated pooled population. PoPoolationTE2 has a slight upward bias for intermediate frequency TEs and a slight downward bias for high frequency TEs. (L) Accuracy of insertion position estimates for 1,000 TEs in a simulated pooled population.