Figure 2. Intron gain in response to low complexity sequence in the gene CG42594.
While the exact sequence of this highly variable region in the common ancestor of D. melanogaster and D. ananassae is not known, it is plausible that a single nucleotide deletion within the QSGQSG amino acid repeat (blue shading) generated the canonical 5′ splice site CAG | GTGAGT used by this phase 0 intron. Similarly, the CAG repeat (encoding poly-Q sequence) is a potent 3′ splice acceptor site [45]. Sequence conservation across all species is indicated with light shading. The novel intron (denoted by < >) is highly length variable across all species of the melanogaster group.