Skip to main content
letter
. 2004 Jan;14(1):67–78. doi: 10.1101/gr.1715204

Figure 2.

Figure 2

Donor and acceptor sites flanking long introns have more information. (A) The graphs show the information profiles for regions around the donor and acceptor sites of two sets of introns—those with lengths between 64 and 80 nt and those with lengths >8191 nt; N denotes the number of introns in each set. The set of longer introns had elevated levels of information, in particular at donor position D + 4 and acceptor positions A - 6 and A - 5 (indicated by arrows). The average standard deviation at each nucleotide position is 0.01 bits (64-80 nt) and 0.05 bits (>8191 nt; see Appendix 2 for information standard deviation calculations). The cumulative information at positions -32 to +32 for introns 64-80 is 9.57 ± 0.08 (donor) and 10.00 ± 0.08 (acceptor), and for introns >8191 is 11.80 ± 0.34(donor) and 14.27 ± 0.40 (acceptor; also see Table 3; Appendix 3). The cumulative information for donor and acceptor sites of the longer introns is significantly higher than that of the shorter introns (p > 0.99 by one-tailed t-test). (B) The graphs show the nucleotide compositions for the sets of introns used in A. These illustrate that the elevated information resulted from stronger preferences for A at D + 4 and U at A - 6 and A - 5 (indicated by arrows). The pyrimidine (C or U) tract upstream of the acceptor site was broader and more pronounced in the set of longer introns. The maximum standard deviation in the frequency at each nucleotide position is 0.011 (64-80 nt) and 0.041 (>8191 nt; see Appendix 2 for standard deviation calculations for nucleotide distributions).