Table 2.
Count a | Motif (number of sites in genome b/site usage c) |
---|---|
78 | ttAAAA (6844770/1.14) |
61 | atAAAA (6115727/0.97) |
37 | ctAAAA (3427545/1.08) |
36 | atAAGA (2040242/1.76) |
35 | ttAAGA (2248938/1.55) |
32 | aaAAAA (8006008/0.40) taAAAA (7446893/0.43) |
23 | gtAAGA (1266992/1.82) |
20 | ttAAAG (2636960/0.75) caAAAA(5905656/0.34) |
19 | aaAAGA (5553248/0.34) |
18 | gtAAAA (2525766/0.58) |
17 | (2940505/0.58) ctAAGA (1418507/1.20) |
13 | caAAGA (2875842/0.45) |
12 | gaAAAA (5451601/0.22) |
11 | aaAGAA (6061956/0.18) |
10 | taAAAG (3079886/0.32) atAAAG (2801071/0.36) atAGAA (2476795/0.40) tgAAAA (4505577/0.22) |
9 | gaAAGA (3257715/0.28) ctAGAA (1925300/0.47)
agAAAA (6851406/0.13) |
8 | ctAAAG (1465413/0.48) aaAAAG (5450737/0.15) |
7 | taAGAA (2719394/0.26) gaAGAA (3200992/0.22)
atGAAA (3793799/0.18) |
6 | ccAAAA (2952060/0.20) tgAAGA (2585632/0.23)
ttAGAA (2603330/0.23) |
5 | gaAAAG gtAAAG acAAAA tcAAAA atAATA ttGAAA
gtAGAA |
4 | tcAGAA gcAAAG ctAAAT aaAAAT ttAAAT acAAGA
agAAGA tcAAAG tgAAAG ttTAAA caAAAG |
3 |
ggAAAA tgAGAA ggAAAG gtAATA ccAAGA atAAAT ggAGAA atAACA tcAAGA ctGAAA aaAATG acAAAG ttAATA acAGAA |
2 | aaAAGT gcAAAA gaGAAA gtTAAA taAAAT caAATA
cgAAAA aaATAT ttAACA aaGTTA aaTAAA aaGAAA ggGAAA atTAAA ggAAGA |
1 | taAAGC atACAA taAAGG aaATTG cgCTTT ttGGGA
aaAGTT agAGAA aaTGAT acCTTC aaAAAC agGAGA gaGCCC taAAAC acTAAA gtGAAA caAGAA atAAAC tcAATA agAATA aaATGA ttAAAC aaGTCA agGAAA atAGGA atAGGC gcAGAA tgAGCA tcGAAT aaCCAC caAATC gtAAGG aaAACT acAAGC aaAGAG agCTGT agTTGT aaGCAG caGAAA gaAAAC ccAAAT tgGGGG ctAATA aaGGTC atTAGA ctTAAA taTTTA agATTC atAGAT gtAAAC aaAATA ttATAA ttTAGA taAATA aaAATT aaATCA caAAGG agAAAG ccAGAA taGAAA taAATT agAAAT aaATCT aaTTGG gcAAGA ttCAAA atTAAT aaGTGC aaCACA aaAAGC atGCCT ggCCTA agATGT tgTATT aaACAT |
Occurrence of each motif among the 800 polymorphic Alu loci.
The occurrence of the motif in the human genome based on UCSC hg15, with both strands considered. The second and third bases in the motif represent the first nick site by EN. For motif “aaAAAA”, the count in the genome does not include all possibility by shifting 1 bp each time in a run of “A”. Instead, in the case of “A” runs, the count refers to the number of possible shifts by 6-bp each time. The eight sites following the “NT-AARA” motif are underlined.
Site usage represents the ratio of observed occurrence in every 1×105 sites. Site counts and site usage are only shown for sites with more than 5 occurrences among the 800 polymorphic Alu loci.