Table 1.
Dimers | |||||||||
---|---|---|---|---|---|---|---|---|---|
Motif Type | Count (obs) | End (exp) | p-value | Starting Pref. | Motif Type | Count (obs) | End (exp) | p-value | Starting Pref. |
TA | 48814 | 42186 | 0 | T | GC | 8444 | 7623 | 1E-39 | G |
AT | 35558 | 42186 | CG | 6802 | 7623 | ||||
GA | 33951 | 31919 | 0 | G/T | AC | 33773 | 29999 | 0 | A/T |
AG | 30185 | 31919 | CA | 26535 | 29999 | ||||
TC | 40029 | 31919 | TG | 35249 | 29999 | ||||
CT | 23511 | 31919 | GT | 24437 | 29999 | ||||
Trimers | |||||||||
AAC+ | 2728 | 2821 | 1E-78 | T/C | ACT | 598 | 695 | 7E-15 | G/T |
ACA* | 2431 | 2821 | CTA+ | 734 | 695 | ||||
CAA* | 3380 | 2821 | TAC+ | 791 | 695 | ||||
GTT* | 2339 | 2821 | AGT | 665 | 695 | ||||
TGT* | 2657 | 2821 | TAG+ | 564 | 695 | ||||
TTG* | 3390 | 2821 | GTA+ | 815 | 695 | ||||
AAG | 5734 | 4486 | 0 | T/A | AGC+ | 1839 | 2823 | 1E-226 | C/G |
AGA* | 3657 | 4486 | GCA* | 2363 | 2823 | ||||
GAA* | 4278 | 4486 | CAG+ | 4131 | 2823 | ||||
CTT | 3393 | 4486 | GCT | 3115 | 2823 | ||||
TCT | 3692 | 4486 | TGC+ | 2725 | 2823 | ||||
TTC+ | 6161 | 4486 | CTG | 2767 | 2823 | ||||
AAT* | 4099 | 2937 | 5E-216 | A | AGG+ | 971 | 1039 | 2E-18 | G/T |
ATA+ | 2260 | 2937 | GGA* | 1222 | 1039 | ||||
TAA* | 2406 | 2937 | GAG+ | 988 | 1039 | ||||
ATT* | 3533 | 2937 | CCT | 905 | 1039 | ||||
TAT* | 2233 | 2937 | TCC | 1207 | 1039 | ||||
TTA | 3093 | 2937 | CTC | 940 | 1039 | ||||
ACC+ | 855 | 1089 | 4E-40 | C/T | ATC | 1153 | 1404 | 8E-46 | T |
CAC+ | 1057 | 1089 | TCA* | 1703 | 1404 | ||||
CCA* | 1383 | 1089 | CAT* | 1342 | 1404 | ||||
GGT | 1034 | 1089 | GAT* | 1452 | 1404 | ||||
GTG | 921 | 1089 | TGA | 1661 | 1404 | ||||
TGG* | 1285 | 1089 | ATG* | 1111 | 1404 | ||||
ACG | 1229 | 1373 | 8E-11 | G | CCG | 703 | 742 | 1E-47 | G |
CGA | 1406 | 1373 | CGC+ | 519 | 742 | ||||
GAC+ | 1561 | 1373 | GGC | 916 | 742 | ||||
CGT | 1325 | 1373 | CGG | 731 | 742 | ||||
TCG | 1267 | 1373 | GCG+ | 585 | 742 | ||||
GTC | 1452 | 1373 | GCC | 995 | 742 |
This table shows nonrandom starting nucleotides in both dimers and trimers. Motif type indicates largest possible repeat identified for each staggered SSR set. P-value calculated using Pearson's chi-square test for random expectation based upon observed and expected frequencies.
Preferential starting base is determined by the highest frequency SSR for the motif grouping. * indicate highly used codons, + indicate rarely used codons.