Skip to main content
. 2010 Dec 3;11:691. doi: 10.1186/1471-2164-11-691

Table 1.

Abundance and Starting Nucleotide Preference for homopolymer and dimer loci in D. pulex

Dimers
Motif Type Count (obs) End (exp) p-value Starting Pref. Motif Type Count (obs) End (exp) p-value Starting Pref.

TA 48814 42186 0 T GC 8444 7623 1E-39 G
AT 35558 42186 CG 6802 7623
GA 33951 31919 0 G/T AC 33773 29999 0 A/T
AG 30185 31919 CA 26535 29999
TC 40029 31919 TG 35249 29999
CT 23511 31919 GT 24437 29999

Trimers

AAC+ 2728 2821 1E-78 T/C ACT 598 695 7E-15 G/T
ACA* 2431 2821 CTA+ 734 695
CAA* 3380 2821 TAC+ 791 695
GTT* 2339 2821 AGT 665 695
TGT* 2657 2821 TAG+ 564 695
TTG* 3390 2821 GTA+ 815 695
AAG 5734 4486 0 T/A AGC+ 1839 2823 1E-226 C/G
AGA* 3657 4486 GCA* 2363 2823
GAA* 4278 4486 CAG+ 4131 2823
CTT 3393 4486 GCT 3115 2823
TCT 3692 4486 TGC+ 2725 2823
TTC+ 6161 4486 CTG 2767 2823
AAT* 4099 2937 5E-216 A AGG+ 971 1039 2E-18 G/T
ATA+ 2260 2937 GGA* 1222 1039
TAA* 2406 2937 GAG+ 988 1039
ATT* 3533 2937 CCT 905 1039
TAT* 2233 2937 TCC 1207 1039
TTA 3093 2937 CTC 940 1039
ACC+ 855 1089 4E-40 C/T ATC 1153 1404 8E-46 T
CAC+ 1057 1089 TCA* 1703 1404
CCA* 1383 1089 CAT* 1342 1404
GGT 1034 1089 GAT* 1452 1404
GTG 921 1089 TGA 1661 1404
TGG* 1285 1089 ATG* 1111 1404
ACG 1229 1373 8E-11 G CCG 703 742 1E-47 G
CGA 1406 1373 CGC+ 519 742
GAC+ 1561 1373 GGC 916 742
CGT 1325 1373 CGG 731 742
TCG 1267 1373 GCG+ 585 742
GTC 1452 1373 GCC 995 742

This table shows nonrandom starting nucleotides in both dimers and trimers. Motif type indicates largest possible repeat identified for each staggered SSR set. P-value calculated using Pearson's chi-square test for random expectation based upon observed and expected frequencies.

Preferential starting base is determined by the highest frequency SSR for the motif grouping. * indicate highly used codons, + indicate rarely used codons.