Table 4.
Top 20 descriptors of 4-mer motifs. Top 20 descriptors of the 4-mer motifs are contained in the reference set of 167 DNASDs. The descriptors of the TATA motif are ranked at the 199th and 98th when applied for the HPL and DPL datasets, respectively.
Rank | HPL dataset | DPL dataset | ||||
---|---|---|---|---|---|---|
4-mer motif | Score | Included (m = 99) | 4-mer motif | Score | Included (m = 74) | |
1 | TGAA | 1000 | + | AAAG | 1000 | + |
2 | TGAT | 941 | + | AAGA | 956 | + |
3 | CCGG | 878 | − | TTCG | 948 | + |
4 | TATG | 843 | + | AGAA | 922 | − |
5 | TGGA | 817 | − | GAAA | 866 | − |
6 | GATG | 770 | + | AAGG | 791 | + |
7 | TCAA | 739 | + | CGCC | 787 | − |
8 | TACA | 702 | + | AGAT | 777 | − |
9 | AGGC | 697 | − | AATA | 759 | − |
10 | ATGA | 694 | + | TCGC | 747 | − |
11 | TTGA | 672 | + | TGAT | 744 | + |
12 | CGGC | 662 | − | TGAA | 732 | − |
13 | CAGG | 651 | − | ATCG | 732 | + |
14 | ATGT | 634 | − | TCGA | 724 | + |
15 | AGCG | 633 | − | CGGT | 724 | − |
16 | CGCG | 629 | − | ATAA | 712 | + |
17 | AGCC | 618 | − | CGAT | 710 | − |
18 | TCAT | 595 | − | CGCG | 703 | − |
19 | GAGC | 592 | + | GAAG | 699 | + |
20 | AGGG | 582 | − | ATAG | 697 | − |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
25 AAGT | 642 | |||||
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
199 | TATA | 111 | − | 98 TATA | 365 | − |
+: included in the set of m DNASDs.
−: not included in the set of m DNASDs.