Table 1.
Subtype | Feature | Gain | Cover | Frequency | Subtype | Feature | Gain | Cover | Frequency |
---|---|---|---|---|---|---|---|---|---|
I-A | length | 0.16 | 0.18 | 0.11 | III-F | AAACA | 0.16 | 0.62 | 0.21 |
I-A | AATTG | 0.15 | 0.12 | 0.02 | III-F | ACAAG | 0.14 | 0.02 | 0.13 |
I-A | gc | 0.07 | 0.01 | 0.10 | III-F | CTTCC | 0.15 | 0.01 | 0.12 |
I-A | AGAAT | 0.05 | 0.00 | 0.01 | III-F | CTGGA | 0.17 | 0.00 | 0.13 |
I-A | CGATA | 0.03 | 0.00 | 0.01 | III-F | CCAGC | 0.13 | 0.00 | 0.04 |
I-B | gc | 0.19 | 0.12 | 0.08 | IV-A | CCCCC | 0.20 | 0.15 | 0.05 |
I-B | length | 0.31 | 0.07 | 0.08 | IV-A | GGTTA | 0.06 | 0.09 | 0.04 |
I-B | CATCA | 0.03 | 0.04 | 0.01 | IV-A | CGATA | 0.18 | 0.08 | 0.03 |
I-B | GGTAC | 0.02 | 0.04 | 0.00 | IV-A | length | 0.05 | 0.00 | 0.08 |
I-B | AGGCG | 0.02 | 0.01 | 0.00 | IV-A | gc | 0.06 | 0.00 | 0.09 |
I-C | GCGAC | 0.36 | 0.09 | 0.01 | IV-C | CTAGA | 0.07 | 0.29 | 0.10 |
I-C | length | 0.12 | 0.08 | 0.11 | IV-C | TTGCA | 0.23 | 0.13 | 0.14 |
I-C | GTGGA | 0.06 | 0.04 | 0.01 | IV-C | CCTAG | 0.06 | 0.08 | 0.05 |
I-C | ATCCA | 0.05 | 0.03 | 0.01 | IV-C | TGCAA | 0.41 | 0.07 | 0.33 |
I-C | gc | 0.06 | 0.02 | 0.10 | IV-C | palIdx | 0.16 | 0.03 | 0.19 |
I-D | length | 0.12 | 0.11 | 0.07 | V-A | GTAGA | 0.57 | 0.30 | 0.14 |
I-D | AATCC | 0.06 | 0.08 | 0.02 | V-A | GTCTA | 0.04 | 0.21 | 0.07 |
I-D | CGGGA | 0.03 | 0.02 | 0.01 | V-A | AAATT | 0.16 | 0.01 | 0.12 |
I-D | gc | 0.06 | 0.01 | 0.07 | V-A | CTAAG | 0.05 | 0.00 | 0.06 |
I-D | ATCCC | 0.06 | 0.01 | 0.02 | V-A | TTAAA | 0.02 | 0.00 | 0.07 |
I-E | length | 0.24 | 0.09 | 0.09 | V-B1 | AAGCT | 0.18 | 0.34 | 0.11 |
I-E | TCCCC | 0.51 | 0.08 | 0.01 | V-B1 | AAAGC | 0.10 | 0.26 | 0.06 |
I-E | CGGAG | 0.04 | 0.07 | 0.01 | V-B1 | TGCCA | 0.10 | 0.02 | 0.06 |
I-E | gc | 0.03 | 0.07 | 0.08 | V-B1 | AACGG | 0.11 | 0.01 | 0.07 |
I-E | CCCGC | 0.04 | 0.05 | 0.02 | V-B1 | gc | 0.09 | 0.00 | 0.12 |
I-F | CTGCC | 0.58 | 0.15 | 0.03 | V-B2 | CAACC | 0.12 | 0.88 | 0.26 |
I-F | TCATC | 0.05 | 0.09 | 0.01 | V-B2 | AACCC | 0.11 | 0.05 | 0.10 |
I-F | CCATC | 0.03 | 0.08 | 0.00 | V-B2 | GCGAA | 0.08 | 0.01 | 0.05 |
I-F | length | 0.14 | 0.08 | 0.10 | V-B2 | CGCGA | 0.26 | 0.00 | 0.18 |
I-F | TCTAA | 0.03 | 0.01 | 0.01 | V-B2 | GCACA | 0.06 | 0.00 | 0.03 |
I-G | CAATG | 0.24 | 0.10 | 0.02 | V-F | gc | 0.21 | 0.12 | 0.13 |
I-G | length | 0.08 | 0.09 | 0.08 | V-F | GTTAA | 0.06 | 0.06 | 0.04 |
I-G | gc | 0.05 | 0.01 | 0.09 | V-F | length | 0.08 | 0.01 | 0.08 |
I-G | CTTCA | 0.15 | 0.01 | 0.02 | V-F | palIdx | 0.08 | 0.00 | 0.13 |
I-G | CCTCA | 0.06 | 0.00 | 0.02 | V-F | CATTC | 0.07 | 0.00 | 0.02 |
II-A | AAAAC | 0.29 | 0.11 | 0.04 | V-K | GTTGA | 0.22 | 0.22 | 0.07 |
II-A | length | 0.12 | 0.07 | 0.05 | V-K | length | 0.09 | 0.04 | 0.07 |
II-A | TCTAA | 0.06 | 0.04 | 0.02 | V-K | CTTTC | 0.10 | 0.01 | 0.08 |
II-A | gc | 0.06 | 0.04 | 0.08 | V-K | CCTCC | 0.09 | 0.01 | 0.06 |
II-A | ACTCT | 0.05 | 0.03 | 0.01 | V-K | gc | 0.06 | 0.00 | 0.11 |
II-B | ATAAT | 0.08 | 0.35 | 0.06 | V-U1 | ATGAG | 0.23 | 0.71 | 0.26 |
II-B | ACTGA | 0.12 | 0.16 | 0.05 | V-U1 | GGTTA | 0.13 | 0.01 | 0.08 |
II-B | length | 0.11 | 0.09 | 0.10 | V-U1 | CATTA | 0.13 | 0.01 | 0.08 |
II-B | CCCTC | 0.11 | 0.01 | 0.02 | V-U1 | AGCAG | 0.13 | 0.00 | 0.08 |
II-B | AATAA | 0.09 | 0.00 | 0.03 | V-U1 | ATTAA | 0.13 | 0.00 | 0.16 |
II-C | length | 0.39 | 0.24 | 0.06 | V-U2 | AAGCT | 0.06 | 0.10 | 0.09 |
II-C | TAAAA | 0.06 | 0.05 | 0.02 | V-U2 | TCGAA | 0.07 | 0.05 | 0.08 |
II-C | CTACA | 0.04 | 0.04 | 0.01 | V-U2 | CCAAG | 0.13 | 0.03 | 0.08 |
II-C | AAATG | 0.02 | 0.02 | 0.01 | V-U2 | GAATC | 0.27 | 0.03 | 0.13 |
II-C | AAAAT | 0.02 | 0.01 | 0.02 | V-U2 | palIdx | 0.05 | 0.00 | 0.06 |
III-A | AGGGG | 0.06 | 0.08 | 0.02 | V-U4 | CGGAC | 0.11 | 0.28 | 0.07 |
III-A | gc | 0.08 | 0.08 | 0.07 | V-U4 | CGGTC | 0.12 | 0.16 | 0.12 |
III-A | CCGTC | 0.12 | 0.05 | 0.01 | V-U4 | palIdx | 0.05 | 0.01 | 0.06 |
III-A | CGAGA | 0.03 | 0.00 | 0.00 | V-U4 | gc | 0.19 | 0.00 | 0.15 |
III-A | CGGAA | 0.05 | 0.00 | 0.00 | V-U4 | length | 0.06 | 0.00 | 0.05 |
III-B | gc | 0.08 | 0.08 | 0.07 | VI-A | ACCTC | 0.04 | 0.18 | 0.06 |
III-B | GGCCA | 0.04 | 0.07 | 0.01 | VI-A | AGTCC | 0.05 | 0.03 | 0.05 |
III-B | TCCGA | 0.05 | 0.05 | 0.01 | VI-A | ATAAT | 0.04 | 0.01 | 0.04 |
III-B | length | 0.05 | 0.04 | 0.05 | VI-A | GGATA | 0.32 | 0.01 | 0.05 |
III-B | ATTAA | 0.04 | 0.03 | 0.01 | VI-A | GATAA | 0.04 | 0.00 | 0.02 |
III-C | AGGAT | 0.07 | 0.08 | 0.03 | VI-B | GGGTA | 0.13 | 0.25 | 0.05 |
III-C | gc | 0.07 | 0.01 | 0.10 | VI-B | TGCAA | 0.10 | 0.12 | 0.02 |
III-C | palIdx | 0.09 | 0.01 | 0.11 | VI-B | CCAAC | 0.07 | 0.03 | 0.05 |
III-C | CAAGG | 0.09 | 0.01 | 0.02 | VI-B | CTTCA | 0.06 | 0.00 | 0.04 |
III-C | AGATA | 0.04 | 0.00 | 0.01 | VI-B | AGAGC | 0.09 | 0.00 | 0.02 |
III-D | length | 0.11 | 0.11 | 0.05 | VI-C | TCCAA | 0.39 | 0.70 | 0.34 |
III-D | GCACC | 0.03 | 0.04 | 0.00 | VI-C | AAACG | 0.07 | 0.12 | 0.09 |
III-D | gc | 0.06 | 0.02 | 0.09 | VI-C | GACTA | 0.14 | 0.10 | 0.14 |
III-D | palIdx | 0.02 | 0.01 | 0.06 | VI-C | CCCTC | 0.07 | 0.00 | 0.02 |
III-D | ATTGA | 0.01 | 0.00 | 0.01 | VI-C | CCTCG | 0.09 | 0.00 | 0.05 |
III-E | CTAGA | 0.15 | 0.08 | 0.16 | VI-D | ACTAG | 0.35 | 1.00 | 0.50 |
III-E | CTAGC | 0.10 | 0.03 | 0.09 | VI-D | gc | 0.32 | 0.00 | 0.21 |
III-E | CAATC | 0.21 | 0.00 | 0.13 | VI-D | GTCTA | 0.14 | 0.00 | 0.13 |
III-E | ATGCC | 0.10 | 0.00 | 0.06 | VI-D | CTAAA | 0.13 | 0.00 | 0.08 |
III-E | GCGGA | 0.10 | 0.00 | 0.06 | VI-D | palIdx | 0.03 | 0.00 | 0.04 |
The five highest-gain features are provided for each subtype. The three highest-gain features identified were “CTGCC” for I-F, with a gain of 0.58, “GTAGA” for V-A, with a gain of 0.57, and “TCCCC” for I-E, with a gain of 0.51.