Skip to main content
. 2023 May 10;6(7):e202301962. doi: 10.26508/lsa.202301962

Table 2.

The top 21 6-mers: Z-DNABERT attention rank versus the 6-mer frequency rank in the experimental datasets tested for tuning the model.

hg38 Kouzine et al hg38 Shin et al
Attention rank 6-mer Frequency 6-mer Frequency
1 GCGCGC 1 TGTGTG 1
2 GTGTGT 5 GTGTGT 2
3 CGCGCG 2 CGCGCG 4
4 ACACAC 6 GCGCGC 3
5 TGTGTG 3 CACACA 5
6 GCGCGG 7 ACACAC 6
7 CACACA 4 GGGGAA 40
8 CCGCGC 10 AAAAAA 17
9 GGGCGC 11 CAGGGA 43
10 GCGCCC 12 GTGCGC 11
11 GTGCGC 17 TGGGGA 331
12 GGCGCG 9 GGGGGA 39
13 GTGTGC 14 GCTGGG 9
14 GCGCAC 19 GTGTGC 7
15 GCACAC 15 TGCGCG 8
16 GCCCGC 20 TGCATG 21
17 GCGGGC 16 GGGAAG 33
18 CGCGCC 8 AGGGAG 429
19 GCGTGC 25 GGGAGC 458
20 GCACGC 26 AGAAAG 38
21 CCCGCG 18 GGGAAA 80

The model based on the experimental Kouzine et al data was used in the paper rather than the much smaller 150 bp resolution ChIP-seq data of Shin et al.