Table 1. Input dimensions and training time for corresponding encoding techniques and k-mer sizes.
Techniques | Training time (min) | Input vector size | Acc | Sn | Sp | MCC |
---|---|---|---|---|---|---|
1-mer | ||||||
One Hot Encoding | 28 | 4 X 1000 | 0.95 | 0.98 | 0.90 | 0.90 |
Frequency Based Tokenization | 14 | 1 X 1000 | 0.97 | 0.98 | 0.99 | 0.97 |
2-mer | ||||||
One Hot Encoding | 240 | 16 X 999 | 0.96 | 0.97 | 0.95 | 0.93 |
Frequency Based Tokenization | 14.3 | 1 x 999 | 0.96 | 0.98 | 0.93 | 0.89 |