Table 8. Performance comparison of models with different sequence lengths under global training.
The performance comparison of models using different sequence lengths (300 and 600 bp) under global training. It provides insights into the effectiveness of each model based on loss, accuracy, AUC, F1-score, specificity, and MCC. The DNABERT-2_CM_BL model performs exceptionally well under this strategy, particularly with 300 bp sequences. The best results for sequence lengths of 300 bp and 600 bp are indicated in bold.
| Length | Model | Loss | Acc | AUC | F1 | Sp | MCC |
|---|---|---|---|---|---|---|---|
| 300 bp | DNABERT-2_BASE | 0.258 | 0.894 | 0.964 | 0.894 | 0.883 | 0.789 |
| DNABERT-2_CNN | 0.232 | 0.907 | 0.969 | 0.907 | 0.910 | 0.814 | |
| DNABERT-2_BiLSTM | 0.296 | 0.898 | 0.940 | 0.898 | 0.905 | 0.796 | |
| DNABERT-2_C M _BL | 0.243 | 0.909 | 0.970 | 0.909 | 0.923 | 0.818 | |
| DNABERT-2_C A _BL | 0.242 | 0.905 | 0.965 | 0.904 | 0.935 | 0.811 | |
| 600 bp | DNABERT-2_BASE | 0.257 | 0.905 | 0.963 | 0.905 | 0.891 | 0.811 |
| DNABERT-2_CNN | 0.239 | 0.910 | 0.966 | 0.910 | 0.905 | 0.819 | |
| DNABERT-2_BiLSTM | 0.275 | 0.911 | 0.950 | 0.911 | 0.913 | 0.822 | |
| DNABERT-2_C M _BL | 0.255 | 0.910 | 0.970 | 0.910 | 0.908 | 0.819 | |
| DNABERT-2_C A _BL | 0.273 | 0.904 | 0.962 | 0.904 | 0.889 | 0.808 |