Table 2. Benchmarking machine learning methods for coding potential prediction based on trinucleotides count.
F1-score for each one of the 15 species in which the algorithms were tested. Other metrics (sensitivity, specificity, precision, accuracy and the confusion matrix) used for the comparison of the algorithm’s performance were made available at the Extended data: Supplementary File S2 19 .
| Species | ANN | CNN | K-NN | NAIVE
BAYES |
RANDOM
FOREST |
SVM | XGBoost |
|---|---|---|---|---|---|---|---|
| Anolis carolinensis | 98.47 | 98.31 | 93.55 | 95.50 | 98.30 | 98.03 | 98.79 |
| Chrysemys picta bellii | 96.54 | 96.02 | 93.54 | 93.13 | 96.89 | 96.04 | 98.00 |
| Crocodylus porosus | 96.74 | 96.48 | 93.67 | 93.93 | 97.26 | 96.35 | 98.15 |
| Danio rerio | 97.54 | 97.77 | 95.44 | 94.55 | 97.56 | 97.27 | 97.98 |
| Eptatretus burgeri | 94.88 | 95.69 | 92.24 | 94.57 | 97.35 | 95.82 | 97.56 |
| Gallus gallus | 98.47 | 98.27 | 96.87 | 95.11 | 98.91 | 98.06 | 99.24 |
| Homo sapiens | 98.01 | 97.66 | 96.63 | 86.00 | 98.30 | 96.83 | 98.50 |
| Latimeria chalumnae | 99.05 | 98.72 | 91.61 | 98.23 | 99.56 | 99.24 | 99.57 |
| Monodelphis domestica | 98.39 | 98.09 | 97.11 | 95.31 | 98.67 | 98.01 | 98.84 |
| Mus musculus | 96.67 | 96.96 | 95.95 | 91.56 | 97.66 | 96.10 | 97.73 |
| Notechis scutatus | 95.90 | 94.10 | 87.77 | 89,81 | 94.94 | 95.73 | 96.51 |
| Ornithorhynchus anatinus | 97.23 | 96.59 | 93.59 | 91.45 | 96.99 | 96.38 | 97.61 |
| Petromyzon marinus | 98.40 | 98.26 | 88.10 | 95.99 | 98.79 | 97.49 | 99.42 |
| Sphenodon punctatus | 97.83 | 96.97 | 78.41 | 96.70 | 96.46 | 95.29 | 99.20 |
| Xenopus tropicalis | 98.28 | 98.81 | 85.53 | 97.14 | 98.88 | 97.20 | 99.13 |