Table 4.
The prediction accuracy of the different methods on the sequences of the COALA90 test data stratified by identity scores against the training database
Identity score | None | ≤50% | >50% |
---|---|---|---|
Number of sequences | 142 | 338 | 1223 |
BLAST best hit | 0.0000 | 0.6243 | 0.9542 |
DIAMOND best hit | 0.0000 | 0.5740 | 0.9534 |
DeepARG | 0.0000 | 0.5266 | 0.9419 |
HMMER | 0.0563 | 0.2751 | 0.6051 |
TRAC | 0.3521 | 0.6124 | 0.9199 |
ARG-CNN | 0.4577 | 0.6538 | 0.9452 |
ARG-InterPro | 0.4085 | 0.6509 | 0.9141 |
ARG-KNN | 0.0000 | 0.6361 | 0.9542 |
ARG-SHINE | 0.4648 | 0.6864 | 0.9558 |
The best results among all the methods and best results among the stand-alone methods are in bold. The lowest identity score among the test data is 21.32%. ‘None’ means that the sequences do not have any alignment against the training database with the e-value no more than 1e-3.