Skip to main content
. 2020 Jun 24;36(18):4699–4705. doi: 10.1093/bioinformatics/btaa586

Table 1.

Detected misclassified taxonomic proteins in the NR database

taxa Total root Kingdom Phylum Class Order Family
2 17 496 167 30 237 47 271 202 205 59 606 177 132 290 065
3 5 921 066 14 376 19 666 107 705 38 575 104 709 236 515
4 2 132 971 4673 21 587 64 801 17 662 47 914 94 054
5 1 022 482 3143 9469 34 322 10 050 27 295 53 276
6 642 760 2509 5662 24 136 7333 23 324 37 998
7 388 794 1572 3959 12 972 5905 13 488 27 221
8 262 682 1121 2803 5988 5375 10 075 16 340
9 190 756 783 2647 3825 3173 7557 12 681
10 156 767 667 1843 3805 2451 6413 11 327
>10 960 891 10 940 23 232 30 048 38 679 46 391 107 679

The first two bold rows show the highest potential misassignments because if a protein has two or three taxonomic assignments and shows a root or kingdom violation, it is more likely to be misclassified.