Table 1. Comparison of tools for automated CAZyme annotation.
Accuracy (F-score) | |||||||
---|---|---|---|---|---|---|---|
Tools + databases | Bacteria | Eukaryotes | Subfamily | Multi-family proteins | Domain repeats | Domain positions | Speedc |
HMMER+dbCAN | 0.88 | 0.86 | Yesa | Yes | Yes | Yes | 69 |
DIAMOND+CAZy | 0.89 | 0.84 | Yesa | No | No | No | 4 |
Hotpep+PPR | 0.80 | 0.94 | Yesb | Yes | No | No | 7 |
Predicted by > = 2 tools | 0.93 | 0.92 |
aTwenty four CAZyme families are classified into 207 subfamilies by phylogenetic clustering and CAZy expert curation (10).
cThe time is in seconds and calculated on Escherichia coli K-12 MG1655 proteome (4140 proteins). The detailed calculations on accuracy and speed are available in Supplementary Table S1. No correspondence has been established between PPR groups and CAZy subfamilies, and in dbCAN web server we only report CAZy subfamily annotation, whenever it is available.