Skip to main content
. 2015 Jun 11;16(Suppl 7):S16. doi: 10.1186/1471-2164-16-S7-S16

Table 4.

Ten-fold cross validation and testing accuracy for enzyme identification and enzyme classification.

Enzyme Identification (EC L0) Enzyme Classification (EC L1)
Classifiers Ten-fold
Accuracy*
Testing
Accuracy
Ten-fold
Accuracy*
Testing
Accuracy

DS 66.39 66.39 39.12 39.31
NBC 92.60 92.46 96.11 95.88
KNN 94.38 94.38 97.80 97.56
SVM 95.69 94.86 98.34 98.39
RFC 98.42 94.60 97.50 97.28

*Ten-fold cross validation accuracy. At EC L0 and EC L1 using ML classifiers, Decision Stump (DS), Naïve Bayes Classifier (NBC), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest Classifier (RFC). At EC L0, train and test sets contain 154,592 and 38,648 sequences respectively, whereas EC L1 contain train and test sets of 50,139 and 12,535, respectively.