Figure 1. Training and Test Set Classification Accuracy for Naïve Bayes Method Using Motif Scores Only.
Classification accuracies for training sets increases with the number of top motifs selected in models, while test set accuracies only increase when model sizes are small. Including too many features will overfit the training set and thus decrease the test set accuracies. 100 random repeats of 5-fold CVs were performed, and the curves display the mean accuracies. The error bars denote the maximum and minimum accuracy achieved in the 100 random repeats.