Figure 4.
Best classification performances for different layers, using sets of features that define spaces of different dimensions (1d–3d feature sets; blue, green and red, respectively). As expected, a higher dimensionality yields a better classification accuracy. Interestingly, on the first convolutional layer (conv1) of the AlexNet model, a two-dimensional set of features performs almost as well as a three-dimensional one. The values represent means and the error bars SDs from a 5-fold cross-validation experiment. Two baselines as provided for comparison. First, a linear SVM on the raw CNN responses of each respective layer (cyan), and second, a linear SVM on the raw image data (black).
