Skip to main content
. 2015 Jan 15;8:168. doi: 10.3389/fncom.2014.00168

Figure 1.

Figure 1

Computational models. HMAX model: Gabor filters of 4 orientations and 10 different scales are convolved in the S1 Layer. The responses are pooled to form the C1 Layer. For the S2 Layer random samples of the pooled responses from the C1 layer to the PASCAL dataset of images is used to form the visual dictionary of dimension 4096. These template patches are detected from the responses to form the S2 Layer. A global max pooling operation is done for the final C2 Layer which is of dimension 4096. BoW model: SIFT features are extracted densely over the image. Visual dictionary of dimension 4000 is learnt by kmeans clustering on SIFT features extracted PASCAL dataset images. Each SIFT descriptor from an image is encoded to the nearest element of visual dictionary. Average pooling is done to form the 4000 dimension visual dictionary representation.