Skip to main content
. 2011 Jun 30;2:151. doi: 10.3389/fpsyg.2011.00151

Figure 1.

Figure 1

Overview of the five layer feed forward spiking neural network used in Masquelier and Thorpe, 2007. As in HMAX (Riesenhuber and Poggio, 1999; Serre et al., 2007), we alternate simple cells that gain selectivity through a sum operation, and complex cells that gain shift and scale invariance through a max operation (which in our framework simply consists of propagating the first received spike). Cells are organized in retinotopic maps until the S2 layer (inclusive). S1 cells detect edges. C1-maps subsample S1-maps by taking the maximum response over a square neighborhood. S2 cells are selective to intermediate complexity visual features, defined as a combination of oriented edges (here, we symbolically represented an eye detector and a mouth detector). There is one S1–C1–S2 pathway for each processing scale (not represented). Then C2 cells take the maximum response of S2 cells over all positions and scales, and are thus shift- and scale-invariant. Finally, a classification is done based on the C2 cells’ responses (here we symbolically represented a face/non-face classifier). In the brain, equivalents of S1 cells may be in V1, S2 cells in V1–V2, S2 cells in V4–PIT, C2 cells in AIT, and the final classifier in PFC. Here STDP shapes the C1-to-S2 connectivity. Figure 2 shows an example of resulting selectivities after exposing the network to face images. Figure modified from Masquelier and Thorpe (2007).