Figure 7.
A spiking neural network that accepts input from the BreakoutDeterministic-v4 gym Atari environment. The observations from the environment are downsampled and binarized. The history and delta keyword arguments are used to create difference images before they are converted into Bernoulli-distributed vectors of spikes, one per time step. The output layer of the network has 4 neurons in it, each representing a different action in the Breakout game. An action is selected at each time step using the select_softmax feedback function, which treats the summed spikes over each output layer neuron as a probability distribution over actions.