Figure 9.13.
Comparison of to 0 for pairs of non-input neurons that are not directly connected to each other in a MNIST classifier network, before and after training. As shown in the previous figure, provides a better approximation. Histograms are obtained by taking 100,000 pairs of unconnected neurons, uniformly at random, and aggregating the results over 10 random input vectors.