Pre-training of DNN layer-1 weights. Blue histogram counts the maxima of
mutual information (M) over all pairs (PT, RND), PT - map with pre-trained
weights, RND - map with randomly initialized weights, where the pre-training has
been done using de-nosing auto-encoder (DAE) architecture. The yellow histogram
shows the counts of M in the opposite case (RND, PT) when for each randomly
initialized feature map, the maximal M match has been searched within PT
maps.