Two-stage training causes the DSC to extract most of the target information content of the primary inputs, especially when the DSC contains a mixture of unimodal and multisensory units. The percentage of multisensory DSC units is varied by manipulating the primary weight threshold θu. Ten networks receive stage-one training for 5000 iterations (px0 = 0.1, px1 = 0.6, ps = 0.34, and pc = 0.17). Each network is then pruned using θu varying from 0 to 1 in steps of 0.005. This produces mean percentages of multisensory DSC units over the 10 networks ranging from 0 to 100%. Each pruned network receives stage-two training for 5000 iterations (py0 = 0, py1 = 0.1, θx = 6, θy = 0, and θz = 0.2). For each of the 10 networks associated with each θu value, both before and after stage-two training, the mutual information between the target and the number of suprathreshold DSC unit responses is computed (Eqs. 17 and 18; θI = 0.3). The mean information gain after stage-one and stage-two training is plotted against the mean percentages of multisensory units. The mutual information between the target and the primary inputs (2.27 bits; dashed line; Eq. 5) is nearly as high as the information content of the target (2.32 bits; dot-dashed line; Eq. 4). Stage-one training causes the DSC to extract a large amount of target information (triangles), and stage-two (stars) causes a small increase in this amount. The increase is significant when the percentage of multisensory units is 60% or larger (t test, 0.05 significance level). The mutual information between target and DSC is nearly as large as the mutual information between target and primary inputs, but only for percentages of multisensory DSC units between ∼10 and 50%. The mutual information between target and DSC decreases steadily as the percentage of multisensory DSC units increases above 50%. This decrease in mutual information between target and DSC approaches that of a uniformly trimodal DSC, with (0.80 bits; square) and without (0.77 bits; circle) modulatory connections. Variability in the DSC response after two-stage training keeps DSC information content above that of the uniformly trimodal DSC.