Skip to main content
. 2020 Dec;132:428–446. doi: 10.1016/j.neunet.2020.08.022

Fig. 10.

Fig. 10

Impact of label noise on performance of wide networks: We plot the generalization error decomposition as in Fig. 9 for (A) an under-parameterized Nh=10 and (B) an over-parameterized Nh=200 network. Because output noise is added to both networks, approximation of the function is difficult even for the larger network because the function it is learning is not deterministic. However, the fitting error is larger in the larger network due to the eigenvalue spectrum of the hidden layer covariance. This is in contrast to our finding that larger networks perform better in the deterministic, low noise setting depicted in Fig. 9. Other parameters: P=50, Nt=20, σw0=0, σw¯=1.