Skip to main content
. 2020 Jul 9;10:11360. doi: 10.1038/s41598-020-68169-x

Figure 5.

Figure 5

Effect of activation function on recovering time parematers. Networks were either equipped with a sigmoid activation function (left) or a ReLU activation function (right). For two combinations of target rate constants (αs=0.68, αr=0.34, upper panels) and (αs=0.34, αr=0.68, lower panels) a grid search was performed over fixed rate constants and networks with adaptive rate constants were trained with their final learned rate constants indicated as blue crosses (40 repetitions). Target rate constants are indicated with a green circle. The dashed lines indicate the approximation where either α^s=1 or α^r=1. The intersection of both dashed lines indicates the Elman solution (blue circle). The networks with a sigmoid activation function have a region of lowest loss contained around the target values. However networks with an ReLU activation function have a much wider basin of rate constants associated with minimal loss. Learned rate constants do not recover the target rate constants for the ReLU activation function uniquely but form a band symmetric around the diagonal αs=0, αr=0. The grid search loss region for the ReLU activation function is also symmetric around the diagonal, indicating an interchangeability of rate constant α^s and α^r. This could be the result of the interchangeability in the linear regime of the ReLU activation function (see Appendix S2).