Table 3.
Impact of different activation functions on network test results.
| Sigmoid | Tanh | ReLU | ELU | |
|---|---|---|---|---|
| Characteristics | Gradient disappears | Convergence speed is faster than Sigmoid; gradient disappears | The input is positive, the gradient does not disappear; the input is negative, the gradient disappears. | It combines sigmoid and ReLU; and gradient disappears |
| Test accuracy rate | 70.00% | 56.67% | 76.67% | 83.33% |