. 2022 Mar 4;12(3):159. doi: 10.3390/bios12030159

Table 5.

Details of the hyperparameters used in the Long Short-Term Memory network.

Hyperparameter	Parameter Value
Initial Learn Rate	$5 \times 10^{- 3}$
Gradient Decay Factor	0.9000
Squared Gradient Decay Factor	0.9990
Epsilon ( $ε$ )	$1 \times 10^{- 8}$
Learn Rate Schedule	piecewise
Learn Rate Drop Factor	0.0100
Learn Rate Drop Period	125,000
L2 Regularization	$1 \times 10^{- 4}$
Gradient Threshold Method	L2 norm
Gradient Threshold	1
Maximum Epochs	7000
Mini Batch Size	2
Input and Label Shuffle	once