Tuning Network Hyperparameters and Structure for Optimal KiDNN Performance
(A) A schematic showing the progressive, multi-phase approach of optimizing the network hyperparameters/structure for peak network performance.
(B) A plot showing the network validation MSE (k = 26) and training MSE as a function of the number of epochs. A polynomial fit (n = 5) of the validation error is also shown and the range of overfitting is indicated.
(C) A heatmap showing the respective LOOCV MSE of 42 networks built with selected combinations of batch sizes and epochs. Yellow regions indicate low relative errors.
(D) A 3D scatterplot illustrating respective LOOCV MSE of 300 networks built with selected combinations of activation functions, weight initializations, and optimizers. Darker spheres indicate low relative errors corresponding to specific combinations of hyperparameters.
(E) A 3D scatterplot illustrating respective LOOCV MSE of 120 networks built with selected combinations of hidden layers, nodes per hidden layer, and dropout rate. Darker blue spheres indicate low relative errors corresponding to specific combinations of hyperparameters. Complete list of hyperparameters tuned are listed in Table S1 and additional trials of the top five hyperparameters are evaluated in Tables S2–S4.