Skip to main content
. 2018 Aug 23;13(8):e0202214. doi: 10.1371/journal.pone.0202214

Table 2. List of hyper-parameters identified for each model using random grid search (Bergstra & Bengio, 2012), their optimized values, and argument descriptions (Pedregosa et al., 2011).

Model Parameter Value Argument description
LDA n_components 3 Number of components for dimensionality reduction
solver svd Solver to use
LR multi_class multinomial Class type; either ‘one-versus-rest’ or ‘multinomial’
C 973.755518841459 Inverse of regularization strength
solver lbfgs Algorithm to use in the optimization problem
fit_intercept FALSE Specifies if a constant should be added to the decision function
class_weight None Weights associated with classes
NB alpha 0.97375551884146 Smoothing parameter
fit_prior TRUE Whether to learn class prior probabilities or not
class_prior None Prior probabilities of the classes
KNN n_neighbours 6 Number of neighbors to use
weights distance Weight function used in prediction
algorithm brute Algorithm used to compute the nearest neighbors
p 1 Power parameter for the Minkowski metric
CDT max_features sqrt Number of features to consider when looking for the best split
min_samples_split 0.031313293 Minimum number of samples required to split internal node
splitter random Strategy used to choose the split at each node
criterion entropy Function measuring the quality of a split
class_weight None Weights associated with classes
RF max_features sqrt Number of features to consider when looking for the best split
min_samples_split 0.007066305 Minimum number of samples required to split an internal node
class_weight balanced_subsample Weights associated with classes
criterion entropy Function measuring the quality of a split
n_estimator 98 Number of trees in the forest
SVM kernel poly Kernel type to be used in the algorithm
C 21.234911067828 Penalty parameter C of the error term
gamma 617.482509627716 Kernel coefficient
degree 1 Degree of the polynomial kernel function
NN hidden_layer_size 200 The n-th element representing the number of neurons in the n-th hidden layer
alpha 0.017436642900 Regularization term
activation relu Activation function for the hidden layer
solver adam Solver for weight optimization
batch_size 32 Size of minibatches for stochastic optimizers
learning_rate 0.0001 Learning rate schedule for weight updates
learning_rate_init adaptive The initial learning rate used
  max_iter 123 Maximum number of iterations

Models were fitted to 10 folds for each of 50 candidates, totaling 500 fits. Acronyms denote: LDA for Linear Discriminant Analysis, LR for Logistic Regression, NB for Naïve Bayes, SVM for Support Vector Machines, KNN for K-Nearest Neighbors, CDT for Classification Decision Tree, RF for Random Forest and NN for Neural Networks.