Skip to main content
. 2022 Mar 11;4:782756. doi: 10.3389/fmedt.2022.782756

Table 1.

Hyper-parameters settings and python library used for implementation.

Algorithm type Classifiers Train-test split Hyper-parameters Python library
Supervised machine learning algorithm Logistics regression • Solver = “lbfgs”
• Penalty = “l2”
sklearn.linear_model
Gaussian Naïve Bayes • Variance smooting = 1e-09 sklearn.naive_bayes
Decision tree • Quality of split criterion = “gini”
• Value of max_depth was varied between range (1-11 with increment of 1)
• Maximum number of features to consider = “auto”
sklearn.tree
Random forest • Quality of split criterion = “gini”
• Maximum depth of trees = 11
• Maximum number of features to consider = “auto”
• Number of trees in the forest = 10
sklearn.ensemble
AdaBoost • Learning rate was varied between range (0.01-1.1 with increment of 0.01)
• Maximum number of estimators at which boosting is terminated was varied between range (50-200 with increment of 10)
• Algorithm = “SAMME.R”
sklearn.ensemble
K-nearest neighbors • Number of neighbors required was set to 2 sklearn.neighbors
K-nearest neighbors 70-30%
and
10-fold cross validation
• Number of neighbors required set at 5 sklearn.neighbors
Unsupervised machine learning algorithm Affinity propagation • Damping factor was set at 0.8 to maintain current value relative to incoming value (weight 1-damping)
• Maximum iteration = 200
• Maximum number of iterations with no change in number of estimated clusters = 15
sklearn.cluster
BIRCH • Threshold from which the radius of subcluster should be lesser = 0.5
• Number of clusters = length of unique ids in training set (default = 2)
sklearn.cluster
DBSCAN • Maximum distance between two samples for consideration as neighbors (eps) = 0.50
• Minimum samples in neighborhood of a point to consider it as core point = 9
• Distance calculation method = “eulidean”
sklearn.cluster
K-mean • Number of neighbors required was set to 2 sklearn.cluster
Mini-batch K-mean • Number of neighbors required was set to 2 sklearn.cluster
Mean shift • Number of clusters = length of unique ids in training set (default = 2) sklearn.cluster
OPTICS • Maximum distance between two samples for consideration as neighbors (eps) = 0.80
• Minimum samples in neighborhood of a point to consider it as core point = 10
sklearn.cluster