Skip to main content
. 2020 Jan 15;15:1. doi: 10.1186/s13062-019-0257-6

Table 1.

Summary of classification algorithms evaluated on the training set

Classification algorithm scikit-learn implementation Parameters selected after optimization
Multilayer Perceptron sklearn.neural_network.MLPClassifier

activation = ‘relu’

alpha = 0.0001

batch_size = ‘auto’

beta_1 = 0.9

beta_2 = 0.999

early_stopping = False

epsilon = 1e-08

hidden_layer_sizes = (30,30,30,30,30,30,30,30,30,30)

learning_rate = ‘constant’

learning_rate_init = 0.0376

max_iter = 200

momentum = 0.9

nesterovs_momentum = True

power_t = 0.5

random_state = None

shuffle = True

solver = ‘adam’

tol = 0.0001

validation_fraction = 0.1

warm_start = False

Gradient Boosting sklearn.ensemble. GradientBoostingClassifier

criterion = ‘friedman_mse’

init = None

learning_rate = 0.31

loss = ‘deviance

max_depth = 3

max_features = None

max_leaf_nodes = None

min_impurity_decrease = 0.0

min_impurity_split = None

min_samples_leaf = 1

min_samples_split = 2

min_weight_fraction_leaf = 0.0

n_estimators = 100

presort = ‘auto’

subsample = 1.0

warm_start = False

K-nearest Neighbor sklearn.neighbors.KNeighborsClassifier

algorithm = ‘auto’

leaf_size = 30

metric = ‘minkowski’

metric_params = None

n_neighbors = 8

p = 2

weights = ‘distance’

Logistic Regression sklearn.linear_model.LogisticRegression

C = 1.0

class_weight = None

dual = False

fit_intercept = True

intercept_scaling = 1

max_iter = 100

multi_class = ‘ovr’

penalty = ‘l2’

solver = ‘lbfgs’

tol = 0.0001

warm_start = False

Gaussian Naïve Bayes sklearn.naive_bayes.GaussianNB priors = None
Random Forest sklearn.ensemble. RandomForestClassifier

bootstrap = False

class_weight = None

criterion = ‘gini’

max_depth = 9

min_samples_split = 2

min_samples_leaf = 1

min_weight_fraction_leaf = 0.0

max_features = ‘auto’

max_leaf_nodes = 25

min_impurity_decrease = 0.0

min_impurity_split = None

n_estimators = 25

oob_score = False

warm_start = False

Support Vector Machines sklearn.svm. SVC

C = 1.0

class_weight = None

coef0 = 0.0

decision_function_shape = ‘ovr’

degree = 3

gamma = ‘auto’

kernel = ‘rbf’

max_iter = − 1

probability = False

shrinking = True

tol = 0.001

Voting-based Ensemble sklearn.ensemble. VotingClassifier

flatten_transform = True

voting = ‘soft’

weights = ‘None’

In Phase I, we employed 7 classification algorithms and a voting-based method that integrated predictions from the individual classifiers. The first two columns indicate a name for each algorithm and the scikit-learn implementation that we used for each algorithm. Using an ad hoc approach, we evaluated many hyperparameters via cross validation on the training set and selected a hyperparameter combination for each algorithm that performed best. Non-default parameters are bolded. Hyperparameters that do not fundamentally affect algorithm behavior—such as the number of parallel jobs—are not shown