Table 1.
ML technique | Median training time in ms (5th−95th perc.) | Median prediction time in ms (5th−95th perc.) | Hyper-parameter (range considered for optimization) |
---|---|---|---|
Logistic regression (LR) | 9.51 (5.50, 12.58) | 0.06 (0.05, 0.11) | C: Inverse of the regularization strength (10−3–1010) |
Support vector machine (SVM) | 208.69 (96.01, 1,745.5) | 12.94 (5.19, 29.81) | C: Penalty parameter that favors smoother decision boundaries when set to a smaller value (10−3–105) |
Neural network (NN) | 412.23 (47.72, 465.35) | 0.24 (0.22, 0.33) | α: A L2-regularization parameter that attempts to reduce over-fitting. Smoother decision boundaries with larger values (10−8–105) |
Naïve-Bayes (NB) | 0.73 (0.70, 1.26) | 0.69 (0.66, 1.23) | None |
Random forest (RF) | 399.98 (7.35, 7,304.26) | 4.74 (0.23, 86.40) | N Estimators: The number of trees being used in the forest (10–1,000) |
k-Nearest neighbor (kNN) | 1.73 (1.64, 2.39) | 6.00 (1.20, 67.10) | N: The number of closest training data (Euclidean distance) considered to be neighbors of the data being predicted (10–1,000) |
Kernel density estimation (KDE) | 1.57 (1.44, 2.55) | 11.33 (6.15, 55.62) | Bandwidth: The standard-deviation of the Gaussian kernel used for fitting a KDE model (10−4–101) |
Automatic KDE (aKDE) | 1.81 (1.54, 2.82) | 30.84 (26.83, 36.94) | None: Bandwidth is automatically calculated using Silverman's approximation |
Training times are estimated for 1,350–3,000 samples in each case, whilst prediction times are for 135–300 samples (validation step). A brief description of the hyper-parameter used in each case is provided (if applicable), with range test provided in parentheses. Computation times are from a 3.5 GHz personal machine with 16 GB of memory and an Intel Iris Plus graphics card.