ST |
Standard training. This method uses the ℓ1-penalized regression techniques outlined in Section 2.2, training one model on all of the Ttraining set. The regularization parameter λ is chosen through cross-validation. |
SVM |
Support vector machine. The cost-tuning parameter is chosen through cross-validation. |
KSVM |
K-means + SVM. We cluster the training data into K clusters via the K-means algorithm and fit an SVM to each training cluster. Test data are assigned to the nearest cluster centroid. This method is a simpler, special case of the clustered SVMs proposed by Gu and Han (2013), whose recommendation of K = 8 we use. |
RF |
Random forests. At each split we consider of the p predictor variables (classification) or p/3 of the p predictor variables (regression). |
KNN |
k-nearest neighbors. This simple technique for classification and regression contrasts the performance of customized training with another “local” method. The parameter k is chosen via cross-validation. |