Table 1.
Machine Learning Algorithm | Principle | Advantages | Drawbacks |
---|---|---|---|
Linear regression | It assumes a linear relationship between input variables and output and thus, attempts to model this relationship by fitting a linear equation to the observed data There are several implementations of this model, of which the most commonly used is ordinary least squares, which tends to minimize the residual sum of the squares between the observed and predicted targets. |
|
|
Linear discriminant analysis (LDA) | It is used to identify to which class samples belong to, certain statistical properties of the data are first calculated and then substituted into the LDA equation. The statistical properties consist of the mean and variance for the case of a single input and the means and covariance matrix for multiple inputs. |
|
|
Random Forest | It builds a number of decision trees on bootstrapped training sets and considers a random sample of m predictors to be split candidates from the full set of p predictors to overcome the problem of high variance. Therefore, on average, the strong predictor is not considered and other predictors have a better chance. This process can be thought of as decorrelating of the trees, thereby making the average of the resulting trees less variable and hence more accurate and reliable. |
|
|
Support vector machine | It converts a non-linear separable problem by transforming it onto another higher dimensional space and thus, the problem becomes linearly separable. This is accomplished using various types of so-called kernel functions. Then, classification is performed by finding the hyperplane that well separates the classes of samples. |
|
|
Discriminant analysis via mixed integer programming (DAMIP) | It is a classification model based on a very powerful supervised-learning approach used primarily in the biomedical field. It is a discrete support vector machine coupled with a powerful embedded feature-selection module [176]. |
|
|