Skip to main content
. 2017 Mar 9;18:160. doi: 10.1186/s12859-017-1565-4

Table 1.

Prominent options for choosing loss function and regularizer in feature extraction algorithms

Name Loss function (L) Regularizer (R)
AIC/BIC y−〈ω,x〉∥2 ω0
Lasso y−〈ω,x〉∥2 ω1
Elastic Net y−〈ω,x〉∥2 ω22 + ∥ω1
Regularized Least Absolute
Deviations Regression y−〈ω,x〉∥1 ω1
Classic SVM max(0,1−yω,x〉)a 12ω22
1-SVM max(0,1−yω,x〉)a 12ω1
Logistic Regression log(1+exp(−yω,x〉)) 12ω1

*This is the so called Hinge loss

The 1- and 2-norm of a vector z=(z 1,…,z d)∈ d are defined by z1=j=1d|zi| and z2=(j=1d|zi|2)1/2, respectively. The “ 0-norm” ∥z0, simply counts the number of non-zero entries of z