Algorithm 2.
Stochastic Gradient Descent (SGD) with Endogenous Learning Rate.
| t := 1 |
| αi:= 0 and Vi:= −1 for i = 1, …, N L |
| do { |
| Choose one index from k ∈ [1, …, N L]. |
| αnew = max{C, αnew} and αnew = min{−C, αnew} |
| Initialize the KKT distance: KKT := 0 |
| loop over all i = 1, …, N L |
| Vi := Vi + yi(αnew − α̂k)Gik |
| KKT := KKT + KKT distance(Vi, yiα̂i) |
| end loop |
| KKT := KKT/(N L) |
| α̂k = αnew |
| t:= t + 1 |
| } while (KKT > θ) |
Note: N is the number of data points, L is the number of classes, ηeff is the learning rate, and θ is the stopping criterion. Note that this algorithm needs to compute the Vi values.