Skip to main content
. Author manuscript; available in PMC: 2013 Jul 21.
Published in final edited form as: Neural Comput. 2012 May 17;24(9):2473–2507. doi: 10.1162/NECO_a_00321

Algorithm 2.

Stochastic Gradient Descent (SGD) with Endogenous Learning Rate.

t := 1
αi:= 0 and Vi:= −1 for i = 1, …, N L
do {
 Choose one index from k ∈ [1, …, N L].
αnew=α^k-ηeffVk(t)ykGkk
αnew = max{C, αnew} and αnew = min{−C, αnew}
 Initialize the KKT distance: KKT := 0
 loop over all i = 1, …, N L
  Vi := Vi + yi(αnewα̂k)Gik
  KKT := KKT + KKT distance(Vi, yiα̂i)
 end loop
KKT := KKT/(N L)
α̂k = αnew
t:= t + 1
 } while (KKT > θ)

Note: N is the number of data points, L is the number of classes, ηeff is the learning rate, and θ is the stopping criterion. Note that this algorithm needs to compute the Vi values.