Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms, both in terms of optimizing the training error and reducing the test error quickly.
* The notable changes over the current version: - worked example of
convergence rates showing SAG can be faster than first-order methods -
pointing out that the storage cost is O(n) for linear models - the
more-stable line-search - comparison to additional optimal SG methods -
comparison to rates of coordinate descent methods in quadratic case