Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed

SGD with shuffling: optimal rates without component convexity and large epoch requirements

Jun 12, 2020
Kwangjun Ahn, Chulhee Yun, Suvrit Sra



We study without-replacement SGD for solving finite-sum optimization problems. Specifically, depending on how the indices of the finite-sum are shuffled, we consider the SingleShuffle (shuffle only once) and RandomShuffle (shuffle at the beginning of each epoch) algorithms. First, we establish minimax optimal convergence rates of these algorithms up to poly-log factors. Notably, our analysis is general enough to cover gradient dominated \emph{nonconvex} costs, and does not rely on the convexity of individual component functions unlike existing optimal convergence results. Secondly, assuming convexity of the individual components, we further sharpen the tight convergence results for $\randshuf$ by removing the drawbacks common to all prior arts: large number of epochs required for the results to hold, and extra poly-log factor gaps to the lower bound.

* 49 pages; this work completely replaces the previous version and adds many new results. Please see Table 1 for overall summary 


Share this with someone who'll enjoy it:

   Access Paper Source



Share this with someone who'll enjoy it: