Timezone: »

 
Poster
SGD with shuffling: optimal rates without component convexity and large epoch requirements
Kwangjun Ahn · Chulhee Yun · Suvrit Sra

Thu Dec 10 09:00 PM -- 11:00 PM (PST) @ Poster Session 6 #1865

We study without-replacement SGD for solving finite-sum optimization problems. Specifically, depending on how the indices of the finite-sum are shuffled, we consider the RandomShuffle (shuffle at the beginning of each epoch) and SingleShuffle (shuffle only once) algorithms. First, we establish minimax optimal convergence rates of these algorithms up to poly-log factors. Notably, our analysis is general enough to cover gradient dominated nonconvex costs, and does not rely on the convexity of individual component functions unlike existing optimal convergence results. Secondly, assuming convexity of the individual components, we further sharpen the tight convergence results for RandomShuffle by removing the drawbacks common to all prior arts: large number of epochs required for the results to hold, and extra poly-log factor gaps to the lower bound.

Author Information

Kwangjun Ahn (MIT)
Chulhee Yun (MIT)
Suvrit Sra (MIT)

Suvrit Sra is a faculty member within the EECS department at MIT, where he is also a core faculty member of IDSS, LIDS, MIT-ML Group, as well as the statistics and data science center. His research spans topics in optimization, matrix theory, differential geometry, and probability theory, which he connects with machine learning --- a key focus of his research is on the theme "Optimization for Machine Learning” (http://opt-ml.org)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors