Timezone: »

On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
Sashank J. Reddi · Ahmed Hefny · Suvrit Sra · Barnabas Poczos · Alexander Smola

Tue Dec 08 04:00 PM -- 08:59 PM (PST) @ 210 C #77

We study optimization algorithms based on variance reduction for stochastic gradientdescent (SGD). Remarkable recent progress has been made in this directionthrough development of algorithms like SAG, SVRG, SAGA. These algorithmshave been shown to outperform SGD, both theoretically and empirically. However,asynchronous versions of these algorithms—a crucial requirement for modernlarge-scale applications—have not been studied. We bridge this gap by presentinga unifying framework that captures many variance reduction techniques.Subsequently, we propose an asynchronous algorithm grounded in our framework,with fast convergence rates. An important consequence of our general approachis that it yields asynchronous versions of variance reduction algorithms such asSVRG, SAGA as a byproduct. Our method achieves near linear speedup in sparsesettings common to machine learning. We demonstrate the empirical performanceof our method through a concrete realization of asynchronous SVRG.

Author Information

Sashank J. Reddi (Carnegie Mellon University)
Ahmed Hefny (Carnegie Mellon University)
Suvrit Sra (MIT)

Suvrit Sra is a Research Faculty at the Laboratory for Information and Decision Systems (LIDS) at Massachusetts Institute of Technology (MIT). He obtained his PhD in Computer Science from the University of Texas at Austin in 2007. Before moving to MIT, he was a Senior Research Scientist at the Max Planck Institute for Intelligent Systems, in Tübingen, Germany. He has also held visiting faculty positions at UC Berkeley (EECS) and Carnegie Mellon University (Machine Learning Department) during 2013-2014. His research is dedicated to bridging a number of mathematical areas such as metric geometry, matrix analysis, convex analysis, probability theory, and optimization with machine learning; more broadly, his work involves algorithmically grounded topics within engineering and science. He has been a co-chair for OPT2008-2015, NIPS workshops on "Optimization for Machine Learning," and has also edited a volume of the same name (MIT Press, 2011).

Barnabas Poczos (Carnegie Mellon University)
Alexander Smola (Carnegie Mellon University)

**AWS Machine Learning**

More from the Same Authors