Timezone: »

Variance Reduction for Stochastic Gradient Optimization
Chong Wang · Xi Chen · Alexander Smola · Eric Xing

Fri Dec 06 07:00 PM -- 11:59 PM (PST) @ Harrah's Special Events Center, 2nd Floor

Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower convergence and worse performance. In this paper, we develop a general approach of using control variate for variance reduction in stochastic gradient. Data statistics such as low-order moments (pre-computed or estimated online) is used to form the control variate. We demonstrate how to construct the control variate for two practical problems using stochastic gradient optimization. One is convex---the MAP estimation for logistic regression, and the other is non-convex---stochastic variational inference for latent Dirichlet allocation. On both problems, our approach shows faster convergence and better performance than the classical approach.

Author Information

Chong Wang (CMU)
Xi Chen (NYU)

Xi Chen is an associate professor with tenure at Stern School of Business at New York University, who is also an affiliated professor to Computer Science and Center for Data Science. Before that, he was a Postdoc in the group of Prof. Michael Jordan at UC Berkeley. He obtained his Ph.D. from the Machine Learning Department at Carnegie Mellon University (CMU). He studies high-dimensional statistical learning, online learning, large-scale stochastic optimization, and applications to operations. He has published more than 20 journal articles in statistics, machine learning, and operations, and 30 top machine learning peer-reviewed conference proceedings. He received NSF Career Award, ICSA Outstanding Young Researcher Award, Faculty Research Awards from Google, Adobe, Alibaba, and Bloomberg, and was featured in Forbes list of “30 Under30 in Science”.

Alexander Smola (Amazon)

**AWS Machine Learning**

Eric Xing (Petuum Inc. / Carnegie Mellon University)

More from the Same Authors