Timezone: »
Stochastic gradient methods are the workhorse (algorithms) of large-scale optimization problems in machine learning, signal processing, and other computational sciences and engineering. This paper studies Markov chain gradient descent, a variant of stochastic gradient descent where the random samples are taken on the trajectory of a Markov chain. Existing results of this method assume convex objectives and a reversible Markov chain and thus have their limitations. We establish new non-ergodic convergence under wider step sizes, for nonconvex problems, and for non-reversible finite-state Markov chains. Nonconvexity makes our method applicable to broader problem classes. Non-reversible finite-state Markov chains, on the other hand, can mix substatially faster. To obtain these results, we introduce a new technique that varies the mixing levels of the Markov chains. The reported numerical results validate our contributions.
Author Information
Tao Sun (National university of defense technology)
College of Science, National University of Defense Technology, PRC.
Yuejiao Sun (University of California, Los Angeles)
Wotao Yin (University of California, Los Angeles)
More from the Same Authors
-
2021 Spotlight: Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems »
Tianyi Chen · Yuejiao Sun · Wotao Yin -
2022 Poster: Finite-Time Analysis of Adaptive Temporal Difference Learning with Deep Neural Networks »
Tao Sun · Dongsheng Li · Bao Wang -
2021 Poster: Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems »
Tianyi Chen · Yuejiao Sun · Wotao Yin -
2019 Poster: General Proximal Incremental Aggregated Gradient Algorithms: Better and Novel Results under General Scheme »
Tao Sun · Yuejiao Sun · Dongsheng Li · Qing Liao -
2018 Poster: LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning »
Tianyi Chen · Georgios Giannakis · Tao Sun · Wotao Yin -
2018 Spotlight: LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning »
Tianyi Chen · Georgios Giannakis · Tao Sun · Wotao Yin -
2018 Poster: Breaking the Span Assumption Yields Fast Finite-Sum Minimization »
Robert Hannah · Yanli Liu · Daniel O'Connor · Wotao Yin -
2018 Poster: Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds »
Xiaohan Chen · Jialin Liu · Zhangyang Wang · Wotao Yin -
2018 Spotlight: Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds »
Xiaohan Chen · Jialin Liu · Zhangyang Wang · Wotao Yin -
2017 Poster: Straggler Mitigation in Distributed Optimization Through Data Encoding »
Can Karakus · Yifan Sun · Suhas Diggavi · Wotao Yin -
2017 Poster: Asynchronous Coordinate Descent under More Realistic Assumptions »
Tao Sun · Robert Hannah · Wotao Yin -
2017 Spotlight: Straggler Mitigation in Distributed Optimization Through Data Encoding »
Can Karakus · Yifan Sun · Suhas Diggavi · Wotao Yin