Poster
SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
Zhize Li
East Exhibition Hall B, C #125
Keywords: [ Optimization ] [ Stochastic Optimization ] [ Algorithms; Optimization -> Non-Convex Optimization; Optimization ]
[
Abstract
]
Abstract:
We analyze stochastic gradient algorithms for optimizing nonconvex problems.
In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points.
We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an (ϵ,δ)-second-order stationary point with ˜O(√n/ϵ2+√n/δ4+n/δ3) stochastic gradient complexity for nonconvex finite-sum problems.
As a by-product, SSRGD finds an ϵ-first-order stationary point with O(n+√n/ϵ2) stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound Ω(√n/ϵ2) for finding even just an ϵ-first-order stationary point.
We emphasize that SSRGD algorithm for finding second-order stationary points is as simple as for finding first-order stationary points just by adding a uniform perturbation sometimes, while all other algorithms for finding second-order stationary points with similar gradient complexity need to combine with a negative-curvature search subroutine (e.g., Neon2 [Allen-Zhu and Li, 2018]).
Moreover, the simple SSRGD algorithm gets a simpler analysis.
Besides, we also extend our results from nonconvex finite-sum problems to nonconvex online (expectation) problems, and prove the corresponding convergence results.
Live content is unavailable. Log in and register to view live content