## A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast

### Jongmin Lee · Chanwoo Park · Ernest Ryu

Keywords: [ Optimization ]

[ Abstract ]
[
Wed 8 Dec 4:30 p.m. PST — 6 p.m. PST

Abstract: Since Nesterov's seminal 1983 work, many accelerated first-order optimization methods have been proposed, but their analyses lacks a common unifying structure. In this work, we identify a geometric structure satisfied by a wide range of first-order accelerated methods. Using this geometric insight, we present several novel generalizations of accelerated methods. Most interesting among them is a method that reduces the squared gradient norm with $\mathcal{O}(1/K^4)$ rate in the prox-grad setup, faster than the $\mathcal{O}(1/K^3)$ rates of Nesterov's FGM or Kim and Fessler's FPGM-m.

Chat is not available.