In this talk, we review some of the recent advances in first-order methods for convex and non-convex optimization as well as their universality properties. We say an algorithm is universal if it does not require to know whether the optimization objective is smooth or not.
We first recall the AdaGrad method and show that AdaGrad is a universal algorithm without any modifications: It implicitly exploits the smoothness of the problem to achieve the standard O(1/k) rate in the presence of smooth objectives, where k is the iteration count.
To this end, we introduce an accelerated, universal variant of AdaGrad, dubbed as AcceleGrad, that in addition obtains the optimal convergence rate of O(1/k^2) in the smooth setting with deterministic oracles. We then introduce UniXGrad, which is the first algorithm that simultaneously achieves optimal rates for smooth or non-smooth problems with either deterministic or stochastic first-order oracles in the constrained convex setting.
We conclude the presentation with results in non-convex optimization revolving around ADAM-type algorithms, including new convergence rates.