Skip to yearly menu bar Skip to main content


Poster
in
Workshop: OPT 2023: Optimization for Machine Learning

Parameter-Agnostic Optimization under Relaxed Smoothness

Florian Hübler · Junchi YANG · Xiang Li · Niao He


Abstract: In training machine learning models, the tuning of hyperparameters such as the stepsize is both time-consuming and intricate. To address this challenge, many adaptive optimization algorithms have been developed to achieve near-optimal complexities, even when stepsizes are independent of problem parameters, provided the function is L-smooth. However, as the assumption is relaxed to the more realistic (L0,L1)-smoothness, all current convergence results still necessitate tuning the stepsize. In this study, we demonstrate that Normalized Stochastic Gradient Descent with Momentum can achieve a near-optimal complexity without prior knowledge of any problem parameter, though this introduces an exponential term dependent on L1. We further establish that this term is inescapable to such schemes. Interestingly, in deterministic settings, this exponential factor can be negated using Gradient Descent with a Backtracking Line Search. To our knowledge, these represent the first parameter-agnostic convergence result for this generalized smoothness paradigm.

Chat is not available.