Skip to yearly menu bar Skip to main content

Workshop: OPT 2023: Optimization for Machine Learning

Parameter-Agnostic Optimization under Relaxed Smoothness

Florian Hübler · Junchi YANG · Xiang Li · Niao He

Abstract: In training machine learning models, the tuning of hyperparameters such as the stepsize is both time-consuming and intricate. To address this challenge, many adaptive optimization algorithms have been developed to achieve near-optimal complexities, even when stepsizes are independent of problem parameters, provided the function is $L$-smooth. However, as the assumption is relaxed to the more realistic $(L_0, L_1)$-smoothness, all current convergence results still necessitate tuning the stepsize. In this study, we demonstrate that Normalized Stochastic Gradient Descent with Momentum can achieve a near-optimal complexity without prior knowledge of any problem parameter, though this introduces an exponential term dependent on $L_1$. We further establish that this term is inescapable to such schemes. Interestingly, in deterministic settings, this exponential factor can be negated using Gradient Descent with a Backtracking Line Search. To our knowledge, these represent the first parameter-agnostic convergence result for this generalized smoothness paradigm.

Chat is not available.