Skip to yearly menu bar Skip to main content

Workshop: OPT 2023: Optimization for Machine Learning

MSL: An Adaptive Momentem-based Stochastic Line-search Framework

Chen Fan · Sharan Vaswani · Christos Thrampoulidis · Mark Schmidt


Various adaptive step sizes have been proposed recently to reduce the amount of tedious manualtuning. A popular example is back-tracking line-search based on a stochastic Armijo condition.But the success of this strategy relies crucially on the search direction being a descent direction.Importantly, this condition is violated by both SGD with momentum (SGDM) and Adam, which arecommon choices in deep-net training. Adaptively choosing the step size in this setting is thus non-trivial and less explored despite its practical relevance. In this work, we propose two frameworks,namely, momentum correction and restart, that allow the use of stochastic line-search in conjunctionwith a generalized Armijo condition, and apply them to both SGDM and Adam. We empiricallyverify that the proposed algorithms are robust to the choice of the momentum parameter and otherhyperparameters.

Chat is not available.