Timezone: »
In this work, we propose an adaptive variation on the classical heavy-ball method for convex quadratic minimization. The adaptivity crucially relies on so-called ``Polyak step-sizes'', which consists in using the knowledge of the optimal value of the optimization problem at hand instead of problem parameters such as a few eigenvalues of the Hessian of the problem. This method happens to also be equivalent to a variation of the classical conjugate gradient method, and thereby inherits many of its attractive features, including its finite-time convergence, instance optimality, and its worst-case convergence rates.The classical gradient method with Polyak step-sizes is known to behave very well in situations in which it can be used, and the question of whether incorporating momentum in this method is possible and can improve the method itself appeared to be open.We provide a definitive answer to this question for minimizing convex quadratic functions, a arguably necessary first step for developing such methods in more general setups.
Author Information
Baptiste Goujaud (Ecole Polytechnique)
Adrien Taylor (Inria)
Aymeric Dieuleveut (Ecole Polytechnique, IPParis)
More from the Same Authors
-
2022 Poster: Gradient Descent Is Optimal Under Lower Restricted Secant Inequality And Upper Error Bound »
Charles Guille-Escuret · Adam Ibrahim · Baptiste Goujaud · Ioannis Mitliagkas -
2022 Poster: FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings »
Jean Ogier du Terrail · Samy-Safwan Ayed · Edwige Cyffers · Felix Grimberg · Chaoyang He · Regis Loeb · Paul Mangold · Tanguy Marchand · Othmane Marfoq · Erum Mushtaq · Boris Muzellec · Constantin Philippenko · Santiago Silva · Maria Teleńczuk · Shadi Albarqouni · Salman Avestimehr · Aurélien Bellet · Aymeric Dieuleveut · Martin Jaggi · Sai Praneeth Karimireddy · Marco Lorenzi · Giovanni Neglia · Marc Tommasi · Mathieu Andreux -
2022 Poster: Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization »
Benjamin Dubois-Taine · Francis Bach · Quentin Berthet · Adrien Taylor -
2022 Poster: Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities »
Eduard Gorbunov · Adrien Taylor · Gauthier Gidel -
2021 Oral: Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms »
Mathieu Even · Raphaël Berthier · Francis Bach · Nicolas Flammarion · Hadrien Hendrikx · Pierre Gaillard · Laurent Massoulié · Adrien Taylor -
2021 Poster: Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms »
Mathieu Even · Raphaël Berthier · Francis Bach · Nicolas Flammarion · Hadrien Hendrikx · Pierre Gaillard · Laurent Massoulié · Adrien Taylor -
2019 Poster: Gradient based sample selection for online continual learning »
Rahaf Aljundi · Min Lin · Baptiste Goujaud · Yoshua Bengio