Timezone: »
Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few. Unrolled differentiation is a popular heuristic that approximates the solution using an iterative solver and differentiates it through the computational path. This work provides a non-asymptotic convergence-rate analysis of this approach on quadratic objectives for gradient descent and the Chebyshev method. We show that to ensure convergence of the Jacobian, we can either 1) choose a large learning rate leading to a fast asymptotic convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence. We refer to this phenomenon as the curse of unrolling.Finally, we discuss open problems relative to this approach, such as deriving a practical update rule for the optimal unrolling strategy and making novel connections with the field of Sobolev orthogonal polynomials.
Author Information
Damien Scieur (Samsung SAIL Montreal)
Gauthier Gidel (Mila)
Quentin Bertrand (Mila)
Fabian Pedregosa (Google AI)
More from the Same Authors
-
2021 Spotlight: A single gradient step finds adversarial examples on random two-layers neural networks »
Sebastien Bubeck · Yeshwanth Cherapanamjeri · Gauthier Gidel · Remi Tachet des Combes -
2022 : Nesterov Meets Optimism: Rate-Optimal Optimistic-Gradient-Based Method for Stochastic Bilinearly-Coupled Minimax Optimization »
Chris Junchi Li · Angela Yuan · Gauthier Gidel · Michael Jordan -
2022 : Momentum Extragradient is Optimal for Games with Cross-Shaped Spectrum »
Junhyung Lyle Kim · Gauthier Gidel · Anastasios Kyrillidis · Fabian Pedregosa -
2022 : A Novel Stochastic Gradient Descent Algorithm for LearningPrincipal Subspaces »
Charline Le Lan · Joshua Greaves · Jesse Farebrother · Mark Rowland · Fabian Pedregosa · Rishabh Agarwal · Marc Bellemare -
2022 : A Second-order Regression Model Shows Edge of Stability Behavior »
Fabian Pedregosa · Atish Agarwala · Jeffrey Pennington -
2022 : Performative Prediction with Neural Networks »
Mehrnaz Mofakhami · Ioannis Mitliagkas · Gauthier Gidel -
2022 Poster: Efficient and Modular Implicit Differentiation »
Mathieu Blondel · Quentin Berthet · Marco Cuturi · Roy Frostig · Stephan Hoyer · Felipe Llinares-Lopez · Fabian Pedregosa · Jean-Philippe Vert -
2022 Poster: Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise »
Eduard Gorbunov · Marina Danilova · David Dobre · Pavel Dvurechenskii · Alexander Gasnikov · Gauthier Gidel -
2022 Poster: Beyond L1: Faster and Better Sparse Models with skglm »
Quentin Bertrand · Quentin Klopfenstein · Pierre-Antoine Bannier · Gauthier Gidel · Mathurin Massias -
2022 Poster: Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities »
Eduard Gorbunov · Adrien Taylor · Gauthier Gidel -
2021 Poster: Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity »
Nicolas Loizou · Hugo Berard · Gauthier Gidel · Ioannis Mitliagkas · Simon Lacoste-Julien -
2021 Poster: A single gradient step finds adversarial examples on random two-layers neural networks »
Sebastien Bubeck · Yeshwanth Cherapanamjeri · Gauthier Gidel · Remi Tachet des Combes -
2019 : Spotlight talks »
Damien Scieur · Konstantin Mishchenko · Rohan Anil -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2017 Poster: Nonlinear Acceleration of Stochastic Algorithms »
Damien Scieur · Francis Bach · Alexandre d'Aspremont -
2017 Poster: Integration Methods and Optimization Algorithms »
Damien Scieur · Vincent Roulet · Francis Bach · Alexandre d'Aspremont -
2016 Poster: Regularized Nonlinear Acceleration »
Damien Scieur · Alexandre d'Aspremont · Francis Bach -
2016 Oral: Regularized Nonlinear Acceleration »
Damien Scieur · Alexandre d'Aspremont · Francis Bach