Timezone: »
Poster
A framework for bilevel optimization that enables stochastic and global variance reduction algorithms
Mathieu Dagréou · Pierre Ablin · Samuel Vaiter · Thomas Moreau
Bilevel optimization, the problem of minimizing a value function which involves the arg-minimum of another function, appears in many areas of machine learning. In a large scale empirical risk minimization setting where the number of samples is huge, it is crucial to develop stochastic methods, which only use a few samples at a time to progress. However, computing the gradient of the value function involves solving a linear system, which makes it difficult to derive unbiased stochastic estimates.To overcome this problem we introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time. These directions are written as a sum, making it straightforward to derive unbiased estimates.The simplicity of our approach allows us to develop global variance reduction algorithms, where the dynamics of all variables is subject to variance reduction.We demonstrate that SABA, an adaptation of the celebrated SAGA algorithm in our framework, has $O(\frac1T)$ convergence rate, and that it achieves linear convergence under Polyak-Lojasciewicz assumption.This is the first stochastic algorithm for bilevel optimization that verifies either of these properties.Numerical experiments validate the usefulness of our method.
Author Information
Mathieu Dagréou (Inria Saclay)
Pierre Ablin (Apple)
Samuel Vaiter (CNRS)
Thomas Moreau (Inria)
More from the Same Authors
-
2023 : Bilevel Optimization to Learn Training Distributions for Language Modeling under Domain Shift »
David Grangier · Pierre Ablin · Awni Hannun -
2023 Poster: How to Scale Your EMA »
Dan Busbridge · Jason Ramapuram · Pierre Ablin · Tatiana Likhomanenko · Eeshan Gunesh Dhekane · Xavier Suau Cuadros · Russell Webb -
2022 Panel: Panel 3A-4: Reproducibility in Optimization:… & A framework for… »
Kwangjun Ahn · Mathieu Dagréou -
2022 Poster: Benchopt: Reproducible, efficient and collaborative optimization benchmarks »
Thomas Moreau · Mathurin Massias · Alexandre Gramfort · Pierre Ablin · Pierre-Antoine Bannier · Benjamin Charlier · Mathieu Dagréou · Tom Dupre la Tour · Ghislain DURIF · Cassio F. Dantas · Quentin Klopfenstein · Johan Larsson · En Lai · Tanguy Lefort · Benoît Malézieux · Badr MOUFAD · Binh T. Nguyen · Alain Rakotomamonjy · Zaccharie Ramzi · Joseph Salmon · Samuel Vaiter -
2022 Poster: Deep invariant networks with differentiable augmentation layers »
Cédric ROMMEL · Thomas Moreau · Alexandre Gramfort -
2022 Poster: Do Residual Neural Networks discretize Neural Ordinary Differential Equations? »
Michael Sander · Pierre Ablin · Gabriel Peyré -
2022 Poster: Automatic differentiation of nonsmooth iterative algorithms »
Jerome Bolte · Edouard Pauwels · Samuel Vaiter -
2021 Poster: Shared Independent Component Analysis for Multi-Subject Neuroimaging »
Hugo Richard · Pierre Ablin · Bertrand Thirion · Alexandre Gramfort · Aapo Hyvarinen -
2021 Poster: On the Universality of Graph Neural Networks on Large Random Graphs »
Nicolas Keriven · Alberto Bietti · Samuel Vaiter -
2020 Poster: Learning to solve TV regularised problems with unrolled algorithms »
Hamza Cherkaoui · Jeremias Sulam · Thomas Moreau -
2020 Poster: Modeling Shared responses in Neuroimaging Studies through MultiView ICA »
Hugo Richard · Luigi Gresele · Aapo Hyvarinen · Bertrand Thirion · Alexandre Gramfort · Pierre Ablin -
2020 Poster: Convergence and Stability of Graph Convolutional Networks on Large Random Graphs »
Nicolas Keriven · Alberto Bietti · Samuel Vaiter -
2020 Spotlight: Convergence and Stability of Graph Convolutional Networks on Large Random Graphs »
Nicolas Keriven · Alberto Bietti · Samuel Vaiter -
2020 Spotlight: Modeling Shared responses in Neuroimaging Studies through MultiView ICA »
Hugo Richard · Luigi Gresele · Aapo Hyvarinen · Bertrand Thirion · Alexandre Gramfort · Pierre Ablin -
2020 Poster: NeuMiss networks: differentiable programming for supervised learning with missing values. »
Marine Le Morvan · Julie Josse · Thomas Moreau · Erwan Scornet · Gael Varoquaux -
2020 Oral: NeuMiss networks: differentiable programming for supervised learning with missing values. »
Marine Le Morvan · Julie Josse · Thomas Moreau · Erwan Scornet · Gael Varoquaux -
2019 Poster: Learning step sizes for unfolded sparse coding »
Pierre Ablin · Thomas Moreau · Mathurin Massias · Alexandre Gramfort -
2019 Poster: Manifold-regression to predict from MEG/EEG brain signals without source modeling »
David Sabbagh · Pierre Ablin · Gael Varoquaux · Alexandre Gramfort · Denis A. Engemann -
2018 Poster: Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals »
Tom Dupré la Tour · Thomas Moreau · Mainak Jas · Alexandre Gramfort