Timezone: »
Poster
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems
Panayotis Mertikopoulos · Nadav Hallak · Ali Kavis · Volkan Cevher
In this paper, we analyze the trajectories of stochastic gradient descent (SGD) with the aim of understanding their convergence properties in non-convex problems. We first show that the sequence of iterates generated by SGD remains bounded and converges with probability $1$ under a very broad range of step-size schedules. Subsequently, we prove that the algorithm's rate of convergence to local minimizers with a positive-definite Hessian is $O(1/n^p)$ if the method is run with a $Θ(1/n^p)$ step-size. This provides an important guideline for tuning the algorithm's step-size as it suggests that a cool-down phase with a vanishing step-size could lead to significant performance gains; we demonstrate this heuristic using ResNet architectures on CIFAR. Finally, going beyond existing positive probability guarantees, we show that SGD avoids strict saddle points/manifolds with probability $1$ for the entire spectrum of step-size policies considered.
Author Information
Panayotis Mertikopoulos (CNRS (French National Center for Scientific Research) and Criteo AI Lab)
Nadav Hallak (EPFL)
Ali Kavis (EPFL)
Volkan Cevher (EPFL)
More from the Same Authors
-
2023 Poster: Error Bounds for Score Matching Causal Discovery »
Zhenyu Zhu · Francesco Locatello · Volkan Cevher -
2023 Poster: Exponential Lower Bounds for Fictitious Play in Potential Games »
Ioannis Panageas · Nikolas Patris · Stratis Skoulakis · Volkan Cevher -
2023 Poster: Riemannian stochastic optimization methods avoid strict saddle points »
Ya-Ping Hsieh · Mohammad Reza Karimi Jaghargh · Andreas Krause · Panayotis Mertikopoulos -
2023 Poster: Strategic Stability under Regularized Learning in Games »
Victor Boone · Panayotis Mertikopoulos -
2023 Poster: Maximum independent set: Self-training through dynamic programming »
Lorenzo Brusca · Lars C.P.M. Quaedvlieg · Stratis Skoulakis · Grigorios Chrysos · Volkan Cevher -
2023 Poster: Stable Nonconvex-Nonconcave Training via Linear Interpolation »
Thomas Pethick · WANYUN XIE · Volkan Cevher -
2023 Poster: Alternation makes the adversary weaker in two-player games »
Volkan Cevher · Ashok Cutkosky · Ali Kavis · Georgios Piliouras · Stratis Skoulakis · Luca Viano -
2023 Poster: Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks »
Jiayuan Ye · Zhenyu Zhu · Fanghui Liu · Reza Shokri · Volkan Cevher -
2023 Poster: Payoff-based Learning with Matrix Multiplicative Weights in Quantum Games »
Kyriakos Lotidis · Panayotis Mertikopoulos · Nicholas Bambos · Jose Blanchet -
2023 Poster: On the Convergence of Shallow Transformers »
Yongtao Wu · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2023 Poster: Efficient Online Clustering with Moving Costs »
Dimitris Christou · Stratis Skoulakis · Volkan Cevher -
2023 Poster: Exploiting hidden structures in non-convex games for convergence to Nash equilibrium »
Iosif Sakos · Emmanouil-Vasileios Vlatakis-Gkaragkounis · Panayotis Mertikopoulos · Georgios Piliouras -
2022 Poster: Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization »
Ali Kavis · Stratis Skoulakis · Kimon Antonakopoulos · Leello Tadesse Dadi · Volkan Cevher -
2022 Poster: No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation »
Yu-Guan Hsieh · Kimon Antonakopoulos · Volkan Cevher · Panayotis Mertikopoulos -
2022 Poster: Generalization Properties of NAS under Activation and Skip Connection Search »
Zhenyu Zhu · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2022 Poster: Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) »
Zhenyu Zhu · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2022 Poster: On the Double Descent of Random Features Models Trained with SGD »
Fanghui Liu · Johan Suykens · Volkan Cevher -
2022 Poster: Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning »
Paul Rolland · Luca Viano · Norman Schürhoff · Boris Nikolov · Volkan Cevher -
2022 Poster: Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study »
Yongtao Wu · Zhenyu Zhu · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2022 Poster: Proximal Point Imitation Learning »
Luca Viano · Angeliki Kamoutsi · Gergely Neu · Igor Krawczuk · Volkan Cevher -
2022 Poster: On the convergence of policy gradient methods to Nash equilibria in general stochastic games »
Angeliki Giannou · Kyriakos Lotidis · Panayotis Mertikopoulos · Emmanouil-Vasileios Vlatakis-Gkaragkounis -
2022 Poster: Understanding Deep Neural Function Approximation in Reinforcement Learning via $\epsilon$-Greedy Exploration »
Fanghui Liu · Luca Viano · Volkan Cevher -
2022 Poster: Sound and Complete Verification of Polynomial Networks »
Elias Abad Rocamora · Mehmet Fatih Sahin · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2022 Poster: Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods »
Kimon Antonakopoulos · Ali Kavis · Volkan Cevher -
2021 Poster: The Effect of the Intrinsic Dimension on the Generalization of Quadratic Classifiers »
Fabian Latorre · Leello Tadesse Dadi · Paul Rolland · Volkan Cevher -
2021 Poster: Convergence of adaptive algorithms for constrained weakly convex optimization »
Ahmet Alacaoglu · Yura Malitsky · Volkan Cevher -
2021 Poster: Fast Routing under Uncertainty: Adaptive Learning in Congestion Games via Exponential Weights »
Dong Quan Vu · Kimon Antonakopoulos · Panayotis Mertikopoulos -
2021 Poster: STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization »
Kfir Levy · Ali Kavis · Volkan Cevher -
2021 Poster: Subquadratic Overparameterization for Shallow Neural Networks »
ChaeHwan Song · Ali Ramezani-Kebrya · Thomas Pethick · Armin Eftekhari · Volkan Cevher -
2021 Poster: Sifting through the noise: Universal first-order methods for stochastic variational inequalities »
Kimon Antonakopoulos · Thomas Pethick · Ali Kavis · Panayotis Mertikopoulos · Volkan Cevher -
2021 Poster: Adaptive First-Order Methods Revisited: Convex Minimization without Lipschitz Requirements »
Kimon Antonakopoulos · Panayotis Mertikopoulos -
2021 Poster: Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch »
Luca Viano · Yu-Ting Huang · Parameswaran Kamalaruban · Adrian Weller · Volkan Cevher -
2021 Poster: On the Rate of Convergence of Regularized Learning in Games: From Bandits and Uncertainty to Optimism and Beyond »
Angeliki Giannou · Emmanouil-Vasileios Vlatakis-Gkaragkounis · Panayotis Mertikopoulos -
2021 Poster: A first-order primal-dual method with adaptivity to local smoothness »
Maria-Luiza Vladarean · Yura Malitsky · Volkan Cevher -
2020 : Invited speaker: Adaptation and universality in first-order methods, Volkan Cevher »
Volkan Cevher -
2020 Poster: No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Thanasis Lianeas · Panayotis Mertikopoulos · Georgios Piliouras -
2020 Spotlight: No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Thanasis Lianeas · Panayotis Mertikopoulos · Georgios Piliouras -
2020 Poster: Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2020 Poster: Online Non-Convex Optimization with Imperfect Feedback »
Amélie Héliou · Matthieu Martin · Panayotis Mertikopoulos · Thibaud Rahier -
2020 Spotlight: Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2020 Poster: Robust Reinforcement Learning via Adversarial training with Langevin Dynamics »
Parameswaran Kamalaruban · Yu-Ting Huang · Ya-Ping Hsieh · Paul Rolland · Cheng Shi · Volkan Cevher -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 Poster: An Inexact Augmented Lagrangian Framework for Nonconvex Optimization with Nonlinear Constraints »
Mehmet Fatih Sahin · Armin eftekhari · Ahmet Alacaoglu · Fabian Latorre · Volkan Cevher -
2019 Poster: Stochastic Frank-Wolfe for Composite Convex Minimization »
Francesco Locatello · Alp Yurtsever · Olivier Fercoq · Volkan Cevher -
2019 Poster: On the convergence of single-call stochastic extra-gradient methods »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2019 Poster: An adaptive Mirror-Prox method for variational inequalities with singular operators »
Kimon Antonakopoulos · Veronica Belmega · Panayotis Mertikopoulos -
2019 Poster: UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization »
Ali Kavis · Kfir Y. Levy · Francis Bach · Volkan Cevher -
2019 Poster: Fast and Provable ADMM for Learning with Generative Priors »
Fabian Latorre · Armin eftekhari · Volkan Cevher -
2019 Spotlight: UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization »
Ali Kavis · Kfir Y. Levy · Francis Bach · Volkan Cevher -
2019 Spotlight: Fast and Provable ADMM for Learning with Generative Priors »
Fabian Latorre · Armin eftekhari · Volkan Cevher -
2018 : Finding Mixed Nash Equilibria of Generative Adversarial Networks »
Volkan Cevher -
2018 : Poster spotlight »
Tianbao Yang · Pavel Dvurechenskii · Panayotis Mertikopoulos · Hugo Berard -
2018 Poster: Online Adaptive Methods, Universality and Acceleration »
Kfir Y. Levy · Alp Yurtsever · Volkan Cevher -
2018 Poster: Mirrored Langevin Dynamics »
Ya-Ping Hsieh · Ali Kavis · Paul Rolland · Volkan Cevher -
2018 Spotlight: Mirrored Langevin Dynamics »
Ya-Ping Hsieh · Ali Kavis · Paul Rolland · Volkan Cevher -
2018 Poster: Bandit Learning in Concave N-Person Games »
Mario Bravo · David Leslie · Panayotis Mertikopoulos -
2018 Poster: Adversarially Robust Optimization with Gaussian Processes »
Ilija Bogunovic · Jonathan Scarlett · Stefanie Jegelka · Volkan Cevher -
2018 Spotlight: Adversarially Robust Optimization with Gaussian Processes »
Ilija Bogunovic · Jonathan Scarlett · Stefanie Jegelka · Volkan Cevher -
2018 Poster: Learning in Games with Lossy Feedback »
Zhengyuan Zhou · Panayotis Mertikopoulos · Susan Athey · Nicholas Bambos · Peter W Glynn · Yinyu Ye -
2017 Poster: Streaming Robust Submodular Maximization: A Partitioned Thresholding Approach »
Slobodan Mitrovic · Ilija Bogunovic · Ashkan Norouzi-Fard · Jakub M Tarnawski · Volkan Cevher -
2017 Poster: Countering Feedback Delays in Multi-Agent Learning »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter W Glynn · Claire Tomlin -
2017 Poster: Learning with Bandit Feedback in Potential Games »
Amélie Héliou · Johanne Cohen · Panayotis Mertikopoulos -
2017 Poster: Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data »
Joel A Tropp · Alp Yurtsever · Madeleine Udell · Volkan Cevher -
2017 Poster: Phase Transitions in the Pooled Data Problem »
Jonathan Scarlett · Volkan Cevher -
2017 Poster: Smooth Primal-Dual Coordinate Descent Algorithms for Nonsmooth Convex Optimization »
Ahmet Alacaoglu · Quoc Tran Dinh · Olivier Fercoq · Volkan Cevher -
2017 Poster: Stochastic Mirror Descent in Variationally Coherent Optimization Problems »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Stephen Boyd · Peter W Glynn -
2016 Poster: An Efficient Streaming Algorithm for the Submodular Cover Problem »
Ashkan Norouzi-Fard · Abbas Bazzi · Ilija Bogunovic · Marwa El Halabi · Ya-Ping Hsieh · Volkan Cevher -
2016 Poster: Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation »
Ilija Bogunovic · Jonathan Scarlett · Andreas Krause · Volkan Cevher -
2016 Poster: Stochastic Three-Composite Convex Minimization »
Alp Yurtsever · Bang Cong Vu · Volkan Cevher -
2015 Poster: Preconditioned Spectral Descent for Deep Learning »
David Carlson · Edo Collins · Ya-Ping Hsieh · Lawrence Carin · Volkan Cevher -
2015 Poster: A Universal Primal-Dual Convex Optimization Framework »
Alp Yurtsever · Quoc Tran Dinh · Volkan Cevher -
2014 Workshop: Discrete Optimization in Machine Learning »
Jeffrey A Bilmes · Andreas Krause · Stefanie Jegelka · S Thomas McCormick · Sebastian Nowozin · Yaron Singer · Dhruv Batra · Volkan Cevher -
2014 Poster: Constrained convex minimization via model-based excessive gap »
Quoc Tran-Dinh · Volkan Cevher -
2014 Poster: Time--Data Tradeoffs by Aggressive Smoothing »
John J Bruer · Joel A Tropp · Volkan Cevher · Stephen Becker -
2013 Poster: High-Dimensional Gaussian Process Bandits »
Josip Djolonga · Andreas Krause · Volkan Cevher -
2012 Poster: Active Learning of Multi-Index Function Models »
Hemant Tyagi · Volkan Cevher -
2009 Workshop: Manifolds, sparsity, and structured models: When can low-dimensional geometry really help? »
Richard Baraniuk · Volkan Cevher · Mark A Davenport · Piotr Indyk · Bruno Olshausen · Michael B Wakin -
2009 Poster: Learning with Compressible Priors »
Volkan Cevher -
2008 Poster: Sparse Signal Recovery Using Markov Random Fields »
Volkan Cevher · Marco F Duarte · Chinmay Hegde · Richard Baraniuk -
2008 Spotlight: Sparse Signal Recovery Using Markov Random Fields »
Volkan Cevher · Marco F Duarte · Chinmay Hegde · Richard Baraniuk