Timezone: »
Stochastic first-order methods such as Stochastic Extragradient (SEG) or Stochastic Gradient Descent-Ascent (SGDA) for solving smooth minimax problems and, more generally, variational inequality problems (VIP) have been gaining a lot of attention in recent years due to the growing popularity of adversarial formulations in machine learning. While high-probability convergence bounds are known to more accurately reflect the actual behavior of stochastic methods, most convergence results are provided in expectation. Moreover, the only known high-probability complexity results have been derived under restrictive sub-Gaussian (light-tailed) noise and bounded domain assumptions [Juditsky et al., 2011]. In this work, we prove the first high-probability complexity results with logarithmic dependence on the confidence level for stochastic methods for solving monotone and structured non-monotone VIPs with non-sub-Gaussian (heavy-tailed) noise and unbounded domains. In the monotone case, our results match the best known ones in the light-tails case [Juditsky et al., 2011], and are novel for structured non-monotone problems such as negative comonotone, quasi-strongly monotone, and/or star-cocoercive ones. We achieve these results by studying SEG and SGDA with clipping. In addition, we numerically validate that the gradient noise of many practical GAN formulations is heavy-tailed and show that clipping improves the performance of SEG/SGDA.
Author Information
Eduard Gorbunov (Mohamed bin Zayed University of Artificial Intelligence)
Marina Danilova (ICS RAS)
David Dobre (Mila)
Pavel Dvurechenskii (Weierstrass Institute, Berlin)
Since 2015 Research assistant, Research Group 6 "Stochastic Algorithms and Nonparametric Statistics", Weierstrass Institute for Applied Analysis and Stochastics, Berlin 2014 - 2015 Research assistant, Institute for Information Transmission Problems, Moscow, Russia 2009 - 2015 Junior researcher, Moscow Institute of Physics and Technology, Moscow, Russia 2013 Ph.D., Moscow Institute of Physics and Technology, Moscow, Russia 2010 Master's Diploma, Moscow Institute of Physics and Technology, Moscow, Russia 2008 Bachelor's Diploma, Moscow Institute of Physics and Technology, Moscow, Russia
Alexander Gasnikov (Moscow Institute of Physics and Technology)
Gauthier Gidel (Mila)
More from the Same Authors
-
2021 Spotlight: A single gradient step finds adversarial examples on random two-layers neural networks »
Sebastien Bubeck · Yeshwanth Cherapanamjeri · Gauthier Gidel · Remi Tachet des Combes -
2021 : Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes »
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov -
2021 : Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes »
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov -
2021 : EF21 with Bells & Whistles: Practical Algorithmic Extensions of Modern Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin · Eduard Gorbunov · Zhize Li -
2021 : Decentralized Personalized Federated Min-Max Problems »
Ekaterina Borodich · Aleksandr Beznosikov · Abdurakhmon Sadiev · Vadim Sushko · Alexander Gasnikov -
2022 : Nesterov Meets Optimism: Rate-Optimal Optimistic-Gradient-Based Method for Stochastic Bilinearly-Coupled Minimax Optimization »
Chris Junchi Li · Angela Yuan · Gauthier Gidel · Michael Jordan -
2022 : Momentum Extragradient is Optimal for Games with Cross-Shaped Spectrum »
Junhyung Lyle Kim · Gauthier Gidel · Anastasios Kyrillidis · Fabian Pedregosa -
2022 : Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods »
Aleksandr Beznosikov · Eduard Gorbunov · Hugo Berard · Nicolas Loizou -
2022 : Effects of momentum scaling for SGD »
Dmitry A. Pasechnyuk · Alexander Gasnikov · Martin Takac -
2022 : Performative Prediction with Neural Networks »
Mehrnaz Mofakhami · Ioannis Mitliagkas · Gauthier Gidel -
2023 : Adversarial Attacks and Defenses in Large Language Models: Old and New Threats »
Leo Schwinn · David Dobre · Stephan Günnemann · Gauthier Gidel -
2023 Poster: Single-Call Stochastic Extragradient Methods for Structured Non-monotone Variational Inequalities: Improved Analysis under Weaker Conditions »
Sayantan Choudhury · Eduard Gorbunov · Nicolas Loizou -
2023 Poster: Feature Likelihood Score: Evaluating the Generalization of Generative Models Using Samples »
Marco Jiralerspong · Joey Bose · Ian Gemp · Chongli Qin · Yoram Bachrach · Gauthier Gidel -
2023 Poster: First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities »
Aleksandr Beznosikov · Sergey Samsonov · Marina Sheshukova · Alexander Gasnikov · Alexey Naumov · Eric Moulines -
2023 Poster: Byzantine-Tolerant Methods for Distributed Variational Inequalities »
Nazarii Tupitsa · Abdulla Jasem Almansoori · Yanlin Wu · Martin Takac · Karthik Nandakumar · Samuel Horváth · Eduard Gorbunov -
2023 Poster: Accelerated Zeroth-order Method for Non-Smooth Stochastic Convex Optimization Problem with Infinite Variance »
Nikita Kornilov · Ohad Shamir · Aleksandr Lobanov · Darina Dvinskikh · Alexander Gasnikov · Innokentiy Shibaev · Eduard Gorbunov · Samuel Horváth -
2023 Poster: Optimal Extragradient-Based Algorithms for Stochastic Variational Inequalities with Separable Structure »
Angela Yuan · Chris Junchi Li · Gauthier Gidel · Michael Jordan · Quanquan Gu · Simon Du -
2023 Poster: Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities »
Aleksandr Beznosikov · Martin Takac · Alexander Gasnikov -
2022 Spotlight: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Spotlight: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Spotlight: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2022 Spotlight: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Spotlight: Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity »
Dmitry Kovalev · Aleksandr Beznosikov · Ekaterina Borodich · Alexander Gasnikov · Gesualdo Scutari -
2022 Spotlight: The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Spotlight: Decentralized Local Stochastic Extra-Gradient for Variational Inequalities »
Aleksandr Beznosikov · Pavel Dvurechenskii · Anastasiia Koloskova · Valentin Samokhin · Sebastian Stich · Alexander Gasnikov -
2022 Poster: Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity »
Dmitry Kovalev · Aleksandr Beznosikov · Ekaterina Borodich · Alexander Gasnikov · Gesualdo Scutari -
2022 Poster: The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Poster: The Curse of Unrolling: Rate of Differentiating Through Optimization »
Damien Scieur · Gauthier Gidel · Quentin Bertrand · Fabian Pedregosa -
2022 Poster: Beyond L1: Faster and Better Sparse Models with skglm »
Quentin Bertrand · Quentin Klopfenstein · Pierre-Antoine Bannier · Gauthier Gidel · Mathurin Massias -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2022 Poster: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Poster: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Poster: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Poster: Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities »
Eduard Gorbunov · Adrien Taylor · Gauthier Gidel -
2022 Poster: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2022 Poster: Decentralized Local Stochastic Extra-Gradient for Variational Inequalities »
Aleksandr Beznosikov · Pavel Dvurechenskii · Anastasiia Koloskova · Valentin Samokhin · Sebastian Stich · Alexander Gasnikov -
2021 Poster: Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity »
Nicolas Loizou · Hugo Berard · Gauthier Gidel · Ioannis Mitliagkas · Simon Lacoste-Julien -
2021 Poster: Distributed Saddle-Point Problems Under Data Similarity »
Aleksandr Beznosikov · Gesualdo Scutari · Alexander Rogozin · Alexander Gasnikov -
2021 Poster: A single gradient step finds adversarial examples on random two-layers neural networks »
Sebastien Bubeck · Yeshwanth Cherapanamjeri · Gauthier Gidel · Remi Tachet des Combes -
2021 Poster: Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks »
Dmitry Kovalev · Elnur Gasanov · Alexander Gasnikov · Peter Richtarik -
2020 Poster: Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping »
Eduard Gorbunov · Marina Danilova · Alexander Gasnikov -
2020 Poster: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Spotlight: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2018 : Poster spotlight »
Tianbao Yang · Pavel Dvurechenskii · Panayotis Mertikopoulos · Hugo Berard -
2018 Poster: Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters »
Pavel Dvurechenskii · Darina Dvinskikh · Alexander Gasnikov · Cesar Uribe · Angelia Nedich -
2018 Spotlight: Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters »
Pavel Dvurechenskii · Darina Dvinskikh · Alexander Gasnikov · Cesar Uribe · Angelia Nedich -
2016 Poster: Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods »
Lev Bogolubsky · Pavel Dvurechenskii · Alexander Gasnikov · Gleb Gusev · Yurii Nesterov · Andrei M Raigorodskii · Aleksey Tikhonov · Maksim Zhukovskii