Timezone: »
Poster
Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes
James Bailey · Georgios Piliouras
Wed Dec 11 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #217
We show for the first time that it is possible to reconcile in online learning in zero-sum games two seemingly contradictory objectives: vanishing time-average regret and non-vanishing step sizes. This phenomenon, that we coin ``fast and furious" learning in games, sets a new benchmark about what is possible both in max-min optimization as well as in multi-agent systems. Our analysis does not depend on introducing a carefully tailored dynamic. Instead we focus on the most well studied online dynamic, gradient descent. Similarly, we focus on the simplest textbook class of games, two-agent two-strategy zero-sum games, such as Matching Pennies. Even for this simplest of benchmarks the best known bound for total regret, prior to our work, was the trivial one of $O(T)$, which is immediately applicable even to a non-learning agent. Based on a tight understanding of the geometry of the non-equilibrating trajectories in the dual space we prove a regret bound of $\Theta(\sqrt{T})$ matching the well known optimal bound for adaptive step sizes in the online setting. This guarantee holds for all fixed step-sizes without having to know the time horizon in advance and adapt the fixed step-size accordingly.As a corollary, we establish that even with fixed learning rates the time-average of mixed strategies, utilities converge to their exact Nash equilibrium values. We also provide experimental evidence suggesting the stronger regret bound holds for all zero-sum games.
Author Information
James Bailey (Texas A&M University)
Georgios Piliouras (Singapore University of Technology and Design)
More from the Same Authors
-
2020 Poster: No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Thanasis Lianeas · Panayotis Mertikopoulos · Georgios Piliouras -
2020 Poster: Efficient Online Learning of Optimal Rankings: Dimensionality Reduction via Gradient Descent »
Dimitris Fotakis · Thanasis Lianeas · Georgios Piliouras · Stratis Skoulakis -
2020 Poster: The route to chaos in routing games: When is price of anarchy too optimistic? »
Thiparat Chotibut · Fryderyk Falniowski · Michał Misiurewicz · Georgios Piliouras -
2020 Spotlight: No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Thanasis Lianeas · Panayotis Mertikopoulos · Georgios Piliouras -
2020 Poster: Chaos, Extremism and Optimism: Volume Analysis of Learning in Games »
Yun Kuen Cheung · Georgios Piliouras -
2019 Poster: Efficiently avoiding saddle points with zero order methods: No gradients required »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Georgios Piliouras -
2019 Poster: Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Georgios Piliouras -
2019 Spotlight: Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Georgios Piliouras