Timezone: »
This paper examines the long-run behavior of learning with bandit feedback in non-cooperative concave games. The bandit framework accounts for extremely low-information environments where the agents may not even know they are playing a game; as such, the agents’ most sensible choice in this setting would be to employ a no-regret learning algorithm. In general, this does not mean that the players' behavior stabilizes in the long run: no-regret learning may lead to cycles, even with perfect gradient information. However, if a standard monotonicity condition is satisfied, our analysis shows that no-regret learning based on mirror descent with bandit feedback converges to Nash equilibrium with probability 1. We also derive an upper bound for the convergence rate of the process that nearly matches the best attainable rate for single-agent bandit stochastic optimization.
Author Information
Mario Bravo (University of Santiago, Chile)
David Leslie (Lancaster University and PROWLER.io)
Panayotis Mertikopoulos (CNRS (French National Center for Scientific Research))
More from the Same Authors
-
2022 Poster: No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation »
Yu-Guan Hsieh · Kimon Antonakopoulos · Volkan Cevher · Panayotis Mertikopoulos -
2022 Poster: On the convergence of policy gradient methods to Nash equilibria in general stochastic games »
Angeliki Giannou · Kyriakos Lotidis · Panayotis Mertikopoulos · Emmanouil-Vasileios Vlatakis-Gkaragkounis -
2021 Poster: Fast Routing under Uncertainty: Adaptive Learning in Congestion Games via Exponential Weights »
Dong Quan Vu · Kimon Antonakopoulos · Panayotis Mertikopoulos -
2021 Poster: Decentralized Q-learning in Zero-sum Markov Games »
Muhammed Sayin · Kaiqing Zhang · David Leslie · Tamer Basar · Asuman Ozdaglar -
2021 Poster: Sifting through the noise: Universal first-order methods for stochastic variational inequalities »
Kimon Antonakopoulos · Thomas Pethick · Ali Kavis · Panayotis Mertikopoulos · Volkan Cevher -
2021 Poster: Adaptive First-Order Methods Revisited: Convex Minimization without Lipschitz Requirements »
Kimon Antonakopoulos · Panayotis Mertikopoulos -
2021 Poster: On the Rate of Convergence of Regularized Learning in Games: From Bandits and Uncertainty to Optimism and Beyond »
Angeliki Giannou · Emmanouil-Vasileios Vlatakis-Gkaragkounis · Panayotis Mertikopoulos -
2020 Poster: No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Thanasis Lianeas · Panayotis Mertikopoulos · Georgios Piliouras -
2020 Spotlight: No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix »
Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Thanasis Lianeas · Panayotis Mertikopoulos · Georgios Piliouras -
2020 Poster: Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2020 Poster: BOSS: Bayesian Optimization over String Spaces »
Henry Moss · David Leslie · Daniel Beck · Javier González · Paul Rayson -
2020 Poster: Online Non-Convex Optimization with Imperfect Feedback »
Amélie Héliou · Matthieu Martin · Panayotis Mertikopoulos · Thibaud Rahier -
2020 Poster: On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems »
Panayotis Mertikopoulos · Nadav Hallak · Ali Kavis · Volkan Cevher -
2020 Spotlight: BOSS: Bayesian Optimization over String Spaces »
Henry Moss · David Leslie · Daniel Beck · Javier González · Paul Rayson -
2020 Spotlight: Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2019 Poster: On the convergence of single-call stochastic extra-gradient methods »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2019 Poster: An adaptive Mirror-Prox method for variational inequalities with singular operators »
Kimon Antonakopoulos · Veronica Belmega · Panayotis Mertikopoulos -
2018 : Poster spotlight »
Tianbao Yang · Pavel Dvurechenskii · Panayotis Mertikopoulos · Hugo Berard -
2018 Poster: Learning in Games with Lossy Feedback »
Zhengyuan Zhou · Panayotis Mertikopoulos · Susan Athey · Nicholas Bambos · Peter W Glynn · Yinyu Ye -
2017 Poster: Countering Feedback Delays in Multi-Agent Learning »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter W Glynn · Claire Tomlin -
2017 Poster: Learning with Bandit Feedback in Potential Games »
Amélie Héliou · Johanne Cohen · Panayotis Mertikopoulos -
2017 Poster: Stochastic Mirror Descent in Variationally Coherent Optimization Problems »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Stephen Boyd · Peter W Glynn