Timezone: »
This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision-making in stochastic games with a large population. It first establishes the existence of a unique Nash Equilibrium to this GMFG, and explains that naively combining Q-learning with the fixed-point approach in classical MFGs yields unstable algorithms. It then proposes a Q-learning algorithm with Boltzmann policy (GMF-Q), with analysis of convergence property and computational complexity. The experiments on repeated Ad auction problems demonstrate that this GMF-Q algorithm is efficient and robust in terms of convergence and learning accuracy. Moreover, its performance is superior in convergence, stability, and learning ability, when compared with existing algorithms for multi-agent reinforcement learning.
Author Information
Xin Guo (University of California, Berkeley)
Anran Hu (University of Californian, Berkeley (UC Berkeley))
Renyuan Xu (University of Oxford)
Renyuan Xu is currently a Hooke Research Fellow in the Mathematical Institute at the University of Oxford. Prior to that, she obtained her Ph.D. degree in the IEOR Department at UC Berkeley in 2019 and the Bachelor's degree in Mathematics from the University of Science and Technology of China in 2014.
Junzi Zhang (Stanford University)
More from the Same Authors
-
2019 Poster: Learning in Generalized Linear Contextual Bandits with Stochastic Delays »
Zhengyuan Zhou · Renyuan Xu · Jose Blanchet -
2019 Spotlight: Learning in Generalized Linear Contextual Bandits with Stochastic Delays »
Zhengyuan Zhou · Renyuan Xu · Jose Blanchet -
2018 : Consistency and Computation for Regularized Maximum Likelihood Estimation of Multivariate Hawkes Processes »
Anran Hu