NeurIPS Poster Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

Poster

Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

Yang Cai · Gabriele Farina · Julien Grand-Clément · Christian Kroer · Chung-Wei Lee · Haipeng Luo · Weiqiang Zheng

West Ballroom A-D #6709

[ Abstract ]

[ Paper] [ OpenReview]

Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract: Self play via online learning is one of the premier ways to solve large-scale zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy

O (1 / T)

$O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages, including logarithmic dependence on the size of the payoff matrix and

\tilde{O} (1 / T)

$\tilde{O}(1/T)$ convergence to coarse correlated equilibria even in general-sum games. However, in terms of last-iterate convergence in two-player zero-sum games, an increasingly popular topic in this area, OGDA guarantees that the duality gap shrinks at a rate of

(1 / \sqrt{T})

$(1/\sqrt{T})$ , while the best existing last-iterate convergence for OMWU depends on some game-dependent constant that could be arbitrarily large. This begs the question: is this potentially slow last-iterate convergence an inherent disadvantage of OMWU, or is the current analysis too loose? Somewhat surprisingly, we show that the former is true. More generally, we prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue: for any arbitrarily small

δ > 0

$\delta>0$ , there exists a

2 \times 2

$2\times 2$ matrix game such that the algorithm admits a constant duality gap even after

1 / δ

$1/\delta$ rounds. This class of algorithms includes OMWU and other standard optimistic follow-the-regularized-leader algorithms.

Chat is not available.