Session
Orals & Spotlights Track 11: Learning Theory
Dylan Foster · Nicolò Cesa-Bianchi
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
Andrea Celli · Alberto Marchesi · Gabriele Farina · Nicola Gatti
The existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as private information. Because of the sequential nature and presence of partial information in the game, extensive-form correlation has significantly different properties than the normal-form counterpart, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to normal-form correlated equilibrium. However, it was currently unknown whether EFCE emerges as the result of uncoupled agent dynamics. In this paper, we give the first uncoupled no-regret dynamics that converge to the set of EFCEs in n-player general-sum extensive-form games with perfect recall. First, we introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games. When each player has low trigger regret, the empirical frequency of play is a close to an EFCE. Then, we give an efficient no-trigger-regret algorithm. Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions at each decision point.
Efficient active learning of sparse halfspaces with arbitrary bounded noise
Chicheng Zhang · Jie Shen · Pranjal Awasthi
We study active learning of homogeneous
In recent years we see a rapidly growing line of research which shows learnability of various models via common neural network algorithms. Yet, besides a very few outliers, these results show learnability of models that can be learned using linear methods. Namely, such results show that learning neural-networks with gradient-descent is competitive with learning a linear classifier on top of a data-independent representation of the examples. This leaves much to be desired, as neural networks are far more successful than linear methods. Furthermore, on the more conceptual level, linear models don't seem to capture the``deepness" of deep networks. In this paper we make a step towards showing leanability of models that are inherently non-linear. We show that under certain distributions, sparse parities are learnable via gradient decent on depth-two network. On the other hand, under the same distributions, these parities cannot be learned efficiently by linear methods.
The Adaptive Complexity of Maximizing a Gross Substitutes Valuation
Ron Kupfer · Sharon Qian · Eric Balkanski · Yaron Singer
In this paper, we study the adaptive complexity of maximizing a monotone gross substitutes function under a cardinality constraint. Our main result is an algorithm that achieves a 1-epsilon approximation in O(log n) adaptive rounds for any constant epsilon > 0, which is an exponential speedup in parallel running time compared to previously studied algorithms for gross substitutes functions. We show that the algorithmic results are tight in the sense that there is no algorithm that obtains a constant factor approximation in o(log n) rounds. Both the upper and lower bounds are under the assumption that queries are only on feasible sets (i.e., of size at most k). We also show that under a stronger model, where non-feasible queries are allowed, there is no non-adaptive algorithm that obtains an approximation better than 1/2 + epsilon. Both lower bounds extend to the class of OXS functions. Additionally, we conduct experiments on synthetic and real data sets to demonstrate the near-optimal performance and efficiency of the algorithm in practice.
Hitting the High Notes: Subset Selection for Maximizing Expected Order Statistics
Aranyak Mehta · Uri Nadav · Alexandros Psomas · Aviad Rubinstein
We consider the fundamental problem of selecting
A Bandit Learning Algorithm and Applications to Auction Design
Kim Thang Nguyen
We consider online bandit learning in which at every time step,
an algorithm has to make a decision and then observe only its reward.
The goal is to design efficient (polynomial-time) algorithms that achieve a total reward approximately
close to that of the best fixed decision in hindsight.
In this paper, we introduce a new notion of
An Optimal Elimination Algorithm for Learning a Best Arm
Avinatan Hassidim · Ron Kupfer · Yaron Singer
We consider the classic problem of
Second Order PAC-Bayesian Bounds for the Weighted Majority Vote
Andres Masegosa · Stephan Lorenzen · Christian Igel · Yevgeny Seldin
We present a novel analysis of the expected risk of weighted majority vote in multiclass classification. The analysis takes correlation of predictions by ensemble members into account and provides a bound that is amenable to efficient minimization, which yields improved weighting for the majority vote. We also provide a specialized version of our bound for binary classification, which allows to exploit additional unlabeled data for tighter risk estimation. In experiments, we apply the bound to improve weighting of trees in random forests and show that, in contrast to the commonly used first order bound, minimization of the new bound typically does not lead to degradation of the test error of the ensemble.
PAC-Bayesian Bound for the Conditional Value at Risk
Zakaria Mhammedi · Benjamin Guedj · Robert Williamson
Conditional Value at Risk (CVaR) is a 'coherent risk measure' which generalizes expectation (reduced to a boundary parameter setting).
Widely used in mathematical finance, it is garnering increasing interest in machine learning as an alternate approach to regularization, and as a means for ensuring fairness.
This paper presents a generalization bound for learning algorithms that minimize the CVaR of the empirical loss.
The bound is of PAC-Bayesian type and is guaranteed to be small when the empirical CVaR is small.
We achieve this by reducing the problem of estimating CVaR to that of merely estimating an expectation. This then enables us, as a by-product, to obtain concentration inequalities for CVaR even when the random variable in question is unbounded.
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability
Sitan Chen · Frederic Koehler · Ankur Moitra · Morris Yau
In this paper, we revisit the problem of distribution-independently learning halfspaces under Massart noise with rate
Hedging in games: Faster convergence of external and swap regrets
Xi Chen · Binghui Peng
We consider the setting where players run the Hedge algorithm or its optimistic variant \cite{syrgkanis2015fast} to play an n-action game repeatedly for T rounds. 1) For two-player games, we show that the regret of optimistic Hedge decays at \tilde{O}( 1/T ^{5/6} ), improving the previous bound O(1/T^{3/4}) by \cite{syrgkanis2015fast}. 2) In contrast, we show that the convergence rate of vanilla Hedge is no better than \tilde{\Omega}(1/ \sqrt{T})}, addressing an open question posted in \cite{syrgkanis2015fast}. For general m-player games, we show that the swap regret of each player decays at rate \tilde{O}(m^{1/2} (n/T)^{3/4}) when they combine optimistic Hedge with the classical external-to-internal reduction of Blum and Mansour \cite{blum2007external}. The algorithm can also be modified to achieve the same rate against itself and a rate of \tilde{O}(\sqrt{n/T}) against adversaries. Via standard connections, our upper bounds also imply faster convergence to coarse correlated equilibria in two-player games and to correlated equilibria in multiplayer games.
Online Bayesian Persuasion
Matteo Castiglioni · Andrea Celli · Alberto Marchesi · Nicola Gatti
In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real economic scenarios. However, the original model by Kamenica and Gentzkow makes some stringent assumptions which limit its applicability in practice. One of the most limiting assumptions is arguably that, in order to compute an optimal signaling scheme, the sender is usually required to know the receiver's utility function. In this paper, we relax this assumption through an online learning framework in which the sender faces a receiver with unknown type. At each round, the receiver's type is chosen adversarially from a finite set of possible types. We are interested in no-regret algorithms prescribing a signaling scheme at each round of the repeated interaction with performances close to that of the best-in-hindsight signaling scheme. First, we prove a hardness result on the per-iteration running time required to achieve the no-regret property. Then, we provide algorithms for the full and partial information model which exhibit regret sublinear in the number of rounds and polynomial in the parameters of the game.