NIPS Peter Grünwald - A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Talk
in
Workshop: (Almost) 50 shades of Bayesian Learning: PAC-Bayesian trends and insights

Peter Grünwald - A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Peter Grünwald

[ Abstract ] [ Project Page ]

2017 Talk

Abstract: Over the last 15 years, machine learning theorists have bounded the performance of empirical risk minimization by (localized) Rademacher complexity; Bayesians with frequentist sympathies have studied Bayesian consistency and rate of convergence theorems, sometimes under misspecification, and PAC-Bayesians have studied convergence properties of generalized Bayesian and Gibbs posteriors. We show that, amazingly, most such bounds readily follow from essentially a single result that bounds excess risk in terms of a novel complexity COMP

(η, w)

. which depends on a learning rate

η

and a luckiness function

w

, the latter generalizing the concept of a 'prior'. Depending on the choice of

w

, COMP

(η, w)

specializes to PAC-Bayesian (KL(posterior||prior) complexity, MDL (normalized maximum likelihood) complexity and Rademacher complexity, and the bounds obtained are optimized for generalized Bayes, ERM, penalized ERM (such as Lasso) or other methods. Tuning

η

leads to optimal excess risk convergence rates, even for very large (polynomial entropy) classes which have always been problematic for the PAC-Bayesian approach; the optimal

η

depends on 'fast rate' properties of the domain, such as central, Bernstein and Tsybakov conditions. Joint work with Nishant Mehta, University of Victoria. See https://arxiv.org/abs/1710.07732

Live content is unavailable. Log in and register to view live content

Talk in Workshop: (Almost) 50 shades of Bayesian Learning: PAC-Bayesian trends and insights

Peter Grünwald - A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Peter Grünwald

Talk
in
Workshop: (Almost) 50 shades of Bayesian Learning: PAC-Bayesian trends and insights