Talk
in
Workshop: (Almost) 50 shades of Bayesian Learning: PAC-Bayesian trends and insights
Peter Grünwald - A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity
Peter Grünwald
Abstract:
Over the last 15 years, machine learning theorists have bounded the performance of empirical risk minimization by (localized) Rademacher complexity; Bayesians with frequentist sympathies have studied Bayesian consistency and rate of convergence theorems, sometimes under misspecification, and PAC-Bayesians have studied convergence properties of generalized Bayesian and Gibbs posteriors. We show that, amazingly, most such bounds readily follow from essentially a single result that bounds excess risk in terms of a novel complexity COMP . which depends on a learning rate and a luckiness function , the latter generalizing the concept of a 'prior'. Depending on the choice of , COMP specializes to PAC-Bayesian (KL(posterior||prior) complexity, MDL (normalized maximum likelihood) complexity and Rademacher complexity, and the bounds obtained are optimized for generalized Bayes, ERM, penalized ERM (such as Lasso) or other methods. Tuning leads to optimal excess risk convergence rates, even for very large (polynomial entropy) classes which have always been problematic for the PAC-Bayesian approach; the optimal depends on 'fast rate' properties of the domain, such as central, Bernstein and Tsybakov conditions.
Joint work with Nishant Mehta, University of Victoria. See https://arxiv.org/abs/1710.07732
Live content is unavailable. Log in and register to view live content