Timezone: »

PAC-Bayes Analysis Beyond the Usual Bounds
Omar Rivasplata · Ilja Kuzborskij · Csaba Szepesvari · John Shawe-Taylor

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #442

We focus on a stochastic learning model where the learner observes a finite set of training examples and the output of the learning process is a data-dependent distribution over a space of hypotheses. The learned data-dependent distribution is then used to make randomized predictions, and the high-level theme addressed here is guaranteeing the quality of predictions on examples that were not seen during training, i.e. generalization. In this setting the unknown quantity of interest is the expected risk of the data-dependent randomized predictor, for which upper bounds can be derived via a PAC-Bayes analysis, leading to PAC-Bayes bounds.

Specifically, we present a basic PAC-Bayes inequality for stochastic kernels, from which one may derive extensions of various known PAC-Bayes bounds as well as novel bounds. We clarify the role of the requirements of fixed ‘data-free’ priors, bounded losses, and i.i.d. data. We highlight that those requirements were used to upper-bound an exponential moment term, while the basic PAC-Bayes theorem remains valid without those restrictions. We present three bounds that illustrate the use of data-dependent priors, including one for the unbounded square loss.

Author Information

Omar Rivasplata (DeepMind & UCL)

I did undergraduate maths in Peru (BSc 2000, PUCP) and graduate maths at the University of Alberta (MSc 2005, PhD 2012). In 2016 I started working on machine learning research with Csaba Szepesvari's team at U of A Computing Science. In 2017 I moved to the UK to join UCL Computer Science to work with John Shawe-Taylor's team. Since 2018 I am affiliated with the Foundations Team at DeepMind. I am interested in machine learning theory, probability and statistics.

Ilja Kuzborskij (DeepMind)
Csaba Szepesvari (DeepMind / University of Alberta)
John Shawe-Taylor (UCL)

John Shawe-Taylor has contributed to fields ranging from graph theory through cryptography to statistical learning theory and its applications. However, his main contributions have been in the development of the analysis and subsequent algorithmic definition of principled machine learning algorithms founded in statistical learning theory. This work has helped to drive a fundamental rebirth in the field of machine learning with the introduction of kernel methods and support vector machines, driving the mapping of these approaches onto novel domains including work in computer vision, document classification, and applications in biology and medicine focussed on brain scan, immunity and proteome analysis. He has published over 300 papers and two books that have together attracted over 60000 citations. He has also been instrumental in assembling a series of influential European Networks of Excellence. The scientific coordination of these projects has influenced a generation of researchers and promoted the widespread uptake of machine learning in both science and industry that we are currently witnessing.

More from the Same Authors