Timezone: »

Modern Nonparametrics 3: Automating the Learning Pipeline
Eric Xing · Mladen Kolar · Arthur Gretton · Samory Kpotufe · Han Liu · Zoltán Szabó · Alan Yuille · Andrew G Wilson · Ryan Tibshirani · Sasha Rakhlin · Damian Kozbur · Bharath Sriperumbudur · David Lopez-Paz · Kirthevasan Kandasamy · Francesco Orabona · Andreas Damianou · Wacha Bounliphone · Yanshuai Cao · Arijit Das · Yingzhen Yang · Giulia DeSalvo · Dmitry Storcheus · Roberto Valerio

Sat Dec 13 05:30 AM -- 03:30 PM (PST) @ Level 5, room 511 e
Event URL: https://sites.google.com/site/nips2014modernnonparametric/ »

Nonparametric methods (kernel methods, kNN, classification trees, etc) are designed to handle complex pattern recognition problems. Such complex problems arise in modern applications such as genomic experiments, climate analysis, robotic control, social network analysis, and so forth. In fact, contemporary statistical procedures are making inroads into a variety of modern application areas as part of solutions to larger problems. As such there is a growing need for statistical procedures that can be used "off-the-shelf", i.e. procedures with as few parameters as possible, or better yet, procedures which can "self-tune" to a particular application at hand.

The problem of devising 'parameter-free' procedures has been addressed in separate areas of the pattern-recognition literature under various names and different emphasis.

In traditional statistics, much effort has gone into so called "adaptive" procedures which can attain optimal risks over large sets of models of increasing complexity. Examples are model selection approaches based on penalized empirical risk minimization, approaches based on stability of estimates (e.g. Lepski’s methods), thresholding approaches under sparsity assumptions, and model averaging approaches. Most of these approaches rely on having tight bounds on the risk of learning procedures (under any parameter setting), hence other approaches concentrate on tight estimations of the actual risks, e.g., Stein’s risk estimators, bootstrapping methods, data dependent learning bounds.

In theoretical machine learning, much of the work has focused on proper tuning of the actual optimization procedures used to minimize (penalized) empirical risks. In particular, great effort has gone into the automatic setting of important tuning parameters such as 'learning rates' and 'step sizes'.

Another approach out of machine learning arises in the kernel literature for 'automatic representation learning'. The aim of the approach, similar to theoretical work on model selection, is to automatically learn an appropriate (kernel) transformation of the data for use with kernel methods such as SVMs or Gaussian processes.

In practice, the simplest self-tuning procedures take the form of cross-validation and variants. Cross-validation can however be expensive in practice, and impractical in various constrained settings -- e.g., streaming settings, in settings with large amounts of tuning parameters, and generally in unsupervised learning problems.

More generally, many existing self-tuning or parameter-free methods are unfortunately expensive given large modern data sizes and dimensionality, while the cheaper methods tend to self-tune only to small model classes. Ideally we would want self-tuning procedures that can adapt to easy or difficult (nonparametric) problems, while satisfying the practical constraints of modern applications.

A main aim of this workshop is to cover the various approaches proposed so far towards automating the learning pipeline, and the practicality of these approaches in light of modern constraints. We are particularly interested in understanding whether large datasizes and dimensionality might help the automation effort since such datasets in fact provide more information on the patterns being learned.

Through a number of invited and contributed talks and a focused panel discussion, we plan to bring together both theoretical and applied researchers to discuss these challenges in detail, share insight on existing solutions, and lay out some of the important future directions towards answering the demands of modern applications.

Author Information

Eric Xing (Petuum Inc. / Carnegie Mellon University)
Mladen Kolar (University of Chicago)
Arthur Gretton (Gatsby Unit, UCL)

Arthur Gretton is a Professor with the Gatsby Computational Neuroscience Unit at UCL. He received degrees in Physics and Systems Engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He previously worked at the MPI for Biological Cybernetics, and at the Machine Learning Department, Carnegie Mellon University. Arthur's recent research interests in machine learning include the design and training of generative models, both implicit (e.g. GANs) and explicit (high/infinite dimensional exponential family models), nonparametric hypothesis testing, and kernel methods. He has been an associate editor at IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 to 2013, an Action Editor for JMLR since April 2013, an Area Chair for NeurIPS in 2008 and 2009, a Senior Area Chair for NeurIPS in 2018, an Area Chair for ICML in 2011 and 2012, and a member of the COLT Program Committee in 2013. Arthur was program chair for AISTATS in 2016 (with Christian Robert), tutorials chair for ICML 2018 (with Ruslan Salakhutdinov), workshops chair for ICML 2019 (with Honglak Lee), program chair for the Dali workshop in 2019 (with Krikamol Muandet and Shakir Mohammed), and co-organsier of the Machine Learning Summer School 2019 in London (with Marc Deisenroth).

Samory Kpotufe (Princeton University)
Han Liu (Tencent AI Lab)
Zoltán Szabó (École Polytechnique)


Alan Yuille (JHU)
Andrew G Wilson (Carnegie Mellon University)
Ryan Tibshirani (Carnegie Mellon University)
Sasha Rakhlin (University of Pennsylvania)
Damian Kozbur (ETH Zurich)
Bharath Sriperumbudur (The Pennsylvania State University)
David Lopez-Paz (Meta AI)
Kirthevasan Kandasamy (Carnegie Mellon University)
Francesco Orabona (Boston University)
Andreas Damianou (University of Sheffield)
Yanshuai Cao (BorealisAI)
Arijit Das (Max Plank Institute + University of Cologne)
Yingzhen Yang (Snap Research)
Giulia DeSalvo (New York University)
Dmitry Storcheus (Google)
Roberto Valerio (University of Houston)

More from the Same Authors