Timezone: »

Practical Bayesian Nonparametrics
Nick Foti · Tamara Broderick · Trevor Campbell · Michael Hughes · Jeffrey Miller · Aaron Schein · Sinead Williamson · Yanxun Xu

Thu Dec 08 11:00 PM -- 09:30 AM (PST) @ AC Barcelona Hotel - Barcelona Room
Event URL: https://sites.google.com/site/nipsbnp2016/ »

In theory, Bayesian nonparametric (BNP) methods are well suited to the large data sets that arise in the sciences, technology, politics, and other applied fields. By making use of infinite-dimensional mathematical structures, BNP methods allow the complexity of a learned model to grow as the size of a data set grows, exhibiting desirable Bayesian regularization properties for small data sets and allowing the practitioner to learn ever more from larger data sets. These properties have resulted in the adoption of BNP methods across a diverse set of application areas---including, but not limited to, biology, neuroscience, the humanities, social sciences, economics, and finance.

In practice, BNP methods present a number of computational and modeling challenges. Recent work has brought a wide range of models to bear on applied problems, going beyond the Dirichlet process and Gaussian process. Meanwhile, advances in accelerated inference are making these models tractable in big data problems.

In this workshop, we will explore new BNP methods for diverse applied problems, including cutting-edge models being developed by application domain experts. We will also discuss the limitations of existing methods and discuss key problems that need to be solved. A major focus of the workshop will be to expose participants to practical software tools for performing Bayesian nonparametric analyses. In particular, we plan to host hands-on tutorials to introduce workshop participants to some of the software packages that can be used to easily perform posterior inference for BNP models, e.g. Stan, BNPy, and BNP.jl.

We expect workshop participants to come from a variety of fields, including but not limited to machine learning, statistics, engineering, political science, and various biological sciences. The workshop will be relevant both to BNP experts as well as those interested in learning how to apply BNP models. There will be a special emphasis on work that makes BNP methods easy-to-use in practice and computationally efficient. Participants will leave the workshop with (i) exposure to recent advances in the field, (ii) hands-on experience with software implementing BNP methods, and (iii) an idea of the current challenges that need to be overcome in order to make BNP methods more widespread in practice. These goals will be accomplished through a series of invited and contributed talks, a poster session, and at least one hands-on tutorial session where participants can get their hands dirty with BNP methods.

This workshop builds off of the “Bayesian Nonparametrics: The Next Generation” workshop held at NIPS in 2015. While that workshop had a broad remit, spanning theory, applications and computation, this year’s workshop shows a fresh focus on the practical aspects of BNP methods. During last year’s panel discussion, there were many questions about computational techniques and practical applications, suggesting that this direction will be of great interest to the many applied machine learning researchers who attend the conference.

Thu 11:15 p.m. - 11:30 p.m.
Welcome and Introductions (Talk)  link »
Thu 11:30 p.m. - 12:00 a.m.
Tamara Broderick: Foundations Talk (Talk)  link » Tamara Broderick
Fri 12:00 a.m. - 12:30 a.m.
Jennifer Hill: Invited Talk (Talk)  link »
Fri 12:30 a.m. - 12:45 a.m.
Hyunjik Kim: Scaling up the Automatic Statistician: Scalable Structure Discovery in Regression using Gaussian Processes (Talk)  link »
Fri 12:45 a.m. - 1:00 a.m.
Melanie F. Pradier: Sparse Three-parameter Restricted Indian Buffet Process for Understanding International Trade (Talk)  link »
Fri 1:00 a.m. - 1:30 a.m.
 link »

Many existing statistical and machine learning tools for social network analysis focus on a single level of analysis. Methods designed for clustering optimize a global partition of the graph, whereas projection based approaches (e.g. the latent space model in the statistics literature) represent in rich detail the roles of individuals. Many pertinent questions in sociology and economics, however, span multiple scales of analysis. Further, many questions involve comparisons across disconnected graphs that will inevitably be of different sizes, either due to missing data or the inherent heterogeneity in real-world networks. We propose a class of network models that represent network structure on multiple scales and facilitate comparison across graphs with different numbers of individuals. These models differentially invest modeling effort within subgraphs of high density, often termed communities, while maintaining a parsimonious structure between said subgraphs. We show that our model class is projective, highlighting an ongoing discussion in the social network modeling literature on the dependence of inference paradigms on the size of the observed graph. We illustrate the utility of our method using data on household relations from Karnataka, India.

Fri 2:00 a.m. - 2:15 a.m.
Poster Spotlights (Spotlight)  link »
Fri 2:15 a.m. - 3:15 a.m.
Poster Session  link »
Fri 3:15 a.m. - 3:45 a.m.
Lunch Session Intro (Break)
Fri 3:45 a.m. - 4:45 a.m.
Rob Trangucci: Stan Tutorial, with focus on Gaussian Processes (Demonstration)  link »
Fri 4:45 a.m. - 5:45 a.m.
Mike Hughes: BNPy tutorial - Clustering with Dirichlet Processes and extensions in Python (Demonstration)  link »
Fri 6:30 a.m. - 7:00 a.m.
Marc Deisenroth: Invited Talk (Talk)
Fri 7:00 a.m. - 7:15 a.m.

Contributed Talk

Fri 7:30 a.m. - 8:00 a.m.

Dustin Tran, Columbia University Lead developer of Edward

Aki Vehtari, Aalto University Stan contributor and Lead developer of GPstuff

Martin Trapp, Austrian Research Institute for Artificial Intelligence Lead developer of BNP.jl (Julia implementation of BNP methods)

Mike Hughes, Harvard University Lead developer of BNPy

Fri 8:00 a.m. - 8:30 a.m.

Stationary time series models built from parametric distributions are, in general, limited in scope due to the assumptions imposed on the residual distribution and autoregression relationship. We present a modeling approach for univariate time series data, which makes no assumptions of stationarity, and can accommodate complex dynamics and capture non-standard distributions. The model for the transition density arises from the conditional distribution implied by a Bayesian nonparametric mixture of bivariate normals. This results in a flexible autoregressive form for the conditional transition density, defining a time-homogeneous, non-stationary Markovian model for real-valued data indexed in discrete time. To obtain a computationally tractable algorithm for posterior inference, we utilize a square-root-free Cholesky decomposition of the mixture kernel covariance matrix. Results from simulated data suggest that the model is able to recover challenging transition densities and non-linear dynamic relationships. We also illustrate the model on time intervals between eruptions of the Old Faithful geyser. Extensions and open questions about accommodating higher order structure and developing state-space models are also discussed.

Fri 8:30 a.m. - 9:30 a.m.

Invited Panel: Bailey Fosdick, Colorado State University Maria DeYoreo, Duke University Suchi Saria, Johns Hopkins University Jim Griffin, University of Kent Marc Deisenroth, Imperial College London

Author Information

Nick Foti (University of Washington)
Tamara Broderick (MIT)
Trevor Campbell (UBC)
Mike Hughes (Tufts University)
Jeff Miller (Harvard University)
Aaron Schein (UMass Amherst)
Sinead Williamson (UT Austin)
Yanxun Xu (Johns Hopkins University)

More from the Same Authors