Timezone: »
The key distinguishing property of a Bayesian approach is marginalization, rather than using a single setting of weights. Bayesian marginalization can particularly improve the accuracy and calibration of modern deep neural networks, which are typically underspecified by the data, and can represent many compelling but different solutions. We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization, and propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction, without significant overhead. We also investigate the prior over functions implied by a vague distribution over neural network weights, explaining the generalization properties of such models from a probabilistic perspective. From this perspective, we explain results that have been presented as mysterious and distinct to neural network generalization, such as the ability to fit images with random labels, and show that these results can be reproduced with Gaussian processes. We also show that Bayesian model averaging alleviates double descent, resulting in monotonic performance improvements with increased flexibility.
Author Information
Andrew Wilson (New York University)

I am a professor of machine learning at New York University.
Pavel Izmailov (New York University)
More from the Same Authors
-
2021 : Robust Reinforcement Learning for Shifting Dynamics During Deployment »
Samuel Stanton · Rasool Fakoor · Jonas Mueller · Andrew Gordon Wilson · Alexander Smola -
2022 : Transfer Learning with Deep Tabular Models »
Roman Levin · Valeriia Cherepanova · Avi Schwarzschild · Arpit Bansal · C. Bayan Bruss · Tom Goldstein · Andrew Wilson · Micah Goldblum -
2022 : On Representation Learning Under Class Imbalance »
Ravid Shwartz-Ziv · Micah Goldblum · Yucen Li · C. Bayan Bruss · Andrew Gordon Wilson -
2022 : Andrew Gordon Wilson: When Bayesian Orthodoxy Can Go Wrong: Model Selection and Out-of-Distribution Generalization »
Andrew Gordon Wilson -
2022 : Andrew Gordon Wilson: When Bayesian Orthodoxy Can Go Wrong: Model Selection and Out-of-Distribution Generalization »
Andrew Gordon Wilson -
2022 : Transfer Learning with Deep Tabular Models »
Roman Levin · Valeriia Cherepanova · Avi Schwarzschild · Arpit Bansal · C. Bayan Bruss · Tom Goldstein · Andrew Wilson · Micah Goldblum -
2022 Poster: Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers »
Wanqian Yang · Polina Kirichenko · Micah Goldblum · Andrew Wilson -
2022 Poster: On Uncertainty, Tempering, and Data Augmentation in Bayesian Classification »
Sanyam Kapoor · Wesley Maddox · Pavel Izmailov · Andrew Wilson -
2022 Poster: Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors »
Ravid Shwartz-Ziv · Micah Goldblum · Hossein Souri · Sanyam Kapoor · Chen Zhu · Yann LeCun · Andrew Wilson -
2022 Poster: On Feature Learning in the Presence of Spurious Correlations »
Pavel Izmailov · Polina Kirichenko · Nate Gruver · Andrew Wilson -
2022 Poster: PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization »
Sanae Lotfi · Marc Finzi · Sanyam Kapoor · Andres Potapczynski · Micah Goldblum · Andrew Wilson -
2021 Workshop: Bayesian Deep Learning »
Yarin Gal · Yingzhen Li · Sebastian Farquhar · Christos Louizos · Eric Nalisnick · Andrew Gordon Wilson · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2021 : Evaluating Approximate Inference in Bayesian Deep Learning + Q&A »
Andrew Gordon Wilson · Pavel Izmailov · Matthew Hoffman · Yarin Gal · Yingzhen Li · Melanie F. Pradier · Sharad Vikram · Andrew Foong · Sanae Lotfi · Sebastian Farquhar -
2021 Poster: Residual Pathway Priors for Soft Equivariance Constraints »
Marc Finzi · Gregory Benton · Andrew Wilson -
2021 Poster: Does Knowledge Distillation Really Work? »
Samuel Stanton · Pavel Izmailov · Polina Kirichenko · Alexander Alemi · Andrew Wilson -
2021 Poster: Dangers of Bayesian Model Averaging under Covariate Shift »
Pavel Izmailov · Patrick Nicholson · Sanae Lotfi · Andrew Wilson -
2021 Poster: Conditioning Sparse Variational Gaussian Processes for Online Decision-making »
Wesley Maddox · Samuel Stanton · Andrew Wilson -
2021 Poster: Bayesian Optimization with High-Dimensional Outputs »
Wesley Maddox · Maximilian Balandat · Andrew Wilson · Eytan Bakshy -
2020 Poster: Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints »
Marc Finzi · Ke Alexander Wang · Andrew Wilson -
2020 Spotlight: Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints »
Marc Finzi · Ke Alexander Wang · Andrew Wilson -
2020 Poster: BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization »
Maximilian Balandat · Brian Karrer · Daniel Jiang · Samuel Daulton · Ben Letham · Andrew Wilson · Eytan Bakshy -
2020 Poster: Learning Invariances in Neural Networks from Training Data »
Gregory Benton · Marc Finzi · Pavel Izmailov · Andrew Wilson -
2020 Poster: Improving GAN Training with Probability Ratio Clipping and Sample Reweighting »
Yue Wu · Pan Zhou · Andrew Wilson · Eric Xing · Zhiting Hu -
2020 Poster: Why Normalizing Flows Fail to Detect Out-of-Distribution Data »
Polina Kirichenko · Pavel Izmailov · Andrew Wilson -
2019 Workshop: Learning with Rich Experience: Integration of Learning Paradigms »
Zhiting Hu · Andrew Wilson · Chelsea Finn · Lisa Lee · Taylor Berg-Kirkpatrick · Ruslan Salakhutdinov · Eric Xing -
2019 Poster: Exact Gaussian Processes on a Million Data Points »
Ke Alexander Wang · Geoff Pleiss · Jacob Gardner · Stephen Tyree · Kilian Weinberger · Andrew Gordon Wilson -
2019 Poster: Function-Space Distributions over Kernels »
Gregory Benton · Wesley Maddox · Jayson Salkey · Julio Albinati · Andrew Gordon Wilson -
2019 Poster: A Simple Baseline for Bayesian Uncertainty in Deep Learning »
Wesley Maddox · Pavel Izmailov · Timur Garipov · Dmitry Vetrov · Andrew Gordon Wilson -
2018 Workshop: Bayesian Deep Learning »
Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Andrew Wilson · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2018 Poster: Scaling Gaussian Process Regression with Derivatives »
David Eriksson · Kun Dong · Eric Lee · David Bindel · Andrew Wilson -
2018 Poster: GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration »
Jacob Gardner · Geoff Pleiss · Kilian Weinberger · David Bindel · Andrew Wilson -
2018 Spotlight: GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration »
Jacob Gardner · Geoff Pleiss · Kilian Weinberger · David Bindel · Andrew Wilson -
2018 Poster: Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs »
Timur Garipov · Pavel Izmailov · Dmitrii Podoprikhin · Dmitry Vetrov · Andrew Wilson -
2018 Spotlight: Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs »
Timur Garipov · Pavel Izmailov · Dmitrii Podoprikhin · Dmitry Vetrov · Andrew Wilson -
2017 Workshop: Bayesian Deep Learning »
Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Andrew Wilson · Andrew Wilson · Diederik Kingma · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2017 Symposium: Interpretable Machine Learning »
Andrew Wilson · Jason Yosinski · Patrice Simard · Rich Caruana · William Herlands -
2017 Poster: Bayesian GAN »
Yunus Saatci · Andrew Wilson -
2017 Spotlight: Bayesian GANs »
Yunus Saatci · Andrew Wilson -
2017 Poster: Bayesian Optimization with Gradients »
Jian Wu · Matthias Poloczek · Andrew Wilson · Peter Frazier -
2017 Poster: Scalable Log Determinants for Gaussian Process Kernel Learning »
Kun Dong · David Eriksson · Hannes Nickisch · David Bindel · Andrew Wilson -
2017 Oral: Bayesian Optimization with Gradients »
Jian Wu · Matthias Poloczek · Andrew Wilson · Peter Frazier -
2017 Poster: Scalable Levy Process Priors for Spectral Kernel Learning »
Phillip Jang · Andrew Loeb · Matthew Davidow · Andrew Wilson -
2016 Workshop: Interpretable Machine Learning for Complex Systems »
Andrew Wilson · Been Kim · William Herlands -
2016 Poster: Stochastic Variational Deep Kernel Learning »
Andrew Wilson · Zhiting Hu · Russ Salakhutdinov · Eric Xing -
2015 Workshop: Nonparametric Methods for Large Scale Representation Learning »
Andrew G Wilson · Alexander Smola · Eric Xing -
2015 Poster: The Human Kernel »
Andrew Wilson · Christoph Dann · Chris Lucas · Eric Xing -
2015 Spotlight: The Human Kernel »
Andrew Wilson · Christoph Dann · Chris Lucas · Eric Xing -
2014 Workshop: Modern Nonparametrics 3: Automating the Learning Pipeline »
Eric Xing · Mladen Kolar · Arthur Gretton · Samory Kpotufe · Han Liu · Zoltán Szabó · Alan Yuille · Andrew G Wilson · Ryan Tibshirani · Sasha Rakhlin · Damian Kozbur · Bharath Sriperumbudur · David Lopez-Paz · Kirthevasan Kandasamy · Francesco Orabona · Andreas Damianou · Wacha Bounliphone · Yanshuai Cao · Arijit Das · Yingzhen Yang · Giulia DeSalvo · Dmitry Storcheus · Roberto Valerio -
2014 Poster: Fast Kernel Learning for Multidimensional Pattern Extrapolation »
Andrew Wilson · Elad Gilboa · John P Cunningham · Arye Nehorai -
2010 Spotlight: Copula Processes »
Andrew Wilson · Zoubin Ghahramani -
2010 Poster: Copula Processes »
Andrew Wilson · Zoubin Ghahramani