Timezone: »
Recent works on Bayesian neural networks (BNNs) have highlighted the need to better understand the implications of using Gaussian priors in combination with the compositional structure of the network architecture. Similar in spirit to the kind of analysis that has been developed to devise better initialization schemes for neural networks (cf. He- or Xavier initialization), we derive a precise characterization of the prior predictive distribution of finite-width ReLU networks with Gaussian weights.While theoretical results have been obtained for their heavy-tailedness,the full characterization of the prior predictive distribution (i.e. its density, CDF and moments), remained unknown prior to this work. Our analysis, based on the Meijer-G function, allows us to quantify the influence of architectural choices such as the width or depth of the network on the resulting shape of the prior predictive distribution. We also formally connect our results to previous work in the infinite width setting, demonstrating that the moments of the distribution converge to those of a normal log-normal mixture in the infinite depth limit. Finally, our results provide valuable guidance on prior design: for instance, controlling the predictive variance with depth- and width-informed priors on the weights of the network.
Author Information
Lorenzo Noci (Swiss Federal Institute of Technology)
Gregor Bachmann (ETH Zürich)
Kevin Roth (ETH Zurich)
Sebastian Nowozin (Microsoft Research Cambridge)
Thomas Hofmann (ETH Zurich)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Precise characterization of the prior predictive distribution of deep ReLU networks »
Dates n/a. Room
More from the Same Authors
-
2022 : Cosmology from Galaxy Redshift Surveys with PointNet »
Sotiris Anagnostidis · Arne Thomsen · Alexandre Refregier · Tomasz Kacprzak · Luca Biggio · Thomas Hofmann · Tilman Tröster -
2022 : Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning »
Sanghwan Kim · Lorenzo Noci · Antonio Orvieto · Thomas Hofmann -
2022 : Contextual Squeeze-and-Excitation »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2022 : FiT: Parameter Efficient Few-shot Transfer Learning »
Aliaksandra Shysheya · John Bronskill · Massimiliano Patacchiola · Sebastian Nowozin · Richard Turner -
2023 Poster: Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers »
Sotiris Anagnostidis · Dario Pavllo · Luca Biggio · Lorenzo Noci · Aurelien Lucchi · Thomas Hofmann -
2023 Poster: Scaling MLPs: A Tale of Inductive Bias »
Gregor Bachmann · Sotiris Anagnostidis · Thomas Hofmann -
2023 Poster: Shaped Attention Mechanism in the Infinite Depth-and-Width Limit at Initialization »
Lorenzo Noci · Chuning Li · Mufan Li · Bobby He · Thomas Hofmann · Chris Maddison · Dan Roy -
2023 Poster: Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics »
Leon Klein · Andrew Foong · Tor Fjelde · Bruno Mlodozeniec · Marc Brockschmidt · Sebastian Nowozin · Frank Noe · Ryota Tomioka -
2022 Poster: Contextual Squeeze-and-Excitation for Efficient Few-Shot Image Classification »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2022 Poster: OpenFilter: A Framework to Democratize Research Access to Social Media AR Filters »
Piera Riccio · Bill Psomas · Francesco Galati · Francisco Escolano · Thomas Hofmann · Nuria Oliver -
2021 Poster: Analytic Insights into Structure and Rank of Neural Network Hessian Maps »
Sidak Pal Singh · Gregor Bachmann · Thomas Hofmann -
2021 Poster: Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect »
Lorenzo Noci · Kevin Roth · Gregor Bachmann · Sebastian Nowozin · Thomas Hofmann -
2021 Poster: Memory Efficient Meta-Learning with Large Images »
John Bronskill · Daniela Massiceti · Massimiliano Patacchiola · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2020 Poster: Batch normalization provably avoids ranks collapse for randomly initialised deep networks »
Hadi Daneshmand · Jonas Kohler · Francis Bach · Thomas Hofmann · Aurelien Lucchi -
2020 Poster: Adversarial Training is a Form of Data-dependent Operator Norm Regularization »
Kevin Roth · Yannic Kilcher · Thomas Hofmann -
2020 Spotlight: Adversarial Training is a Form of Data-dependent Operator Norm Regularization »
Kevin Roth · Yannic Kilcher · Thomas Hofmann -
2020 Poster: Convolutional Generation of Textured 3D Meshes »
Dario Pavllo · Graham Spinks · Thomas Hofmann · Marie-Francine Moens · Aurelien Lucchi -
2020 Oral: Convolutional Generation of Textured 3D Meshes »
Dario Pavllo · Graham Spinks · Thomas Hofmann · Marie-Francine Moens · Aurelien Lucchi -
2019 Poster: A Domain Agnostic Measure for Monitoring and Evaluating GANs »
Paulina Grnarova · Kfir Y. Levy · Aurelien Lucchi · Nathanael Perraudin · Ian Goodfellow · Thomas Hofmann · Andreas Krause -
2019 Poster: Icebreaker: Element-wise Efficient Information Acquisition with a Bayesian Deep Latent Gaussian Model »
Wenbo Gong · Sebastian Tschiatschek · Sebastian Nowozin · Richard Turner · José Miguel Hernández-Lobato · Cheng Zhang -
2019 Poster: Fast and Flexible Multi-Task Classification using Conditional Neural Adaptive Processes »
James Requeima · Jonathan Gordon · John Bronskill · Sebastian Nowozin · Richard Turner -
2019 Spotlight: Fast and Flexible Multi-Task Classification using Conditional Neural Adaptive Processes »
James Requeima · Jonathan Gordon · John Bronskill · Sebastian Nowozin · Richard Turner -
2019 Poster: Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift »
Jasper Snoek · Yaniv Ovadia · Emily Fertig · Balaji Lakshminarayanan · Sebastian Nowozin · D. Sculley · Joshua Dillon · Jie Ren · Zachary Nado -
2018 : Sebastian Nowozin »
Sebastian Nowozin -
2018 Workshop: Smooth Games Optimization and Machine Learning »
Simon Lacoste-Julien · Ioannis Mitliagkas · Gauthier Gidel · Vasilis Syrgkanis · Eva Tardos · Leon Bottou · Sebastian Nowozin -
2018 Poster: Hyperbolic Neural Networks »
Octavian Ganea · Gary Becigneul · Thomas Hofmann -
2018 Spotlight: Hyperbolic Neural Networks »
Octavian Ganea · Gary Becigneul · Thomas Hofmann -
2018 Poster: Deep State Space Models for Unconditional Word Generation »
Florian Schmidt · Thomas Hofmann -
2017 Poster: The Numerics of GANs »
Lars Mescheder · Sebastian Nowozin · Andreas Geiger -
2017 Spotlight: The Numerics of GANs »
Lars Mescheder · Sebastian Nowozin · Andreas Geiger -
2017 Poster: Stabilizing Training of Generative Adversarial Networks through Regularization »
Kevin Roth · Aurelien Lucchi · Sebastian Nowozin · Thomas Hofmann -
2016 : Discussion panel »
Ian Goodfellow · Soumith Chintala · Arthur Gretton · Sebastian Nowozin · Aaron Courville · Yann LeCun · Emily Denton -
2016 : Training Generative Neural Samplers using Variational Divergence »
Sebastian Nowozin -
2016 Poster: f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization »
Sebastian Nowozin · Botond Cseke · Ryota Tomioka -
2016 Poster: Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy »
Aryan Mokhtari · Hadi Daneshmand · Aurelien Lucchi · Thomas Hofmann · Alejandro Ribeiro -
2016 Poster: DISCO Nets : DISsimilarity COefficients Networks »
Diane Bouchacourt · Pawan K Mudigonda · Sebastian Nowozin -
2015 Poster: Variance Reduced Stochastic Gradient Descent with Neighbors »
Thomas Hofmann · Aurelien Lucchi · Simon Lacoste-Julien · Brian McWilliams -
2014 Workshop: Discrete Optimization in Machine Learning »
Jeffrey A Bilmes · Andreas Krause · Stefanie Jegelka · S Thomas McCormick · Sebastian Nowozin · Yaron Singer · Dhruv Batra · Volkan Cevher -
2014 Poster: Communication-Efficient Distributed Dual Coordinate Ascent »
Martin Jaggi · Virginia Smith · Martin Takac · Jonathan Terhorst · Sanjay Krishnan · Thomas Hofmann · Michael Jordan -
2013 Poster: Decision Jungles: Compact and Rich Models for Classification »
Jamie Shotton · Toby Sharp · Pushmeet Kohli · Sebastian Nowozin · John Winn · Antonio Criminisi -
2011 Workshop: Optimization for Machine Learning »
Suvrit Sra · Stephen Wright · Sebastian Nowozin -
2011 Poster: Higher-Order Correlation Clustering for Image Segmentation »
Sungwoong Kim · Sebastian Nowozin · Pushmeet Kohli · Chang D. D Yoo -
2010 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Stephen Wright -
2009 Workshop: Optimization for Machine Learning »
Sebastian Nowozin · Suvrit Sra · S.V.N Vishwanthan · Stephen Wright -
2008 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Vishwanathan S V N