Timezone: »
Probabilistic inference of Neural Network parameters is challenging due to the highly multi-modal likelihood functions. Most importantly, the permutation invariance of the neurons of the hidden layers renders the likelihood function unidentifiable with a factorial number of equivalent (symmetric) modes, independent of the data. We show that variational Bayesian methods that approximate the (multi-modal) posterior by a (uni-modal) Gaussian distribution are biased towards approximations with identical (e.g. zero-centred) weights, resulting in severe underfitting.This explains the common empirical observation that, in contrast to MCMC methods, variational approximations typically collapse most weights to the (zero-centred) prior.We propose a simple modification to the likelihood function that breaks the symmetry using fixed semi-orthogonal matrices as skip connections in each layer.Initial empirical results show an improved predictive performance.
Author Information
Richard Kurle (AWS AI Labs)
Tim Januschowski (Amazon Research)
- Director Pricing Platform, Zalando SE - Head of Time Series ML at AWS AI
Jan Gasthaus (Amazon / AWS)
Bernie Wang (AWS AI Labs)
More from the Same Authors
-
2021 : Modeling Advection on Directed Graphs using Mat\'{e}rn Gaussian Processes for Traffic Flow »
Nadim Saad · Danielle Maddix · Bernie Wang -
2022 Poster: On the detrimental effect of invariances in the likelihood for variational inference »
Richard Kurle · Ralf Herbrich · Tim Januschowski · Yuyang (Bernie) Wang · Jan Gasthaus -
2021 : Backpropagation through Back substitution with a Backslash »
Ekin Akyürek · Alan Edelman · Bernie Wang -
2021 Poster: Neural Flows: Efficient Alternative to Neural ODEs »
Marin Biloš · Johanna Sommer · Syama Sundar Rangapuram · Tim Januschowski · Stephan Günnemann -
2021 Poster: Detecting Anomalous Event Sequences with Temporal Point Processes »
Oleksandr Shchur · Ali Caner Turkmen · Tim Januschowski · Jan Gasthaus · Stephan Günnemann -
2021 Poster: Probabilistic Forecasting: A Level-Set Approach »
Hilaf Hasson · Bernie Wang · Tim Januschowski · Jan Gasthaus -
2021 Poster: Online false discovery rate control for anomaly detection in time series »
Quentin Rebjock · Baris Kurt · Tim Januschowski · Laurent Callot -
2021 Poster: Deep Explicit Duration Switching Models for Time Series »
Abdul Fatir Ansari · Konstantinos Benidis · Richard Kurle · Ali Caner Turkmen · Harold Soh · Alexander Smola · Bernie Wang · Tim Januschowski -
2021 Poster: Latent Matters: Learning Deep State-Space Models »
Alexej Klushyn · Richard Kurle · Maximilian Soelch · Botond Cseke · Patrick van der Smagt -
2020 Poster: Deep Rao-Blackwellised Particle Filters for Time Series Forecasting »
Richard Kurle · Syama Sundar Rangapuram · Emmanuel de Bézenac · Stephan Günnemann · Jan Gasthaus -
2020 Poster: Normalizing Kalman Filters for Multivariate Time Series Analysis »
Emmanuel de Bézenac · Syama Sundar Rangapuram · Konstantinos Benidis · Michael Bohlke-Schneider · Richard Kurle · Lorenzo Stella · Hilaf Hasson · Patrick Gallinari · Tim Januschowski -
2019 Poster: High-dimensional multivariate forecasting with low-rank Gaussian Copula Processes »
David Salinas · Michael Bohlke-Schneider · Laurent Callot · Roberto Medico · Jan Gasthaus -
2019 Poster: Learning Hierarchical Priors in VAEs »
Alexej Klushyn · Nutan Chen · Richard Kurle · Botond Cseke · Patrick van der Smagt -
2019 Spotlight: Learning Hierarchical Priors in VAEs »
Alexej Klushyn · Nutan Chen · Richard Kurle · Botond Cseke · Patrick van der Smagt -
2018 Poster: Deep State Space Models for Time Series Forecasting »
Syama Sundar Rangapuram · Matthias W Seeger · Jan Gasthaus · Lorenzo Stella · Bernie Wang · Tim Januschowski