Timezone: »
Black box variational inference (BBVI) with reparameterization gradients triggered the exploration of divergence measures other than the Kullback-Leibler (KL) divergence, such as alpha divergences. These divergences can be tuned to be more mass-covering (preventing overfitting in complex models), but are also often harder to optimize using Monte-Carlo gradients. In this paper, we view BBVI with generalized divergences as a form of biased importance sampling. The choice of divergence determines a bias-variance tradeoff between the tightness of the bound (low bias) and the variance of its gradient estimators. Drawing on variational perturbation theory of statistical physics, we use these insights to construct a new variational bound which is tighter than the KL bound and more mass covering. Compared to alpha-divergences, its reparameterization gradients have a lower variance. We show in several experiments on Gaussian Processes and Variational Autoencoders that the resulting posterior covariances are closer to the true posterior and lead to higher likelihoods on held-out data.
Author Information
Robert Bamler (Disney Research)
Robert Bamler is a Postdoctoral Associate at Disney Research. He works on scalable methods for approximate Bayesian inference and on applications to natural language processing. Robert received his PhD in theoretical condensed matter physics from University of Cologne, Germany in 2016.
Cheng Zhang (Disney Research)
Manfred Opper (TU Berlin)
Stephan Mandt (Disney Research)
More from the Same Authors
-
2021 : Accurate Imputation and Efficient Data Acquisitionwith Transformer-based VAEs »
Sarah Lewis · Tatiana Matejovicova · Yingzhen Li · Angus Lamb · Yordan Zaykov · Miltiadis Allamanis · Cheng Zhang -
2021 : Deterministic particle flows for constraining SDEs »
Dimitra Maoutsa · Manfred Opper -
2021 : Accurate Imputation and Efficient Data Acquisitionwith Transformer-based VAEs »
Sarah Lewis · Tatiana Matejovicova · Yingzhen Li · Angus Lamb · Yordan Zaykov · Miltiadis Allamanis · Cheng Zhang -
2021 Poster: Sparse Uncertainty Representation in Deep Learning with Inducing Weights »
Hippolyt Ritter · Martin Kukla · Cheng Zhang · Yingzhen Li -
2021 Poster: Identifiable Generative models for Missing Not at Random Data Imputation »
Chao Ma · Cheng Zhang -
2017 : Introduction »
Cheng Zhang · Francisco Ruiz · Dustin Tran · James McInerney · Stephan Mandt -
2017 Workshop: Advances in Approximate Bayesian Inference »
Francisco Ruiz · Stephan Mandt · Cheng Zhang · James McInerney · James McInerney · Dustin Tran · Dustin Tran · David Blei · Max Welling · Tamara Broderick · Michalis Titsias -
2015 Workshop: Modelling and inference for dynamics on complex interaction networks: joining up machine learning and statistical physics »
Manfred Opper · Yasser Roudi · Peter Sollich -
2014 Poster: Poisson Process Jumping between an Unknown Number of Rates: Application to Neural Spike Data »
Florian Stimberg · Andreas Ruttor · Manfred Opper -
2014 Spotlight: Poisson Process Jumping between an Unknown Number of Rates: Application to Neural Spike Data »
Florian Stimberg · Andreas Ruttor · Manfred Opper -
2014 Poster: Optimal Neural Codes for Control and Estimation »
Alex K Susemihl · Ron Meir · Manfred Opper -
2013 Poster: Approximate inference in latent Gaussian-Markov models from continuous time observations »
Botond Cseke · Manfred Opper · Guido Sanguinetti -
2013 Spotlight: Approximate inference in latent Gaussian-Markov models from continuous time observations »
Botond Cseke · Manfred Opper · Guido Sanguinetti -
2013 Poster: Approximate Gaussian process inference for the drift function in stochastic differential equations »
Andreas Ruttor · Philipp Batz · Manfred Opper -
2011 Poster: Analytical Results for the Error in Filtering of Gaussian Processes »
Alex K Susemihl · Ron Meir · Manfred Opper