Timezone: »
A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. In this paper, we begin by making the crucial observation that the stochastic control problem can be reinterpreted as one of trans-dimensional inference. With this new understanding, we are able to propose a novel reversible jump Markov chain Monte Carlo (MCMC) algorithm that is more efficient than its EM counterparts. Moreover, it enables us to carry out full Bayesian policy search, without the need for gradients and with one single Markov chain. The new approach involves sampling directly from a distribution that is proportional to the reward and, consequently, performs better than classic simulations methods in situations where the reward is a rare event.
Author Information
Matthew Hoffman (DeepMind)
Arnaud Doucet (Oxford)
Nando de Freitas (University of Oxford)
Ajay Jasra (Imperial College London)
Related Events (a corresponding poster, oral, or spotlight)
-
2007 Poster: Bayesian Policy Learning with Trans-Dimensional MCMC »
Mon. Dec 3rd 06:30 -- 06:40 PM Room None
More from the Same Authors
-
2021 Test Of Time: Online Learning for Latent Dirichlet Allocation »
Matthew Hoffman · Francis Bach · David Blei -
2020 Poster: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Spotlight: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2019 Poster: Augmented Neural ODEs »
Emilien Dupont · Arnaud Doucet · Yee Whye Teh -
2018 Poster: Hamiltonian Variational Auto-Encoder »
Anthony Caterini · Arnaud Doucet · Dino Sejdinovic -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2017 Poster: Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling »
Andrei-Cristian Barbos · Francois Caron · Jean-François Giovannelli · Arnaud Doucet -
2016 Poster: Learning to learn by gradient descent by gradient descent »
Marcin Andrychowicz · Misha Denil · Sergio Gómez · Matthew Hoffman · David Pfau · Tom Schaul · Nando de Freitas -
2016 Poster: Learning to Communicate with Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Yannis Assael · Nando de Freitas · Shimon Whiteson -
2015 : Information based methods for Black-box Optimization »
Matthew Hoffman -
2015 Workshop: Scalable Monte Carlo Methods for Bayesian Analysis of Big Data »
Babak Shahbaba · Yee Whye Teh · Max Welling · Arnaud Doucet · Christophe Andrieu · Sebastian J. Vollmer · Pierre Jacob -
2015 Poster: Expectation Particle Belief Propagation »
Thibaut Lienart · Yee Whye Teh · Arnaud Doucet -
2014 Workshop: Bayesian Optimization in Academia and Industry »
Zoubin Ghahramani · Ryan Adams · Matthew Hoffman · Kevin Swersky · Jasper Snoek -
2014 Poster: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2014 Oral: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2014 Poster: Predictive Entropy Search for Efficient Global Optimization of Black-box Functions »
José Miguel Hernández-Lobato · Matthew Hoffman · Zoubin Ghahramani -
2014 Spotlight: Predictive Entropy Search for Efficient Global Optimization of Black-box Functions »
José Miguel Hernández-Lobato · Matthew Hoffman · Zoubin Ghahramani -
2014 Poster: Distributed Parameter Estimation in Probabilistic Graphical Models »
Yariv D Mizrahi · Misha Denil · Nando de Freitas -
2013 Workshop: Bayesian Optimization in Theory and Practice »
Matthew Hoffman · Jasper Snoek · Nando de Freitas · Michael A Osborne · Ryan Adams · Sebastien Bubeck · Philipp Hennig · Remi Munos · Andreas Krause -
2013 Workshop: Deep Learning »
Yoshua Bengio · Hugo Larochelle · Russ Salakhutdinov · Tomas Mikolov · Matthew D Zeiler · David Mcallester · Nando de Freitas · Josh Tenenbaum · Jian Zhou · Volodymyr Mnih -
2011 Workshop: Bayesian optimization, experimental design and bandits: Theory and applications »
Nando de Freitas · Roman Garnett · Frank R Hutter · Michael A Osborne -
2010 Session: Spotlights Session 10 »
Nando de Freitas -
2010 Session: Oral Session 12 »
Nando de Freitas -
2009 Workshop: Adaptive Sensing, Active Learning, and Experimental Design »
Rui M Castro · Nando de Freitas · Ruben Martinez-Cantin -
2009 Poster: Bayesian Nonparametric Models on Decomposable Graphs »
Francois Caron · Arnaud Doucet -
2009 Tutorial: Sequential Monte-Carlo Methods »
Arnaud Doucet · Nando de Freitas -
2008 Poster: An interior-point stochastic approximation method and an L1-regularized delta rule »
Peter Carbonetto · Mark Schmidt · Nando de Freitas -
2008 Oral: An interior-point stochastic approximation method and an L1-regularized delta rule »
Peter Carbonetto · Mark Schmidt · Nando de Freitas -
2008 Demonstration: Worio: A Web-Scale Machine Learning System »
Nando de Freitas · Ali Davar -
2007 Poster: Active Preference Learning with Discrete Choice Data »
Eric Brochu · Nando de Freitas · Abhijeet Ghosh -
2006 Poster: Conditional mean field »
Peter Carbonetto · Nando de Freitas