Timezone: »
Many of the recent triumphs in machine learning are dependent on well-tuned hyperparameters. This is particularly prominent in reinforcement learning (RL) where a small change in the configuration can lead to failure. Despite the importance of tuning hyperparameters, it remains expensive and is often done in a naive and laborious way. A recent solution to this problem is Population Based Training (PBT) which updates both weights and hyperparameters in a \emph{single training run} of a population of agents. PBT has been shown to be particularly effective in RL, leading to widespread use in the field. However, PBT lacks theoretical guarantees since it relies on random heuristics to explore the hyperparameter space. This inefficiency means it typically requires vast computational resources, which is prohibitive for many small and medium sized labs. In this work, we introduce the first provably efficient PBT-style algorithm, Population-Based Bandits (PB2). PB2 uses a probabilistic model to guide the search in an efficient way, making it possible to discover high performing hyperparameter configurations with far fewer agents than typically required by PBT. We show in a series of RL experiments that PB2 is able to achieve high performance with a modest computational budget.
Author Information
Jack Parker-Holder (University of Oxford)
Vu Nguyen (Amazon Research Adelaide)
Stephen J Roberts (University of Oxford)
More from the Same Authors
-
2021 : MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research »
Mikayel Samvelyan · Robert Kirk · Vitaly Kurin · Jack Parker-Holder · Minqi Jiang · Eric Hambro · Fabio Petroni · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel -
2021 : HumBugDB: A Large-scale Acoustic Mosquito Dataset »
Ivan Kiskin · Marianne Sinka · Adam Cobb · Waqas Rafique · Lawrence Wang · Davide Zilli · Benjamin Gutteridge · Rinita Dam · Theodoros Marinos · Yunpeng Li · Dickson Msaky · Emmanuel Kaindoa · Gerard Killeen · Eva Herreros-Moya · Kathy Willis · Stephen J Roberts -
2021 : Grounding Aleatoric Uncertainty in Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2021 : That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 : Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay »
Iryna Korshunova · Minqi Jiang · Jack Parker-Holder · Tim Rocktäschel · Edward Grefenstette -
2021 : Relaxed-Responsibility Hierarchical Discrete VAEs »
Matthew Willetts · Xenia Miscouridou · Stephen J Roberts · Chris C Holmes -
2021 : On-the-fly Strategy Adaptation for ad-hoc Agent Coordination »
Jaleh Zand · Jack Parker-Holder · Stephen J Roberts -
2022 : Distributionally Robust Bayesian Optimization with φ-divergences »
Hisham Husain · Vu Nguyen · Anton van den Hengel -
2022 : Panel on Open Problems in Machine Learning Systems »
Ivana Dusparic · Stephen J Roberts · Morine Amutorine · Jerome White · Murtuza Shergadwala -
2021 : HumBugDB: A Large-scale Acoustic Mosquito Dataset »
Ivan Kiskin · Marianne Sinka · Adam Cobb · Waqas Rafique · Lawrence Wang · Davide Zilli · Benjamin Gutteridge · Rinita Dam · Theodoros Marinos · Yunpeng Li · Dickson Msaky · Emmanuel Kaindoa · Gerard Killeen · Eva Herreros-Moya · Kathy Willis · Stephen J Roberts -
2021 : The NetHack Challenge + Q&A »
Eric Hambro · Sharada Mohanty · Dipam Chakrabroty · Edward Grefenstette · Minqi Jiang · Robert Kirk · Vitaly Kurin · Heinrich Kuttler · Vegard Mella · Nantas Nardelli · Jack Parker-Holder · Roberta Raileanu · Tim Rocktäschel · Danielle Rothermel · Mikayel Samvelyan -
2021 Poster: Replay-Guided Adversarial Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 Poster: Tactical Optimism and Pessimism for Deep Reinforcement Learning »
Ted Moskovitz · Jack Parker-Holder · Aldo Pacchiano · Michael Arbel · Michael Jordan -
2021 Poster: Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL »
Jack Parker-Holder · Vu Nguyen · Shaan Desai · Stephen J Roberts -
2020 Poster: Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian »
Jack Parker-Holder · Luke Metz · Cinjon Resnick · Hengyuan Hu · Adam Lerer · Alistair Letcher · Alexander Peysakhovich · Aldo Pacchiano · Jakob Foerster -
2020 Poster: Effective Diversity in Population Based Reinforcement Learning »
Jack Parker-Holder · Aldo Pacchiano · Krzysztof M Choromanski · Stephen J Roberts -
2020 Poster: Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective »
Vu Nguyen · Vaden Masrani · Rob Brekelmans · Michael A Osborne · Frank Wood -
2020 Spotlight: Effective Diversity in Population Based Reinforcement Learning »
Jack Parker-Holder · Aldo Pacchiano · Krzysztof M Choromanski · Stephen J Roberts -
2020 Poster: Explicit Regularisation in Gaussian Noise Injections »
Alexander Camuto · Matthew Willetts · Umut Simsekli · Stephen J Roberts · Chris C Holmes -
2020 Poster: Bayesian Optimization for Iterative Learning »
Vu Nguyen · Sebastian Schulze · Michael A Osborne -
2019 : Poster Session »
Gergely Flamich · Shashanka Ubaru · Charles Zheng · Josip Djolonga · Kristoffer Wickstrøm · Diego Granziol · Konstantinos Pitas · Jun Li · Robert Williamson · Sangwoong Yoon · Kwot Sin Lee · Julian Zilly · Linda Petrini · Ian Fischer · Zhe Dong · Alexander Alemi · Bao-Ngoc Nguyen · Rob Brekelmans · Tailin Wu · Aditya Mahajan · Alexander Li · Kirankumar Shiragur · Yair Carmon · Linara Adilova · SHIYU LIU · Bang An · Sanjeeb Dash · Oktay Gunluk · Arya Mazumdar · Mehul Motani · Julia Rosenzweig · Michael Kamp · Marton Havasi · Leighton P Barnes · Zhengqing Zhou · Yi Hao · Dylan Foster · Yuval Benjamini · Nati Srebro · Michael Tschannen · Paul Rubenstein · Sylvain Gelly · John Duchi · Aaron Sidford · Robin Ru · Stefan Zohren · Murtaza Dalal · Michael A Osborne · Stephen J Roberts · Moses Charikar · Jayakumar Subramanian · Xiaodi Fan · Max Schwarzer · Nicholas Roberts · Simon Lacoste-Julien · Vinay Prabhu · Aram Galstyan · Greg Ver Steeg · Lalitha Sankar · Yung-Kyun Noh · Gautam Dasarathy · Frank Park · Ngai-Man (Man) Cheung · Ngoc-Trung Tran · Linxiao Yang · Ben Poole · Andrea Censi · Tristan Sylvain · R Devon Hjelm · Bangjie Liu · Jose Gallego-Posada · Tyler Sypherd · Kai Yang · Jan Nikolas Morshuis -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2019 Poster: From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization »
Krzysztof M Choromanski · Aldo Pacchiano · Jack Parker-Holder · Yunhao Tang · Vikas Sindhwani -
2017 : Cost-sensitive detection with variational autoencoders for environmental acoustic sensing »
Yunpeng Li · Stephen J Roberts -
2017 : Contributed talk: Safe Policy Search with Gaussian Process Models »
Kyriakos Polymenakos · Stephen J Roberts -
2014 Poster: Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature »
Tom Gunter · Michael A Osborne · Roman Garnett · Philipp Hennig · Stephen J Roberts -
2012 Poster: Active Learning of Model Evidence Using Bayesian Quadrature »
Michael A Osborne · David Duvenaud · Roman Garnett · Carl Edward Rasmussen · Stephen J Roberts · Zoubin Ghahramani -
2006 Poster: Bayesian Image Super-resolution, Continued »
Lyndsey C Pickup · David Capel · Stephen J Roberts · Andrew Zisserman -
2006 Spotlight: Bayesian Image Super-resolution, Continued »
Lyndsey C Pickup · David Capel · Stephen J Roberts · Andrew Zisserman