Timezone: »
As machine learning permeates more industries and models become more expensive and time consuming to train, the need for efficient automated hyperparameter optimization (HPO) has never been more pressing. Multi-step planning based approaches to hyperparameter optimization promise improved efficiency over myopic alternatives by more effectively balancing out exploration and exploitation. However, the potential of these approaches has not been fully realized due to their technical complexity and computational intensity. In this work, we leverage recent advances in Transformer-based, natural-language-interfaced hyperparameter optimization to circumvent these barriers. We build on top of the recently proposed OptFormer which casts both hyperparameter suggestion and target function approximation as autoregressive generation thus making planning via rollouts simple and efficient. We conduct extensive exploration of different strategies for performing multi-step planning on top of the OptFormer model to highlight its potential for use in constructing non-myopic HPO strategies.
Author Information
Lucio M Dery (Carnegie Mellon University)
Abram Friesen (DeepMind)
Nando de Freitas (DeepMind)
Marc'Aurelio Ranzato (DeepMind)
Yutian Chen (DeepMind)
More from the Same Authors
-
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2022 Poster: Towards Learning Universal Hyperparameter Optimizers with Transformers »
Yutian Chen · Xingyou Song · Chansoo Lee · Zi Wang · Richard Zhang · David Dohan · Kazuya Kawakami · Greg Kochanski · Arnaud Doucet · Marc'Aurelio Ranzato · Sagi Perel · Nando de Freitas -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 Poster: Active Offline Policy Selection »
Ksenia Konyushova · Yutian Chen · Thomas Paine · Caglar Gulcehre · Cosmin Paduraru · Daniel Mankowitz · Misha Denil · Nando de Freitas -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 : Offline RL »
Nando de Freitas -
2020 Poster: Critic Regularized Regression »
Ziyu Wang · Alexander Novikov · Konrad Zolna · Josh Merel · Jost Tobias Springenberg · Scott Reed · Bobak Shahriari · Noah Siegel · Caglar Gulcehre · Nicolas Heess · Nando de Freitas -
2020 Poster: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Spotlight: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2019 Workshop: Science meets Engineering of Deep Learning »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 : Welcoming remarks and introduction »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 Poster: Large Memory Layers with Product Keys »
Guillaume Lample · Alexandre Sablayrolles · Marc'Aurelio Ranzato · Ludovic Denoyer · Herve Jegou -
2019 Spotlight: Large Memory Layers with Product Keys »
Guillaume Lample · Alexandre Sablayrolles · Marc'Aurelio Ranzato · Ludovic Denoyer · Herve Jegou -
2019 Poster: Learning Compositional Neural Programs with Recursive Tree Search and Planning »
Thomas PIERROT · Guillaume Ligner · Scott Reed · Olivier Sigaud · Nicolas Perrin · Alexandre Laterre · David Kas · Karim Beguir · Nando de Freitas -
2019 Spotlight: Learning Compositional Neural Programs with Recursive Tree Search and Planning »
Thomas PIERROT · Guillaume Ligner · Scott Reed · Olivier Sigaud · Nicolas Perrin · Alexandre Laterre · David Kas · Karim Beguir · Nando de Freitas -
2018 : TBA 5 »
Nando de Freitas -
2018 : Invited Talk 5: Nando de Freitas »
Nando de Freitas -
2018 : Invited Speaker #3 Marc'Aurelio Ranzato »
Marc'Aurelio Ranzato -
2018 Poster: Submodular Field Grammars: Representation, Inference, and Application to Image Parsing »
Abram Friesen · Pedro Domingos -
2018 Poster: Playing hard exploration games by watching YouTube »
Yusuf Aytar · Tobias Pfaff · David Budden · Thomas Paine · Ziyu Wang · Nando de Freitas -
2018 Spotlight: Playing hard exploration games by watching YouTube »
Yusuf Aytar · Tobias Pfaff · David Budden · Thomas Paine · Ziyu Wang · Nando de Freitas -
2018 Tutorial: Unsupervised Deep Learning »
Alex Graves · Marc'Aurelio Ranzato -
2017 : Invited talk: Learning to learn without gradient descent by gradient descent. »
Yutian Chen -
2017 Poster: Robust Imitation of Diverse Behaviors »
Ziyu Wang · Josh Merel · Scott Reed · Nando de Freitas · Gregory Wayne · Nicolas Heess -
2017 Poster: Fader Networks:Manipulating Images by Sliding Attributes »
Guillaume Lample · Neil Zeghidour · Nicolas Usunier · Antoine Bordes · Ludovic DENOYER · Marc'Aurelio Ranzato -
2017 Poster: Gradient Episodic Memory for Continual Learning »
David Lopez-Paz · Marc'Aurelio Ranzato -
2017 Tutorial: Deep Learning: Practice and Trends »
Nando de Freitas · Scott Reed · Oriol Vinyals -
2016 Workshop: Neural Abstract Machines & Program Induction »
Matko Bošnjak · Nando de Freitas · Tejas Kulkarni · Arvind Neelakantan · Scott E Reed · Sebastian Riedel · Tim Rocktäschel -
2016 : Nando De Freitas »
Nando de Freitas -
2016 : Learning To Optimize »
Nando de Freitas -
2016 Poster: Learning to learn by gradient descent by gradient descent »
Marcin Andrychowicz · Misha Denil · Sergio Gómez · Matthew Hoffman · David Pfau · Tom Schaul · Nando de Freitas -
2015 Workshop: Bayesian Optimization: Scalability and Flexibility »
Bobak Shahriari · Ryan Adams · Nando de Freitas · Amar Shah · Roberto Calandra -
2015 Symposium: Deep Learning Symposium »
Yoshua Bengio · Marc'Aurelio Ranzato · Honglak Lee · Max Welling · Andrew Y Ng -
2014 Session: Oral Session 4 »
Marc'Aurelio Ranzato -
2013 Poster: DeViSE: A Deep Visual-Semantic Embedding Model »
Andrea Frome · Greg Corrado · Jonathon Shlens · Samy Bengio · Jeff Dean · Marc'Aurelio Ranzato · Tomas Mikolov -
2013 Poster: Predicting Parameters in Deep Learning »
Misha Denil · Babak Shakibi · Laurent Dinh · Marc'Aurelio Ranzato · Nando de Freitas -
2012 Poster: How Prior Probability Influences Decision Making: A Unifying Probabilistic Model »
Yanping Huang · Abram Friesen · Timothy Hanks · Michael N Shadlen · Rajesh PN Rao -
2012 Poster: Large Scale Distributed Deep Networks »
Jeff Dean · Greg Corrado · Rajat Monga · Kai Chen · Matthieu Devin · Quoc V Le · Mark Mao · Marc'Aurelio Ranzato · Andrew Senior · Paul Tucker · Ke Yang · Andrew Y Ng -
2011 Workshop: Challenges in Learning Hierarchical Models: Transfer Learning and Optimization »
Quoc V. Le · Marc'Aurelio Ranzato · Russ Salakhutdinov · Josh Tenenbaum · Andrew Y Ng -
2011 Poster: An ideal observer model for identifying the reference frame of objects »
Joseph L Austerweil · Abram Friesen · Tom Griffiths -
2010 Workshop: Deep Learning and Unsupervised Feature Learning »
Honglak Lee · Marc'Aurelio Ranzato · Yoshua Bengio · Geoffrey E Hinton · Yann LeCun · Andrew Y Ng -
2010 Poster: Generating more realistic images using gated MRF's »
Marc'Aurelio Ranzato · Volodymyr Mnih · Geoffrey E Hinton -
2010 Poster: Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine »
George Dahl · Marc'Aurelio Ranzato · Abdel-rahman Mohamed · Geoffrey E Hinton -
2007 Poster: Sparse Feature Learning for Deep Belief Networks »
Marc'Aurelio Ranzato · Y-Lan Boureau · Yann LeCun -
2006 Poster: Efficient Learning of Sparse Representations with an Energy-Based Model »
Marc'Aurelio Ranzato · Christopher Poultney · Sumit Chopra · Yann LeCun -
2006 Spotlight: Efficient Learning of Sparse Representations with an Energy-Based Model »
Marc'Aurelio Ranzato · Christopher Poultney · Sumit Chopra · Yann LeCun