Timezone: »
Meta reinforcement learning (RL) attempts to discover new RL algorithms automatically from environment interaction. In so-called black-box approaches, the policy and the learning algorithm are jointly represented by a single neural network. These methods are very flexible, but they tend to underperform in terms of generalisation to new, unseen environments. In this paper, we explore the role of symmetries in meta-generalisation. We show that a recent successful meta RL approach that meta-learns an objective for backpropagation-based learning exhibits certain symmetries (specifically the reuse of the learning rule, and invariance to input and output permutations) that are not present in typical black-box meta RL systems. We hypothesise that these symmetries can play an important role in meta-generalisation. Building off recent work in black-box supervised meta learning, we develop a black-box meta RL system that exhibits these same symmetries. We show through careful experimentation that incorporating these symmetries can lead to algorithms with a greater ability to generalise to unseen action & observation spaces, tasks, and environments.
Author Information
Louis Kirsch (The Swiss AI Lab IDSIA)
Sebastian Flennerhag (DeepMind)
Ph.D. candidate in Deep Learning, focusing on network adaptation in transfer learning, meta learning and sequence learning.
Hado van Hasselt (DeepMind)
Abram Friesen (DeepMind)
Junhyuk Oh (DeepMind)
Yutian Chen (DeepMind)
More from the Same Authors
-
2020 : Meta-Learning Backpropagation And Improving It »
Louis Kirsch -
2021 : Exploring through Random Curiosity with General Value Functions »
Aditya Ramesh · Louis Kirsch · Sjoerd van Steenkiste · Jürgen Schmidhuber -
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2022 : Meta-Learning General-Purpose Learning Algorithms with Transformers »
Louis Kirsch · Luke Metz · James Harrison · Jascha Sohl-Dickstein -
2022 : Optimistic Meta-Gradients »
Sebastian Flennerhag · Tom Zahavy · Brendan O'Donoghue · Hado van Hasselt · András György · Satinder Singh -
2022 : Meta-Learning General-Purpose Learning Algorithms with Transformers »
Louis Kirsch · Luke Metz · James Harrison · Jascha Sohl-Dickstein -
2022 : The Benefits of Model-Based Generalization in Reinforcement Learning »
Kenny Young · Aditya Ramesh · Louis Kirsch · Jürgen Schmidhuber -
2022 Poster: Exploring through Random Curiosity with General Value Functions »
Aditya Ramesh · Louis Kirsch · Sjoerd van Steenkiste · Jürgen Schmidhuber -
2021 : Bootstrapped Meta-Learning »
Sebastian Flennerhag · Yannick Schroecker · Tom Zahavy · Hado van Hasselt · David Silver · Satinder Singh -
2021 Poster: Discovery of Options via Meta-Learned Subgoals »
Vivek Veeriah · Tom Zahavy · Matteo Hessel · Zhongwen Xu · Junhyuk Oh · Iurii Kemaev · Hado van Hasselt · David Silver · Satinder Singh -
2021 Poster: Active Offline Policy Selection »
Ksenia Konyushova · Yutian Chen · Thomas Paine · Caglar Gulcehre · Cosmin Paduraru · Daniel Mankowitz · Misha Denil · Nando de Freitas -
2021 Poster: Meta Learning Backpropagation And Improving It »
Louis Kirsch · Jürgen Schmidhuber -
2021 Poster: Self-Consistent Models and Values »
Greg Farquhar · Kate Baumli · Zita Marinho · Angelos Filos · Matteo Hessel · Hado van Hasselt · David Silver -
2020 : Q/A for invited talk #4 »
Louis Kirsch -
2020 : General meta-learning »
Louis Kirsch -
2020 Poster: Discovering Reinforcement Learning Algorithms »
Junhyuk Oh · Matteo Hessel · Wojciech Czarnecki · Zhongwen Xu · Hado van Hasselt · Satinder Singh · David Silver -
2020 Poster: Meta-Gradient Reinforcement Learning with an Objective Discovered Online »
Zhongwen Xu · Hado van Hasselt · Matteo Hessel · Junhyuk Oh · Satinder Singh · David Silver -
2020 Poster: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Spotlight: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Poster: A Self-Tuning Actor-Critic Algorithm »
Tom Zahavy · Zhongwen Xu · Vivek Veeriah · Matteo Hessel · Junhyuk Oh · Hado van Hasselt · David Silver · Satinder Singh -
2020 Poster: Forethought and Hindsight in Credit Assignment »
Veronica Chelu · Doina Precup · Hado van Hasselt -
2019 : Coffee/Poster session 1 »
Shiro Takagi · Khurram Javed · Johanna Sommer · Amr Sharaf · Pierluca D'Oro · Ying Wei · Sivan Doveh · Colin White · Santiago Gonzalez · Cuong Nguyen · Mao Li · Tianhe Yu · Tiago Ramalho · Masahiro Nomura · Ahsan Alvi · Jean-Francois Ton · W. Ronny Huang · Jessica Lee · Sebastian Flennerhag · Michael Zhang · Abram Friesen · Paul Blomstedt · Alina Dubatovka · Sergey Bartunov · Subin Yi · Iaroslav Shcherbatyi · Christian Simon · Zeyuan Shang · David MacLeod · Lu Liu · Liam Fowl · Diego Mesquita · Deirdre Quillen -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Poster: Discovery of Useful Questions as Auxiliary Tasks »
Vivek Veeriah · Matteo Hessel · Zhongwen Xu · Janarthanan Rajendran · Richard L Lewis · Junhyuk Oh · Hado van Hasselt · David Silver · Satinder Singh -
2019 Poster: When to use parametric models in reinforcement learning? »
Hado van Hasselt · Matteo Hessel · John Aslanides -
2019 Poster: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Spotlight: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2018 : Transferring Knowledge across Learning Processes »
Sebastian Flennerhag -
2018 Poster: Modular Networks: Learning to Decompose Neural Computation »
Louis Kirsch · Julius Kunze · David Barber -
2018 Poster: Meta-Gradient Reinforcement Learning »
Zhongwen Xu · Hado van Hasselt · David Silver -
2018 Poster: Breaking the Activation Function Bottleneck through Adaptive Parameterization »
Sebastian Flennerhag · Hujun Yin · John Keane · Mark Elliot -
2017 : Invited talk: Learning to learn without gradient descent by gradient descent. »
Yutian Chen -
2017 Poster: Natural Value Approximators: Learning when to Trust Past Estimates »
Zhongwen Xu · Joseph Modayil · Hado van Hasselt · Andre Barreto · David Silver · Tom Schaul -
2017 Poster: Successor Features for Transfer in Reinforcement Learning »
Andre Barreto · Will Dabney · Remi Munos · Jonathan Hunt · Tom Schaul · David Silver · Hado van Hasselt -
2017 Spotlight: Successor Features for Transfer in Reinforcement Learning »
Andre Barreto · Will Dabney · Remi Munos · Jonathan Hunt · Tom Schaul · David Silver · Hado van Hasselt -
2017 Spotlight: Natural Value Approximators: Learning when to Trust Past Estimates »
Zhongwen Xu · Joseph Modayil · Hado van Hasselt · Andre Barreto · David Silver · Tom Schaul -
2016 Poster: Learning values across many orders of magnitude »
Hado van Hasselt · Arthur Guez · Arthur Guez · Matteo Hessel · Volodymyr Mnih · David Silver -
2014 Poster: Weighted importance sampling for off-policy learning with linear function approximation »
Rupam Mahmood · Hado P van Hasselt · Richard Sutton -
2010 Poster: Double Q-learning »
Hado P van Hasselt