Timezone: »

 
Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents
Jane Wang · Michael King · Nicolas Porcel · Zeb Kurth-Nelson · Tina Zhu · Charles Deck · Peter Choy · Mary Cassin · Malcolm Reynolds · Francis Song · Gavin Buttimore · David Reichert · Neil Rabinowitz · Loic Matthey · Demis Hassabis · Alexander Lerchner · Matt Botvinick

There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled analysis. In the present work, we introduce a new benchmark for meta-RL research, emphasizing transparency and potential for in-depth analysis as well as structural richness. Alchemy is a 3D video game, implemented in Unity, which involves a latent causal structure that is resampled procedurally from episode to episode, affording structure learning, online inference, hypothesis testing and action sequencing based on abstract domain knowledge. We evaluate a pair of powerful RL agents on Alchemy and present an in-depth analysis of one of these agents. Results clearly indicate a frank and specific failure of meta-learning, providing validation for Alchemy as a challenging benchmark for meta-RL. Concurrent with this report, we are releasing Alchemy as public resource, together with a suite of analysis tools and sample agent trajectories.

Author Information

Jane Wang (DeepMind)

Jane Wang is a research scientist at DeepMind on the neuroscience team, working on meta-reinforcement learning and neuroscience-inspired artificial agents. Her background is in physics, complex systems, and computational and cognitive neuroscience.

Michael King (DeepMind)
Nicolas Porcel (DeepMind)
Zeb Kurth-Nelson (University College London)
Tina Zhu (DeepMind)
Charles Deck (Deepmind)
Peter Choy (Google)
Mary Cassin (Ringling College of Art and Design)
Malcolm Reynolds (DeepMind)
Francis Song (DeepMind)
Gavin Buttimore (DeepMind)
David Reichert (DeepMind)
Neil Rabinowitz (DeepMind)
Loic Matthey (DeepMind)
Demis Hassabis (DeepMind Technologies Ltd)
Alexander Lerchner (DeepMind)
Matt Botvinick (Google DeepMind / University College London)

More from the Same Authors