Timezone: »
Latent dynamics models learn an abstract representation of an environment based on collected experience. Such models are the core of recent advances in model-based reinforcement learning. For example, world models can imagine unseen trajectories, potentially improving sample efficiency. Planning in the real-world requires agents to understand long-term dependencies between actions and events, and account for varying degree of changes, e.g. due to a change in background or viewpoint. Moreover, in a typical scene, only a subset of objects change their state. These changes are often quite sparse which suggests incorporating such an inductive bias in a dynamics model. In this work, we introduce the variational sparse gating mechanism, which enables an agent to sparsely update a latent dynamics model state. We also present a simplified version, which unlike prior models, has a single stochastic recurrent state. Finally, we introduce a new ShapeHerd environment, in which an agent needs to push shapes into a goal area. This environment is partially-observable and requires models to remember the previously observed objects and explore the environment to discover unseen objects. Our experiments show that the proposed methods significantly outperform leading model-based reinforcement learning methods on this environment, while also yielding competitive performance on tasks from the DeepMind Control Suite.
Author Information
Arnav Kumar Jain (Mila, ETS)
Shivakanth Sujit (École de technologie supérieure)
Shruti Joshi (Mila, ETS Montreal)
Vincent Michalski (Université de Montréal)
Danijar Hafner (Google)
Samira Ebrahimi Kahou (McGill University)
More from the Same Authors
-
2021 : Shift and Scale is Detrimental To Few-Shot Transfer »
Moslem Yazdanpanah · Christian Desrosiers · Mohammad Havaei · Eugene Belilovsky · Samira Ebrahimi Kahou -
2021 : Benchmarking the Spectrum of Agent Capabilities »
Danijar Hafner -
2022 : BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning »
Mohsen Fayyaz · Ehsan Aghazadeh · Seyed MohammadAli Modarressi · Mohammad Taher Pilehvar · Yadollah Yaghoobzadeh · Samira Ebrahimi Kahou -
2022 : Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies »
Shivakanth Sujit · Pedro Braga · Jörg Bornschein · Samira Ebrahimi Kahou -
2022 : Learning from uncertain concepts via test time interventions »
Ivaxi Sheth · Aamer Abdul Rahman · Laya Rafiee Sevyeri · Mohammad Havaei · Samira Ebrahimi Kahou -
2022 : Locally Constrained Representations in Reinforcement Learning »
Somjit Nath · Samira Ebrahimi Kahou -
2022 : Prioritizing Samples in Reinforcement Learning with Reducible Loss »
Shivakanth Sujit · Somjit Nath · Pedro Braga · Samira Ebrahimi Kahou -
2022 : Pitfalls of conditional computation for multi-modal learning »
Ivaxi Sheth · Mohammad Havaei · Samira Ebrahimi Kahou -
2023 Poster: Maximum State Entropy Exploration using Predecessor and Successor Representations »
Arnav Kumar Jain · Lucas Lehnert · Irina Rish · Glen Berseth -
2023 Poster: Auxiliary Losses for Learning Generalizable Concept-based Model »
Ivaxi Sheth · Samira Ebrahimi Kahou -
2023 Poster: Prioritizing Samples in Reinforcement Learning with Reducible Loss »
Shivakanth Sujit · Somjit Nath · Pedro Braga · Samira Ebrahimi Kahou -
2022 Poster: Learning Robust Dynamics through Variational Sparse Gating »
Arnav Kumar Jain · Shivakanth Sujit · Shruti Joshi · Vincent Michalski · Danijar Hafner · Samira Ebrahimi Kahou -
2021 : Benchmarking the Spectrum of Agent Capabilities Q&A »
Danijar Hafner -
2021 : Benchmarking the Spectrum of Agent Capabilities »
Danijar Hafner -
2021 : From model compression to self-distillation: a review »
Samira Ebrahimi Kahou -
2021 Poster: Dynamic Inference with Neural Interpreters »
Nasim Rahaman · Muhammad Waleed Gondal · Shruti Joshi · Peter Gehler · Yoshua Bengio · Francesco Locatello · Bernhard Schölkopf -
2021 Poster: Discovering and Achieving Goals via World Models »
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak -
2021 Poster: Clockwork Variational Autoencoders »
Vaibhav Saxena · Jimmy Ba · Danijar Hafner -
2021 Poster: Information is Power: Intrinsic Control via Information Capture »
Nicholas Rhinehart · Jenny Wang · Glen Berseth · John Co-Reyes · Danijar Hafner · Chelsea Finn · Sergey Levine -
2020 : Spotlight Talk: Ebrahimi Kahou »
Samira Ebrahimi Kahou -
2019 : Lunch Break and Posters »
Xingyou Song · Elad Hoffer · Wei-Cheng Chang · Jeremy Cohen · Jyoti Islam · Yaniv Blumenfeld · Andreas Madsen · Jonathan Frankle · Sebastian Goldt · Satrajit Chatterjee · Abhishek Panigrahi · Alex Renda · Brian Bartoldson · Israel Birhane · Aristide Baratin · Niladri Chatterji · Roman Novak · Jessica Forde · YiDing Jiang · Yilun Du · Linara Adilova · Michael Kamp · Berry Weinstein · Itay Hubara · Tal Ben-Nun · Torsten Hoefler · Daniel Soudry · Hsiang-Fu Yu · Kai Zhong · Yiming Yang · Inderjit Dhillon · Jaime Carbonell · Yanqing Zhang · Dar Gilboa · Johannes Brandstetter · Alexander R Johansen · Gintare Karolina Dziugaite · Raghav Somani · Ari Morcos · Freddie Kalaitzis · Hanie Sedghi · Lechao Xiao · John Zech · Muqiao Yang · Simran Kaur · Qianli Ma · Yao-Hung Hubert Tsai · Ruslan Salakhutdinov · Sho Yaida · Zachary Lipton · Daniel Roy · Michael Carbin · Florent Krzakala · Lenka Zdeborová · Guy Gur-Ari · Ethan Dyer · Dilip Krishnan · Hossein Mobahi · Samy Bengio · Behnam Neyshabur · Praneeth Netrapalli · Kris Sankaran · Julien Cornebise · Yoshua Bengio · Vincent Michalski · Samira Ebrahimi Kahou · Md Rifat Arefin · Jiri Hron · Jaehoon Lee · Jascha Sohl-Dickstein · Samuel Schoenholz · David Schwab · Dongyu Li · Sang Choe · Henning Petzka · Ashish Verma · Zhichao Lin · Cristian Sminchisescu -
2019 Poster: Bayesian Layers: A Module for Neural Network Uncertainty »
Dustin Tran · Mike Dusenberry · Mark van der Wilk · Danijar Hafner -
2018 Poster: Towards Deep Conversational Recommendations »
Raymond Li · Samira Ebrahimi Kahou · Hannes Schulz · Vincent Michalski · Laurent Charlin · Chris Pal -
2017 Demonstration: A Deep Reinforcement Learning Chatbot »
Iulian Vlad Serban · Chinnadhurai Sankar · Mathieu Germain · Saizheng Zhang · Zhouhan Lin · Sandeep Subramanian · Taesup Kim · Michael Pieper · Sarath Chandar · Nan Rosemary Ke · Sai Rajeswar Mudumba · Alexandre de Brébisson · Jose Sotelo · Dendi A Suhubdy · Vincent Michalski · Joelle Pineau · Yoshua Bengio -
2014 Poster: Modeling Deep Temporal Dependencies with Recurrent "Grammar Cells" »
Vincent Michalski · Roland Memisevic · Kishore Konda