`

Timezone: »

 
Poster
Deep Reinforcement and InfoMax Learning
Bogdan Mazoure · Remi Tachet des Combes · Thang Long Doan · Philip Bachman · R Devon Hjelm

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #548

We posit that a reinforcement learning (RL) agent will perform better when it uses representations that are better at predicting the future, particularly in terms of few-shot learning and domain adaptation. To test that hypothesis, we introduce an objective based on Deep InfoMax (DIM) which trains the agent to predict the future by maximizing the mutual information between its internal representation of successive timesteps. We provide an intuitive analysis of the convergence properties of our approach from the perspective of Markov chain mixing times, and argue that convergence of the lower bound on mutual information is related to the inverse absolute spectral gap of the transition model. We test our approach in several synthetic settings, where it successfully learns representations that are predictive of the future. Finally, we augment C51, a strong distributional RL agent, with our temporal DIM objective and demonstrate on a continual learning task (inspired by Ms.~PacMan) and on the recently introduced Procgen environment that our approach improves performance, which supports our core hypothesis.

Author Information

Bogdan Mazoure (McGill University)

Ph.D. student at MILA / McGill University, supervised by Doina Precup and Devon Hjelm. Interested in reinforcement learning, representation learning, mathematical statistics and density estimation.

Remi Tachet des Combes (Microsoft Research Montreal)
Thang Doan (McGill)
Philip Bachman (Microsoft Research)
R Devon Hjelm (Microsoft Research)

More from the Same Authors

  • 2021 Poster: Pretraining Representations for Data-Efficient Reinforcement Learning »
    Max Schwarzer · Nitarshan Rajkumar · Michael Noukhovitch · Ankesh Anand · Laurent Charlin · R Devon Hjelm · Philip Bachman · Aaron Courville
  • 2020 Poster: Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift »
    Remi Tachet des Combes · Han Zhao · Yu-Xiang Wang · Geoffrey Gordon
  • 2019 : Poster Session »
    Gergely Flamich · Shashanka Ubaru · Charles Zheng · Josip Djolonga · Kristoffer Wickstrøm · Diego Granziol · Konstantinos Pitas · Jun Li · Robert Williamson · Sangwoong Yoon · Kwot Sin Lee · Julian Zilly · Linda Petrini · Ian Fischer · Zhe Dong · Alexander Alemi · Bao-Ngoc Nguyen · Rob Brekelmans · Tailin Wu · Aditya Mahajan · Alexander Li · Kirankumar Shiragur · Yair Carmon · Linara Adilova · SHIYU LIU · Bang An · Sanjeeb Dash · Oktay Gunluk · Arya Mazumdar · Mehul Motani · Julia Rosenzweig · Michael Kamp · Marton Havasi · Leighton P Barnes · Zhengqing Zhou · Yi Hao · Dylan Foster · Yuval Benjamini · Nati Srebro · Michael Tschannen · Paul Rubenstein · Sylvain Gelly · John Duchi · Aaron Sidford · Robin Ru · Stefan Zohren · Murtaza Dalal · Michael A Osborne · Stephen J Roberts · Moses Charikar · Jayakumar Subramanian · Xiaodi Fan · Max Schwarzer · Nicholas Roberts · Simon Lacoste-Julien · Vinay Prabhu · Aram Galstyan · Greg Ver Steeg · Lalitha Sankar · Yung-Kyun Noh · Gautam Dasarathy · Frank Park · Ngai-Man (Man) Cheung · Ngoc-Trung Tran · Linxiao Yang · Ben Poole · Andrea Censi · Tristan Sylvain · R Devon Hjelm · Bangjie Liu · Jose Gallego-Posada · Tyler Sypherd · Kai Yang · Jan Nikolas Morshuis
  • 2019 Poster: Learning Representations by Maximizing Mutual Information Across Views »
    Philip Bachman · R Devon Hjelm · William Buchwalter
  • 2019 Poster: Unsupervised State Representation Learning in Atari »
    Ankesh Anand · Evan Racah · Sherjil Ozair · Yoshua Bengio · Marc-Alexandre Côté · R Devon Hjelm
  • 2019 Poster: On Adversarial Mixup Resynthesis »
    Christopher Beckham · Sina Honari · Alex Lamb · Vikas Verma · Farnoosh Ghadiri · R Devon Hjelm · Yoshua Bengio · Chris Pal
  • 2017 : Poster session »
    Xun Zheng · Tim G. J. Rudner · Christopher Tegho · Patrick McClure · Yunhao Tang · ASHWIN D'CRUZ · Juan Camilo Gamboa Higuera · Chandra Sekhar Seelamantula · Jhosimar Arias Figueroa · Andrew Berlin · Maxime Voisin · Alexander Amini · Thang Long Doan · Hengyuan Hu · Aleksandar Botev · Niko Sünderhauf · CHI ZHANG · John Lambert
  • 2016 Poster: An Architecture for Deep, Hierarchical Generative Models »
    Philip Bachman
  • 2015 Poster: Data Generation as Sequential Decision Making »
    Philip Bachman · Doina Precup
  • 2015 Spotlight: Data Generation as Sequential Decision Making »
    Philip Bachman · Doina Precup
  • 2014 Poster: Learning with Pseudo-Ensembles »
    Philip Bachman · Ouais Alsharif · Doina Precup