`

Timezone: »

 
BLAST: Latent Dynamics Models from Bootstrapping
Keiran Paster · Lev McKinney · Sheila McIlraith · Jimmy Ba
Event URL: https://openreview.net/forum?id=VwA_hKnX_kR »

State-of-the-art world models such as DreamerV2 have significantly improved the capabilities of model-based reinforcement learning. However, these approaches typically rely on reconstruction losses to shape their latent representations of the environment, which are known to fail in environments with high fidelity visual observations. When learning latent dynamics models without reconstruction loss using only the signal present in the reward signal, the performance of these methods also drops dramatically. We present a simple modification to DreamerV2 without reconstruction loss inspired by the recent self-supervised learning method Bootstrap Your Own Latent. The combination of adding a stop-gradient to the posterior, using a powerful auto-regressive model for the prior, and using a slowly updating target encoder, which we call BLAST, allows the world model to learn from signals present in both the reward and observations, improving efficiency on our tested environment as well as being significantly more robust to visual distractors.

Author Information

Keiran Paster (University of Toronto)
Lev McKinney (University of Toronto)
Sheila McIlraith (University of Toronto and Vector Institute)
Jimmy Ba (University of Toronto / Vector Institute)

More from the Same Authors

  • 2020 : Poster #6 »
    Sheila McIlraith
  • 2021 : Avoiding Negative Side Effects by Considering Others »
    Parand Alizadeh Alamdari · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith
  • 2021 Poster: Clockwork Variational Autoencoders »
    Vaibhav Saxena · Jimmy Ba · Danijar Hafner
  • 2021 Poster: Learning Domain Invariant Representations in Goal-conditioned Block MDPs »
    Beining Han · Chongyi Zheng · Harris Chan · Keiran Paster · Michael Zhang · Jimmy Ba
  • 2021 Poster: How does a Neural Network's Architecture Impact its Robustness to Noisy Labels? »
    Jingling Li · Mozhi Zhang · Keyulu Xu · John Dickerson · Jimmy Ba
  • 2020 : Contributed Talk #2: Evaluating Agents Without Rewards »
    Brendon Matusch · Danijar Hafner · Jimmy Ba
  • 2020 : Contributed Talk: Planning from Pixels using Inverse Dynamics Models »
    Keiran Paster · Sheila McIlraith · Jimmy Ba
  • 2020 Session: Orals & Spotlights Track 34: Deep Learning »
    Tuo Zhao · Jimmy Ba
  • 2019 : Poster and Coffee Break 2 »
    Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall
  • 2019 : Poster Session »
    Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma
  • 2019 Poster: Lookahead Optimizer: k steps forward, 1 step back »
    Michael Zhang · James Lucas · Jimmy Ba · Geoffrey E Hinton
  • 2019 Poster: Graph Normalizing Flows »
    Jenny Liu · Aviral Kumar · Jimmy Ba · Jamie Kiros · Kevin Swersky
  • 2019 Poster: Learning Reward Machines for Partially Observable Reinforcement Learning »
    Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Sheila McIlraith
  • 2019 Spotlight: Learning Reward Machines for Partially Observable Reinforcement Learning »
    Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Sheila McIlraith
  • 2018 : Poster Session »
    Carl Trimbach · Mennatullah Siam · Rodrigo Toro Icarte · Zhongtian Dai · Sheila McIlraith · Matthew Rahtz · Robert Sheline · Christopher MacLellan · Carolin Lawrence · Stefan Riezler · Dylan Hadfield-Menell · Fang-I Hsiao
  • 2018 : Teaching Multiple Tasks to an RL Agent using LTL »
    Rodrigo Toro Icarte · Sheila McIlraith
  • 2018 : Poster Session 1 + Coffee »
    Tom Van de Wiele · Rui Zhao · J. Fernando Hernandez-Garcia · Fabio Pardo · Xian Yeow Lee · Xiaolin Andy Li · Marcin Andrychowicz · Jie Tang · Suraj Nair · Juhyeon Lee · Cédric Colas · S. M. Ali Eslami · Yen-Chen Wu · Stephen McAleer · Ryan Julian · Yang Xue · Matthia Sabatelli · Pranav Shyam · Alexandros Kalousis · Giovanni Montana · Emanuele Pesce · Felix Leibfried · Zhanpeng He · Chunxiao Liu · Yanjun Li · Yoshihide Sawada · Alexander Pashevich · Tejas Kulkarni · Keiran Paster · Luca Rigazio · Quan Vuong · Hyunggon Park · Minhae Kwon · Rivindu Weerasekera · Shamane Siriwardhanaa · Rui Wang · Ozsel Kilinc · Keith Ross · Yizhou Wang · Simon Schmitt · Thomas Anthony · Evan Cater · Forest Agostinelli · Tegg Sung · Shirou Maruyama · Alexander Shmakov · Devin Schwab · Mohammad Firouzi · Glen Berseth · Denis Osipychev · Jesse Farebrother · Jianlan Luo · William Agnew · Peter Vrancx · Jonathan Heek · Catalin Ionescu · Haiyan Yin · Megumi Miyashita · Nathan Jay · Noga H. Rotman · Sam Leroux · Shaileshh Bojja Venkatakrishnan · Henri Schmidt · Jack Terwilliger · Ishan Durugkar · Jonathan Sauder · David Kas · Arash Tavakoli · Alain-Sam Cohen · Philip Bontrager · Adam Lerer · Thomas Paine · Ahmed Khalifa · Ruben Rodriguez · Avi Singh · Yiming Zhang
  • 2018 Poster: Reversible Recurrent Neural Networks »
    Matthew MacKay · Paul Vicol · Jimmy Ba · Roger Grosse