`

Timezone: »

 
Poster
Provable Representation Learning for Imitation with Contrastive Fourier Features
Ofir Nachum · Mengjiao Yang

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @ None #None

In imitation learning, it is common to learn a behavior policy to match an unknown target policy via max-likelihood training on a collected set of target demonstrations. In this work, we consider using offline experience datasets -- potentially far from the target distribution -- to learn low-dimensional state representations that provably accelerate the sample-efficiency of downstream imitation learning. A central challenge in this setting is that the unknown target policy itself may not exhibit low-dimensional behavior, and so there is a potential for the representation learning objective to alias states in which the target policy acts differently. Circumventing this challenge, we derive a representation learning objective that provides an upper bound on the performance difference between the target policy and a low-dimensional policy trained with max-likelihood, and this bound is tight regardless of whether the target policy itself exhibits low-dimensional structure. Moving to the practicality of our method, we show that our objective can be implemented as contrastive learning, in which the transition dynamics are approximated by either an implicit energy-based model or, in some special cases, an implicit linear model with representations given by random Fourier features. Experiments on both tabular environments and high-dimensional Atari games provide quantitative evidence for the practical benefits of our proposed objective.

Author Information

Ofir Nachum (Google Brain)
Sherry Yang (Google Brain)

More from the Same Authors

  • 2021 Spotlight: Combiner: Full Attention Transformer with Sparse Computation Cost »
    Hongyu Ren · Hanjun Dai · Zihang Dai · Mengjiao Yang · Jure Leskovec · Dale Schuurmans · Bo Dai
  • 2021 : Offline Policy Selection under Uncertainty »
    Mengjiao Yang · Bo Dai · Ofir Nachum · George Tucker · Dale Schuurmans
  • 2021 : TARGETED ENVIRONMENT DESIGN FROM OFFLINE DATA »
    Izzeddin Gur · Ofir Nachum · Aleksandra Faust
  • 2021 : Policy Gradients Incorporating the Future »
    David Venuto · · Doina Precup · Ofir Nachum
  • 2021 Poster: Near Optimal Policy Optimization via REPS »
    Aldo Pacchiano · Jonathan Lee · Peter Bartlett · Ofir Nachum
  • 2021 Poster: Combiner: Full Attention Transformer with Sparse Computation Cost »
    Hongyu Ren · Hanjun Dai · Zihang Dai · Mengjiao Yang · Jure Leskovec · Dale Schuurmans · Bo Dai
  • 2020 Poster: CoinDICE: Off-Policy Confidence Interval Estimation »
    Bo Dai · Ofir Nachum · Yinlam Chow · Lihong Li · Csaba Szepesvari · Dale Schuurmans
  • 2020 Poster: Off-Policy Evaluation via the Regularized Lagrangian »
    Mengjiao Yang · Ofir Nachum · Bo Dai · Lihong Li · Dale Schuurmans
  • 2020 Spotlight: CoinDICE: Off-Policy Confidence Interval Estimation »
    Bo Dai · Ofir Nachum · Yinlam Chow · Lihong Li · Csaba Szepesvari · Dale Schuurmans
  • 2019 : Poster and Coffee Break 2 »
    Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall
  • 2019 : Poster Session »
    Matthia Sabatelli · Adam Stooke · Amir Abdi · Paulo Rauber · Leonard Adolphs · Ian Osband · Hardik Meisheri · Karol Kurach · Johannes Ackermann · Matt Benatan · GUO ZHANG · Chen Tessler · Dinghan Shen · Mikayel Samvelyan · Riashat Islam · Murtaza Dalal · Luke Harries · Andrey Kurenkov · Konrad Żołna · Sudeep Dasari · Kristian Hartikainen · Ofir Nachum · Kimin Lee · Markus Holzleitner · Vu Nguyen · Francis Song · Christopher Grimm · Felipe Leno da Silva · Yuping Luo · Yifan Wu · Alex Lee · Thomas Paine · Wei-Yang Qu · Daniel Graves · Yannis Flet-Berliac · Yunhao Tang · Suraj Nair · Matthew Hausknecht · Akhil Bagaria · Simon Schmitt · Bowen Baker · Paavo Parmas · Benjamin Eysenbach · Lisa Lee · Siyu Lin · Daniel Seita · Abhishek Gupta · Riley Simmons-Edler · Yijie Guo · Kevin Corder · Vikash Kumar · Scott Fujimoto · Adam Lerer · Ignasi Clavera Gilaberte · Nicholas Rhinehart · Ashvin Nair · Ge Yang · Lingxiao Wang · Sungryull Sohn · J. Fernando Hernandez-Garcia · Xian Yeow Lee · Rupesh Srivastava · Khimya Khetarpal · Chenjun Xiao · Luckeciano Carvalho Melo · Rishabh Agarwal · Tianhe Yu · Glen Berseth · Devendra Singh Chaplot · Jie Tang · Anirudh Srinivasan · Tharun Kumar Reddy Medini · Aaron Havens · Misha Laskin · Asier Mujika · Rohan Saphal · Joseph Marino · Alex Ray · Joshua Achiam · Ajay Mandlekar · Zhuang Liu · Danijar Hafner · Zhiwen Tang · Ted Xiao · Michael Walton · Jeff Druce · Ferran Alet · Zhang-Wei Hong · Stephanie Chan · Anusha Nagabandi · Hao Liu · Hao Sun · Ge Liu · Dinesh Jayaraman · John Co-Reyes · Sophia Sanborn
  • 2019 : Poster Spotlight 2 »
    Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare
  • 2019 : Contributed Talks »
    Kevin Lu · Matthew Hausknecht · Ofir Nachum
  • 2019 : Poster session »
    Jindong Gu · Alice Xiang · Atoosa Kasirzadeh · Zhiwei Han · Omar U. Florez · Frederik Harder · An-phi Nguyen · Amir Hossein Akhavan Rahnama · Michele Donini · Dylan Slack · Junaid Ali · Paramita Koley · Michiel Bakker · Anna Hilgard · Hailey James-Sorenson · Gonzalo Ramos · Jialin Lu · Jingying Yang · Margarita Boyarskaya · Martin Pawelczyk · Kacper Sokol · Mimansa Jaiswal · Umang Bhatt · David Alvarez-Melis · Aditya Grover · Charles Marx · Mengjiao Yang · Jingyan Wang · Gökhan Çapan · Hanchen Wang · Steffen Grünewälder · Moein Khajehnejad · Gourab Patro · Russell Kunes · Samuel Deng · Yuanting Liu · Luca Oneto · Mengze Li · Thomas Weber · Stefan Matthes · Duy Patrick Tu
  • 2019 Poster: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections »
    Ofir Nachum · Yinlam Chow · Bo Dai · Lihong Li
  • 2019 Spotlight: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections »
    Ofir Nachum · Yinlam Chow · Bo Dai · Lihong Li
  • 2018 Poster: A Lyapunov-based Approach to Safe Reinforcement Learning »
    Yinlam Chow · Ofir Nachum · Edgar Duenez-Guzman · Mohammad Ghavamzadeh
  • 2018 Poster: Data-Efficient Hierarchical Reinforcement Learning »
    Ofir Nachum · Shixiang (Shane) Gu · Honglak Lee · Sergey Levine
  • 2017 Poster: Bridging the Gap Between Value and Policy Based Reinforcement Learning »
    Ofir Nachum · Mohammad Norouzi · Kelvin Xu · Dale Schuurmans