Timezone: »
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive; such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there exists no single established protocol for evaluating offline RL methods. In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of the learning process and the robustness of algorithms to distribution changes in the dataset while also harmonizing the visualization of the offline and online learning phases. Our approach is generally applicable and easy to implement. We compare several existing offline RL algorithms using this approach and present insights from a variety of tasks and offline datasets.
Author Information
Shivakanth Sujit (École de technologie supérieure)
Pedro Braga (UFPE, ÉTS/Mila)
Jörg Bornschein (Deepmind)
Samira Ebrahimi Kahou (McGill University)
More from the Same Authors
-
2021 : Shift and Scale is Detrimental To Few-Shot Transfer »
Moslem Yazdanpanah · Christian Desrosiers · Mohammad Havaei · Eugene Belilovsky · Samira Ebrahimi Kahou -
2021 : Learning Robust Dynamics through Variational Sparse Gating »
Arnav Kumar Jain · Shivakanth Sujit · Shruti Joshi · Vincent Michalski · Danijar Hafner · Samira Ebrahimi Kahou -
2021 : Prequential MDL for Causal Structure Learning with Neural Networks »
Jorg Bornschein · Silvia Chiappa · Alan Malek · Nan Rosemary Ke -
2022 : BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning »
Mohsen Fayyaz · Ehsan Aghazadeh · Seyed MohammadAli Modarressi · Mohammad Taher Pilehvar · Yadollah Yaghoobzadeh · Samira Ebrahimi Kahou -
2022 : Learning from uncertain concepts via test time interventions »
Ivaxi Sheth · Aamer Abdul Rahman · Laya Rafiee Sevyeri · Mohammad Havaei · Samira Ebrahimi Kahou -
2022 : Locally Constrained Representations in Reinforcement Learning »
Somjit Nath · Samira Ebrahimi Kahou -
2022 : Prioritizing Samples in Reinforcement Learning with Reducible Loss »
Shivakanth Sujit · Somjit Nath · Pedro Braga · Samira Ebrahimi Kahou -
2022 : Pitfalls of conditional computation for multi-modal learning »
Ivaxi Sheth · Mohammad Havaei · Samira Ebrahimi Kahou -
2022 Poster: Learning Robust Dynamics through Variational Sparse Gating »
Arnav Kumar Jain · Shivakanth Sujit · Shruti Joshi · Vincent Michalski · Danijar Hafner · Samira Ebrahimi Kahou -
2021 : From model compression to self-distillation: a review »
Samira Ebrahimi Kahou -
2020 : Spotlight Talk: Ebrahimi Kahou »
Samira Ebrahimi Kahou -
2019 : Lunch Break and Posters »
Xingyou Song · Elad Hoffer · Wei-Cheng Chang · Jeremy Cohen · Jyoti Islam · Yaniv Blumenfeld · Andreas Madsen · Jonathan Frankle · Sebastian Goldt · Satrajit Chatterjee · Abhishek Panigrahi · Alex Renda · Brian Bartoldson · Israel Birhane · Aristide Baratin · Niladri Chatterji · Roman Novak · Jessica Forde · YiDing Jiang · Yilun Du · Linara Adilova · Michael Kamp · Berry Weinstein · Itay Hubara · Tal Ben-Nun · Torsten Hoefler · Daniel Soudry · Hsiang-Fu Yu · Kai Zhong · Yiming Yang · Inderjit Dhillon · Jaime Carbonell · Yanqing Zhang · Dar Gilboa · Johannes Brandstetter · Alexander R Johansen · Gintare Karolina Dziugaite · Raghav Somani · Ari Morcos · Freddie Kalaitzis · Hanie Sedghi · Lechao Xiao · John Zech · Muqiao Yang · Simran Kaur · Qianli Ma · Yao-Hung Hubert Tsai · Ruslan Salakhutdinov · Sho Yaida · Zachary Lipton · Daniel Roy · Michael Carbin · Florent Krzakala · Lenka Zdeborová · Guy Gur-Ari · Ethan Dyer · Dilip Krishnan · Hossein Mobahi · Samy Bengio · Behnam Neyshabur · Praneeth Netrapalli · Kris Sankaran · Julien Cornebise · Yoshua Bengio · Vincent Michalski · Samira Ebrahimi Kahou · Md Rifat Arefin · Jiri Hron · Jaehoon Lee · Jascha Sohl-Dickstein · Samuel Schoenholz · David Schwab · Dongyu Li · Sang Keun Choe · Henning Petzka · Ashish Verma · Zhichao Lin · Cristian Sminchisescu -
2018 Poster: Towards Deep Conversational Recommendations »
Raymond Li · Samira Ebrahimi Kahou · Hannes Schulz · Vincent Michalski · Laurent Charlin · Chris Pal -
2017 Poster: Variational Memory Addressing in Generative Models »
Jörg Bornschein · Andriy Mnih · Daniel Zoran · Danilo Jimenez Rezende