Timezone: »
We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications. Given the increasing interest in deploying learning-based methods, there has been a flurry of recent proposals for OPE method, leading to a need for standardized empirical analyses. Our work takes a strong focus on diversity of experimental design to enable stress testing of OPE methods. We provide a comprehensive benchmarking suite to study the interplay of different attributes on method performance. We distill the results into a summarized set of guidelines for OPE in practice. Our software package, the Caltech OPE Benchmarking Suite (COBS), is open-sourced and we invite interested researchers to further contribute to the benchmark.
Author Information
Cameron Voloshin (California Institute of Technology)
Hoang Le (Microsoft Research)
Nan Jiang (University of Illinois at Urbana-Champaign)
Yisong Yue (Caltech)
More from the Same Authors
-
2021 : The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions »
Jennifer J Sun · Tomomi Karigo · Dipam Chakraborty · Sharada Mohanty · Benjamin Wild · Quan Sun · Chen Chen · David Anderson · Pietro Perona · Yisong Yue · Ann Kennedy -
2022 Poster: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 : Neurosymbolic Programming for Science »
Jennifer J Sun · Megan Tjandrasuwita · Atharva Sehgal · Armando Solar-Lezama · Swarat Chaudhuri · Yisong Yue · Omar Costilla Reyes -
2022 : SustainGym: A Benchmark Suite of Reinforcement Learning for Sustainability Applications »
Christopher Yeh · Victor Li · Rajeev Datta · Yisong Yue · Adam Wierman -
2022 : Trajectory-based Explainability Framework for Offline RL »
Shripad Deshmukh · Arpan Dasgupta · Chirag Agarwal · Nan Jiang · Balaji Krishnamurthy · Georgios Theocharous · Jayakumar Subramanian -
2022 : AMORE: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data »
Tengyang Xie · Mohak Bhardwaj · Nan Jiang · Ching-An Cheng -
2022 Spotlight: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 : Panel »
Jeevana Priya Inala · Pushmeet Kohli · Ann Kennedy · Sriram Rajamani · Yisong Yue -
2022 : Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings »
Sabera Talukder · Jennifer J Sun · Matthew Leonard · Bingni Brunton · Yisong Yue -
2022 Poster: Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions »
Audrey Huang · Nan Jiang -
2022 Poster: Interaction-Grounded Learning with Action-Inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan J Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Poster: A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation »
Philip Amortila · Nan Jiang · Dhruv Madeka · Dean Foster -
2022 Poster: On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL »
Jinglin Chen · Aditya Modi · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal -
2022 Poster: Policy Optimization with Linear Temporal Logic Constraints »
Cameron Voloshin · Hoang Le · Swarat Chaudhuri · Yisong Yue -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 : Panel B: Safe Learning and Decision Making in Uncertain and Unstructured Environments »
Yisong Yue · J. Zico Kolter · Ivan Dario D Jimenez Rodriguez · Dragos Margineantu · Animesh Garg · Melissa Greeff -
2021 : Learning for Agile Control in the Real World: Challenges and Opportunities »
Yisong Yue · Ivan Dario D Jimenez Rodriguez -
2021 Workshop: Offline Reinforcement Learning »
Rishabh Agarwal · Aviral Kumar · George Tucker · Justin Fu · Nan Jiang · Doina Precup · Aviral Kumar -
2021 Poster: Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning »
Siyuan Zhang · Nan Jiang -
2021 Poster: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Meta-Adaptive Nonlinear Control: Theory and Algorithms »
Guanya Shi · Kamyar Azizzadenesheli · Michael O'Connell · Soon-Jo Chung · Yisong Yue -
2021 Oral: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: DeepGEM: Generalized Expectation-Maximization for Blind Inversion »
Angela Gao · Jorge Castellanos · Yisong Yue · Zachary Ross · Katherine Bouman -
2021 Poster: Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning »
Tengyang Xie · Nan Jiang · Huan Wang · Caiming Xiong · Yu Bai -
2021 Poster: Iterative Amortized Policy Optimization »
Joseph Marino · Alexandre Piche · Alessandro Davide Ialongo · Yisong Yue -
2020 : Towards Reliable Validation and Evaluation for Offline RL »
Nan Jiang -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 Workshop: Learning Meets Combinatorial Algorithms »
Marin Vlastelica · Jialin Song · Aaron Ferber · Brandon Amos · Georg Martius · Bistra Dilkina · Yisong Yue -
2020 Poster: Online Optimization with Memory and Competitive Control »
Guanya Shi · Yiheng Lin · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2020 Poster: A General Large Neighborhood Search Framework for Solving Integer Linear Programs »
Jialin Song · ravi lanka · Yisong Yue · Bistra Dilkina -
2020 Poster: Learning compositional functions via multiplicative weight updates »
Jeremy Bernstein · Jiawei Zhao · Markus Meister · Ming-Yu Liu · Anima Anandkumar · Yisong Yue -
2020 Poster: Learning Differentiable Programs with Admissible Neural Heuristics »
Ameesh Shah · Eric Zhan · Jennifer J Sun · Abhinav Verma · Yisong Yue · Swarat Chaudhuri -
2020 Poster: On the distance between two neural networks and the stability of learning »
Jeremy Bernstein · Arash Vahdat · Yisong Yue · Ming-Yu Liu -
2020 Poster: The Power of Predictions in Online Control »
Chenkai Yu · Guanya Shi · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 Workshop: Safety and Robustness in Decision-making »
Mohammad Ghavamzadeh · Shie Mannor · Yisong Yue · Marek Petrik · Yinlam Chow -
2019 Poster: Imitation-Projected Programmatic Reinforcement Learning »
Abhinav Verma · Hoang Le · Yisong Yue · Swarat Chaudhuri -
2019 Poster: NAOMI: Non-Autoregressive Multiresolution Sequence Imputation »
Yukai Liu · Rose Yu · Stephan Zheng · Eric Zhan · Yisong Yue -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: Provably Efficient Q-Learning with Low Switching Cost »
Yu Bai · Tengyang Xie · Nan Jiang · Yu-Xiang Wang -
2019 Poster: Landmark Ordinal Embedding »
Nikhil Ghosh · Yuxin Chen · Yisong Yue -
2018 : Yisong Yue »
Yisong Yue -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: A General Method for Amortizing Variational Filtering »
Joseph Marino · Milan Cvitkovic · Yisong Yue -
2017 : Coffee break and Poster Session II »
Mohamed Kane · Albert Haque · Vagelis Papalexakis · John Guibas · Peter Li · Carlos Arias · Eric Nalisnick · Padhraic Smyth · Frank Rudzicz · Xia Zhu · Theodore Willke · Noemie Elhadad · Hans Raffauf · Harini Suresh · Paroma Varma · Yisong Yue · Ognjen (Oggi) Rudovic · Luca Foschini · Syed Rameel Ahmad · Hasham ul Haq · Valerio Maggio · Giuseppe Jurman · Sonali Parbhoo · Pouya Bashivan · Jyoti Islam · Mirco Musolesi · Chris Wu · Alexander Ratner · Jared Dunnmon · Cristóbal Esteban · Aram Galstyan · Greg Ver Steeg · Hrant Khachatrian · Marc Górriz · Mihaela van der Schaar · Anton Nemchenko · Manasi Patwardhan · Tanay Tandon -
2016 Poster: Generating Long-term Trajectories Using Deep Hierarchical Networks »
Stephan Zheng · Yisong Yue · Patrick Lucey -
2015 Poster: Smooth Interactive Submodular Set Cover »
Bryan He · Yisong Yue -
2015 Demonstration: Data-Driven Speech Animation »
Yisong Yue · Iain Matthews