( events)   Timezone: »  
Tue Dec 14 09:00 AM -- 06:20 PM (PST)
Offline Reinforcement Learning
Rishabh Agarwal · Aviral Kumar · George Tucker · Justin Fu · Nan Jiang · Doina Precup · Aviral Kumar

Offline reinforcement learning (RL) is a re-emerging area of study that aims to learn behaviors using only logged data, such as data from previous experiments or human demonstrations, without further environment interaction. It has the potential to make tremendous progress in a number of real-world decision-making problems where active data collection is expensive (e.g., in robotics, drug discovery, dialogue generation, recommendation systems) or unsafe/dangerous (e.g., healthcare, autonomous driving, or education). Such a paradigm promises to resolve a key challenge to bringing reinforcement learning algorithms out of constrained lab settings to the real world. The first edition of the offline RL workshop, held at NeurIPS 2020, focused on and led to algorithmic development in offline RL. This year we propose to shift the focus from algorithm design to bridging the gap between offline RL research and real-world offline RL. Our aim is to create a space for discussion between researchers and practitioners on topics of importance for enabling offline RL methods in the real world. To that end, we have revised the topics and themes of the workshop, invited new speakers working on application-focused areas, and building on the lively panel discussion last year, we have invited the panelists from last year to participate in a retrospective panel on their changing perspectives.

For details on submission please visit: https://offline-rl-neurips.github.io/2021 (Submission deadline: October 6, Anywhere on Earth)

Aviv Tamar (Technion - Israel Inst. of Technology)
Angela Schoellig (University of Toronto)
Barbara Engelhardt (Princeton University)
Sham Kakade (University of Washington/Microsoft)
Minmin Chen (Google)
Philip S. Thomas (UMass Amherst)

Opening Remarks
Learning to Explore From Data (Talk)
Q&A for Aviv Tamar (Q&A)
Contributed Talk 1: What Matters in Learning from Offline Human Demonstrations for Robot Manipulation (Talk)
Contributed Talk 2: What Would the Expert do?: Causal Imitation Learning (Talk)
Contributed Talk 3: Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation (Talk)
Contributed Talk 4: PulseRL: Enabling Offline Reinforcement Learning for Digital Marketing Systems via Conservative Q-Learning (Talk)
Poster Session 1 (Poster Session)
Speaker Intro (Speaker Introduction)
Offline RL for Robotics (Talk)
Q&A for Angela Schoellig (Q&A)
Speaker Intro (Live short intro)
Generalization theory in Offline RL (Talk)
Q&A for Sham Kakade (Q&A)
Invited Speaker Panel (Discussion Panel)
Retrospective Panel (Discussion Panel)
Speaker Intro
Offline RL for recommendation systems (Talk)
Q&A for Minmin Chen (Q&A)
Speaker Intro
Offline Reinforcement Learning for Hospital Patients When Every Patient is Different (Talk)
Q&A for Barbara Engelhardt (Q&A)
Speaker Intro (Introduction)
Advances in (High-Confidence) Off-Policy Evaluation (Talk)
Q&A for Philip Thomas (Q&A)
Closing Remarks & Poster Session (Closing Remarks)
Poster Session 2 (Poster Session)
Why so pessimistic? Estimating uncertainties for offline rl through ensembles, and why their independence matters (Poster)
Pretraining for Language-Conditioned Imitation with Transformers (Poster)
DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning (Poster)
PulseRL: Enabling Offline Reinforcement Learning for Digital Marketing Systems via Conservative Q-Learning (Poster)
Doubly Pessimistic Algorithms for Strictly Safe Off-Policy Optimization (Poster)
What Would the Expert $do(\cdot)$?: Causal Imitation Learning (Poster)
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning (Poster)
Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters (Poster)
Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation (Poster)
Model-Based Offline Planning with Trajectory Pruning (Poster)
Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning (Poster)
Instance-dependent Offline Reinforcement Learning: From tabular RL to linear MDPs (Poster)
Offline Reinforcement Learning with Implicit Q-Learning (Poster)
Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions (Poster)
MBAIL: Multi-Batch Best Action Imitation Learning utilizing Sample Transfer and Policy Distillation (Poster)
Learning Value Functions from Undirected State-only Experience (Poster)
Offline Contextual Bandits for Wireless Network Optimization (Poster)
Latent Geodesics of Model Dynamics for Offline Reinforcement Learning (Poster)
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning (Poster)
Offline Reinforcement Learning with Soft Behavior Regularization (Poster)
Offline Meta-Reinforcement Learning for Industrial Insertion (Poster)
d3rlpy: An Offline Deep Reinforcement Learning Library (Poster)
Domain Knowledge Guided Offline Q Learning (Poster)
TiKick: Toward Playing Multi-agent Football Full Games from Single-agent Demonstrations (Poster)
BATS: Best Action Trajectory Stitching (Poster)
Unsupervised Learning of Temporal Abstractions using Slot-based Transformers (Poster)
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation (Poster)
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage (Poster)
Pessimistic Model Selection for Offline Deep Reinforcement Learning (Poster)
Robust On-Policy Data Collection for Data-Efficient Policy Evaluation (Poster)
Importance of Representation Learning for Off-Policy Fitted Q-Evaluation (Poster)
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning (Poster)
Offline Reinforcement Learning with Munchausen Regularization (Poster)
Single-Shot Pruning for Offline Reinforcement Learning (Poster)
Personalization for Web-based Services using Offline Reinforcement Learning (Poster)
Offline neural contextual bandits: Pessimism, Optimization and Generalization (Poster)
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning (Poster)
Counter-Strike Deathmatch with Large-Scale Behavioural Cloning (Poster)
Quantile Filtered Imitation Learning (Poster)
Modern Hopfield Networks for Return Decomposition for Delayed Rewards (Poster)
Stateful Offline Contextual Policy Evaluation and Learning (Poster)
Discrete Uncertainty Quantification Approach for Offline RL (Poster)
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data (Poster)
The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks (Poster)
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations (Poster)
Sim-to-Real Interactive Recommendation via Off-Dynamics Reinforcement Learning (Poster)
Example-Based Offline Reinforcement Learning without Rewards (Poster)