( events)
Timezone: »
Workshop
Mon Dec 13 08:55 AM -- 06:00 PM (PST)
Deep Reinforcement Learning
In recent years, the use of deep neural networks as function approximators has enabled researchers to extend reinforcement learning techniques to solve increasingly complex control tasks. The emerging field of deep reinforcement learning has led to remarkable empirical results in rich and varied domains like robotics, strategy games, and multiagent interactions. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help interested researchers outside of the field gain perspective about the current state of the art and potential directions for future contributions.
Welcome and Introduction (Welcoming Notes) | |
Implicit Behavioral Cloning (Oral) | |
Implicit Behavioral Cloning Q&A (Q&A) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Oral) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A (Q&A) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Oral) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation Q&A (Q&A) | |
Benchmarking the Spectrum of Agent Capabilities (Oral) | |
Benchmarking the Spectrum of Agent Capabilities Q&A (Q&A) | |
Invited Talk: Laura Schulz - In praise of folly: Goals, play, and human cognition (Talk) | |
Laura Schulz Talk Q&A (Q&A) | |
Break | |
Opinion Contributed Talk: Wilka Carvalho (Talk) | |
Wilka Carvalho Talk Q&A (Q&A) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Oral) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning Q&A (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision Q&A (Q&A) | |
Invited Talk: George Konidaris - Signal to Symbol (via Skills) (Talk) | |
George Konidaris Talk Q&A (Q&A) | |
Poster Session (in Gather Town) (Poster Session) | |
Opinion Contributed Talk: Sergey Levine (Talk) | |
Sergey Levine Talk Q&A (Q&A) | |
Panel Discussion 1 (Panel Discussion) | |
Invited Talk: Dale Schuurmans - Understanding Deep Value Estimation (Talk) | |
Dale Schuurmans Talk Q&A (Q&A) | |
Break | |
Invited Talk: Karol Hausman - Reinforcement Learning as a Data Sponge (Talk) | |
Karol Hausman Talk Q&A (Q&A) | |
NeurIPS RL Competitions Results Presentations (Presentations) | |
Invited Talk: Kenji Doya - Natural and Artificial Reinforcement Learning (Talk) | |
Kenji Doya Talk Q&A (Q&A) | |
Panel Discussion 2 (Panel Discussion) | |
The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks (Poster) | |
BLAST: Latent Dynamics Models from Bootstrapping (Poster) | |
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning (Poster) | |
Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning (Poster) | |
Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning (Poster) | |
StarCraft II Unplugged: Large Scale Offline Reinforcement Learning (Poster) | |
Learning Robust Dynamics through Variational Sparse Gating (Poster) | |
Should I Run Offline Reinforcement Learning or Behavioral Cloning? (Poster) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Poster) | |
Deep RePReL--Combining Planning and Deep RL for acting in relational domains (Poster) | |
Fast Inference and Transfer of Compositional Task for Few-shot Task Generalization (Poster) | |
Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning (Poster) | |
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning (Poster) | |
Off-Policy Correction For Multi-Agent Reinforcement Learning (Poster) | |
Bayesian Exploration for Lifelong Reinforcement Learning (Poster) | |
Offline Policy Selection under Uncertainty (Poster) | |
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies (Poster) | |
Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation (Poster) | |
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization (Poster) | |
Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks (Poster) | |
Cross-Domain Imitation Learning via Optimal Transport (Poster) | |
Lifting the veil on hyper-parameters for value-baseddeep reinforcement learning (Poster) | |
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (Poster) | |
Learning Parameterized Task Structure for Generalization to Unseen Entities (Poster) | |
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models (Poster) | |
Learning a Subspace of Policies for Online Adaptation in Reinforcement Learning (Poster) | |
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning (Poster) | |
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations (Poster) | |
Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning (Poster) | |
Meta Arcade: A Configurable Environment Suite for Deep Reinforcement Learning and Meta-Learning (Poster) | |
Hindsight Foresight Relabeling for Meta-Reinforcement Learning (Poster) | |
CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery (Poster) | |
Continuous Control With Ensemble Deep Deterministic Policy Gradients (Poster) | |
Grounding Aleatoric Uncertainty in Unsupervised Environment Design (Poster) | |
SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning (Poster) | |
Task-Induced Representation Learning (Poster) | |
OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning (Poster) | |
GrASP: Gradient-Based Affordance Selection for Planning (Poster) | |
Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization (Poster) | |
No DICE: An Investigation of the Bias-Variance Tradeoff in Meta-Gradients (Poster) | |
Block Contextual MDPs for Continual Learning (Poster) | |
PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network (Poster) | |
Recurrent Off-policy Baselines for Memory-based Continuous Control (Poster) | |
Transfer RL across Observation Feature Spaces via Model-Based Regularization (Poster) | |
Embodiment perspective of reward definition for behavioural homeostasis (Poster) | |
Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games (Poster) | |
URLB: Unsupervised Reinforcement Learning Benchmark (Poster) | |
Offline Reinforcement Learning with In-sample Q-Learning (Poster) | |
Wasserstein Distance Maximizing Intrinsic Control (Poster) | |
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks (Poster) | |
Strength Through Diversity: Robust Behavior Learning via Mixture Policies (Poster) | |
Long-Term Credit Assignment via Model-based Temporal Shortcuts (Poster) | |
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks (Poster) | |
TARGETED ENVIRONMENT DESIGN FROM OFFLINE DATA (Poster) | |
GPU-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning (Poster) | |
Behavior Predictive Representations for Generalization in Reinforcement Learning (Poster) | |
Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari (Poster) | |
Implicit Behavioral Cloning (Poster) | |
TempoRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning (Poster) | |
Exploring through Random Curiosity with General Value Functions (Poster) | |
Maximum Entropy Model-based Reinforcement Learning (Poster) | |
Exponential Family Model-Based Reinforcement Learning via Score Matching (Poster) | |
Imitation Learning from Pixel Observations for Continuous Control (Poster) | |
Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning (Poster) | |
Latent Geodesics of Model Dynamics for Offline Reinforcement Learning (Poster) | |
An Empirical Study of Non-Uniform Sampling in Off-Policy Reinforcement Learning for Continuous Control (Poster) | |
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension (Poster) | |
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback (Poster) | |
That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities (Poster) | |
The Information Geometry of Unsupervised Reinforcement Learning (Poster) | |
Mismatched No More: Joint Model-Policy Optimization for Model-Based RL (Poster) | |
Graph Backup: Data Efficient Backup Exploiting Markovian Data (Poster) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Poster) | |
Unsupervised Learning of Temporal Abstractions using Slot-based Transformers (Poster) | |
Modern Hopfield Networks for Return Decomposition for Delayed Rewards (Poster) | |
Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium (Poster) | |
Stability Analysis in Mixed-Autonomous Traffic with Deep Reinforcement Learning (Poster) | |
Learning Efficient Multi-Agent Cooperative Visual Exploration (Poster) | |
Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization (Poster) | |
Large Scale Coordination Transfer for Cooperative Multi-Agent Reinforcement Learning (Poster) | |
Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay (Poster) | |
Status-quo policy gradient in Multi-Agent Reinforcement Learning (Poster) | |
Deep Reinforcement Learning Explanation via Model Transforms (Poster) | |
A Meta-Gradient Approach to Learning Cooperative Multi-Agent Communication Topology (Poster) | |
A Family of Cognitively Realistic Parsing Environments for Deep Reinforcement Learning (Poster) | |
OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion (Poster) | |
Hybrid Imitative Planning with Geometric and Predictive Costs in Offroad Environments (Poster) | |
Accelerated Deep Reinforcement Learning of Terrain-Adaptive Locomotion Skills (Poster) | |
CoMPS: Continual Meta Policy Search (Poster) | |
Continuous Control with Action Quantization from Demonstrations (Poster) | |
Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments (Poster) | |
Expert Human-Level Driving in Gran Turismo Sport Using Deep Reinforcement Learning with Image-based Representation (Poster) | |
MHER: Model-based Hindsight Experience Replay (Poster) | |
On the Transferability of Deep-Q Networks (Poster) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Poster) | |
Skill-based Meta-Reinforcement Learning (Poster) | |
Introducing Symmetries to Black Box Meta Reinforcement Learning (Poster) | |
A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems (Poster) | |
Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models (Poster) | |
Component Transfer Learning for Deep RL Based on Abstract Representations (Poster) | |
Math Programming based Reinforcement Learning for Multi-Echelon Inventory Management (Poster) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Poster) | |
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL (Poster) | |
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (Poster) | |
Implicitly Regularized RL with Implicit Q-values (Poster) | |
Towards Automatic Actor-Critic Solutions to Continuous Control (Poster) | |
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World Trifinger (Poster) | |
Hierarchical Few-Shot Imitation with Skill Transition Models (Poster) | |
Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives (Poster) | |
Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL (Poster) | |
Automatic Curricula via Expert Demonstrations (Poster) | |
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning (Poster) | |
Policy Optimization via Optimal Policy Evaluation (Poster) | |
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (Poster) | |
Discriminator Augmented Model-Based Reinforcement Learning (Poster) | |
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives (Poster) | |
Learning compositional tasks from language instructions (Poster) | |
General Characterization of Agents by States they Visit (Poster) | |
A Modern Self-Referential Weight Matrix That Learns to Modify Itself (Poster) | |
Understanding the Effects of Dataset Composition on Offline Reinforcement Learning (Poster) | |
Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method (Poster) | |
Benchmarking the Spectrum of Agent Capabilities (Poster) | |
Policy Gradients Incorporating the Future (Poster) | |
Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning (Poster) | |
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning (Poster) | |
What Would the Expert $do(\cdot)$?: Causal Imitation Learning (Poster) | |
TransDreamer: Reinforcement Learning with Transformer World Models (Poster) | |
Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks (Poster) | |
A Framework for Efficient Robotic Manipulation (Poster) | |
Learning Value Functions from Undirected State-only Experience (Poster) | |
Distributional Decision Transformer for Offline Hindsight Information Matching (Poster) | |
Self-Imitation Learning from Demonstrations (Poster) | |
Understanding and Preventing Capacity Loss in Reinforcement Learning (Poster) | |
A Closer Look at Gradient Estimators with Reinforcement Learning as Inference (Poster) | |
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation (Poster) | |
Attention-based Partial Decoupling of Policy and Value for Generalization in Reinforcement Learning (Poster) | |
Imitation Learning from Observations under Transition Model Disparity (Poster) | |
Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization (Poster) | |
Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling (Poster) | |
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification (Poster) | |
Generalisation in Lifelong Reinforcement Learning through Logical Composition (Poster) | |
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations (Poster) | |
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates (Poster) | |
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers (Poster) | |
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation (Poster) | |
Target Entropy Annealing for Discrete Soft Actor-Critic (Poster) | |
Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks (Poster) | |
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals (Poster) | |