Workshop: Deep Reinforcement Learning

Pieter Abbeel, Chelsea Finn, Joelle Pineau, David Silver, Satinder Singh, Coline Devin, Misha Laskin, Kimin Lee, Janarthanan Rajendran, Vivek Veeriah

2020-12-11T08:30:00-08:00 - 2020-12-11T19:00:00-08:00
Abstract: In recent years, the use of deep neural networks as function approximators has enabled researchers to extend reinforcement learning techniques to solve increasingly complex control tasks. The emerging field of deep reinforcement learning has led to remarkable empirical results in rich and varied domains like robotics, strategy games, and multiagent interactions. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help interested researchers outside of the field gain a high-level view about the current state of the art and potential directions for future contributions.

Video

Chat

Chat is not available.

Schedule

2020-12-11T08:29:00-08:00 - 2020-12-11T08:30:00-08:00
Welcome and Introduction
2020-12-11T08:30:00-08:00 - 2020-12-11T09:00:00-08:00
Invited talk: PierreYves Oudeyer "Machines that invent their own problems: Towards open-ended learning of skills"
Pierre-Yves Oudeyer
2020-12-11T09:00:00-08:00 - 2020-12-11T09:15:00-08:00
Contributed Talk: Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning
Sammy Christen, Lukas Jendele, Emre Aksan, Otmar Hilliges
2020-12-11T09:15:00-08:00 - 2020-12-11T09:30:00-08:00
Contributed Talk: Maximum Reward Formulation In Reinforcement Learning
Sai Krishna Gottipati, Yashaswi Pathak, Rohan Nuttall, Sahir ., Ravi Chunduru, Ahmed Touati, Sriram Ganapathi, Matthew Taylor , Sarath Chandar
2020-12-11T09:30:00-08:00 - 2020-12-11T09:45:00-08:00
Contributed Talk: Accelerating Reinforcement Learning with Learned Skill Priors
Karl Pertsch, Youngwoon Lee, Joseph Lim
2020-12-11T09:45:00-08:00 - 2020-12-11T10:00:00-08:00
Contributed Talk: Asymmetric self-play for automatic goal discovery in robotic manipulation
OpenAI Robotics, Matthias Plappert, Raul Sampedro, Tao Xu , Ilge Akkaya, Vineet Kosaraju, Peter Welinder, Ruben D'Sa, Arthur Petron, Henrique Ponde, Alex Paino, Hyeonwoo Noh  Noh , Lilian Weng, Qiming Yuan, Casey Chu , Wojciech Zaremba
2020-12-11T10:00:00-08:00 - 2020-12-11T10:30:00-08:00
Invited talk: Marc Bellemare "Autonomous navigation of stratospheric balloons using reinforcement learning"
Marc Bellemare
2020-12-11T10:30:00-08:00 - 2020-12-11T11:00:00-08:00
Break
2020-12-11T10:59:00-08:00 - 2020-12-11T11:00:00-08:00
Introduction
2020-12-11T11:00:00-08:00 - 2020-12-11T11:30:00-08:00
Invited talk: Peter Stone "Grounded Simulation Learning for Sim2Real with Connections to Off-Policy Reinforcement Learning"
Peter Stone
For autonomous robots to operate in the open, dynamically changing world, they will need to be able to learn a robust set of skills from relatively little experience. This talk introduces Grounded Simulation Learning as a way to bridge the so-called reality gap between simulators and the real world in order to enable transfer learning from simulation to a real robot. Grounded Simulation Learning has led to the fastest known stable walk on a widely used humanoid robot. Connections to theoretical advances in off-policy reinforcement learning will be highlighted.
2020-12-11T11:30:00-08:00 - 2020-12-11T11:45:00-08:00
Contributed Talk: Mirror Descent Policy Optimization
Manan Tomar, Lior Shani, Yonathan Efroni, Mohammad Ghavamzadeh
2020-12-11T11:45:00-08:00 - 2020-12-11T12:00:00-08:00
Contributed Talk: Planning from Pixels using Inverse Dynamics Models
Keiran Paster, Sheila McIlraith, Jimmy Ba
2020-12-11T12:00:00-08:00 - 2020-12-11T12:30:00-08:00
Invited talk: Matt Botvinick "Alchemy: A Benchmark Task Distribution for Meta-Reinforcement Learning Research"
Matt Botvinick
2020-12-11T12:30:00-08:00 - 2020-12-11T13:30:00-08:00
Poster session 1
2020-12-11T13:29:00-08:00 - 2020-12-11T13:30:00-08:00
Introduction
2020-12-11T13:30:00-08:00 - 2020-12-11T14:00:00-08:00
Invited talk: Susan Murphy "We used RL but…. Did it work?!"
Susan Murphy
Digital Healthcare is a growing area of importance in modern healthcare due to its potential in helping individuals improve their behaviors so as to better manage chronic health challenges such as hypertension, mental health, cancer and so on. Digital apps and wearables, observe the user's state via sensors/self-report, deliver treatment actions (reminders, motivational messages, suggestions, social outreach,...) and observe rewards repeatedly on the user across time. This area is seeing increasing interest by RL researchers with the goal of including in the digital app/wearable an RL algorithm that "personalizes" the treatments to the user. But after RL is run on a number of users, how do we know whether the RL algorithm actually personalized the sequential treatments to the user? In this talk we report on our first efforts to address this question after our RL algorithm was deployed on each of 111 individuals with hypertension.
2020-12-11T14:00:00-08:00 - 2020-12-11T14:15:00-08:00
Contributed Talk: MaxEnt RL and Robust Control
Benjamin Eysenbach, Sergey Levine
2020-12-11T14:15:00-08:00 - 2020-12-11T14:30:00-08:00
Contributed Talk: Reset-Free Lifelong Learning with Skill-Space Planning
Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
2020-12-11T14:30:00-08:00 - 2020-12-11T15:00:00-08:00
Invited talk: Anusha Nagabandi "Model-based Deep Reinforcement Learning for Robotic Systems"
Anusha Nagabandi
Deep learning has shown promising results in robotics, but we are still far from having intelligent systems that can operate in the unstructured settings of the real world, where disturbances, variations, and unobserved factors lead to a dynamic environment. In this talk, we'll see that model-based deep RL can indeed allow for efficient skill acquisition, as well as the ability to repurpose models to solve a variety of tasks. We'll scale up these approaches to enable locomotion with a 6-DoF legged robot on varying terrains in the real world, as well as dexterous manipulation with a 24-DoF anthropomorphic hand in the real world. We then focus on the inevitable mismatch between an agent's training conditions and the test conditions in which it may actually be deployed, thus illuminating the need for adaptive systems. Inspired by the ability of humans and animals to adapt quickly in the face of unexpected changes, we present a meta-learning algorithm within this model-based RL framework to enable online adaptation of large, high-capacity models using only small amounts of data from the new task. These fast adaptation capabilities are seen in both simulation and the real-world, with experiments such as a 6-legged robot adapting online to an unexpected payload or suddenly losing a leg. We will then further extend the capabilities of our robotic systems by enabling the agents to reason directly from raw image observations. Bridging the benefits of representation learning techniques with the adaptation capabilities of meta-RL, we'll present a unified framework for effective meta-RL from images. With robotic arms in the real world that learn peg insertion and ethernet cable insertion to varying targets, we'll see the fast acquisition of new skills, directly from raw image observations in the real world. Finally, this talk will conclude that model-based deep RL provides a framework for making sense of the world, thus allowing for reasoning and adaptation capabilities that are necessary for successful operation in the dynamic settings of the real world.
2020-12-11T15:00:00-08:00 - 2020-12-11T15:30:00-08:00
Break
2020-12-11T15:29:00-08:00 - 2020-12-11T15:30:00-08:00
Introduction
2020-12-11T15:30:00-08:00 - 2020-12-11T16:00:00-08:00
Invited talk: Ashley Edwards "Learning Offline from Observation"
Ashley Edwards
A common trope in sci-fi is to have a robot that can quickly solve some problem after watching a person, studying a video, or reading a book. While these settings are (currently) fictional, the benefits are real. Agents that can solve tasks by observing others have the potential to greatly reduce the burden of their human teachers, removing some of the need to hand-specify rewards or goals. In this talk, I consider the question of how an agent can not only learn by observing others, but also how it can learn quickly by training offline before taking any steps in the environment. First, I will describe an approach that trains a latent policy directly from state observations, which can then be quickly mapped to real actions in the agent’s environment. Then I will describe how we can train a novel value function, Q(s,s’), to learn off-policy from observations. Unlike previous imitation from observation approaches, this formulation goes beyond simply imitating and rather enables learning from potentially suboptimal observations.
2020-12-11T16:00:00-08:00 - 2020-12-11T16:07:00-08:00
NeurIPS RL Competitions: Flatland challenge
Sharada Mohanty
2020-12-11T16:07:00-08:00 - 2020-12-11T16:15:00-08:00
NeurIPS RL Competitions: Learning to run a power network
Antoine Marot
2020-12-11T16:15:00-08:00 - 2020-12-11T16:22:00-08:00
NeurIPS RL Competitions: Procgen challenge
Sharada Mohanty
2020-12-11T16:22:00-08:00 - 2020-12-11T16:30:00-08:00
NeurIPS RL Competitions: MineRL
William Guss, Stephanie Milani
2020-12-11T16:30:00-08:00 - 2020-12-11T17:00:00-08:00
Invited talk: Karen Liu "Deep Reinforcement Learning for Physical Human-Robot Interaction"
Karen Liu
Creating realistic virtual humans has traditionally been considered a research problem in Computer Animation primarily for entertainment applications. With the recent breakthrough in collaborative robots and deep reinforcement learning, accurately modeling human movements and behaviors has become a common challenge also faced by researchers in robotics and artificial intelligence. For example, mobile robots and autonomous vehicles can benefit from training in environments populated with ambulating humans and learning to avoid colliding with them. Healthcare robotics, on the other hand, need to embrace physical contacts and learn to utilize them for enabling human’s activities of daily living. An immediate concern in developing such an autonomous and powered robotic device is the safety of human users during the early development phase when the control policies are still largely suboptimal. Learning from physically simulated humans and environments presents a promising alternative which enables robots to safely make and learn from mistakes without putting real people at risk. However, deploying such policies to interact with people in the real world adds additional complexity to the already challenging sim-to-real transfer problem. In this talk, I will present our current progress on solving the problem of sim-to-real transfer with humans in the environment, actively interacting with the robots through physical contacts. We tackle the problem from two fronts: developing more relevant human models to facilitate robot learning and developing human-aware robot perception and control policies. As an example of contextualizing our research effort, we develop a mobile manipulator to put clothes on people with physical impairments, enabling them to carry out day-to-day tasks and maintain independence.
2020-12-11T17:00:00-08:00 - 2020-12-11T18:00:00-08:00
Panel discussion
Pierre-Yves Oudeyer, Marc Bellemare, Peter Stone, Matt Botvinick, Susan Murphy, Anusha Nagabandi, Ashley Edwards, Karen Liu, Pieter Abbeel
2020-12-11T18:00:00-08:00 - 2020-12-11T19:00:00-08:00
Poster session 2
Poster: How to make Deep RL work in Practice
Poster: Safety Aware Reinforcement Learning
Poster: Greedy Multi-Step Off-Policy Reinforcement Learning
Poster: Randomized Ensembled Double Q-Learning: Learning Fast Without a Model
Poster: Combating False Negatives in Adversarial Imitation Learning
Poster: Exploring Zero-Shot Emergent Communication in Embodied Multi-Agent Populations
Poster: Interactive Visualization for Debugging RL
Poster: Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization
Poster: D2RL: Deep Dense Architectures in Reinforcement Learning
Poster: Domain Adversarial Reinforcement Learning
Poster: Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples
Poster: Unified View of Inference-based Off-policy RL: Decoupling Algorithmic and Implemental Source of Performance Gaps
Poster: PettingZoo: Gym for Multi-Agent Reinforcement Learning
Poster: Continual Model-Based Reinforcement Learning withHypernetworks
Poster: Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies
Poster: Deep Bayesian Quadrature Policy Gradient
Poster: Accelerating Reinforcement Learning with Learned Skill Priors
Poster: Semantic State Representation for Reinforcement Learning
Poster: Regularized Inverse Reinforcement Learning
Poster: Decoupling Exploration and Exploitation in Meta-Reinforcement Learning without Sacrifices
Poster: Model-Based Reinforcement Learning via Latent-Space Collocation
Poster: Conservative Safety Critics for Exploration
Poster: Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
Poster: Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Poster: Online Safety Assurance for Deep Reinforcement Learning
Poster: FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance
Poster: Dream and Search to Control: Latent Space Planning for Continuous Control
Poster: DREAM: Deep Regret minimization with Advantage baselines and Model-free learning
Poster: Learning to Reach Goals via Iterated Supervised Learning
Poster: Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning
Poster: Modular Training, Integrated Planning Deep Reinforcement Learning for Mobile Robot Navigation
Poster: Preventing Value Function Collapse in Ensemble Q-Learning by Maximizing Representation Diversity
Poster: Diverse Exploration via InfoMax Options
Poster: Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking
Poster: Lyapunov Barrier Policy Optimization
Poster: Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning
Poster: FactoredRL: Leveraging Factored Graphs for Deep Reinforcement Learning
Poster: SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II
Poster: Parrot: Data-driven Behavioral Priors for Reinforcement Learning
Poster: C-Learning: Learning to Achieve Goals via Recursive Classification
Poster: Maximum Mutation Reinforcement Learning for Scalable Control
Poster: A Policy Gradient Method for Task-Agnostic Exploration
Poster: Evolving Reinforcement Learning Algorithms
Poster: Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning
Poster: Sample Efficient Training in Multi-Agent AdversarialGames with Limited Teammate Communication
Poster: A Deep Value-based Policy Search Approach for Real-world Vehicle Repositioning on Mobility-on-Demand Platforms
Poster: Parameter-based Value Functions
Poster: Bringing order into Actor-Critic Algorithms usingStackelberg Games
Poster: Skill Transfer via Partially Amortized Hierarchical Planning
Poster: XLVIN: eXecuted Latent Value Iteration Nets
Poster: Latent State Models for Meta-Reinforcement Learning from Images
Poster: Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments
Poster: Curriculum Learning through Distilled Discriminators
Poster: Decoupling Representation Learning from Reinforcement Learning
Poster: Amortized Variational Deep Q Network
Poster: Abstract Value Iteration for Hierarchical Deep Reinforcement Learning
Poster: Average Reward Reinforcement Learning with Monotonic Policy Improvement
Poster: Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
Poster: Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Poster: Model-Based Visual Planning with Self-Supervised Functional Distances
Poster: Adversarial Environment Generation for Learning to Navigate the Web
Poster: On Effective Parallelization of Monte Carlo Tree Search
Poster: Emergent Road Rules In Multi-Agent Driving Environments
Poster: Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments
Poster: What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study
Poster: Utilizing Skipped Frames in Action Repeats via Pseudo-Actions
Poster: Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
Poster: Action and Perception as Divergence Minimization
Poster: PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards
Poster: Targeted Query-based Action-Space Adversarial Policies on Deep Reinforcement Learning Agents
Poster: Disentangled Planning and Control in Vision Based Robotics via Reward Machines
Poster: Hyperparameter Auto-tuning in Self-Supervised Robotic Learning
Poster: Policy Guided Planning in Learned Latent Space
Poster: Model-Based Meta-Reinforcement Learning for Flight with Suspended Payloads
Poster: Learning Intrinsic Symbolic Rewards in Reinforcement Learning
Poster: GRAC: Self-Guided and Self-Regularized Actor-Critic
Poster: DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies
Poster: Structure and randomness in planning and reinforcement learning
Poster: Trust, but verify: model-based exploration in sparse reward environments
Poster: Learning to Represent Action Values as a Hypergraph on the Action Vertices
Poster: Value Generalization among Policies: Improving Value Function with Policy Representation
Poster: Unsupervised Domain Adaptation for Visual Navigation
Poster: Data-Efficient Reinforcement Learning with Self-Predictive Representations
Poster: Inter-Level Cooperation in Hierarchical Reinforcement Learning
Poster: Safe Reinforcement Learning with Natural Language Constraints
Poster: Multi-Agent Option Critic Architecture
Poster: Chaining Behaviors from Data with Model-Free Reinforcement Learning
Poster: An Examination of Preference-based Reinforcement Learning for Treatment Recommendation
Poster: Addressing reward bias in Adversarial Imitation Learning with neutral reward functions
Poster: Unsupervised Task Clustering for Multi-Task Reinforcement Learning
Poster: Policy Learning Using Weak Supervision
Poster: Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
Poster: Efficient Competitive Self-Play Policy Optimization
Poster: Beyond Exponentially Discounted Sum: Automatic Learning of Return Function
Poster: Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning
Poster: Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning
Poster: Influence-aware Memory for Deep Reinforcement Learning in POMDPs
Poster: Measuring Visual Generalization in Continuous Control from Pixels
Poster: R-LAtte: Visual Control via Deep Reinforcement Learning with Attention Network
Poster: OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
Poster: Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
Poster: XT2: Training an X-to-Text Typing Interface with Online Learning from Implicit Feedback
Poster: Self-Supervised Policy Adaptation during Deployment
Poster: Model-based Navigation in Environments with Novel Layouts Using Abstract $n$-D Maps
Poster: Provably Efficient Policy Optimization via Thompson Sampling
Poster: Discovery of Options via Meta-Gradients
Poster: Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates
Poster: Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning
Poster: Energy-based Surprise Minimization for Multi-Agent Value Factorization
Poster: Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms
Poster: An Algorithmic Causal Model of Credit Assignment in Reinforcement Learning
Poster: Pairwise Weights for Temporal Credit Assignment
Poster: Multi-task Reinforcement Learning with a Planning Quasi-Metric
Poster: Reinforcement Learning with Latent Flow
Poster: Which Mutual-Information Representation Learning Objectives are Sufficient for Control?
Poster: Successor Landmarks for Efficient Exploration and Long-Horizon Navigation
Poster: Backtesting Optimal Trade Execution Policies in Agent-Based Market Simulator
Poster: Deep Q-Learning with Low Switching Cost
Poster: Learning Markov State Abstractions for Deep Reinforcement Learning
Poster: Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
Poster: A Variational Inference Perspective on Goal-Directed Behavior in Reinforcement Learning
Poster: BeBold: Exploration Beyond the Boundary of Explored Regions
Poster: Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks
Poster: Maximum Reward Formulation In Reinforcement Learning
Poster: Planning from Pixels using Inverse Dynamics Models
Poster: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
Poster: Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning
Poster: Asymmetric self-play for automatic goal discovery in robotic manipulation
Poster: DERAIL: Diagnostic Environments for Reward And Imitation Learning
Poster: MaxEnt RL and Robust Control
Poster: Solving Compositional Reinforcement Learning Problems via Task Reduction
Poster: Unlocking the Potential of Deep Counterfactual Value Networks
Poster: Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets
Poster: Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning
Poster: Evaluating Agents Without Rewards
Poster: ReaPER: Improving Sample Efficiency in Model-Based Latent Imagination
Poster: Compute- and Memory-Efficient Reinforcement Learning with Latent Experience Replay
Poster: Optimizing Traffic Bottleneck Throughput using Cooperative, Decentralized Autonomous Vehicles
Poster: TACTO: A Simulator for Learning Control from Touch Sensing
Poster: Mastering Atari with Discrete World Models
Poster: Learning to Sample with Local and Global Contexts in Experience Replay Buffer
Poster: C-Learning: Horizon-Aware Cumulative Accessibility Estimation
Poster: AWAC: Accelerating Online Reinforcement Learning With Offline Datasets
Poster: Understanding Learned Reward Functions
Poster: Correcting Momentum in Temporal Difference Learning
Poster: Goal-Conditioned Reinforcement Learning in the Presence of an Adversary
Poster: Reset-Free Lifelong Learning with Skill-Space Planning
Poster: Model-Based Reinforcement Learning: A Compressed Survey
Poster: Mirror Descent Policy Optimization
Poster: Learning Latent Landmarks for Generalizable Planning
Poster: Reusability and Transferability of Macro Actions for Reinforcement Learning
Poster: Quantifying Differences in Reward Functions
Poster: Learning to Weight Imperfect Demonstrations