Skip to yearly menu bar Skip to main content


( events)   Timezone:  
Workshop
Sat Dec 08 05:00 AM -- 03:30 PM (PST) @ Room 517 C
Reinforcement Learning under Partial Observability
Joni Pajarinen · Chris Amato · Pascal Poupart · David Hsu





Workshop Home Page

Reinforcement learning (RL) has succeeded in many challenging tasks such as Atari, Go, and Chess and even in high dimensional continuous domains such as robotics. Most impressive successes are in tasks where the agent observes the task features fully. However, in real world problems, the agent usually can only rely on partial observations. In real time games the agent makes only local observations; in robotics the agent has to cope with noisy sensors, occlusions, and unknown dynamics. Even more fundamentally, any agent without a full a priori world model or without full access to the system state, has to make decisions based on partial knowledge about the environment and its dynamics.

Reinforcement learning under partial observability has been tackled in the operations research, control, planning, and machine learning communities. One of the goals of the workshop is to bring researchers from different backgrounds together. Moreover, the workshop aims to highlight future applications. In addition to robotics where partial observability is a well known challenge, many diverse applications such as wireless networking, human-robot interaction and autonomous driving require taking partial observability into account.

Partial observability introduces unique challenges: the agent has to remember the past but also connect the present with potential futures requiring memory, exploration, and value propagation techniques that can handle partial observability. Current model-based methods can handle discrete values and take long term information gathering into account while model-free methods can handle high-dimensional continuous problems but often assume that the state space has been created for the problem at hand such that there is sufficient information for optimal decision making or just add memory to the policy without taking partial observability explicitly into account.

In this workshop, we want to go further and ask among others the following questions.
* How can we extend deep RL methods to robustly solve partially observable problems?
* Can we learn concise abstractions of history that are sufficient for high-quality decision-making?
* There have been several successes in decision making under partial observability despite the inherent challenges. Can we characterize problems where computing good policies is feasible?
* Since decision making is hard under partial observability do we want to use more complex models and solve them approximately or use (inaccurate) simple models and solve them exactly? Or not use models at all?
* How can we use control theory together with reinforcement learning to advance decision making under partial observability?
* Can we combine the strengths of model-based and model-free methods under partial observability?
* Can recent method improvements in general RL already tackle some partially observable applications which were not previously possible?
* How do we scale up reinforcement learning in multi-agent systems with partial observability?
* Do hierarchical models / temporal abstraction improve RL efficiency under partial observability?

Opening Remarks
Joelle Pineau (Talk)
Leslie Kaelbling (Talk)
Contributed Talk 1: High-Level Strategy Selection under Partial Observability in StarCraft: Brood War (Talk)
David Silver (Talk)
Contributed Talk 2: Joint Belief Tracking and Reward Optimization through Approximate Inference (Talk)
Contributed Talk 3: Learning Dexterous In-Hand Manipulation (Talk)
Pieter Abbeel (Talk)
Spotlights & Poster Session (Spotlights)
Peter Stone (Talk)
Contributed Talk 4: Differentiable Algorithm Networks: Learning Wrong Models for Wrong Algorithms (Talk)
Jilles Dibangoye (Talk)
Anca Dragan (Talk)
Panel Discussion
Poster Session