NIPS Enabling Robots to Communicate Reward Functions, Sandy Huang, David Held, Pieter Abbeel and Anca Dragan

Paper Presentation
in
Workshop: The Future of Interactive Machine Learning

Enabling Robots to Communicate Reward Functions, Sandy Huang, David Held, Pieter Abbeel and Anca Dragan

[ Abstract ]

2016 Paper Presentation

Abstract:

Understanding a robot's reward function is key to anticipating how the robot will act in a new situation. Our goal is to generate a set of robot behaviors that best illustrates a robot's reward function. We build on prior work modeling inference of the reward function from example behavior via Inverse Reinforcement Learning (IRL). Prior work using IRL has focused on people teaching machines and assumes exact inference. Our insight is that when teaching people, they will not perform exact inference. We show that while leveraging models of noisy inference can be beneficial, it is also important to achieve coverage in the space of possible strategies the robot can use. We introduce a hybrid algorithm that targets informative examples via both a noisy inference model and coverage.

Live content is unavailable. Log in and register to view live content

Paper Presentation in Workshop: The Future of Interactive Machine Learning

Enabling Robots to Communicate Reward Functions, Sandy Huang, David Held, Pieter Abbeel and Anca Dragan

Paper Presentation
in
Workshop: The Future of Interactive Machine Learning