Timezone: »

Safe Reinforcement Learning with Natural Language Constraints
Jimmy Yang · Michael Y Hu · Yinlam Chow · Peter J Ramadge · Karthik Narasimhan

@ None

While safe reinforcement learning (RL) holds great promise for many practical applications like robotics or autonomous cars, current approaches require specifying constraints in mathematical form. Such specifications demand domain expertise, limiting the adoption of safe RL. In this paper, we propose learning to interpret natural language constraints for safe RL. To this end, we first introduce HAZARDWORLD, a new multi-task benchmark that requires an agent to optimize reward while not violating constraints specified in free-form text. We then develop an agent with a modular architecture that can interpret and adhere to such textual constraints while learning new tasks. Our model consists of (1) a constraint interpreter that encodes textual constraints into spatial and temporal representations of forbidden states, and (2) a policy network that uses these representations to produce a policy achieving minimal constraint violations during training. Across different domains in HAZARDWORLD, we show that our method achieves higher rewards (up to11x) and fewer constraint violations (by 1.8x) compared to existing approaches. However, in terms of absolute performance, HAZARDWORLD still poses significant challenges for agents to learn efficiently, motivating the need for future work.

Author Information

Jimmy Yang (Princeton University)

I am a graduate student in the Department of Electrical Engineering at Princeton University, working with Prof. Peter Ramadge and Prof. Karthik Narasimhan since September 2017. My research interests lie at the intersection of machine learning, reinforcement learning, and natural language processing. Specifically, I work on safe reinforcement learning, focusing on building autonomous systems that acquire knowledge by interacting with the world, and providing provable safety guarantees during training and deployment.

Michael Y Hu (Princeton University)
Yinlam Chow (Google Research)
Peter J Ramadge (Princeton)
Karthik Narasimhan (Princeton University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors