Timezone: »

 
SCERL: A Benchmark for intersecting language and safe reinforcement learning
Lan Hoang · Shivam Ratnakar · Nicolas Galichet · Akifumi Wachi · Keerthiram Murugesan · Songtao Lu · Mattia Atzeni · Michael Katz · Subhajit Chaudhury
Event URL: https://openreview.net/forum?id=rNmrhsewsUX »

The issue of safety and robustness is a critical focus for AI research. Two lines of research are so far distinct, namely (i) safe reinforcement learning, where an agent needs to interact with the world under safety constraints, and (ii) textual reinforcement learning, where agents need to perform robust reasoning and modelling of the state of the environment. In this paper, we propose Safety-Constrained Environments for Reinforcement Learning (SCERL), a benchmark to bridge the gap between these two research directions. The contribution of this benchmark is safety-relevant environments with i) a sample set of 20 games built on new logical rules to represent physical safety issues; ii) added monitoring of safety violations and iii) a mechanism to further generate a more diverse set of games with safety constraints and their corresponding metrics of safety types and difficulties. This paper shows selected baseline results on the benchmark. Our aim is for the SCERL benchmark and its flexible framework to provide a set of tasks to demonstrate language-based safety challenges to inspire the research community to further explore safety applications in a text-based domain.

Author Information

Lan Hoang (IBM Research UK)

My research interests are Deep Reinforcement Learning, GIS, decision support systems, interdependencies of complex systems, agent-based modelling and uncertainty analysis. My focus is to create applied research outputs that can address industry's needs. I have a background in Physical Geography and Environmental Sciences, in particular decision making under climate change impacts, hydrology, water management and GIS applications for environmental management.

Shivam Ratnakar (International Business Machines)
Nicolas Galichet
Akifumi Wachi (IBM Research)
Keerthiram Murugesan (IBM Research)
Songtao Lu (IBM Thomas J. Watson Research Center)
Mattia Atzeni (Swiss Federal Institute of Technology Lausanne)
Michael Katz (IBM Research)
Subhajit Chaudhury (International Business Machines)

More from the Same Authors