Timezone: »
As reinforcement learning agents become increasingly integrated into complex, real-world environments, designing for safety becomes a critical consideration. We introduce Safety Aware Reinforcement Learning (SARL) - a framework where a virtual safe agent modulates the actions of a main reward-based agent to minimize side effects. Here, a safe agent learns a task-independent notion of safety for a given environment. The main agent is then trained with a regularization loss given by the distance between the native action probabilities of the two agents, allowing us to learn a task-independent notion of safety. This notion can then be ported to modulate multiple policies solving different tasks within the given environment without further training. We contrast this with solutions that rely on task-specific regularization metrics and test our framework on the SafeLife Suite, based on Conway's Game of Life, comprising a number of complex tasks in dynamic environments.
Author Information
Santiago Miret (Intel AI Lab)
More from the Same Authors
-
2022 : Multi-Objective GFlowNets »
Moksh Jain · Sharath Chandra Raparthy · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Yoshua Bengio · Santiago Miret · Emmanuel Bengio -
2022 : On Multi-information source Constraint Active Search »
Gustavo Malkomes · Bolong Cheng · Santiago Miret -
2022 : PhAST: Physics-Aware, Scalable, and Task-specific GNNs for accelerated catalyst design »
ALEXANDRE DUVAL · Victor Schmidt · Alex Hernandez-Garcia · Santiago Miret · Yoshua Bengio · David Rolnick -
2022 : Human-in-the-Loop Approaches For Task Guidance In Manufacturing Settings »
Ramesh Manuvinakurike · Santiago Miret · Richard Beckwith · Saurav Sahay · Giuseppe Raffa -
2022 : Group SELFIES: A Robust Fragment-Based Molecular String Representation »
Austin Cheng · Andy Cai · Santiago Miret · Gustavo Malkomes · Mariano Phielipp · Alan Aspuru-Guzik -
2022 : Hyperparameter Optimization of Graph Neural Networks for the OpenCatalyst Dataset: A Case Study »
Carmelo Gonzales · Eric Lee · Kin Long Kelvin Lee · Joyce Tang · Santiago Miret -
2022 : Conformer Search Using SE3-Transformers and Imitation Learning »
Luca Thiede · Santiago Miret · Krzysztof Sadowski · Haoping Xu · Mariano Phielipp · Alan Aspuru-Guzik -
2022 : Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in Materials Science »
Santiago Miret · Kin Long Kelvin Lee · Carmelo Gonzales · Marcel Nassar · Krzysztof Sadowski -
2022 Workshop: AI for Accelerated Materials Design (AI4Mat) »
Santiago Miret · Marta Skreta · Zamyla Morgan-Chan · Benjamin Sanchez-Lengeling · Shyue Ping Ong · Alan Aspuru-Guzik -
2021 : Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization »
Santiago Miret · Vui Seng Chua · Mattias Marder · Mariano Phielipp · Nilesh Jain · Somdeb Majumdar