Timezone: »
Safe reinforcement learning (RL) trains a policy to maximize the task reward while satisfying safety constraints. While prior works focus on performance optimality, we find that the optimal solutions of many safe RL problems are not robust and safe against observational perturbations.We formally analyze the unique properties of designing effective state adversarial attackers in the safe RL setting. We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL and proposed two new approaches - one maximizes the cost and the other maximizes the reward. One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward.We further propose a more effective adversarial training framework for safe RL and evaluate it via comprehensive experiments (video demos are available at: \url{https://sites.google.com/view/robustsaferl/home).This paper provides a pioneer work to investigate the safety and robustness of RL under observational attacks for future safe RL studies.
Author Information
ZUXIN LIU (Carnegie Mellon University)
Zijian Guo (Carnegie Mellon University)
Zhepeng Cen (Carnegie Mellon University)
Huan Zhang (CMU)
Jie Tan (Google)
Bo Li (UIUC)
DING ZHAO (Carnegie Mellon University)
More from the Same Authors
-
2021 : Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models »
Boxin Wang · Chejian Xu · Shuohang Wang · Zhe Gan · Yu Cheng · Jianfeng Gao · Ahmed Awadallah · Bo Li -
2021 : MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance »
Michael Luo · Ashwin Balakrishna · Brijen Thananjeyan · Suraj Nair · Julian Ibarz · Jie Tan · Chelsea Finn · Ion Stoica · Ken Goldberg -
2021 : Certified Robustness for Free in Differentially Private Federated Learning »
Chulin Xie · Yunhui Long · Pin-Yu Chen · Krishnaram Kenthapadi · Bo Li -
2021 : RVFR: Robust Vertical Federated Learning via Feature Subspace Recovery »
Jing Liu · Chulin Xie · Krishnaram Kenthapadi · Sanmi Koyejo · Bo Li -
2021 : What Would Jiminy Cricket Do? Towards Agents That Behave Morally »
Dan Hendrycks · Mantas Mazeika · Andy Zou · Sahil Patel · Christine Zhu · Jesus Navarro · Dawn Song · Bo Li · Jacob Steinhardt -
2022 Poster: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? »
Jiawei Jiang · Lukas Burkhalter · Fangcheng Fu · Bolin Ding · Bo Du · Anwar Hithnawi · Bo Li · Ce Zhang -
2022 : Improving Vertical Federated Learning by Efficient Communication with ADMM »
Chulin Xie · Pin-Yu Chen · Ce Zhang · Bo Li -
2022 : Hyper-Decision Transformer for Efficient Online Policy Adaptation »
Mengdi Xu · Yuchen Lu · Yikang Shen · Shun Zhang · DING ZHAO · Chuang Gan -
2022 : Benchmarking Robustness under Distribution Shift of Multimodal Image-Text Models »
Jielin Qiu · Yi Zhu · Xingjian Shi · Zhiqiang Tang · DING ZHAO · Bo Li · Mu Li -
2022 : Denoised Smoothing with Sample Rejection for Robustifying Pretrained Classifiers »
Fatemeh Sheikholeslami · Wan-Yi Lin · Jan Hendrik Metzen · Huan Zhang · J. Zico Kolter -
2022 : DensePure: Understanding Diffusion Models towards Adversarial Robustness »
Zhongzhu Chen · Kun Jin · Jiongxiao Wang · Weili Nie · Mingyan Liu · Anima Anandkumar · Bo Li · Dawn Song -
2022 : Fifteen-minute Competition Overview Video »
Nathan Drenkow · Raman Arora · Gino Perrotta · Todd Neller · Ryan Gardner · Mykel J Kochenderfer · Jared Markowitz · Corey Lowman · Casey Richardson · Bo Li · Bart Paulhamus · Ashley J Llorens · Andrew Newman -
2022 : Learning Semantics-Aware Locomotion Skills from Human Demonstrations »
Yuxiang Yang · Xiangyun Meng · Wenhao Yu · Tingnan Zhang · Jie Tan · Byron Boots -
2022 : Evaluating Worst Case Adversarial Weather Perturbations Robustness »
Yihan Wang · Yunhao Ba · Howard Zhang · Huan Zhang · Achuta Kadambi · Stefano Soatto · Alex Wong · Cho-Jui Hsieh -
2022 : Closing Remarks »
Huan Zhang · Linyi Li -
2022 : Panel Discussion »
Kamalika Chaudhuri · Been Kim · Dorsa Sadigh · Huan Zhang · Linyi Li -
2022 : Contributed Talk: DensePure: Understanding Diffusion Models towards Adversarial Robustness »
Zhongzhu Chen · Kun Jin · Jiongxiao Wang · Weili Nie · Mingyan Liu · Anima Anandkumar · Bo Li · Dawn Song -
2022 Workshop: Trustworthy and Socially Responsible Machine Learning »
Huan Zhang · Linyi Li · Chaowei Xiao · J. Zico Kolter · Anima Anandkumar · Bo Li -
2022 : Introduction and Opening Remarks »
Huan Zhang · Linyi Li -
2022 Spotlight: Fairness in Federated Learning via Core-Stability »
Bhaskar Ray Chaudhury · Linyi Li · Mintong Kang · Bo Li · Ruta Mehta -
2022 Competition: The Trojan Detection Challenge »
Mantas Mazeika · Dan Hendrycks · Huichen Li · Xiaojun Xu · Andy Zou · Sidney Hough · Arezoo Rajabi · Dawn Song · Radha Poovendran · Bo Li · David Forsyth -
2022 Spotlight: LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness »
Xiaojun Xu · Linyi Li · Bo Li -
2022 Spotlight: Lightning Talks 5B-1 »
Devansh Arpit · Xiaojun Xu · Zifan Shi · Ivan Skorokhodov · Shayan Shekarforoush · Zhan Tong · Yiqun Wang · Shichong Peng · Linyi Li · Ivan Skorokhodov · Huan Wang · Yibing Song · David Lindell · Yinghao Xu · Seyed Alireza Moazenipourasil · Sergey Tulyakov · Peter Wonka · Yiqun Wang · Ke Li · David Fleet · Yujun Shen · Yingbo Zhou · Bo Li · Jue Wang · Peter Wonka · Marcus Brubaker · Caiming Xiong · Limin Wang · Deli Zhao · Qifeng Chen · Dit-Yan Yeung -
2022 Competition: Reconnaissance Blind Chess: An Unsolved Challenge for Multi-Agent Decision Making Under Uncertainty »
Ryan Gardner · Gino Perrotta · Corey Lowman · Casey Richardson · Andrew Newman · Jared Markowitz · Nathan Drenkow · Bart Paulhamus · Ashley J Llorens · Todd Neller · Raman Arora · Bo Li · Mykel J Kochenderfer -
2022 Spotlight: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Spotlight: Lightning Talks 1A-4 »
Siwei Wang · Jing Liu · Nianqiao Ju · Shiqian Li · Eloïse Berthier · Muhammad Faaiz Taufiq · Arsene Fansi Tchango · Chen Liang · Chulin Xie · Jordan Awan · Jean-Francois Ton · Ziad Kobeissi · Wenguan Wang · Xinwang Liu · Kewen Wu · Rishab Goel · Jiaxu Miao · Suyuan Liu · Julien Martel · Ruobin Gong · Francis Bach · Chi Zhang · Rob Cornish · Sanmi Koyejo · Zhi Wen · Yee Whye Teh · Yi Yang · Jiaqi Jin · Bo Li · Yixin Zhu · Vinayak Rao · Wenxuan Tu · Gaetan Marceau Caron · Arnaud Doucet · Xinzhong Zhu · Joumana Ghosn · En Zhu -
2022 Spotlight: Lightning Talks 1A-3 »
Kimia Noorbakhsh · Ronan Perry · Qi Lyu · Jiawei Jiang · Christian Toth · Olivier Jeunen · Xin Liu · Yuan Cheng · Lei Li · Manuel Rodriguez · Julius von Kügelgen · Lars Lorch · Nicolas Donati · Lukas Burkhalter · Xiao Fu · Zhongdao Wang · Songtao Feng · Ciarán Gilligan-Lee · Rishabh Mehrotra · Fangcheng Fu · Jing Yang · Bernhard Schölkopf · Ya-Li Li · Christian Knoll · Maks Ovsjanikov · Andreas Krause · Shengjin Wang · Hong Zhang · Mounia Lalmas · Bolin Ding · Bo Du · Yingbin Liang · Franz Pernkopf · Robert Peharz · Anwar Hithnawi · Julius von Kügelgen · Bo Li · Ce Zhang -
2022 Spotlight: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? »
Jiawei Jiang · Lukas Burkhalter · Fangcheng Fu · Bolin Ding · Bo Du · Anwar Hithnawi · Bo Li · Ce Zhang -
2022 Spotlight: CoPur: Certifiably Robust Collaborative Inference via Feature Purification »
Jing Liu · Chulin Xie · Sanmi Koyejo · Bo Li -
2022 : Panel »
Pin-Yu Chen · Alex Gittens · Bo Li · Celia Cintas · Hilde Kuehne · Payel Das -
2022 : Trustworthy Machine Learning in Autonomous Driving »
Bo Li -
2022 Workshop: Decentralization and Trustworthy Machine Learning in Web3: Methodologies, Platforms, and Applications »
Jian Lou · Zhiguang Wang · Chejian Xu · Bo Li · Dawn Song -
2022 : Invited Talk #5, Privacy-Preserving Data Synthesis for General Purposes, Bo Li »
Bo Li -
2022 : Fairness Panel »
Freedom Gumedze · Rachel Cummings · Bo Li · Robert Tillman · Edward Choi -
2022 : Trustworthy Federated Learning »
Bo Li -
2022 Poster: Improving Certified Robustness via Statistical Learning with Logical Reasoning »
Zhuolin Yang · Zhikuan Zhao · Boxin Wang · Jiawei Zhang · Linyi Li · Hengzhi Pei · Bojan Karlaš · Ji Liu · Heng Guo · Ce Zhang · Bo Li -
2022 Poster: Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection »
Yiming Li · Yang Bai · Yong Jiang · Yong Yang · Shu-Tao Xia · Bo Li -
2022 Poster: Fairness in Federated Learning via Core-Stability »
Bhaskar Ray Chaudhury · Linyi Li · Mintong Kang · Bo Li · Ruta Mehta -
2022 Poster: Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning »
Wenhao Ding · Haohong Lin · Bo Li · DING ZHAO -
2022 Poster: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Poster: LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness »
Xiaojun Xu · Linyi Li · Bo Li -
2022 Poster: Curriculum Reinforcement Learning using Optimal Transport via Gradual Domain Adaptation »
Peide Huang · Mengdi Xu · Jiacheng Zhu · Laixi Shi · Fei Fang · DING ZHAO -
2022 Poster: Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation »
Zhouxing Shi · Yihan Wang · Huan Zhang · J. Zico Kolter · Cho-Jui Hsieh -
2022 Poster: CoPur: Certifiably Robust Collaborative Inference via Feature Purification »
Jing Liu · Chulin Xie · Sanmi Koyejo · Bo Li -
2022 Poster: Are AlphaZero-like Agents Robust to Adversarial Perturbations? »
Li-Cheng Lan · Huan Zhang · Ti-Rong Wu · Meng-Yu Tsai · I-Chen Wu · Cho-Jui Hsieh -
2022 Poster: Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models »
Boxin Wang · Wei Ping · Chaowei Xiao · Peng Xu · Mostofa Patwary · Mohammad Shoeybi · Bo Li · Anima Anandkumar · Bryan Catanzaro -
2022 Poster: SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles »
Chejian Xu · Wenhao Ding · Weijie Lyu · ZUXIN LIU · Shuai Wang · Yihan He · Hanjiang Hu · DING ZHAO · Bo Li -
2022 Poster: General Cutting Planes for Bound-Propagation-Based Neural Network Verification »
Huan Zhang · Shiqi Wang · Kaidi Xu · Linyi Li · Bo Li · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2021 : Career and Life: Panel Discussion - Bo Li, Adriana Romero-Soriano, Devi Parikh, and Emily Denton »
Emily Denton · Devi Parikh · Bo Li · Adriana Romero -
2021 : Live Q&A with Bo Li »
Bo Li -
2021 : Invited talk – Trustworthy Machine Learning via Logic Inference, Bo Li »
Bo Li -
2021 : Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models »
Boxin Wang · Chejian Xu · Shuohang Wang · Zhe Gan · Yu Cheng · Jianfeng Gao · Ahmed Awadallah · Bo Li -
2021 Poster: G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators »
Yunhui Long · Boxin Wang · Zhuolin Yang · Bhavya Kailkhura · Aston Zhang · Carl Gunter · Bo Li -
2021 Poster: Anti-Backdoor Learning: Training Clean Models on Poisoned Data »
Yige Li · Xixiang Lyu · Nodens Koren · Lingjuan Lyu · Bo Li · Xingjun Ma -
2021 Poster: Adversarial Attack Generation Empowered by Min-Max Optimization »
Jingkang Wang · Tianyun Zhang · Sijia Liu · Pin-Yu Chen · Jiacen Xu · Makan Fardad · Bo Li -
2021 : Reconnaissance Blind Chess + Q&A »
Ryan Gardner · Gino Perrotta · Corey Lowman · Casey Richardson · Andrew Newman · Jared Markowitz · Nathan Drenkow · Bart Paulhamus · Ashley J Llorens · Todd Neller · Raman Arora · Bo Li · Mykel J Kochenderfer -
2021 Poster: TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness »
Zhuolin Yang · Linyi Li · Xiaojun Xu · Shiliang Zuo · Qian Chen · Pan Zhou · Benjamin Rubinstein · Ce Zhang · Bo Li -
2020 Workshop: Workshop on Dataset Curation and Security »
Nathalie Baracaldo · Yonatan Bisk · Avrim Blum · Michael Curry · John Dickerson · Micah Goldblum · Tom Goldstein · Bo Li · Avi Schwarzschild -
2020 Poster: Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond »
Kaidi Xu · Zhouxing Shi · Huan Zhang · Yihan Wang · Kai-Wei Chang · Minlie Huang · Bhavya Kailkhura · Xue Lin · Cho-Jui Hsieh -
2020 Poster: Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations »
Huan Zhang · Hongge Chen · Chaowei Xiao · Bo Li · Mingyan Liu · Duane Boning · Cho-Jui Hsieh -
2020 Spotlight: Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations »
Huan Zhang · Hongge Chen · Chaowei Xiao · Bo Li · Mingyan Liu · Duane Boning · Cho-Jui Hsieh -
2020 Poster: An Efficient Adversarial Attack for Tree Ensembles »
Chong Zhang · Huan Zhang · Cho-Jui Hsieh -
2020 Poster: Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes »
Mengdi Xu · Wenhao Ding · Jiacheng Zhu · ZUXIN LIU · Baiming Chen · Ding Zhao -
2020 Poster: On Convergence of Nearest Neighbor Classifiers over Feature Transformations »
Luka Rimanic · Cedric Renggli · Bo Li · Ce Zhang