Timezone: »
We study black-box reward poisoning attacks against reinforcement learning (RL), in which an adversary aims to manipulate the rewards to mislead a sequence of RL agents with unknown algorithms to learn a nefarious policy in an environment unknown to the adversary a priori. That is, our attack makes minimum assumptions on the prior knowledge of the adversary: it has no initial knowledge of the environment or the learner, and neither does it observe the learner's internal mechanism except for its performed actions. We design a novel black-box attack, U2, that can provably achieve a near-matching performance to the state-of-the-art white-box attack, demonstrating the feasibility of reward poisoning even in the most challenging black-box setting.
Author Information
Amin Rakhsha (University of Toronto)
Xuezhou Zhang (Princeton)
Jerry Zhu (University of Wisconsin-Madison)
Adish Singla (MPI-SWS)
More from the Same Authors
-
2021 Spotlight: Neural Additive Models: Interpretable Machine Learning with Neural Nets »
Rishabh Agarwal · Levi Melnick · Nicholas Frosst · Xuezhou Zhang · Ben Lengerich · Rich Caruana · Geoffrey Hinton -
2021 : Game Redesign in No-regret Game Playing »
Yuzhe Ma · Young Wu · Jerry Zhu -
2021 : Poster: Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Reinforcement Learning Under Algorithmic Triage »
Eleni Straitouri · Adish Singla · Vahid Balazadeh Meresht · Manuel Rodriguez -
2021 : Game Redesign in No-regret Game Playing »
Yuzhe Ma · Young Wu · Jerry Zhu -
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2022 Poster: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2023 Poster: Mechanism Design for Collaborative Normal Mean Estimation »
Yiding Chen · Jerry Zhu · Kirthevasan Kandasamy -
2023 Poster: Dream the Impossible: Outlier Imagination with Diffusion Models »
Xuefeng Du · Yiyou Sun · Jerry Zhu · Yixuan Li -
2023 Workshop: Generative AI for Education (GAIED): Advances, Opportunities, and Challenges »
Paul Denny · Sumit Gulwani · Neil Heffernan · Tanja Käser · Steven Moore · Anna Rafferty · Adish Singla -
2022 Spotlight: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Poster: Envy-free Policy Teaching to Multiple Agents »
Jiarui Gan · R Majumdar · Adish Singla · Goran Radanovic -
2022 Poster: Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards »
Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2022 Poster: Provable Defense against Backdoor Policies in Reinforcement Learning »
Shubham Bharti · Xuezhou Zhang · Adish Singla · Jerry Zhu -
2022 Poster: Operator Splitting Value Iteration »
Amin Rakhsha · Andrew Wang · Mohammad Ghavamzadeh · Amir-massoud Farahmand -
2021 : Representation Learning for Online and Offline RL in Low-rank MDPs »
Masatoshi Uehara · Xuezhou Zhang · Wen Sun -
2021 : Representation Learning for Online and Offline RL in Low-rank MDPs »
Masatoshi Uehara · Xuezhou Zhang · Wen Sun -
2021 : Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Fairness Degrading Adversarial Attacks Against Clustering Algorithms »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 Poster: Curriculum Design for Teaching via Demonstrations: Theory and Applications »
Gaurav Yengera · Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: Explicable Reward Design for Reinforcement Learning Agents »
Rati Devidze · Goran Radanovic · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: On Blame Attribution for Accountable Multi-Agent Sequential Decision Making »
Stelios Triantafyllou · Adish Singla · Goran Radanovic -
2021 Poster: Teaching an Active Learner with Contrastive Examples »
Chaoqi Wang · Adish Singla · Yuxin Chen -
2021 Poster: Neural Additive Models: Interpretable Machine Learning with Neural Nets »
Rishabh Agarwal · Levi Melnick · Nicholas Frosst · Xuezhou Zhang · Ben Lengerich · Rich Caruana · Geoffrey Hinton -
2021 Poster: Teaching via Best-Case Counterexamples in the Learning-with-Equivalence-Queries Paradigm »
Akash Kumar · Yuxin Chen · Adish Singla -
2020 Poster: Synthesizing Tasks for Block-based Programming »
Umair Ahmed · Maria Christakis · Aleksandr Efremov · Nigel Fernandez · Ahana Ghosh · Abhik Roychoudhury · Adish Singla -
2020 Poster: Task-agnostic Exploration in Reinforcement Learning »
Xuezhou Zhang · Yuzhe Ma · Adish Singla -
2019 Poster: Policy Poisoning in Batch Reinforcement Learning and Control »
Yuzhe Ma · Xuezhou Zhang · Wen Sun · Jerry Zhu -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models »
Farnam Mansouri · Yuxin Chen · Ara Vartanian · Jerry Zhu · Adish Singla -
2019 Poster: A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning »
Xuanqing Liu · Si Si · Jerry Zhu · Yang Li · Cho-Jui Hsieh -
2019 Poster: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints »
Sebastian Tschiatschek · Ahana Ghosh · Luis Haug · Rati Devidze · Adish Singla -
2018 : Assisted Inverse Reinforcement Learning »
Adish Singla · Rati Devidze -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: Teaching Inverse Reinforcement Learners via Features and Demonstrations »
Luis Haug · Sebastian Tschiatschek · Adish Singla -
2018 Poster: Enhancing the Accuracy and Fairness of Human Decision Making »
Isabel Valera · Adish Singla · Manuel Gomez Rodriguez -
2018 Poster: Adversarial Attacks on Stochastic Bandits »
Kwang-Sung Jun · Lihong Li · Yuzhe Ma · Jerry Zhu -
2017 Workshop: Teaching Machines, Robots, and Humans »
Maya Cakmak · Anna Rafferty · Adish Singla · Jerry Zhu · Sandra Zilles -
2016 : Optimal Teaching for Online Perceptrons »
Xuezhou Zhang · Jerry Zhu -
2016 Workshop: The Future of Interactive Machine Learning »
Kory Mathewson @korymath · Kaushik Subramanian · Mark Ho · Robert Loftin · Joseph L Austerweil · Anna Harutyunyan · Doina Precup · Layla El Asri · Matthew Gombolay · Jerry Zhu · Sonia Chernova · Charles Isbell · Patrick M Pilarski · Weng-Keen Wong · Manuela Veloso · Julie A Shah · Matthew Taylor · Brenna Argall · Michael Littman -
2016 Poster: Active Learning with Oracle Epiphany »
Tzu-Kuo Huang · Lihong Li · Ara Vartanian · Saleema Amershi · Jerry Zhu -
2015 Poster: Human Memory Search as Initial-Visit Emitting Random Walk »
Kwang-Sung Jun · Jerry Zhu · Timothy T Rogers · Zhuoran Yang · Ming Yuan -
2014 Poster: Optimal Teaching for Limited-Capacity Human Learners »
Kaustubh R Patil · Jerry Zhu · Łukasz Kopeć · Bradley C Love -
2014 Spotlight: Optimal Teaching for Limited-Capacity Human Learners »
Kaustubh R Patil · Jerry Zhu · Łukasz Kopeć · Bradley C Love -
2013 Poster: Machine Teaching for Bayesian Learners in the Exponential Family »
Jerry Zhu -
2011 Poster: How Do Humans Teach: On Curriculum Learning and Teaching Dimension »
Faisal Khan · Jerry Zhu · Bilge Mutlu -
2011 Poster: Learning Higher-Order Graph Structure with Features by Structure Penalty »
Shilin Ding · Grace Wahba · Jerry Zhu -
2010 Oral: Humans Learn Using Manifolds, Reluctantly »
Bryan R Gibson · Jerry Zhu · Timothy T Rogers · Chuck Kalish · Joseph Harrison -
2010 Poster: Humans Learn Using Manifolds, Reluctantly »
Bryan R Gibson · Jerry Zhu · Timothy T Rogers · Chuck Kalish · Joseph Harrison -
2010 Poster: Transduction with Matrix Completion: Three Birds with One Stone »
Andrew B Goldberg · Jerry Zhu · Benjamin Recht · Junming Sui · Rob Nowak -
2010 Session: Spotlights Session 1 »
Jerry Zhu -
2009 Poster: Human Rademacher Complexity »
Jerry Zhu · Timothy T Rogers · Bryan R Gibson -
2008 Workshop: Machine learning meets human learning »
Nathaniel D Daw · Tom Griffiths · Josh Tenenbaum · Jerry Zhu -
2008 Poster: Human Active Learning »
Jerry Zhu · Rui M Castro · Timothy T Rogers · Rob Nowak · Ruichen Qian · Chuck Kalish -
2008 Poster: Unlabeled data: Now it helps, now it doesn't »
Aarti Singh · Rob Nowak · Jerry Zhu -
2008 Oral: Unlabeled data: Now it helps, now it doesn't »
Aarti Singh · Rob Nowak · Jerry Zhu