Timezone: »
Learning near-optimal behaviour from an expert's demonstrations typically relies on the assumption that the learner knows the features that the true reward function depends on. In this paper, we study the problem of learning from demonstrations in the setting where this is not the case, i.e., where there is a mismatch between the worldviews of the learner and the expert. We introduce a natural quantity, the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms based on inverse reinforcement learning. Based on these findings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimal policy.
Author Information
Luis Haug (ETH Zurich)
Sebastian Tschiatschek (Microsoft Research)
Adish Singla (MPI-SWS)
More from the Same Authors
-
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2021 : Poster: Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Reinforcement Learning Under Algorithmic Triage »
Eleni Straitouri · Adish Singla · Vahid Balazadeh Meresht · Manuel Rodriguez -
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2022 Poster: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Spotlight: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Poster: Envy-free Policy Teaching to Multiple Agents »
Jiarui Gan · R Majumdar · Adish Singla · Goran Radanovic -
2022 Poster: Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards »
Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2022 Poster: Provable Defense against Backdoor Policies in Reinforcement Learning »
Shubham Bharti · Xuezhou Zhang · Adish Singla · Jerry Zhu -
2021 : Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Fairness Degrading Adversarial Attacks Against Clustering Algorithms »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 Poster: Curriculum Design for Teaching via Demonstrations: Theory and Applications »
Gaurav Yengera · Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: Explicable Reward Design for Reinforcement Learning Agents »
Rati Devidze · Goran Radanovic · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: On Blame Attribution for Accountable Multi-Agent Sequential Decision Making »
Stelios Triantafyllou · Adish Singla · Goran Radanovic -
2021 Poster: Information Directed Reward Learning for Reinforcement Learning »
David Lindner · Matteo Turchetta · Sebastian Tschiatschek · Kamil Ciosek · Andreas Krause -
2021 Poster: Teaching an Active Learner with Contrastive Examples »
Chaoqi Wang · Adish Singla · Yuxin Chen -
2021 Poster: Teaching via Best-Case Counterexamples in the Learning-with-Equivalence-Queries Paradigm »
Akash Kumar · Yuxin Chen · Adish Singla -
2020 Poster: Synthesizing Tasks for Block-based Programming »
Umair Ahmed · Maria Christakis · Aleksandr Efremov · Nigel Fernandez · Ahana Ghosh · Abhik Roychoudhury · Adish Singla -
2020 Poster: Task-agnostic Exploration in Reinforcement Learning »
Xuezhou Zhang · Yuzhe Ma · Adish Singla -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models »
Farnam Mansouri · Yuxin Chen · Ara Vartanian · Jerry Zhu · Adish Singla -
2019 Poster: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints »
Sebastian Tschiatschek · Ahana Ghosh · Luis Haug · Rati Devidze · Adish Singla -
2018 : Assisted Inverse Reinforcement Learning »
Adish Singla · Rati Devidze -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: Enhancing the Accuracy and Fairness of Human Decision Making »
Isabel Valera · Adish Singla · Manuel Gomez Rodriguez -
2016 Poster: Variational Inference in Mixed Probabilistic Submodular Models »
Josip Djolonga · Sebastian Tschiatschek · Andreas Krause -
2016 Poster: Cooperative Graphical Models »
Josip Djolonga · Stefanie Jegelka · Sebastian Tschiatschek · Andreas Krause -
2014 Poster: Learning Mixtures of Submodular Functions for Image Collection Summarization »
Sebastian Tschiatschek · Rishabh K Iyer · Haochen Wei · Jeffrey A Bilmes