Timezone: »
We consider the problem of teaching via demonstrations in sequential decision-making settings. In particular, we study how to design a personalized curriculum over demonstrations to speed up the learner's convergence. We provide a unified curriculum strategy for two popular learner models: Maximum Causal Entropy Inverse Reinforcement Learning (MaxEnt-IRL) and Cross-Entropy Behavioral Cloning (CrossEnt-BC). Our unified strategy induces a ranking over demonstrations based on a notion of difficulty scores computed w.r.t. the teacher's optimal policy and the learner's current policy. Compared to the state of the art, our strategy doesn't require access to the learner's internal dynamics and still enjoys similar convergence guarantees under mild technical conditions. Furthermore, we adapt our curriculum strategy to the setting where no teacher agent is present using task-specific difficulty scores. Experiments on a synthetic car driving environment and navigation-based environments demonstrate the effectiveness of our curriculum strategy.
Author Information
Gaurav Yengera (Amazon)
Computer Science master's student at Saarland University, Germany. Presently carrying out research in reinforcement learning at the Max Planck Institute for Software Systems, with prior research experience in deep learning for healthcare. Previously graduated with a bachelor's degree in electrical engineering from the Indian Institute of Technology, Varanasi.
Rati Devidze (MPI-SWS)
Parameswaran Kamalaruban (EPFL)
Adish Singla (MPI-SWS)
More from the Same Authors
-
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2021 : Poster: Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Reinforcement Learning Under Algorithmic Triage »
Eleni Straitouri · Adish Singla · Vahid Balazadeh Meresht · Manuel Rodriguez -
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2022 Poster: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Spotlight: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Poster: Envy-free Policy Teaching to Multiple Agents »
Jiarui Gan · R Majumdar · Adish Singla · Goran Radanovic -
2022 Poster: Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards »
Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2022 Poster: Provable Defense against Backdoor Policies in Reinforcement Learning »
Shubham Bharti · Xuezhou Zhang · Adish Singla · Jerry Zhu -
2021 : Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Fairness Degrading Adversarial Attacks Against Clustering Algorithms »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 Poster: Explicable Reward Design for Reinforcement Learning Agents »
Rati Devidze · Goran Radanovic · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: On Blame Attribution for Accountable Multi-Agent Sequential Decision Making »
Stelios Triantafyllou · Adish Singla · Goran Radanovic -
2021 Poster: Teaching an Active Learner with Contrastive Examples »
Chaoqi Wang · Adish Singla · Yuxin Chen -
2021 Poster: Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch »
Luca Viano · Yu-Ting Huang · Parameswaran Kamalaruban · Adrian Weller · Volkan Cevher -
2021 Poster: Teaching via Best-Case Counterexamples in the Learning-with-Equivalence-Queries Paradigm »
Akash Kumar · Yuxin Chen · Adish Singla -
2020 Poster: Synthesizing Tasks for Block-based Programming »
Umair Ahmed · Maria Christakis · Aleksandr Efremov · Nigel Fernandez · Ahana Ghosh · Abhik Roychoudhury · Adish Singla -
2020 Poster: Task-agnostic Exploration in Reinforcement Learning »
Xuezhou Zhang · Yuzhe Ma · Adish Singla -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models »
Farnam Mansouri · Yuxin Chen · Ara Vartanian · Jerry Zhu · Adish Singla -
2019 Poster: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints »
Sebastian Tschiatschek · Ahana Ghosh · Luis Haug · Rati Devidze · Adish Singla -
2018 : Assisted Inverse Reinforcement Learning »
Adish Singla · Rati Devidze -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: Teaching Inverse Reinforcement Learners via Features and Demonstrations »
Luis Haug · Sebastian Tschiatschek · Adish Singla -
2018 Poster: Enhancing the Accuracy and Fairness of Human Decision Making »
Isabel Valera · Adish Singla · Manuel Gomez Rodriguez