Timezone: »
Human explanation (e.g., in terms of feature importance) has been recently used to extend the communication channel between human and agent in interactive machine learning. Under this setting, human trainers provide not only the ground truth but also some form of explanation. However, this kind of human guidance was only investigated in supervised learning tasks, and it remains unclear how to best incorporate this type of human knowledge into deep reinforcement learning. In this paper, we present the first study of using human visual explanations in human-in-the-loop reinforcement learning (HIRL). We focus on the task of learning from feedback, in which the human trainer not only gives binary evaluative "good" or "bad" feedback for queried state-action pairs, but also provides a visual explanation by annotating relevant features in images. We propose EXPAND (EXPlanation AugmeNted feeDback) to encourage the model to encode task-relevant features through a context-aware data augmentation that only perturbs irrelevant features in human salient information. We choose five tasks, namely Pixel-Taxi and four Atari games, to evaluate the performance and sample efficiency of this approach. We show that our method significantly outperforms methods leveraging human explanation that are adapted from supervised learning, and Human-in-the-loop RL baselines that only utilize evaluative feedback.
Author Information
Lin Guan (Arizona State University)
Mudit Verma (Arizona State University)
Sihang Guo (University of Texas at Austin)
Ruohan Zhang (Stanford University)
Subbarao Kambhampati (Arizona State University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation »
Dates n/a. Room
More from the Same Authors
-
2022 : Revisiting Value Alignment Through the Lens of Human-Aware AI »
Sarath Sreedharan · Subbarao Kambhampati -
2022 : Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change) »
Karthik Valmeekam · Alberto Olmo · Sarath Sreedharan · Subbarao Kambhampati -
2022 : Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion »
Utkarsh Soni · Sarath Sreedharan · Mudit Verma · Lin Guan · Matthew Marquez · Subbarao Kambhampati -
2022 : Advice Conformance Verification by Reinforcement Learning agents for Human-in-the-Loop »
Mudit Verma · Ayush Kharkwal · Subbarao Kambhampati -
2022 : Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences »
Lin Guan · Karthik Valmeekam · Subbarao Kambhampati -
2021 Poster: Machine versus Human Attention in Deep Reinforcement Learning Tasks »
Sihang Guo · Ruohan Zhang · Bo Liu · Yifeng Zhu · Dana Ballard · Mary Hayhoe · Peter Stone -
2020 : Panel #2 »
Oren Etzioni · Heng Ji · Subbarao Kambhampati · Victoria Lin · Jiajun Wu -
2017 : Visual attention guided deep imitation learning »
Ruohan Zhang -
2016 Poster: Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain »
Ian En-Hsu Yen · Xiangru Huang · Kai Zhong · Ruohan Zhang · Pradeep Ravikumar · Inderjit Dhillon -
2013 Poster: Synthesizing Robust Plans under Incomplete Domain Models »
Tuan A Nguyen · Subbarao Kambhampati · Minh Do -
2012 Poster: Action-Model Based Multi-agent Plan Recognition »
Hankz Hankui Zhuo · Qiang Yang · Subbarao Kambhampati