Timezone: »
Deep reinforcement learning (RL) algorithms are powerful tools for solving visuomotor decision tasks. However, the trained models are often difficult to interpret, because they are represented as end-to-end deep neural networks. In this paper, we shed light on the inner workings of such trained models by analyzing the pixels that they attend to during task execution, and comparing them with the pixels attended to by humans executing the same tasks. To this end, we investigate the following two questions that, to the best of our knowledge, have not been previously studied. 1) How similar are the visual representations learned by RL agents and humans when performing the same task? and, 2) How do similarities and differences in these learned representations explain RL agents' performance on these tasks? Specifically, we compare the saliency maps of RL agents against visual attention models of human experts when learning to play Atari games. Further, we analyze how hyperparameters of the deep RL algorithm affect the learned representations and saliency maps of the trained agents. The insights provided have the potential to inform novel algorithms for closing the performance gap between human experts and RL agents.
Author Information
Sihang Guo (University of Texas at Austin)
Ruohan Zhang (Stanford University)
Bo Liu (Stanford University)
Yifeng Zhu (The University of Texas at Austin)
Dana Ballard (University of Texas, Austin)
Dana H. Ballard obtained his undergraduate degree in Aeronautics and Astronautics from M.I.T. in 1967. Subsequently he obtained MS and PhD degrees in information engineering from the University of Michigan and the University of California at Irvine in 1969 and 1974 respectively. He is the author of two books, Computer Vision (with Christopher Brown) and An Introduction to Natural Computation. His main research interest is in computational theories of the brain with emphasis on human vision. His research places emphasis on Embodied Cognition. Starting in 1985, he and Chris Brown designed and built the first high-speed binocular camera control system capable of simulating human eye movements in real time. Currently he pursues this research at the University of Texas at Austin by using model humans in virtual reality environments. His focus is on the use of machine learning as a model for human behavior with an emphasis on reinforcement learning
Mary Hayhoe (University of Texas, Austin)
Peter Stone (The University of Texas at Austin, Sony AI)
More from the Same Authors
-
2020 : Paper 19: Multiagent Driving Policy for Congestion Reduction in a Large Scale Scenario »
Jiaxun Cui · Peter Stone -
2021 Spotlight: Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation »
Lin Guan · Mudit Verma · Sihang Guo · Ruohan Zhang · Subbarao Kambhampati -
2021 : Task-Independent Causal State Abstraction »
Zizhao Wang · Xuesu Xiao · Yuke Zhu · Peter Stone -
2021 : Leveraging Information about Background Music in Human-Robot Interaction »
Elad Liebman · Peter Stone -
2021 : Safe Evaluation For Offline Learning: \\Are We Ready To Deploy? »
Hager Radi · Josiah Hanna · Peter Stone · Matthew Taylor -
2021 : Safe Evaluation For Offline Learning: \\Are We Ready To Deploy? »
Hager Radi · Josiah Hanna · Peter Stone · Matthew Taylor -
2022 : BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Mao Ye · Bo Liu · Stephen Wright · Peter Stone · Qiang Liu -
2022 : ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning »
Eddy Hudson · Ishan Durugkar · Garrett Warnell · Peter Stone -
2022 : ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning »
Eddy Hudson · Ishan Durugkar · Garrett Warnell · Peter Stone -
2022 : Panel RL Theory-Practice Gap »
Peter Stone · Matej Balog · Jonas Buchli · Jason Gauci · Dhruv Madeka -
2022 : Panel RL Benchmarks »
Minmin Chen · Pablo Samuel Castro · Caglar Gulcehre · Tony Jebara · Peter Stone -
2022 : Invited talk: Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning »
Peter Stone -
2022 : Human in the Loop Learning for Robot Navigation and Task Learning from Implicit Human Feedback »
Peter Stone -
2022 Poster: BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Bo Liu · Mao Ye · Stephen Wright · Peter Stone · Qiang Liu -
2022 Poster: Value Function Decomposition for Iterative Design of Reinforcement Learning Agents »
James MacGlashan · Evan Archer · Alisa Devlic · Takuma Seno · Craig Sherstan · Peter Wurman · Peter Stone -
2021 Poster: Adversarial Intrinsic Motivation for Reinforcement Learning »
Ishan Durugkar · Mauricio Tec · Scott Niekum · Peter Stone -
2021 Poster: Conflict-Averse Gradient Descent for Multi-task learning »
Bo Liu · Xingchao Liu · Xiaojie Jin · Peter Stone · Qiang Liu -
2021 Poster: Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation »
Lin Guan · Mudit Verma · Sihang Guo · Ruohan Zhang · Subbarao Kambhampati -
2020 : Q&A: Peter Stone (The University of Texas at Austin): Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination, with Natasha Jaques (Google) [moderator] »
Peter Stone · Natasha Jaques -
2020 : Invited Speaker: Peter Stone (The University of Texas at Austin) on Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination »
Peter Stone -
2020 : Panel discussion »
Pierre-Yves Oudeyer · Marc Bellemare · Peter Stone · Matt Botvinick · Susan Murphy · Anusha Nagabandi · Ashley Edwards · Karen Liu · Pieter Abbeel -
2020 : Discussion Panel »
Pete Florence · Dorsa Sadigh · Carolina Parada · Jeannette Bohg · Roberto Calandra · Peter Stone · Fabio Ramos -
2020 : Invited talk: Peter Stone "Grounded Simulation Learning for Sim2Real with Connections to Off-Policy Reinforcement Learning" »
Peter Stone -
2020 Poster: Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks »
Lemeng Wu · Bo Liu · Peter Stone · Qiang Liu -
2020 Poster: An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch »
Siddharth Desai · Ishan Durugkar · Haresh Karnan · Garrett Warnell · Josiah Hanna · Peter Stone -
2018 : Peter Stone »
Peter Stone -
2018 : Control Algorithms for Imitation Learning from Observation »
Peter Stone -
2018 : Peter Stone »
Peter Stone -
2017 : Visual attention guided deep imitation learning »
Ruohan Zhang -
2016 : Peter Stone (University of Texas at Austin) »
Peter Stone -
2016 Poster: Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain »
Ian En-Hsu Yen · Xiangru Huang · Kai Zhong · Ruohan Zhang · Pradeep Ravikumar · Inderjit Dhillon -
2015 Workshop: Learning, Inference and Control of Multi-Agent Systems »
Vicenç Gómez · Gerhard Neumann · Jonathan S Yedidia · Peter Stone -
2010 Tutorial: Reinforcement Learning for Embodied Cognition »
Dana Ballard