Timezone: »
Humans have the remarkable ability to follow the gaze of other people to identify what they are looking at. Following eye gaze, or gaze-following, is an important ability that allows us to understand what other people are thinking, the actions they are performing, and even predict what they might do next. Despite the importance of this topic, this problem has only been studied in limited scenarios within the computer vision community. In this paper, we propose a deep neural network-based approach for gaze-following and a new benchmark dataset for thorough evaluation. Given an image and the location of a head, our approach follows the gaze of the person and identifies the object being looked at. After training, the network is able to discover how to extract head pose and gaze orientation, and to select objects in the scene that are in the predicted line of sight and likely to be looked at (such as televisions, balls and food). The quantitative evaluation shows that our approach produces reliable results, even when viewing only the back of the head. While our method outperforms several baseline approaches, we are still far from reaching human performance at this task. Overall, we believe that this is a challenging and important task that deserves more attention from the community.
Author Information
Adria Recasens (MIT)
Aditya Khosla (MIT)
Carl Vondrick (MIT)
Antonio Torralba (MIT)
More from the Same Authors
-
2022 Poster: Procedural Image Programs for Representation Learning »
Manel Baradad · Richard Chen · Jonas Wulff · Tongzhou Wang · Rogerio Feris · Antonio Torralba · Phillip Isola -
2022 Poster: Learning Neural Acoustic Fields »
Andrew Luo · Yilun Du · Michael Tarr · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2022 Poster: Pre-Trained Language Models for Interactive Decision-Making »
Shuang Li · Xavier Puig · Chris Paxton · Yilun Du · Clinton Wang · Linxi Fan · Tao Chen · De-An Huang · Ekin Akyürek · Anima Anandkumar · Jacob Andreas · Igor Mordatch · Antonio Torralba · Yuke Zhu -
2022 Poster: ActionSense: A Multimodal Dataset and Recording Framework for Human Activities Using Wearable Sensors in a Kitchen Environment »
Joseph DelPreto · Chao Liu · Yiyue Luo · Michael Foshey · Yunzhu Li · Antonio Torralba · Wojciech Matusik · Daniela Rus -
2020 Poster: Debiased Contrastive Learning »
Ching-Yao Chuang · Joshua Robinson · Yen-Chen Lin · Antonio Torralba · Stefanie Jegelka -
2020 Spotlight: Debiased Contrastive Learning »
Ching-Yao Chuang · Joshua Robinson · Yen-Chen Lin · Antonio Torralba · Stefanie Jegelka -
2018 : Panel Discussion »
Antonio Torralba · Douwe Kiela · Barbara Landau · Angeliki Lazaridou · Joyce Chai · Christopher Manning · Stevan Harnad · Roozbeh Mottaghi -
2018 : Antonio Torralba - Learning to See and Hear »
Antonio Torralba -
2018 Poster: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding »
Kexin Yi · Jiajun Wu · Chuang Gan · Antonio Torralba · Pushmeet Kohli · Josh Tenenbaum -
2018 Poster: 3D-Aware Scene Manipulation via Inverse Graphics »
Shunyu Yao · Tzu Ming Hsu · Jun-Yan Zhu · Jiajun Wu · Antonio Torralba · Bill Freeman · Josh Tenenbaum -
2018 Spotlight: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding »
Kexin Yi · Jiajun Wu · Chuang Gan · Antonio Torralba · Pushmeet Kohli · Josh Tenenbaum -
2016 Poster: Generating Videos with Scene Dynamics »
Carl Vondrick · Hamed Pirsiavash · Antonio Torralba -
2016 Poster: SoundNet: Learning Sound Representations from Unlabeled Video »
Yusuf Aytar · Carl Vondrick · Antonio Torralba -
2015 Poster: Skip-Thought Vectors »
Jamie Kiros · Yukun Zhu · Russ Salakhutdinov · Richard Zemel · Raquel Urtasun · Antonio Torralba · Sanja Fidler -
2015 Spotlight: Where are they looking? »
Adria Recasens · Aditya Khosla · Carl Vondrick · Antonio Torralba -
2015 Poster: Learning visual biases from human imagination »
Carl Vondrick · Hamed Pirsiavash · Aude Oliva · Antonio Torralba -
2012 Poster: Modeling the Forgetting Process using Image Regions »
Aditya Khosla · Jianxiong Xiao · Antonio Torralba · Aude Oliva -
2011 Poster: Video Annotation and Tracking with Active Learning »
Carl Vondrick · Deva Ramanan