Timezone: »
Despite major efforts in recent years to improve explainability of deep neural networks, the tools we use for communicating explanations have largely remained the same: visualizations of representative inputs, salient input regions, and local model approximations. But when humans describe complex decision rules, we often use a different explanatory tool: natural language. I'll describe recent work on explaining models for computer vision tasks by automatically constructing natural language descriptions of individual neurons. These descriptions ground prediction in meaningful perceptual and linguistic abstractions, and can be used to surface unexpected model behaviors, and identify and mitigate adversarial vulnerabilities. These results show that fine-grained, automatic annotation of deep network models is both possible and practical: rich, language-based explanations produced by automated annotation procedures can surface meaningful and actionable information about deep networks.
Author Information
Jacob Andreas (MIT)
More from the Same Authors
-
2022 Workshop: LaReL: Language and Reinforcement Learning »
Laetitia Teodorescu · Laura Ruis · Tristan Karch · Cédric Colas · Paul Barde · Jelena Luketina · Athul Jacob · Pratyusha Sharma · Edward Grefenstette · Jacob Andreas · Marc-Alexandre Côté -
2022 Poster: Pre-Trained Language Models for Interactive Decision-Making »
Shuang Li · Xavier Puig · Chris Paxton · Yilun Du · Clinton Wang · Linxi Fan · Tao Chen · De-An Huang · Ekin Akyürek · Anima Anandkumar · Jacob Andreas · Igor Mordatch · Antonio Torralba · Yuke Zhu -
2021 : Q/A Session »
Alice Xiang · Jacob Andreas -
2021 Poster: Teachable Reinforcement Learning via Advice Distillation »
Olivia Watkins · Abhishek Gupta · Trevor Darrell · Pieter Abbeel · Jacob Andreas -
2020 Poster: A Benchmark for Systematic Generalization in Grounded Language Understanding »
Laura Ruis · Jacob Andreas · Marco Baroni · Diane Bouchacourt · Brenden Lake -
2020 Poster: Compositional Explanations of Neurons »
Jesse Mu · Jacob Andreas -
2020 Oral: Compositional Explanations of Neurons »
Jesse Mu · Jacob Andreas -
2019 : Panel Discussion »
Jacob Andreas · Edward Gibson · Stefan Lee · Noga Zaslavsky · Jason Eisner · Jürgen Schmidhuber -
2019 : Invited Talk - 4 »
Jacob Andreas -
2018 Poster: Speaker-Follower Models for Vision-and-Language Navigation »
Daniel Fried · Ronghang Hu · Volkan Cirik · Anna Rohrbach · Jacob Andreas · Louis-Philippe Morency · Taylor Berg-Kirkpatrick · Kate Saenko · Dan Klein · Trevor Darrell -
2017 : Afternoon Panel discussion »
Brian Skyrms · Satinder Singh · Jacob Andreas -
2017 : Poster session (and Coffee Break) »
Jacob Andreas · Kun Li · Conner Vercellino · Thomas Miconi · Wenpeng Zhang · Luca Franceschi · Zheng Xiong · Karim Ahmed · Laurent Itti · Tim Klinger · Mostafa Rohaninejad -
2015 Poster: On the Accuracy of Self-Normalized Log-Linear Models »
Jacob Andreas · Maxim Rabinovich · Michael Jordan · Dan Klein -
2014 Poster: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2014 Spotlight: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein