Timezone: »
Feature attribution methods are exceedingly popular in interpretable machine learning. They aim to compute the attribution of each input feature to represent its importance, but there is no consensus on the definition of "attribution", leading to many competing methods with little systematic evaluation. The lack of ground truth for feature attribution particularly complicates evaluation; to address this, we propose a dataset modification procedure where we construct attribution ground truth. Using this procedure, we evaluate three common interpretability methods: saliency maps, rationales, and attention. We identify several deficiencies and add new perspectives to the growing body of evidence questioning the correctness and reliability of these methods in the wild. Our evaluation approach is model-agnostic and can be used to assess future feature attribution method proposals as well.
Author Information
Yilun Zhou (MIT)
Serena Booth (Massachusetts Institute of Technology)
Marco Tulio Ribeiro (Microsoft Research)
Julie A Shah (MIT)
More from the Same Authors
-
2022 : Trading off Utility, Informativeness, and Complexity in Emergent Communication »
Mycal Tucker · Julie A Shah · Roger Levy · Noga Zaslavsky -
2022 : Fast Adaptation via Human Diagnosis of Task Distribution Shift »
Andi Peng · Mark Ho · Aviv Netanyahu · Julie A Shah · Pulkit Agrawal -
2022 : Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations »
Felix Yanwei Wang · Nadia Figueroa · Shen Li · Ankit Shah · Julie A Shah -
2022 : Aligning Robot Representations with Humans »
Andreea Bobu · Andi Peng · Pulkit Agrawal · Julie A Shah · Anca Dragan -
2022 : Generalization and Translatability in Emergent Communication via Informational Constraints »
Mycal Tucker · Roger Levy · Julie A Shah · Noga Zaslavsky -
2023 Poster: Human-Guided Complexity-Controlled Abstractions »
Andi Peng · Mycal Tucker · Eoin Kenny · Noga Zaslavsky · Pulkit Agrawal · Julie A Shah -
2023 Poster: Collaborative Development of NLP Models »
Fereshte Khani · Marco Tulio Ribeiro -
2022 : Generalization and Translatability in Emergent Communication via Informational Constraints »
Mycal Tucker · Roger Levy · Julie A Shah · Noga Zaslavsky -
2021 Poster: Emergent Discrete Communication in Semantic Spaces »
Mycal Tucker · Huao Li · Siddharth Agrawal · Dana Hughes · Katia Sycara · Michael Lewis · Julie A Shah -
2018 Poster: Bayesian Inference of Temporal Task Specifications from Demonstrations »
Ankit Shah · Pritish Kamath · Julie A Shah · Shen Li -
2016 Workshop: The Future of Interactive Machine Learning »
Kory Mathewson @korymath · Kaushik Subramanian · Mark Ho · Robert Loftin · Joseph L Austerweil · Anna Harutyunyan · Doina Precup · Layla El Asri · Matthew Gombolay · Jerry Zhu · Sonia Chernova · Charles Isbell · Patrick M Pilarski · Weng-Keen Wong · Manuela Veloso · Julie A Shah · Matthew Taylor · Brenna Argall · Michael Littman -
2015 Poster: Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction »
Been Kim · Julie A Shah · Finale Doshi-Velez -
2014 Poster: Fairness in Multi-Agent Sequential Decision-Making »
Chongjie Zhang · Julie A Shah -
2014 Poster: The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification »
Been Kim · Cynthia Rudin · Julie A Shah