Timezone: »
We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks. Our results across several large-scale image classification datasets show that many popular interpretability methods produce estimates of feature importance that are not better than a random designation of feature importance. Only certain ensemble based approaches---VarGrad and SmoothGrad-Squared---outperform such a random assignment of importance. The manner of ensembling remains critical, we show that some approaches do no better then the underlying method but carry a far higher computational burden.
Author Information
Sara Hooker (Google Brain)
I am a Research Scholar at Google doing machine learning research. My research interests include algorithm transparency, security and privacy.
Dumitru Erhan (Google Brain)
Pieter-Jan Kindermans (Google Brain)
Been Kim (Google)
More from the Same Authors
-
2021 : Interpretability of Machine Learning in Computer Systems: Analyzing a Caching Model »
Leon Sixt · Evan Liu · Marie Pellat · James Wexler · Milad Hashemi · Been Kim · Martin Maas -
2020 : The Hardware Lottery »
Sara Hooker -
2020 Poster: Debugging Tests for Model Explanations »
Julius Adebayo · Michael Muelly · Ilaria Liccardi · Been Kim -
2020 Poster: On Completeness-aware Concept-Based Explanations in Deep Neural Networks »
Chih-Kuan Yeh · Been Kim · Sercan Arik · Chun-Liang Li · Tomas Pfister · Pradeep Ravikumar -
2019 Poster: Towards Automatic Concept-based Explanations »
Amirata Ghorbani · James Wexler · James Zou · Been Kim -
2019 Poster: High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks »
Ruben Villegas · Arkanath Pathak · Harini Kannan · Dumitru Erhan · Quoc V Le · Honglak Lee -
2019 Poster: Visualizing and Measuring the Geometry of BERT »
Emily Reif · Ann Yuan · Martin Wattenberg · Fernanda Viegas · Andy Coenen · Adam Pearce · Been Kim -
2018 : Interpretability for when NOT to use machine learning by Been Kim »
Been Kim -
2018 Poster: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Spotlight: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Poster: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Spotlight: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Poster: To Trust Or Not To Trust A Classifier »
Heinrich Jiang · Been Kim · Melody Guan · Maya Gupta -
2017 : Methods 2 »
Pieter-Jan Kindermans -
2017 Poster: SchNet: A continuous-filter convolutional neural network for modeling quantum interactions »
Kristof Schütt · Pieter-Jan Kindermans · Huziel Enoc Sauceda Felix · Stefan Chmiela · Alexandre Tkatchenko · Klaus-Robert Müller -
2017 Poster: An Empirical Study on The Properties of Random Bases for Kernel Methods »
Maximilian Alber · Pieter-Jan Kindermans · Kristof Schütt · Klaus-Robert Müller · Fei Sha -
2016 Poster: Domain Separation Networks »
Konstantinos Bousmalis · George Trigeorgis · Nathan Silberman · Dilip Krishnan · Dumitru Erhan -
2013 Poster: Deep Neural Networks for Object Detection »
Christian Szegedy · Alexander Toshev · Dumitru Erhan -
2012 Demonstration: A Fast Accurate Training-less P300 Speller: Unsupervised Learning Uncovers new Possibilities »
Pieter-Jan Kindermans · Hannes Verschore · David Verstraeten · Benjamin Schrauwen -
2012 Poster: A P300 BCI for the Masses: Prior Information Enables Instant Unsupervised Spelling »
Pieter-Jan Kindermans · Hannes Verschore · David Verstraeten · Benjamin Schrauwen