Timezone: »
We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks. Our results across several large-scale image classification datasets show that many popular interpretability methods produce estimates of feature importance that are not better than a random designation of feature importance. Only certain ensemble based approaches---VarGrad and SmoothGrad-Squared---outperform such a random assignment of importance. The manner of ensembling remains critical, we show that some approaches do no better then the underlying method but carry a far higher computational burden.
Author Information
Sara Hooker (Google Brain)
I lead Cohere For AI, a non-profit research lab that seeks to solve complex machine learning problems. We support fundamental research that explores the unknown, and are focused on creating more points of entry into machine learning research. Prior to Cohere, I was a research scientist Google Brain doing work on training models that go beyond test-set accuracy to fulfill multiple desired criteria -- interpretable, compact, fair and robust. I enjoy working on research problems where progress translates to reliable and accessible machine learning in the real-world. My research interests include algorithm transparency, security and privacy.
Dumitru Erhan (Google Brain)
Pieter-Jan Kindermans (Google Brain)
Been Kim (Google)
More from the Same Authors
-
2022 : Panel »
Kristian Lum · Rachel Cummings · Jake Goldenfein · Sara Hooker · Joshua Loftus -
2022 Poster: TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets »
Chengrun Yang · Gabriel Bender · Hanxiao Liu · Pieter-Jan Kindermans · Madeleine Udell · Yifeng Lu · Quoc V Le · Da Huang -
2021 : Interpretability of Machine Learning in Computer Systems: Analyzing a Caching Model »
Leon Sixt · Evan Liu · Marie Pellat · James Wexler · Milad Hashemi · Been Kim · Martin Maas -
2020 : The Hardware Lottery »
Sara Hooker -
2020 Poster: Debugging Tests for Model Explanations »
Julius Adebayo · Michael Muelly · Ilaria Liccardi · Been Kim -
2020 Poster: On Completeness-aware Concept-Based Explanations in Deep Neural Networks »
Chih-Kuan Yeh · Been Kim · Sercan Arik · Chun-Liang Li · Tomas Pfister · Pradeep Ravikumar -
2019 Poster: Towards Automatic Concept-based Explanations »
Amirata Ghorbani · James Wexler · James Zou · Been Kim -
2019 Poster: High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks »
Ruben Villegas · Arkanath Pathak · Harini Kannan · Dumitru Erhan · Quoc V Le · Honglak Lee -
2019 Poster: Visualizing and Measuring the Geometry of BERT »
Emily Reif · Ann Yuan · Martin Wattenberg · Fernanda Viegas · Andy Coenen · Adam Pearce · Been Kim -
2018 : Interpretability for when NOT to use machine learning by Been Kim »
Been Kim -
2018 Poster: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Spotlight: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Poster: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Spotlight: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Poster: To Trust Or Not To Trust A Classifier »
Heinrich Jiang · Been Kim · Melody Guan · Maya Gupta -
2017 : Methods 2 »
Pieter-Jan Kindermans -
2017 Poster: SchNet: A continuous-filter convolutional neural network for modeling quantum interactions »
Kristof Schütt · Pieter-Jan Kindermans · Huziel Enoc Sauceda Felix · Stefan Chmiela · Alexandre Tkatchenko · Klaus-Robert Müller -
2017 Poster: An Empirical Study on The Properties of Random Bases for Kernel Methods »
Maximilian Alber · Pieter-Jan Kindermans · Kristof Schütt · Klaus-Robert Müller · Fei Sha -
2016 Poster: Domain Separation Networks »
Konstantinos Bousmalis · George Trigeorgis · Nathan Silberman · Dilip Krishnan · Dumitru Erhan -
2013 Poster: Deep Neural Networks for Object Detection »
Christian Szegedy · Alexander Toshev · Dumitru Erhan -
2012 Demonstration: A Fast Accurate Training-less P300 Speller: Unsupervised Learning Uncovers new Possibilities »
Pieter-Jan Kindermans · Hannes Verschore · David Verstraeten · Benjamin Schrauwen -
2012 Poster: A P300 BCI for the Masses: Prior Information Enables Instant Unsupervised Spelling »
Pieter-Jan Kindermans · Hannes Verschore · David Verstraeten · Benjamin Schrauwen