Timezone: »
To explain predictions made by complex machine learning models, many feature attribution methods have been developed that assign importance scores to input features. Some recent work challenges the robustness of these methods by showing that they are sensitive to input and model perturbations, while other work addresses this issue by proposing robust attribution methods. However, previous work on attribution robustness has focused primarily on gradient-based feature attributions, whereas the robustness of removal-based attribution methods is not currently well understood. To bridge this gap, we theoretically characterize the robustness properties of removal-based feature attributions. Specifically, we provide a unified analysis of such methods and derive upper bounds for the difference between intact and perturbed attributions, under settings of both input and model perturbations. Our empirical results on synthetic and real-world data validate our theoretical results and demonstrate their practical implications, including the ability to increase attribution robustness by improving the model’s Lipschitz regularity.
Author Information
Chris Lin (University of Washington)
Ian Covert (Stanford University)
Su-In Lee (University of Washington)
More from the Same Authors
-
2023 : A deep generative model of single-cell methylomic data »
Ethan Weinberger · Su-In Lee -
2023 Poster: Feature Selection in the Contrastive Analysis Setting »
Ethan Weinberger · Ian Covert · Su-In Lee -
2020 Poster: Learning Deep Attribution Priors Based On Prior Knowledge »
Ethan Weinberger · Joseph Janizek · Su-In Lee -
2020 Poster: Understanding Global Feature Contributions With Additive Importance Measures »
Ian Covert · Scott Lundberg · Su-In Lee -
2017 Poster: A Unified Approach to Interpreting Model Predictions »
Scott M Lundberg · Su-In Lee -
2017 Oral: A unified approach to interpreting model predictions »
Scott M Lundberg · Su-In Lee -
2016 Poster: Learning Sparse Gaussian Graphical Models with Overlapping Blocks »
Mohammad Javad Hosseini · Su-In Lee -
2014 Workshop: Machine Learning in Computational Biology »
Oliver Stegle · Sara Mostafavi · Anna Goldenberg · Su-In Lee · Michael Leung · Anshul Kundaje · Mark B Gerstein · Martin Renqiang Min · Hannes Bretschneider · Francesco Paolo Casale · Loïc Schwaller · Amit G Deshwar · Benjamin A Logsdon · Yuanyang Zhang · Ali Punjani · Derek C Aguiar · Samuel Kaski