NeurIPS Machine Learning Explainability from an Information-theoretic Perspective

Poster
in
Workshop: Information-Theoretic Principles in Cognitive Systems

Machine Learning Explainability from an Information-theoretic Perspective

Debargha Ganguly · Debayan Gupta

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

The primary challenge for practitioners with multiple \textit{post-hoc gradient-based} interpretability methods is to benchmark them and select the best. Using information theory, we represent finding the optimal explainer as a rate-distortion optimization problem. Therefore : $\begin{itemize} \item We propose an information-theoretic test \verb|InfoExplain| to resolve the benchmarking ambiguity in a model agnostic manner without additional user data (apart from the input features, model, and explanations). \item We show that \verb|InfoExplain| is extendable to utilise human interpretable concepts, deliver performance guarantees, and filter out erroneous explanations.\end{itemize}$ The adjoining experiments, code and data will be released soon.

Chat is not available.

Poster in Workshop: Information-Theoretic Principles in Cognitive Systems

Machine Learning Explainability from an Information-theoretic Perspective

Debargha Ganguly · Debayan Gupta

Poster
in
Workshop: Information-Theoretic Principles in Cognitive Systems