Timezone: »
The widespread deployment of machine learning models in various high-stakes settings has underscored the need for ensuring that individuals who are adversely impacted by model predictions are provided with a means for recourse. To this end, several algorithms have been proposed in recent literature to generate recourses. Recent research has also demonstrated that the recourses generated by these algorithms often correspond to adversarial examples. This key finding emphasizes the need for a deeper understanding of the impact of adversarially robust models (which are designed to guard against adversarial examples) on algorithmic recourse. In this work, we make one of the first attempts at studying the impact of adversarially robust models on algorithmic recourse. We theoretically and empirically analyze the cost (ease of implementation) and validity (probability of obtaining a positive model prediction) of the recourses output by state-of-the-art algorithms when the underlying models are adversarially robust. More specifically, we construct theoretical bounds on the differences between the cost and the validity of the recourses generated by various state-of-the-art algorithms when the underlying models are adversarially robust vs. non-robust. We also carry out extensive empirical analysis with multiple real-world datasets to not only validate our theoretical results, but also analyze the impact of varying degrees of model robustness on the cost and validity of the resulting recourses. Our theoretical and empirical analyses demonstrate that adversarially robust models significantly increase the cost and reduce the validity of the resulting recourses, thereby shedding light on the inherent trade-offs between achieving adversarial robustness in predictive models and providing easy-to-implement and reliable algorithmic recourse.
Author Information
Satyapriya Krishna (Harvard University)
Chirag Agarwal (Harvard University/Adobe)
Himabindu Lakkaraju (Harvard)
More from the Same Authors
-
2022 : Trajectory-based Explainability Framework for Offline RL »
Shripad Deshmukh · Arpan Dasgupta · Chirag Agarwal · Nan Jiang · Balaji Krishnamurthy · Georgios Theocharous · Jayakumar Subramanian -
2022 : TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations »
Dylan Slack · Satyapriya Krishna · Himabindu Lakkaraju · Sameer Singh -
2022 : Contributed Talk: TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations »
Dylan Slack · Satyapriya Krishna · Himabindu Lakkaraju · Sameer Singh -
2022 Poster: OpenXAI: Towards a Transparent Evaluation of Model Explanations »
Chirag Agarwal · Satyapriya Krishna · Eshika Saxena · Martin Pawelczyk · Nari Johnson · Isha Puri · Marinka Zitnik · Himabindu Lakkaraju -
2022 Poster: Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations »
Tessa Han · Suraj Srinivas · Himabindu Lakkaraju -
2022 Poster: Efficient Training of Low-Curvature Neural Networks »
Suraj Srinivas · Kyle Matoba · Himabindu Lakkaraju · François Fleuret