Timezone: »
Visual Counterfactual Explanations (VCEs) are an important tool to understand the decisions of an image classifier. They are “small” but “realistic” semantic changes of the image changing the classifier decision. Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts, or are limited to image classification problems with few classes. In this paper, we overcome this by generating Diffusion Visual Counterfactual Explanations (DVCEs) for arbitrary ImageNet classifiers via a diffusion process. Two modifications to the diffusion process are key for our DVCEs: first, an adaptive parameterization, whose hyperparameters generalize across images and models, together with distance regularization and late start of the diffusion process, allow us to generate images with minimal semantic changes to the original ones but different classification. Second, our cone regularization via an adversarially robust model ensures that the diffusion process does not converge to trivial non-semantic changes, but instead produces realistic images of the target class which achieve high confidence by the classifier.
Author Information
Maximilian Augustin (University of Tuebingen)
Valentyn Boreiko (Eberhard-Karls-Universität Tübingen)
Francesco Croce (University of Tübingen)
Matthias Hein (University of Tübingen)
More from the Same Authors
-
2021 : RobustBench: a standardized adversarial robustness benchmark »
Francesco Croce · Maksym Andriushchenko · Vikash Sehwag · Edoardo Debenedetti · Nicolas Flammarion · Mung Chiang · Prateek Mittal · Matthias Hein -
2022 : Perturbing BatchNorm and Only BatchNorm Benefits Sharpness-Aware Minimization »
Maximilian Mueller · Matthias Hein -
2022 : Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation »
Maksym Yatsura · Kaspar Sakmann · N. Grace Hua · Matthias Hein · Jan Hendrik Metzen -
2022 Poster: Provably Adversarially Robust Detection of Out-of-Distribution Data (Almost) for Free »
Alexander Meinke · Julian Bitterwolf · Matthias Hein -
2020 Poster: Certifiably Adversarially Robust Detection of Out-of-Distribution Data »
Julian Bitterwolf · Alexander Meinke · Matthias Hein -
2019 : Break / Poster Session 1 »
Antonia Marcu · Yao-Yuan Yang · Pascale Gourdeau · Chen Zhu · Thodoris Lykouris · Jianfeng Chi · Mark Kozdoba · Arjun Nitin Bhagoji · Xiaoxia Wu · Jay Nandy · Michael T Smith · Bingyang Wen · Yuege Xie · Konstantinos Pitas · Suprosanna Shit · Maksym Andriushchenko · Dingli Yu · Gaël Letarte · Misha Khodak · Hussein Mozannar · Chara Podimata · James Foulds · Yizhen Wang · Huishuai Zhang · Ondrej Kuzelka · Alexander Levine · Nan Lu · Zakaria Mhammedi · Paul Viallard · Diana Cai · Lovedeep Gondara · James Lucas · Yasaman Mahdaviyeh · Aristide Baratin · Rishi Bommasani · Alessandro Barp · Andrew Ilyas · Kaiwen Wu · Jens Behrmann · Omar Rivasplata · Amir Nazemi · Aditi Raghunathan · Will Stephenson · Sahil Singla · Akhil Gupta · YooJung Choi · Yannic Kilcher · Clare Lyle · Edoardo Manino · Andrew Bennett · Zhi Xu · Niladri Chatterji · Emre Barut · Flavien Prost · Rodrigo Toro Icarte · Arno Blaas · Chulhee Yun · Sahin Lale · YiDing Jiang · Tharun Kumar Reddy Medini · Ashkan Rezaei · Alexander Meinke · Stephen Mell · Gary Kazantsev · Shivam Garg · Aradhana Sinha · Vishnu Lokhande · Geovani Rizk · Han Zhao · Aditya Kumar Akash · Jikai Hou · Ali Ghodsi · Matthias Hein · Tyler Sypherd · Yichen Yang · Anastasia Pentina · Pierre Gillot · Antoine Ledent · Guy Gur-Ari · Noah MacAulay · Tianzong Zhang -
2019 Poster: Provably robust boosted decision stumps and trees against adversarial attacks »
Maksym Andriushchenko · Matthias Hein -
2019 Poster: Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs »
Pedro Mercado · Francesco Tudisco · Matthias Hein