Designing Counterfactual Generators using Deep Model Inversion
Jayaraman Thiagarajan · Vivek Sivaraman Narayanaswamy · Deepta Rajan · Jia Liang · Akshay Chaudhari · Andreas Spanias

Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.

Author Information

Jayaraman Thiagarajan (Lawrence Livermore National Laboratory)
Vivek Sivaraman Narayanaswamy (Arizona State University)
Deepta Rajan (IBM, International Business Machines)
Jia Liang (Stanford University)
Akshay Chaudhari (Stanford University)
Andreas Spanias (ASU)

Andreas Spanias is Professor in the School of Electrical, Computer, and Energy Engineering at Arizona State University (ASU). He is also the director of the Sensor Signal and Information Processing (SenSIP) center and the founder of the SenSIP industry consortium (also an NSF I/UCRC site). His research interests are in the areas of adaptive signal processing, speech processing, machine learning and sensor systems. He and his student team developed the computer simulation software Java-DSP and its award-winning iPhone/iPad and Android versions. He is author of two textbooks: Audio Processing and Coding by Wiley and DSP; An Interactive Approach (2nd Ed.). He contributed to more than 350 papers, 11 monographs 11 full patents, 10 provisional patents and 12 patent pre-disclosures. He served as Associate Editor of the IEEE Transactions on Signal Processing and as General Co-chair of IEEE ICASSP-99. He also served as the IEEE Signal Processing Vice-President for Conferences. Andreas Spanias is co-recipient of the 2002 IEEE Donald G. Fink paper prize award and was elected Fellow of the IEEE in 2003. He served as Distinguished Lecturer for the IEEE Signal processing society in 2004. He is a series editor for the Morgan and Claypool lecture series on algorithms and software. He received the 2018 IEEE Phoenix Chapter award with citation: “For significant innovations and patents in signal processing for sensor systems.” He also received the 2018 IEEE Region 6 Outstanding Educator Award (across 12 states) with citation: “For outstanding research and education contributions in signal processing.” He was elected recently to Senior Member of the National Academy of Inventors (NAI).

