Timezone: »
Representation learning, \textit{i.e.} the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs). Recently, \emph{robustness to adversarial examples} has emerged as a desirable property for DNNs, spurring the development of robust training methods that account for adversarialexamples. In this paper, we aim to understand how the properties of representations learned by robust training differ from those obtained from standard, non-robust training. This is critical to diagnosing numerous salient pitfalls in robust networks, such as, degradation of performance on benign inputs, poor generalization of robustness, and increase in over-fitting. We utilize a powerful set of tools known as representation similarity metrics, across 3 vision datasets, to obtain layer-wise comparisons between robust and non-robust DNNs with different architectures, training procedures and adversarial constraints. Our experiments highlight hitherto unseen properties of robust representations that we posit underlie the behavioral differences of robust networks. We discover a lack of specialization in robust networks' representations along with a disappearance of `block structure'. We also find overfitting during robust training largely impacts deeper layers. These, along with other findings, suggest ways forward for the design and training of better robust networks.
Author Information
Christian Cianfarani (University of Chicago)
Arjun Nitin Bhagoji (University of Chicago)
Vikash Sehwag (Princeton University)
Ben Zhao (University of Chicago)
Heather Zheng (University of Chicago)
Prateek Mittal (Princeton University)
More from the Same Authors
-
2021 : RobustBench: a standardized adversarial robustness benchmark »
Francesco Croce · Maksym Andriushchenko · Vikash Sehwag · Edoardo Debenedetti · Nicolas Flammarion · Mung Chiang · Prateek Mittal · Matthias Hein -
2021 : A Novel Self-Distillation Architecture to Defeat Membership Inference Attacks »
Xinyu Tang · Saeed Mahloujifar · Liwei Song · Virat Shejwalkar · Amir Houmansadr · Prateek Mittal -
2022 : Lower Bounds on 0-1 Loss for Multi-class Classification with a Test-time Attacker »
Sihui Dai · Wenxin Ding · Arjun Nitin Bhagoji · Daniel Cullina · Prateek Mittal · Ben Zhao -
2022 Poster: Formulating Robustness Against Unforeseen Attacks »
Sihui Dai · Saeed Mahloujifar · Prateek Mittal -
2022 Poster: Finding Naturally Occurring Physical Backdoors in Image Datasets »
Emily Wenger · Roma Bhattacharjee · Arjun Nitin Bhagoji · Josephine Passananti · Emilio Andere · Heather Zheng · Ben Zhao -
2022 Poster: Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning »
Jiachen T. Wang · Saeed Mahloujifar · Shouda Wang · Ruoxi Jia · Prateek Mittal -
2020 Poster: HYDRA: Pruning Adversarially Robust Neural Networks »
Vikash Sehwag · Shiqi Wang · Prateek Mittal · Suman Jana -
2019 : Break / Poster Session 1 »
Antonia Marcu · Yao-Yuan Yang · Pascale Gourdeau · Chen Zhu · Thodoris Lykouris · Jianfeng Chi · Mark Kozdoba · Arjun Nitin Bhagoji · Xiaoxia Wu · Jay Nandy · Michael T Smith · Bingyang Wen · Yuege Xie · Konstantinos Pitas · Suprosanna Shit · Maksym Andriushchenko · Dingli Yu · GaĆ«l Letarte · Misha Khodak · Hussein Mozannar · Chara Podimata · James Foulds · Yizhen Wang · Huishuai Zhang · Ondrej Kuzelka · Alexander Levine · Nan Lu · Zakaria Mhammedi · Paul Viallard · Diana Cai · Lovedeep Gondara · James Lucas · Yasaman Mahdaviyeh · Aristide Baratin · Rishi Bommasani · Alessandro Barp · Andrew Ilyas · Kaiwen Wu · Jens Behrmann · Omar Rivasplata · Amir Nazemi · Aditi Raghunathan · Will Stephenson · Sahil Singla · Akhil Gupta · YooJung Choi · Yannic Kilcher · Clare Lyle · Edoardo Manino · Andrew Bennett · Zhi Xu · Niladri Chatterji · Emre Barut · Flavien Prost · Rodrigo Toro Icarte · Arno Blaas · Chulhee Yun · Sahin Lale · YiDing Jiang · Tharun Kumar Reddy Medini · Ashkan Rezaei · Alexander Meinke · Stephen Mell · Gary Kazantsev · Shivam Garg · Aradhana Sinha · Vishnu Lokhande · Geovani Rizk · Han Zhao · Aditya Kumar Akash · Jikai Hou · Ali Ghodsi · Matthias Hein · Tyler Sypherd · Yichen Yang · Anastasia Pentina · Pierre Gillot · Antoine Ledent · Guy Gur-Ari · Noah MacAulay · Tianzong Zhang -
2019 Poster: Lower Bounds on Adversarial Robustness from Optimal Transport »
Arjun Nitin Bhagoji · Daniel Cullina · Prateek Mittal -
2018 Poster: PAC-learning in the presence of adversaries »
Daniel Cullina · Arjun Nitin Bhagoji · Prateek Mittal