The vulnerability of deep image classification networks to adversarial attack is now well known, but less well understood. Via a novel experimental analysis, we illustrate some facts about deep convolutional networks for image classification that shed new light on their behaviour and how it connects to the problem of adversaries. In short, the celebrated performance of these networks and their vulnerability to adversarial attack are simply two sides of the same coin: the input image-space directions along which the networks are most vulnerable to attack are the same directions which they use to achieve their classification performance in the first place. We develop this result in two main steps. The first uncovers the fact that classes tend to be associated with specific image-space directions. This is shown by an examination of the class-score outputs of nets as functions of 1D movements along these directions. This provides a novel perspective on the existence of universal adversarial perturbations. The second is a clear demonstration of the tight coupling between classification performance and vulnerability to adversarial attack within the spaces spanned by these directions. Thus, our analysis resolves the apparent contradiction between accuracy and vulnerability. It provides a new perspective on much of the prior art and reveals profound implications for efforts to construct neural nets that are both accurate and robust to adversarial attack.
Saumya Jetley (University of Oxford)
I am a final year PhD student under Prof. Philip Torr generously funded by the EPSRC research scholarship to pursue research in 'AI for scene understanding' at the University of Oxford. My research covers the broad themes of computer vision, machine learning and deep learning. In the area of applied computer vision, I have worked on topics of human saliency estimation, object recognition, instance segmentation and attention modelling in deep neural nets. My recent work focuses on explaining the adversarial phenomenon observed in deep neural networks for image classification.
Nick Lord (University of Oxford/FiveAI)
Philip Torr (University of Oxford)
More from the Same Authors
2019 Poster: Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models »
Yuge Shi · Siddharth N · Brooks Paige · Philip Torr