Timezone: »
As deep neural networks (DNNs) are growing larger, their requirements for computational resources become huge, which makes outsourcing training more popular. Training in a third-party platform, however, may introduce potential risks that a malicious trainer will return backdoored DNNs, which behave normally on clean samples but output targeted misclassifications whenever a trigger appears at the test time. Without any knowledge of the trigger, it is difficult to distinguish or recover benign DNNs from backdoored ones. In this paper, we first identify an unexpected sensitivity of backdoored DNNs, that is, they are much easier to collapse and tend to predict the target label on clean samples when their neurons are adversarially perturbed. Based on these observations, we propose a novel model repairing method, termed Adversarial Neuron Pruning (ANP), which prunes some sensitive neurons to purify the injected backdoor. Experiments show, even with only an extremely small amount of clean data (e.g., 1%), ANP effectively removes the injected backdoor without causing obvious performance degradation.
Author Information
Dongxian Wu (University of Tokyo)
Yisen Wang (Peking University)
More from the Same Authors
-
2021 Spotlight: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Yisen Wang · Zhouchen Lin -
2021 Spotlight: Clustering Effect of Adversarial Robust Models »
Yang Bai · Xin Yan · Yong Jiang · Shu-Tao Xia · Yisen Wang -
2022 Poster: Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors »
Qixun Wang · Yifei Wang · Hong Zhu · Yisen Wang -
2022 Poster: When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture »
Yichuan Mo · Dongxian Wu · Yifei Wang · Yiwen Guo · Yisen Wang -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders »
Qi Zhang · Yifei Wang · Yisen Wang -
2022 Spotlight: When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture »
Yichuan Mo · Dongxian Wu · Yifei Wang · Yiwen Guo · Yisen Wang -
2022 Spotlight: Lightning Talks 1B-3 »
Chaofei Wang · Qixun Wang · Jing Xu · Long-Kai Huang · Xi Weng · Fei Ye · Harsh Rangwani · shrinivas ramasubramanian · Yifei Wang · Qisen Yang · Xu Luo · Lei Huang · Adrian G. Bors · Ying Wei · Xinglin Pan · Sho Takemori · Hong Zhu · Rui Huang · Lei Zhao · Yisen Wang · Kato Takashi · Shiji Song · Yanan Li · Rao Anwer · Yuhei Umeda · Salman Khan · Gao Huang · Wenjie Pei · Fahad Shahbaz Khan · Venkatesh Babu R · Zenglin Xu -
2022 Spotlight: Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors »
Qixun Wang · Yifei Wang · Hong Zhu · Yisen Wang -
2022 Poster: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders »
Qi Zhang · Yifei Wang · Yisen Wang -
2021 Poster: Clustering Effect of Adversarial Robust Models »
Yang Bai · Xin Yan · Yong Jiang · Shu-Tao Xia · Yisen Wang -
2021 Poster: On Training Implicit Models »
Zhengyang Geng · Xin-Yu Zhang · Shaojie Bai · Yisen Wang · Zhouchen Lin -
2021 Poster: Dissecting the Diffusion Process in Linear Graph Convolutional Networks »
Yifei Wang · Yisen Wang · Jiansheng Yang · Zhouchen Lin -
2021 Poster: Gauge Equivariant Transformer »
Lingshen He · Yiming Dong · Yisen Wang · Dacheng Tao · Zhouchen Lin -
2021 Poster: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Yisen Wang · Zhouchen Lin -
2021 Poster: Efficient Equivariant Network »
Lingshen He · Yuxuan Chen · zhengyang shen · Yiming Dong · Yisen Wang · Zhouchen Lin -
2021 Poster: Towards a Unified Game-Theoretic View of Adversarial Perturbations and Robustness »
Jie Ren · Die Zhang · Yisen Wang · Lu Chen · Zhanpeng Zhou · Yiting Chen · Xu Cheng · Xin Wang · Meng Zhou · Jie Shi · Quanshi Zhang -
2021 Poster: Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks »
Hanxun Huang · Yisen Wang · Sarah Erfani · Quanquan Gu · James Bailey · Xingjun Ma -
2021 Poster: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks »
Chen Ma · Xiangyu Guo · Li Chen · Jun-Hai Yong · Yisen Wang -
2021 Poster: Residual Relaxation for Multi-view Representation Learning »
Yifei Wang · Zhengyang Geng · Feng Jiang · Chuming Li · Yisen Wang · Jiansheng Yang · Zhouchen Lin -
2021 Poster: MoriĆ© Attack (MA): A New Potential Risk of Screen Photos »
Dantong Niu · Ruohao Guo · Yisen Wang -
2020 Poster: Adversarial Weight Perturbation Helps Robust Generalization »
Dongxian Wu · Shu-Tao Xia · Yisen Wang