Timezone: »

Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples
Weixin Chen · Baoyuan Wu · Haoqian Wang

Thu Dec 08 09:00 AM -- 11:00 AM (PST) @

Poisoning-based backdoor attacks are serious threat for training deep models on data from untrustworthy sources. Given a backdoored model, we observe that the feature representations of poisoned samples with trigger are more sensitive to transformations than those of clean samples. It inspires us to design a simple sensitivity metric, called feature consistency towards transformations (FCT), to distinguish poisoned samples from clean samples in the untrustworthy training set. Moreover, we propose two effective backdoor defense methods. Built upon a sample-distinguishment module utilizing the FCT metric, the first method trains a secure model from scratch using a two-stage secure training module. And the second method removes backdoor from a backdoored model with a backdoor removal module which alternatively unlearns the distinguished poisoned samples and relearns the distinguished clean samples. Extensive results on three benchmark datasets demonstrate the superior defense performance against eight types of backdoor attacks, to state-of-the-art backdoor defenses. Codes are available at: https://github.com/SCLBD/Effectivebackdoordefense.

Author Information

Weixin Chen (Tsinghua University)
Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen)
Haoqian Wang (Tsinghua Shenzhen International Graduate School)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors