Skip to yearly menu bar Skip to main content

Workshop: Workshop on Machine Learning Safety

BAAT: Towards Sample-specific Backdoor Attack with Clean Labels

Yiming Li · Mingyan Zhu · Chengxiao Luo · Haiqing Weng · Yong Jiang · Tao Wei · Shu-Tao Xia


Recent studies revealed that the training process of deep neural networks (DNNs) is vulnerable to backdoor attacks if third-party training resources are adopted. Among all different types of existing attacks, sample-specific backdoor attacks (SSBAs) are probably the most advanced and malicious methods, since they can easily bypass most of the existing defenses. In this paper, we reveal that SSBAs are not stealthy enough due to their poisoned-label nature, where users can discover anomalies if they check the image-label relationship. Besides, we also show that extending existing SSBAs to the ones under the clean-label setting based on poisoning samples from only the target class has minor effects. Inspired by the decision process of humans, we propose to adopt \emph{attribute} as the trigger to design the sample-specific backdoor attack with clean labels (dubbed BAAT). Experimental results on benchmark datasets verify the effectiveness and stealthiness of BAAT.

Chat is not available.