Timezone: »
Poster
Handcrafted Backdoors in Deep Neural Networks
Sanghyun Hong · Nicholas Carlini · Alexey Kurakin
When machine learning training is outsourced to third parties, $backdoor$ $attacks$ become practical as the third party who trains the model may act maliciously to inject hidden behaviors into the otherwise accurate model. Until now, the mechanism to inject backdoors has been limited to $poisoning$. We argue that a supply-chain attacker has more attack techniques available by introducing a $handcrafted$ attack that directly manipulates a model's weights. This direct modification gives our attacker more degrees of freedom compared to poisoning, and we show it can be used to evade many backdoor detection or removal defenses effectively. Across four datasets and four network architectures our backdoor attacks maintain an attack success rate above 96%. Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.
Author Information
Sanghyun Hong (Oregon State University)
Nicholas Carlini (Google)
Alexey Kurakin (Google Brain)
More from the Same Authors
-
2022 Workshop: Workshop on Machine Learning Safety »
Dan Hendrycks · Victoria Krakovna · Dawn Song · Jacob Steinhardt · Nicholas Carlini -
2022 Panel: Panel 6A-3: Untargeted Backdoor Watermark:… & Handcrafted Backdoors in… »
Sanghyun Hong · Yiming Li -
2022 Poster: Increasing Confidence in Adversarial Robustness Evaluations »
Roland S. Zimmermann · Wieland Brendel · Florian Tramer · Nicholas Carlini -
2022 Poster: The Privacy Onion Effect: Memorization is Relative »
Nicholas Carlini · Matthew Jagielski · Chiyuan Zhang · Nicolas Papernot · Andreas Terzis · Florian Tramer -
2022 Poster: Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples »
Maura Pintor · Luca Demetrio · Angelo Sotgiu · Ambra Demontis · Nicholas Carlini · Battista Biggio · Fabio Roli -
2021 Poster: Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes »
Sanghyun Hong · Michael-Andrei Panaitescu-Liess · Yigitcan Kaya · Tudor Dumitras -
2020 Poster: On Adaptive Attacks to Adversarial Example Defenses »
Florian Tramer · Nicholas Carlini · Wieland Brendel · Aleksander Madry -
2020 Poster: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Spotlight: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2017 : Competition I: Adversarial Attacks and Defenses »
Alexey Kurakin · Ian Goodfellow · Samy Bengio · Yao Zhao · Yinpeng Dong · Tianyu Pang · Fangzhou Liao · Cihang Xie · Adithya Ganesh · Oguz Elibol