Timezone: »

Class-Disentanglement and Applications in Adversarial Detection and Defense
Kaiwen Yang · Tianyi Zhou · Yonggang Zhang · Xinmei Tian · Dacheng Tao

Wed Dec 08 04:30 PM -- 06:00 PM (PST) @ Virtual
What is the minimum necessary information required by a neural net $D(\cdot)$ from an image $x$ to accurately predict its class? Extracting such information in the input space from $x$ can allocate the areas $D(\cdot)$ mainly attending to and shed novel insights to the detection and defense of adversarial attacks. In this paper, we propose ''class-disentanglement'' that trains a variational autoencoder $G(\cdot)$ to extract this class-dependent information as $x - G(x)$ via a trade-off between reconstructing $x$ by $G(x)$ and classifying $x$ by $D(x-G(x))$, where the former competes with the latter in decomposing $x$ so the latter retains only necessary information for classification in $x-G(x)$. We apply it to both clean images and their adversarial images and discover that the perturbations generated by adversarial attacks mainly lie in the class-dependent part $x-G(x)$. The decomposition results also provide novel interpretations to classification and attack models. Inspired by these observations, we propose to conduct adversarial detection and adversarial defense respectively on $x - G(x)$ and $G(x)$, which consistently outperform the results on the original $x$. In experiments, this simple approach substantially improves the detection and defense against different types of adversarial attacks.

Author Information

Kaiwen Yang (University of Science and Technology of China)
Tianyi Zhou (University of Washington, Seattle)

Tianyi Zhou is a Ph.D. student in Computer Science at University of Washington and a member of MELODI lab led by Prof. Jeff A. Bilmes. He will be joining University of Maryland, College Park as a tenure-track assistant professor at the Department of Computer Science and affiliated with UMIACS in 2022. His research interests are in machine learning, optimization, and natural language processing. He has published ~60 papers at NeurIPS, ICML, ICLR, AISTATS, EMNLP, NAACL, COLING, KDD, ICDM, AAAI, IJCAI, ISIT, Machine Learning (Springer), IEEE TIP/TNNLS/TKDE, etc. He is the recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE TCSC Most Influential Paper Award.

Yonggang Zhang (ustc)
Xinmei Tian (University of Science and Technology of China)
Dacheng Tao (University of Technology, Sydney)

More from the Same Authors