Smoothed-SGDmax: A Stability-Inspired Algorithm to Improve Adversarial Generalization
Jiancong Xiao · Jiawei Zhang · Zhiquan Luo · Asuman Ozdaglar
Abstract
Unlike standard training, deep neural networks can suffer from serious overfitting problems in adversarial settings. Recent research [40,39] suggests that adversarial training can have nonvanishing generalization error even if the sample size $n$ goes to infinity. A natural question arises: can we eliminate the generalization error floor in adversarial training? This paper gives an affirmative answer. First, by an adaptation of information-theoretical lower bound on the complexity of solving Lipschitz-convex problems using randomized algorithms, we establish a minimax lower bound $\Omega(s(T)/n)$ given a training loss of $1/s(T)$ for the adversarial generalization gap, where $T$ is the number of iterations, and $s(T)\rightarrow+\infty$ as $T\rightarrow+\infty$. Next, by observing that the nonvanishing generalization error of existing adversarial training algorithms comes from the non-smoothness of the adversarial loss function, we employ a smoothing technique to smooth the adversarial loss function. Based on the smoothed loss function, we design a smoothed SGDmax algorithm achieving a generalization bound $\mathcal{O}(s(T)/n)$, which eliminates the generalization error floor and matches the minimax lower bound. Experimentally, we show that our algorithm improves adversarial generalization on common datasets.
Chat is not available.
Successful Page Load