Timezone: »

Certified defences hurt generalisation
Piersilvio De Bartolomeis · Jacob Clarysse · Fanny Yang · Amartya Sanyal

In recent years, much work has been devoted to designing certifieddefences for neural networks, i.e., methods for learning neuralnetworks that are provably robust to certain adversarialperturbations. Due to the non-convexity of the problem, dominantapproaches in this area rely on convex approximations, which areinherently loose. In this paper, we question the effectiveness of suchapproaches for realistic computer vision tasks. First, we provideextensive empirical evidence to show that certified defences suffernot only worse accuracy but also worse robustness and fairness thanempirical defences. We hypothesise that the reason for why certifieddefences suffer in generalisation is (i) the large number ofrelaxed non-convex constraints and (ii) strong alignment between theadversarial perturbations and the "signal" direction. We provide acombination of theoretical and experimental evidence to support thesehypotheses.

Author Information

Piersilvio De Bartolomeis (ETH Zürich)
Jacob Clarysse (ETH Zürich)
Fanny Yang (ETH Zurich)
Amartya Sanyal (ETH Zurich)

More from the Same Authors