Timezone: »

Lower Bounds on 0-1 Loss for Multi-class Classification with a Test-time Attacker
Sihui Dai · Wenxin Ding · Arjun Nitin Bhagoji · Daniel Cullina · Prateek Mittal · Ben Zhao

Finding classifiers robust to adversarial examples is critical for their safe deployment. Determining the robustness of the best possible classifier under a given threat model and comparing it to that achieved by state-of-the-art training methods is thus an important diagnostic tool. In this paper, we find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset. We provide a general framework for computing lower bounds on 0-1 loss based on solving a linear program (LP). This LP is constructed based on what we introduce as a conflict hypergraph, and we explore different settings in the construction of this hypergraph and their impact on the computed lower bound. Our work enables, for the first time, an analysis of the gap to optimal robustness for classifiers in the multi-class setting.

Author Information

Sihui Dai (Princeton University)
Wenxin Ding (University of Chicago)
Arjun Nitin Bhagoji (University of Chicago)
Daniel Cullina (Penn State University)
Prateek Mittal (Princeton University)
Ben Zhao (University of Chicago)

More from the Same Authors