Timezone: »
Despite their impressive performance on diverse tasks, neural networks fail catastrophically in the presence of adversarial inputs—imperceptibly but adversarially perturbed versions of natural inputs. We have witnessed an arms race between defenders who attempt to train robust networks and attackers who try to construct adversarial examples. One promise of ending the arms race is developing certified defenses, ones which are provably robust against all attackers in some family. These certified defenses are based on convex relaxations which construct an upper bound on the worst case loss over all attackers in the family. Previous relaxations are loose on networks that are not trained against the respective relaxation. In this paper, we propose a new semidefinite relaxation for certifying robustness that applies to arbitrary ReLU networks. We show that our proposed relaxation is tighter than previous relaxations and produces meaningful robustness guarantees on three different foreign networks whose training objectives are agnostic to our proposed relaxation.
Author Information
Aditi Raghunathan (Stanford University)
Jacob Steinhardt (UC Berkeley)
Percy Liang (Stanford University)
More from the Same Authors
-
2020 : Invited Talk 8 Presentation - Percy Liang - Semantic Parsing for Natural Language Interfaces »
Percy Liang -
2021 Spotlight: Learning Equilibria in Matching Markets from Bandit Feedback »
Meena Jagadeesan · Alexander Wei · Yixin Wang · Michael Jordan · Jacob Steinhardt -
2021 : Measuring Coding Challenge Competence With APPS »
Dan Hendrycks · Steven Basart · Saurav Kadavath · Mantas Mazeika · Akul Arora · Ethan Guo · Collin Burns · Samir Puranik · Horace He · Dawn Song · Jacob Steinhardt -
2021 : PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures »
Dan Hendrycks · Andy Zou · Mantas Mazeika · Leonard Tang · Dawn Song · Jacob Steinhardt -
2021 : Calibrated Ensembles: A Simple Way to Mitigate ID-OOD Accuracy Tradeoffs »
Ananya Kumar · Aditi Raghunathan · Tengyu Ma · Percy Liang -
2021 : Effect of Model Size on Worst-group Generalization »
Alan Pham · Eunice Chan · Vikranth Srivatsa · Dhruba Ghosh · Yaoqing Yang · Yaodong Yu · Ruiqi Zhong · Joseph Gonzalez · Jacob Steinhardt -
2021 : The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models »
Alexander Pan · Kush Bhatia · Jacob Steinhardt -
2021 : What Would Jiminy Cricket Do? Towards Agents That Behave Morally »
Dan Hendrycks · Mantas Mazeika · Andy Zou · Sahil Patel · Christine Zhu · Jesus Navarro · Dawn Song · Bo Li · Jacob Steinhardt -
2021 : Measuring Mathematical Problem Solving With the MATH Dataset »
Dan Hendrycks · Collin Burns · Saurav Kadavath · Akul Arora · Steven Basart · Eric Tang · Dawn Song · Jacob Steinhardt -
2022 : Out-of-Distribution Robustness via Targeted Augmentations »
Irena Gao · Shiori Sagawa · Pang Wei Koh · Tatsunori Hashimoto · Percy Liang -
2022 : Surgical Fine-Tuning Improves Adaptation to Distribution Shifts »
Yoonho Lee · Annie Chen · Fahim Tajwar · Ananya Kumar · Huaxiu Yao · Percy Liang · Chelsea Finn -
2022 : Surgical Fine-Tuning Improves Adaptation to Distribution Shifts »
Yoonho Lee · Annie Chen · Fahim Tajwar · Ananya Kumar · Huaxiu Yao · Percy Liang · Chelsea Finn -
2022 : Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations »
Yongyi Yang · Jacob Steinhardt · Wei Hu -
2022 : Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small »
Kevin Wang · Alexandre Variengien · Arthur Conmy · Buck Shlegeris · Jacob Steinhardt -
2022 Workshop: Workshop on Machine Learning Safety »
Dan Hendrycks · Victoria Krakovna · Dawn Song · Jacob Steinhardt · Nicholas Carlini -
2022 : Fine-Tuning without Distortion: Improving Robustness to Distribution Shifts »
Percy Liang · Ananya Kumar -
2022 Workshop: MATH-AI: Toward Human-Level Mathematical Reasoning »
Pan Lu · Swaroop Mishra · Sean Welleck · Yuhuai Wu · Hannaneh Hajishirzi · Percy Liang -
2022 Poster: What Can Transformers Learn In-Context? A Case Study of Simple Function Classes »
Shivam Garg · Dimitris Tsipras · Percy Liang · Gregory Valiant -
2022 Poster: Insights into Pre-training via Simpler Synthetic Tasks »
Yuhuai Wu · Felix Li · Percy Liang -
2022 Poster: Deep Bidirectional Language-Knowledge Graph Pretraining »
Michihiro Yasunaga · Antoine Bosselut · Hongyu Ren · Xikun Zhang · Christopher D Manning · Percy Liang · Jure Leskovec -
2022 Poster: How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios »
Mantas Mazeika · Eric Tang · Andy Zou · Steven Basart · Jun Shern Chan · Dawn Song · David Forsyth · Jacob Steinhardt · Dan Hendrycks -
2022 Poster: Decentralized Training of Foundation Models in Heterogeneous Environments »
Binhang Yuan · Yongjun He · Jared Davis · Tianyi Zhang · Tri Dao · Beidi Chen · Percy Liang · Christopher Ré · Ce Zhang -
2022 Poster: Capturing Failures of Large Language Models via Human Cognitive Biases »
Erik Jones · Jacob Steinhardt -
2022 Poster: Diffusion-LM Improves Controllable Text Generation »
Xiang Li · John Thickstun · Ishaan Gulrajani · Percy Liang · Tatsunori Hashimoto -
2022 Poster: Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization? »
Rishi Bommasani · Kathleen A. Creel · Ananya Kumar · Dan Jurafsky · Percy Liang -
2022 Poster: Forecasting Future World Events With Neural Networks »
Andy Zou · Tristan Xiao · Ryan Jia · Joe Kwon · Mantas Mazeika · Richard Li · Dawn Song · Jacob Steinhardt · Owain Evans · Dan Hendrycks -
2022 Poster: Improving Self-Supervised Learning by Characterizing Idealized Representations »
Yann Dubois · Stefano Ermon · Tatsunori Hashimoto · Percy Liang -
2021 : Invited Talk: Lessons from robust machine learning »
Aditi Raghunathan -
2021 Workshop: Distribution shifts: connecting methods and applications (DistShift) »
Shiori Sagawa · Pang Wei Koh · Fanny Yang · Hongseok Namkoong · Jiashi Feng · Kate Saenko · Percy Liang · Sarah Bird · Sergey Levine -
2021 Poster: Grounding Representation Similarity Through Statistical Testing »
Frances Ding · Jean-Stanislas Denain · Jacob Steinhardt -
2021 Poster: Learning Equilibria in Matching Markets from Bandit Feedback »
Meena Jagadeesan · Alexander Wei · Yixin Wang · Michael Jordan · Jacob Steinhardt -
2020 : Invited Talk 8 Q/A - Percy Liang »
Percy Liang -
2020 Poster: The Pitfalls of Simplicity Bias in Neural Networks »
Harshay Shah · Kaustav Tamuly · Aditi Raghunathan · Prateek Jain · Praneeth Netrapalli -
2020 Poster: Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming »
Sumanth Dathathri · Krishnamurthy Dvijotham · Alexey Kurakin · Aditi Raghunathan · Jonathan Uesato · Rudy Bunel · Shreya Shankar · Jacob Steinhardt · Ian Goodfellow · Percy Liang · Pushmeet Kohli -
2019 : Extended Poster Session »
Travis LaCroix · Marie Ossenkopf · Mina Lee · Nicole Fitzgerald · Daniela Mihai · Jonathon Hare · Ali Zaidi · Alexander Cowen-Rivers · Alana Marzoev · Eugene Kharitonov · Luyao Yuan · Tomasz Korbak · Paul Pu Liang · Yi Ren · Roberto Dessì · Peter Potash · Shangmin Guo · Tatsunori Hashimoto · Percy Liang · Julian Zubek · Zipeng Fu · Song-Chun Zhu · Adam Lerer -
2019 : Break / Poster Session 1 »
Antonia Marcu · Yao-Yuan Yang · Pascale Gourdeau · Chen Zhu · Thodoris Lykouris · Jianfeng Chi · Mark Kozdoba · Arjun Nitin Bhagoji · Xiaoxia Wu · Jay Nandy · Michael T Smith · Bingyang Wen · Yuege Xie · Konstantinos Pitas · Suprosanna Shit · Maksym Andriushchenko · Dingli Yu · Gaël Letarte · Misha Khodak · Hussein Mozannar · Chara Podimata · James Foulds · Yizhen Wang · Huishuai Zhang · Ondrej Kuzelka · Alexander Levine · Nan Lu · Zakaria Mhammedi · Paul Viallard · Diana Cai · Lovedeep Gondara · James Lucas · Yasaman Mahdaviyeh · Aristide Baratin · Rishi Bommasani · Alessandro Barp · Andrew Ilyas · Kaiwen Wu · Jens Behrmann · Omar Rivasplata · Amir Nazemi · Aditi Raghunathan · Will Stephenson · Sahil Singla · Akhil Gupta · YooJung Choi · Yannic Kilcher · Clare Lyle · Edoardo Manino · Andrew Bennett · Zhi Xu · Niladri Chatterji · Emre Barut · Flavien Prost · Rodrigo Toro Icarte · Arno Blaas · Chulhee Yun · Sahin Lale · YiDing Jiang · Tharun Kumar Reddy Medini · Ashkan Rezaei · Alexander Meinke · Stephen Mell · Gary Kazantsev · Shivam Garg · Aradhana Sinha · Vishnu Lokhande · Geovani Rizk · Han Zhao · Aditya Kumar Akash · Jikai Hou · Ali Ghodsi · Matthias Hein · Tyler Sypherd · Yichen Yang · Anastasia Pentina · Pierre Gillot · Antoine Ledent · Guy Gur-Ari · Noah MacAulay · Tianzong Zhang -
2019 Poster: SPoC: Search-based Pseudocode to Code »
Sumith Kulal · Panupong Pasupat · Kartik Chandra · Mina Lee · Oded Padon · Alex Aiken · Percy Liang -
2019 Poster: On the Accuracy of Influence Functions for Measuring Group Effects »
Pang Wei Koh · Kai-Siang Ang · Hubert Teo · Percy Liang -
2019 Poster: Unlabeled Data Improves Adversarial Robustness »
Yair Carmon · Aditi Raghunathan · Ludwig Schmidt · John Duchi · Percy Liang -
2019 Poster: Verified Uncertainty Calibration »
Ananya Kumar · Percy Liang · Tengyu Ma -
2019 Spotlight: Verified Uncertainty Calibration »
Ananya Kumar · Percy Liang · Tengyu Ma -
2018 : Natural Language Supervision »
Percy Liang -
2018 Workshop: Workshop on Security in Machine Learning »
Nicolas Papernot · Jacob Steinhardt · Matt Fredrikson · Kamalika Chaudhuri · Florian Tramer -
2018 Poster: Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss »
Stephen Mussmann · Percy Liang -
2018 Poster: A Retrieve-and-Edit Framework for Predicting Structured Outputs »
Tatsunori Hashimoto · Kelvin Guu · Yonatan Oren · Percy Liang -
2018 Oral: A Retrieve-and-Edit Framework for Predicting Structured Outputs »
Tatsunori Hashimoto · Kelvin Guu · Yonatan Oren · Percy Liang -
2017 Workshop: Aligned Artificial Intelligence »
Dylan Hadfield-Menell · Jacob Steinhardt · David Duvenaud · David Krueger · Anca Dragan -
2017 : (Invited Talk) Percy Liang: Learning with Adversaries and Collaborators »
Percy Liang -
2017 Workshop: Machine Learning and Computer Security »
Jacob Steinhardt · Nicolas Papernot · Bo Li · Chang Liu · Percy Liang · Dawn Song -
2017 Demonstration: Babble Labble: Learning from Natural Language Explanations »
Braden Hancock · Paroma Varma · Percy Liang · Christopher Ré · Stephanie Wang -
2017 Poster: Learning Mixture of Gaussians with Streaming Data »
Aditi Raghunathan · Prateek Jain · Ravishankar Krishnawamy -
2017 Poster: Learning Overcomplete HMMs »
Vatsal Sharan · Sham Kakade · Percy Liang · Gregory Valiant -
2017 Poster: Certified Defenses for Data Poisoning Attacks »
Jacob Steinhardt · Pang Wei Koh · Percy Liang -
2017 Poster: Unsupervised Transformation Learning via Convex Relaxations »
Tatsunori Hashimoto · Percy Liang · John Duchi -
2016 Workshop: Deep Learning for Action and Interaction »
Chelsea Finn · Raia Hadsell · David Held · Sergey Levine · Percy Liang -
2016 : Opening Remarks »
Jacob Steinhardt -
2016 Workshop: Nonconvex Optimization for Machine Learning: Theory and Practice »
Hossein Mobahi · Anima Anandkumar · Percy Liang · Stefanie Jegelka · Anna Choromanska -
2016 Workshop: Reliable Machine Learning in the Wild »
Dylan Hadfield-Menell · Adrian Weller · David Duvenaud · Jacob Steinhardt · Percy Liang -
2016 Poster: Unsupervised Risk Estimation Using Only Conditional Independence Structure »
Jacob Steinhardt · Percy Liang -
2015 : Sharing the "How" (and not the "What") »
Percy Liang -
2015 Workshop: Non-convex Optimization for Machine Learning: Theory and Practice »
Anima Anandkumar · Niranjan Uma Naresh · Kamalika Chaudhuri · Percy Liang · Sewoong Oh -
2015 Demonstration: CodaLab Worksheets for Reproducible, Executable Papers »
Percy Liang · Evelyne Viegas -
2015 Poster: On-the-Job Learning with Bayesian Decision Theory »
Keenon Werling · Arun Tejasvi Chaganty · Percy Liang · Christopher Manning -
2015 Spotlight: On-the-Job Learning with Bayesian Decision Theory »
Keenon Werling · Arun Tejasvi Chaganty · Percy Liang · Christopher Manning -
2015 Poster: Estimating Mixture Models via Mixtures of Polynomials »
Sida Wang · Arun Tejasvi Chaganty · Percy Liang -
2015 Poster: Learning with Relaxed Supervision »
Jacob Steinhardt · Percy Liang -
2015 Poster: Calibrated Structured Prediction »
Volodymyr Kuleshov · Percy Liang -
2014 Workshop: Challenges in Machine Learning workshop (CiML 2014) »
Isabelle Guyon · Evelyne Viegas · Percy Liang · Olga Russakovsky · Rinat Sergeev · Gábor Melis · Michele Sebag · Gustavo Stolovitzky · Jaume Bacardit · Michael S Kim · Ben Hamner -
2014 Poster: Altitude Training: Strong Bounds for Single-Layer Dropout »
Stefan Wager · William S Fithian · Sida Wang · Percy Liang -
2014 Poster: Simple MAP Inference via Low-Rank Relaxations »
Roy Frostig · Sida Wang · Percy Liang · Christopher D Manning -
2013 Poster: Dropout Training as Adaptive Regularization »
Stefan Wager · Sida Wang · Percy Liang -
2013 Spotlight: Dropout Training as Adaptive Regularization »
Stefan Wager · Sida Wang · Percy Liang -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2009 Workshop: The Generative and Discriminative Learning Interface »
Simon Lacoste-Julien · Percy Liang · Guillaume Bouchard -
2009 Poster: Asymptotically Optimal Regularization in Smooth Parametric Models »
Percy Liang · Francis Bach · Guillaume Bouchard · Michael Jordan -
2008 Workshop: Speech and Language: Unsupervised Latent-Variable Models »
Slav Petrov · Aria Haghighi · Percy Liang · Dan Klein -
2007 Poster: Agreement-Based Learning »
Percy Liang · Dan Klein · Michael Jordan -
2007 Spotlight: Agreement-Based Learning »
Percy Liang · Dan Klein · Michael Jordan -
2007 Poster: A Probabilistic Approach to Language Change »
Alexandre Bouchard-Côté · Percy Liang · Tom Griffiths · Dan Klein