Timezone: »
Standard deep neural networks often have excess non-linearity, making them susceptible to issues such as low adversarial robustness and gradient instability. Common methods to address these downstream issues, such as adversarial training, are expensive and often sacrifice predictive accuracy. In this work, we address the core issue of excess non-linearity via curvature, and demonstrate low-curvature neural networks (LCNNs) that obtain drastically lower curvature than standard models while exhibiting similar predictive performance. This leads to improved robustness and stable gradients, at a fraction of the cost of standard adversarial training. To achieve this, we decompose overall model curvature in terms of curvatures and slopes of its constituent layers. To enable efficient curvature minimization of constituent layers, we introduce two novel architectural components: first, a non-linearity called centered-softplus that is a stable variant of the softplus non-linearity, and second, a Lipschitz-constrained batch normalization layer.Our experiments show that LCNNs have lower curvature, more stable gradients and increased off-the-shelf adversarial robustness when compared to standard neural networks, all without affecting predictive performance. Our approach is easy to use and can be readily incorporated into existing neural network architectures.
Author Information
Suraj Srinivas (School of Engineering and Applied Sciences, Harvard University)
Kyle Matoba (EPFL)
Himabindu Lakkaraju (Harvard)
François Fleuret (University of Geneva)
François Fleuret got a PhD in Mathematics from INRIA and the University of Paris VI in 2000, and an Habilitation degree in Mathematics from the University of Paris XIII in 2006. He is Full Professor in the department of Computer Science at the University of Geneva, and Adjunct Professor in the School of Engineering of the École Polytechnique Fédérale de Lausanne. He has published more than 80 papers in peer-reviewed international conferences and journals. He is Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence, serves as Area Chair for NeurIPS, AAAI, and ICCV, and in the program committee of many top-tier international conferences in machine learning and computer vision. He was or is expert for multiple funding agencies. He is the inventor of several patents in the field of machine learning, and co-founder of Neural Concept SA, a company specializing in the development and commercialization of deep learning solutions for engineering design. His main research interest is machine learning, with a particular focus on computational aspects and sample efficiency.
More from the Same Authors
-
2020 : Exact Preimages of Neural Network Aircraft Collision Avoidance Systems »
Kyle Matoba · François Fleuret -
2021 : Test time Adaptation through Perturbation Robustness »
Prabhu Teja Sivaprasad · François Fleuret -
2022 : Deformations of Boltzmann Distributions »
Bálint Máté · François Fleuret -
2022 : Diversity through Disagreement for Better Transferability »
Matteo Pagliardini · Martin Jaggi · François Fleuret · Sai Praneeth Karimireddy -
2022 : TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations »
Dylan Slack · Satyapriya Krishna · Himabindu Lakkaraju · Sameer Singh -
2022 : On the Impact of Adversarially Robust Models on Algorithmic Recourse »
Satyapriya Krishna · Chirag Agarwal · Himabindu Lakkaraju -
2023 Poster: Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability »
Usha Bhalla · Suraj Srinivas · Himabindu Lakkaraju -
2023 Poster: Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness »
Suraj Srinivas · Sebastian Bordt · Himabindu Lakkaraju -
2023 Poster: Faster Causal Attention Over Large Sequences Through Sparse Flash Attention »
Matteo Pagliardini · Daniele Paliotta · Martin Jaggi · François Fleuret -
2023 Poster: Post Hoc Explanations of Language Models Can Improve Language Models »
Satyapriya Krishna · Jiaqi Ma · Dylan Slack · Asma Ghandeharioun · Sameer Singh · Himabindu Lakkaraju -
2023 Poster: SUPA: A Lightweight Diagnostic Simulator for Machine Learning in Particle Physics »
Atul Kumar Sinha · Daniele Paliotta · Bálint Máté · John Raine · Tobias Golling · François Fleuret -
2023 Poster: $\mathcal{M}^4$: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models »
Xuhong Li · Mengnan Du · Jiamin Chen · Yekun Chai · Himabindu Lakkaraju · Haoyi Xiong -
2023 Workshop: Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations »
Jiaqi Ma · Danielle Belgrave · P-R Stark · Daniele Magazzeni · Himabindu Lakkaraju -
2023 Workshop: XAI in Action: Past, Present, and Future Applications »
Chhavi Yadav · Michal Moshkovitz · Nave Frost · Suraj Srinivas · Bingqing Chen · Valentyn Boreiko · Himabindu Lakkaraju · J. Zico Kolter · Dotan Di Castro · Kamalika Chaudhuri -
2022 : Contributed Talk: TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations »
Dylan Slack · Satyapriya Krishna · Himabindu Lakkaraju · Sameer Singh -
2022 : Transformers are Sample-Efficient World Models »
Vincent Micheli · Eloi Alonso · François Fleuret -
2022 Poster: Data-Efficient Structured Pruning via Submodular Optimization »
Marwa El Halabi · Suraj Srinivas · Simon Lacoste-Julien -
2022 Poster: OpenXAI: Towards a Transparent Evaluation of Model Explanations »
Chirag Agarwal · Satyapriya Krishna · Eshika Saxena · Martin Pawelczyk · Nari Johnson · Isha Puri · Marinka Zitnik · Himabindu Lakkaraju -
2022 Poster: Flowification: Everything is a normalizing flow »
Bálint Máté · Samuel Klein · Tobias Golling · François Fleuret -
2022 Poster: Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations »
Tessa Han · Suraj Srinivas · Himabindu Lakkaraju -
2020 Poster: Fast Transformers with Clustered Attention »
Apoorv Vyas · Angelos Katharopoulos · François Fleuret -
2019 Poster: Reducing Noise in GAN Training with Variance Reduced Extragradient »
Tatjana Chavdarova · Gauthier Gidel · François Fleuret · Simon Lacoste-Julien -
2019 Demonstration: Real Time CFD simulations with 3D Mesh Convolutional Networks »
Pierre Baque · Pascal Fua · François Fleuret -
2019 Poster: Full-Gradient Representation for Neural Network Visualization »
Suraj Srinivas · François Fleuret -
2018 Poster: Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching »
Stepan Tulyakov · Anton Ivanov · François Fleuret -
2017 Poster: K-Medoids For K-Means Seeding »
James Newling · François Fleuret -
2017 Spotlight: K-Medoids For K-Means Seeding »
James Newling · François Fleuret -
2016 Poster: Nested Mini-Batch K-Means »
James Newling · François Fleuret -
2015 Poster: Kullback-Leibler Proximal Variational Inference »
Mohammad Emtiyaz Khan · Pierre Baque · François Fleuret · Pascal Fua -
2014 Demonstration: A 3D Simulator for Evaluating Reinforcement and Imitation Learning Algorithms on Complex Tasks »
Leonidas Lefakis · François Fleuret · Cijo Jose -
2013 Poster: Reservoir Boosting : Between Online and Offline Ensemble Learning »
Leonidas Lefakis · François Fleuret -
2011 Poster: Boosting with Maximum Adaptive Sampling »
Charles Dubout · François Fleuret -
2010 Demonstration: Platform to Share Feature Extraction Methods »
François Fleuret -
2010 Poster: Joint Cascade Optimization Using A Product Of Boosted Classifiers »
Leonidas Lefakis · François Fleuret