Timezone: »
Poster
Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets
Rohith Kuditipudi · Xiang Wang · Holden Lee · Yi Zhang · Zhiyuan Li · Wei Hu · Rong Ge · Sanjeev Arora
Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #165
Mode connectivity is a surprising phenomenon in the loss landscape of deep nets. Optima---at least those discovered by gradient-based optimization---turn out to be connected by simple paths on which the loss function is almost constant. Often, these paths can be chosen to be piece-wise linear, with as few as two segments.
We give mathematical explanations for this phenomenon, assuming generic properties (such as dropout stability and noise stability) of well-trained deep nets, which have previously been identified as part of understanding the generalization properties of deep nets. Our explanation holds for realistic multilayer nets, and experiments are presented to verify the theory.
Author Information
Rohith Kuditipudi (Duke University)
Xiang Wang (Duke University)
Holden Lee (Princeton)
Yi Zhang (Princeton)
Zhiyuan Li (Princeton University)
Wei Hu (Princeton University)
Rong Ge (Duke University)
Sanjeev Arora (Princeton University)
More from the Same Authors
-
2021 : Private Confidence Sets »
Karan Chadha · John Duchi · Rohith Kuditipudi -
2022 : How Sharpness-Aware Minimization Minimizes Sharpness? »
Kaiyue Wen · Tengyu Ma · Zhiyuan Li -
2022 : How Sharpness-Aware Minimization Minimizes Sharpness? »
Kaiyue Wen · Tengyu Ma · Zhiyuan Li -
2022 : Why (and When) does Local SGD Generalize Better than SGD? »
Xinran Gu · Kaifeng Lyu · Longbo Huang · Sanjeev Arora -
2022 : Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations »
Yongyi Yang · Jacob Steinhardt · Wei Hu -
2022 : Contributed Talks 3 »
Cristóbal Guzmán · Fangshuo Liao · Vishwak Srinivasan · Zhiyuan Li -
2022 Poster: New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound »
Arushi Gupta · Nikunj Saunshi · Dingli Yu · Kaifeng Lyu · Sanjeev Arora -
2022 Poster: Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent »
Zhiyuan Li · Tianhao Wang · Jason Lee · Sanjeev Arora -
2022 Poster: Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction »
Kaifeng Lyu · Zhiyuan Li · Sanjeev Arora -
2022 Poster: Fast Mixing of Stochastic Gradient Descent with Normalization and Weight Decay »
Zhiyuan Li · Tianhao Wang · Dingli Yu -
2022 Poster: Outlier-Robust Sparse Estimation via Non-Convex Optimization »
Yu Cheng · Ilias Diakonikolas · Rong Ge · Shivam Gupta · Daniel Kane · Mahdi Soltanolkotabi -
2022 Poster: On the SDEs and Scaling Rules for Adaptive Gradient Algorithms »
Sadhika Malladi · Kaifeng Lyu · Abhishek Panigrahi · Sanjeev Arora -
2021 : Invited talk 2 »
Sanjeev Arora -
2021 Oral: Evaluating Gradient Inversion Attacks and Defenses in Federated Learning »
Yangsibo Huang · Samyak Gupta · Zhao Song · Kai Li · Sanjeev Arora -
2021 Poster: On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) »
Zhiyuan Li · Sadhika Malladi · Sanjeev Arora -
2021 Poster: Understanding Deflation Process in Over-parametrized Tensor Decomposition »
Rong Ge · Yunwei Ren · Xiang Wang · Mo Zhou -
2021 Poster: A Regression Approach to Learning-Augmented Online Algorithms »
Keerti Anand · Rong Ge · Amit Kumar · Debmalya Panigrahi -
2021 Poster: Evaluating Gradient Inversion Attacks and Defenses in Federated Learning »
Yangsibo Huang · Samyak Gupta · Zhao Song · Kai Li · Sanjeev Arora -
2021 Poster: Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias »
Kaifeng Lyu · Zhiyuan Li · Runzhe Wang · Sanjeev Arora -
2021 Poster: Universal Approximation Using Well-Conditioned Normalizing Flows »
Holden Lee · Chirag Pabbaraju · Anish Prasad Sevekari · Andrej Risteski -
2020 : Keynote speech: Sanjeev Arora (PGDL) »
Sanjeev Arora · Yiding Jiang -
2020 Poster: Implicit Regularization and Convergence for Weight Normalization »
Xiaoxia Wu · Edgar Dobriban · Tongzheng Ren · Shanshan Wu · Zhiyuan Li · Suriya Gunasekar · Rachel Ward · Qiang Liu -
2020 Poster: Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate »
Zhiyuan Li · Kaifeng Lyu · Sanjeev Arora -
2020 Poster: Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality »
Yi Zhang · Orestis Plevrakis · Simon Du · Xingguo Li · Zhao Song · Sanjeev Arora -
2020 Poster: The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks »
Wei Hu · Lechao Xiao · Ben Adlam · Jeffrey Pennington -
2020 Spotlight: The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks »
Wei Hu · Lechao Xiao · Ben Adlam · Jeffrey Pennington -
2020 Poster: Beyond Lazy Training for Over-parameterized Tensor Decomposition »
Xiang Wang · Chenwei Wu · Jason Lee · Tengyu Ma · Rong Ge -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Poster: The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares »
Rong Ge · Sham Kakade · Rahul Kidambi · Praneeth Netrapalli -
2019 Poster: Implicit Regularization in Deep Matrix Factorization »
Sanjeev Arora · Nadav Cohen · Wei Hu · Yuping Luo -
2019 Spotlight: Implicit Regularization in Deep Matrix Factorization »
Sanjeev Arora · Nadav Cohen · Wei Hu · Yuping Luo -
2019 Poster: On Exact Computation with an Infinitely Wide Neural Net »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Russ Salakhutdinov · Ruosong Wang -
2019 Spotlight: On Exact Computation with an Infinitely Wide Neural Net »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Russ Salakhutdinov · Ruosong Wang -
2018 : Plenary Talk 1 »
Sanjeev Arora -
2018 Poster: Online Improper Learning with an Approximation Oracle »
Elad Hazan · Wei Hu · Yuanzhi Li · Zhiyuan Li -
2018 Poster: Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced »
Simon Du · Wei Hu · Jason Lee -
2018 Poster: On the Local Minima of the Empirical Risk »
Chi Jin · Lydia T. Liu · Rong Ge · Michael Jordan -
2018 Spotlight: On the Local Minima of the Empirical Risk »
Chi Jin · Lydia T. Liu · Rong Ge · Michael Jordan -
2018 Poster: Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo »
Holden Lee · Andrej Risteski · Rong Ge -
2018 Poster: Spectral Filtering for General Linear Dynamical Systems »
Elad Hazan · Holden Lee · Karan Singh · Cyril Zhang · Yi Zhang -
2018 Oral: Spectral Filtering for General Linear Dynamical Systems »
Elad Hazan · Holden Lee · Karan Singh · Cyril Zhang · Yi Zhang -
2017 Workshop: Deep Learning: Bridging Theory and Practice »
Sanjeev Arora · Maithra Raghu · Russ Salakhutdinov · Ludwig Schmidt · Oriol Vinyals -
2017 Poster: On the Optimization Landscape of Tensor Decompositions »
Rong Ge · Tengyu Ma -
2017 Poster: Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls »
Zeyuan Allen-Zhu · Elad Hazan · Wei Hu · Yuanzhi Li -
2017 Spotlight: Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls »
Zeyuan Allen-Zhu · Elad Hazan · Wei Hu · Yuanzhi Li -
2017 Oral: On the Optimization Landscape of Tensor Decompositions »
Rong Ge · Tengyu Ma -
2016 Poster: Solving Marginal MAP Problems with NP Oracles and Parity Constraints »
Yexiang Xue · zhiyuan li · Stefano Ermon · Carla Gomes · Bart Selman -
2016 Poster: Learning in Games: Robustness of Fast Convergence »
Dylan Foster · zhiyuan li · Thodoris Lykouris · Karthik Sridharan · Eva Tardos -
2016 Poster: Combinatorial Multi-Armed Bandit with General Reward Functions »
Wei Chen · Wei Hu · Fu Li · Jian Li · Yu Liu · Pinyan Lu -
2012 Poster: Provable ICA with Unknown Gaussian Noise, with Implications for Gaussian Mixtures and Autoencoders »
Sanjeev Arora · Rong Ge · Ankur Moitra · Sushant Sachdeva