Timezone: »
We study norm-based uniform convergence bounds for neural networks, aiming at a tight understanding of how these are affected by the architecture and type of norm constraint, for the simple class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm. We begin by proving that in general, controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees (independent of the network width), while a stronger Frobenius norm control is sufficient, extending and improving on previous work. Motivated by the proof constructions, we identify and analyze two important settings where (perhaps surprisingly) a mere spectral norm control turns out to be sufficient: First, when the network's activation functions are sufficiently smooth (with the result extending to deeper networks); and second, for certain types of convolutional networks. In the latter setting, we study how the sample complexity is additionally affected by parameters such as the amount of overlap between patches and the overall number of patches.
Author Information
Gal Vardi (TTI-Chicago)
Ohad Shamir (Weizmann Institute of Science)
Nati Srebro (TTI-Chicago)
More from the Same Authors
-
2021 Spotlight: Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems »
Itay Safran · Ohad Shamir -
2021 Spotlight: On the Power of Differentiable Learning versus PAC and SQ Learning »
Emmanuel Abbe · Pritish Kamath · Eran Malach · Colin Sandon · Nathan Srebro -
2021 : Exponential Family Model-Based Reinforcement Learning via Score Matching »
Gene Li · Junbo Li · Nathan Srebro · Zhaoran Wang · Zhuoran Yang -
2022 : Distributed Online and Bandit Convex Optimization »
Kumar Kshitij Patel · Aadirupa Saha · Nati Srebro · Lingxiao Wang -
2022 : On Convexity and Linear Mode Connectivity in Neural Networks »
David Yunis · Kumar Kshitij Patel · Pedro Savarese · Gal Vardi · Jonathan Frankle · Matthew Walter · Karen Livescu · Michael Maire -
2022 : On the Complexity of Finding Small Subgradients in Nonsmooth Optimization »
Guy Kornowski · Ohad Shamir -
2022 : On the Complexity of Finding Small Subgradients in Nonsmooth Optimization »
Guy Kornowski · Ohad Shamir -
2022 Panel: Panel 1C-2: Reconstructing Training Data… & On Optimal Learning… »
Gal Vardi · Idan Mehalel -
2022 Poster: A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models »
Lijia Zhou · Frederic Koehler · Pragya Sur · Danica J. Sutherland · Nati Srebro -
2022 Poster: On Margin Maximization in Linear and ReLU Networks »
Gal Vardi · Ohad Shamir · Nati Srebro -
2022 Poster: Towards Optimal Communication Complexity in Distributed Non-Convex Optimization »
Kumar Kshitij Patel · Lingxiao Wang · Blake Woodworth · Brian Bullins · Nati Srebro -
2022 Poster: Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization »
Idan Amir · Roi Livni · Nati Srebro -
2022 Poster: Pessimism for Offline Linear Contextual Bandits using $\ell_p$ Confidence Sets »
Gene Li · Cong Ma · Nati Srebro -
2022 Poster: Adversarially Robust Learning: A Generic Minimax Optimal Learner and Characterization »
Omar Montasser · Steve Hanneke · Nati Srebro -
2022 Poster: On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias »
Itay Safran · Gal Vardi · Jason Lee -
2022 Poster: Reconstructing Training Data From Trained Neural Networks »
Niv Haim · Gal Vardi · Gilad Yehudai · Ohad Shamir · Michal Irani -
2022 Poster: Understanding the Eluder Dimension »
Gene Li · Pritish Kamath · Dylan J Foster · Nati Srebro -
2022 Poster: Gradient Methods Provably Converge to Non-Robust Networks »
Gal Vardi · Gilad Yehudai · Ohad Shamir -
2022 Poster: Exponential Family Model-Based Reinforcement Learning via Score Matching »
Gene Li · Junbo Li · Anmol Kabra · Nati Srebro · Zhaoran Wang · Zhuoran Yang -
2021 Poster: On the Power of Differentiable Learning versus PAC and SQ Learning »
Emmanuel Abbe · Pritish Kamath · Eran Malach · Colin Sandon · Nathan Srebro -
2021 Oral: Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting »
Frederic Koehler · Lijia Zhou · Danica J. Sutherland · Nathan Srebro -
2021 Poster: Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting »
Frederic Koehler · Lijia Zhou · Danica J. Sutherland · Nathan Srebro -
2021 Poster: Representation Costs of Linear Neural Networks: Analysis and Design »
Zhen Dai · Mina Karzand · Nathan Srebro -
2021 Poster: Learning a Single Neuron with Bias Using Gradient Descent »
Gal Vardi · Gilad Yehudai · Ohad Shamir -
2021 Poster: Oracle Complexity in Nonsmooth Nonconvex Optimization »
Guy Kornowski · Ohad Shamir -
2021 Poster: An Even More Optimal Stochastic Optimization Algorithm: Minibatching and Interpolation Learning »
Blake Woodworth · Nathan Srebro -
2021 Poster: A Stochastic Newton Algorithm for Distributed Convex Optimization »
Brian Bullins · Kshitij Patel · Ohad Shamir · Nathan Srebro · Blake Woodworth -
2021 Oral: Oracle Complexity in Nonsmooth Nonconvex Optimization »
Guy Kornowski · Ohad Shamir -
2021 Poster: Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems »
Itay Safran · Ohad Shamir -
2020 : Poster Session 1 (gather.town) »
Laurent Condat · Tiffany Vlaar · Ohad Shamir · Mohammadi Zaki · Zhize Li · Guan-Horng Liu · Samuel Horváth · Mher Safaryan · Yoni Choukroun · Kumar Shridhar · Nabil Kahale · Jikai Jin · Pratik Kumar Jawanpuria · Gaurav Kumar Yadav · Kazuki Koyama · Junyoung Kim · Xiao Li · Saugata Purkayastha · Adil Salim · Dighanchal Banerjee · Peter Richtarik · Lakshman Mahto · Tian Ye · Bamdev Mishra · Huikang Liu · Jiajie Zhu -
2020 : Contributed talks in Session 1 (Zoom) »
Sebastian Stich · Laurent Condat · Zhize Li · Ohad Shamir · Tiffany Vlaar · Mohammadi Zaki -
2020 : Contributed Video: Can We Find Near-Approximately-Stationary Points of Nonsmooth Nonconvex Functions?, Ohad Shamir »
Ohad Shamir -
2020 Poster: On Uniform Convergence and Low-Norm Interpolation Learning »
Lijia Zhou · Danica J. Sutherland · Nati Srebro -
2020 Poster: Neural Networks with Small Weights and Depth-Separation Barriers »
Gal Vardi · Ohad Shamir -
2020 Poster: Reducing Adversarially Robust Learning to Non-Robust PAC Learning »
Omar Montasser · Steve Hanneke · Nati Srebro -
2020 Spotlight: On Uniform Convergence and Low-Norm Interpolation Learning »
Lijia Zhou · Danica J. Sutherland · Nati Srebro -
2020 Poster: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2020 Poster: Minibatch vs Local SGD for Heterogeneous Distributed Learning »
Blake Woodworth · Kumar Kshitij Patel · Nati Srebro -
2020 Spotlight: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2020 Poster: Hardness of Learning Neural Networks with Natural Weights »
Amit Daniely · Gal Vardi -
2019 : Poster Session »
Gergely Flamich · Shashanka Ubaru · Charles Zheng · Josip Djolonga · Kristoffer Wickstrøm · Diego Granziol · Konstantinos Pitas · Jun Li · Robert Williamson · Sangwoong Yoon · Kwot Sin Lee · Julian Zilly · Linda Petrini · Ian Fischer · Zhe Dong · Alexander Alemi · Bao-Ngoc Nguyen · Rob Brekelmans · Tailin Wu · Aditya Mahajan · Alexander Li · Kirankumar Shiragur · Yair Carmon · Linara Adilova · SHIYU LIU · Bang An · Sanjeeb Dash · Oktay Gunluk · Arya Mazumdar · Mehul Motani · Julia Rosenzweig · Michael Kamp · Marton Havasi · Leighton P Barnes · Zhengqing Zhou · Yi Hao · Dylan Foster · Yuval Benjamini · Nati Srebro · Michael Tschannen · Paul Rubenstein · Sylvain Gelly · John Duchi · Aaron Sidford · Robin Ru · Stefan Zohren · Murtaza Dalal · Michael A Osborne · Stephen J Roberts · Moses Charikar · Jayakumar Subramanian · Xiaodi Fan · Max Schwarzer · Nicholas Roberts · Simon Lacoste-Julien · Vinay Prabhu · Aram Galstyan · Greg Ver Steeg · Lalitha Sankar · Yung-Kyun Noh · Gautam Dasarathy · Frank Park · Ngai-Man (Man) Cheung · Ngoc-Trung Tran · Linxiao Yang · Ben Poole · Andrea Censi · Tristan Sylvain · R Devon Hjelm · Bangjie Liu · Jose Gallego-Posada · Tyler Sypherd · Kai Yang · Jan Nikolas Morshuis -
2019 Poster: On the Power and Limitations of Random Features for Understanding Neural Networks »
Gilad Yehudai · Ohad Shamir -
2018 Poster: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Spotlight: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Poster: Implicit Bias of Gradient Descent on Linear Convolutional Networks »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro -
2018 Poster: Are ResNets Provably Better than Linear Predictors? »
Ohad Shamir -
2018 Poster: The Everlasting Database: Statistical Validity at a Fair Price »
Blake Woodworth · Vitaly Feldman · Saharon Rosset · Nati Srebro -
2018 Poster: On preserving non-discrimination when combining expert advice »
Avrim Blum · Suriya Gunasekar · Thodoris Lykouris · Nati Srebro -
2018 Poster: Global Non-convex Optimization with Discretized Diffusions »
Murat Erdogdu · Lester Mackey · Ohad Shamir -
2017 Poster: The Marginal Value of Adaptive Gradient Methods in Machine Learning »
Ashia C Wilson · Becca Roelofs · Mitchell Stern · Nati Srebro · Benjamin Recht -
2017 Oral: The Marginal Value of Adaptive Gradient Methods in Machine Learning »
Ashia C Wilson · Becca Roelofs · Mitchell Stern · Nati Srebro · Benjamin Recht -
2017 Poster: Stochastic Approximation for Canonical Correlation Analysis »
Raman Arora · Teodor Vanislavov Marinov · Poorya Mianjy · Nati Srebro -
2017 Poster: Exploring Generalization in Deep Learning »
Behnam Neyshabur · Srinadh Bhojanapalli · David Mcallester · Nati Srebro -
2017 Poster: Implicit Regularization in Matrix Factorization »
Suriya Gunasekar · Blake Woodworth · Srinadh Bhojanapalli · Behnam Neyshabur · Nati Srebro -
2017 Spotlight: Implicit Regularization in Matrix Factorization »
Suriya Gunasekar · Blake Woodworth · Srinadh Bhojanapalli · Behnam Neyshabur · Nati Srebro -
2016 Poster: Tight Complexity Bounds for Optimizing Composite Objectives »
Blake Woodworth · Nati Srebro -
2016 Poster: Dimension-Free Iteration Complexity of Finite Sum Optimization Problems »
Yossi Arjevani · Ohad Shamir -
2016 Poster: Efficient Globally Convergent Stochastic Optimization for Canonical Correlation Analysis »
Weiran Wang · Jialei Wang · Dan Garber · Dan Garber · Nati Srebro -
2016 Poster: Without-Replacement Sampling for Stochastic Gradient Methods »
Ohad Shamir -
2016 Oral: Without-Replacement Sampling for Stochastic Gradient Methods »
Ohad Shamir -
2016 Poster: Global Optimality of Local Search for Low Rank Matrix Recovery »
Srinadh Bhojanapalli · Behnam Neyshabur · Nati Srebro -
2016 Poster: Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations »
Behnam Neyshabur · Yuhuai Wu · Russ Salakhutdinov · Nati Srebro -
2016 Poster: Equality of Opportunity in Supervised Learning »
Moritz Hardt · Eric Price · Eric Price · Nati Srebro -
2016 Poster: Normalized Spectral Map Synchronization »
Yanyao Shen · Qixing Huang · Nati Srebro · Sujay Sanghavi -
2015 Poster: Path-SGD: Path-Normalized Optimization in Deep Neural Networks »
Behnam Neyshabur · Russ Salakhutdinov · Nati Srebro -
2015 Poster: Communication Complexity of Distributed Convex Learning and Optimization »
Yossi Arjevani · Ohad Shamir -
2014 Poster: Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation »
Ohad Shamir -
2014 Poster: Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm »
Deanna Needell · Rachel Ward · Nati Srebro -
2014 Poster: On the Computational Efficiency of Training Neural Networks »
Roi Livni · Shai Shalev-Shwartz · Ohad Shamir -
2013 Workshop: Learning Faster From Easy Data »
Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck -
2013 Workshop: Large Scale Matrix Analysis and Inference »
Reza Zadeh · Gunnar Carlsson · Michael Mahoney · Manfred K. Warmuth · Wouter M Koolen · Nati Srebro · Satyen Kale · Malik Magdon-Ismail · Ashish Goel · Matei A Zaharia · David Woodruff · Ioannis Koutis · Benjamin Recht -
2013 Poster: Online Learning with Switching Costs and Other Adaptive Adversaries »
Nicolò Cesa-Bianchi · Ofer Dekel · Ohad Shamir -
2013 Poster: Stochastic Optimization of PCA with Capped MSG »
Raman Arora · Andrew Cotter · Nati Srebro -
2013 Poster: Auditing: Active Learning with Outcome-Dependent Query Costs »
Sivan Sabato · Anand D Sarwate · Nati Srebro -
2013 Poster: The Power of Asymmetry in Binary Hashing »
Behnam Neyshabur · Nati Srebro · Russ Salakhutdinov · Yury Makarychev · Payman Yadollahpour -
2012 Poster: Sparse Prediction with the $k$-Support Norm »
Andreas Argyriou · Rina Foygel · Nati Srebro -
2012 Spotlight: Sparse Prediction with the $k$-Support Norm »
Andreas Argyriou · Rina Foygel · Nati Srebro -
2012 Poster: Matrix reconstruction with the local max norm »
Rina Foygel · Nati Srebro · Russ Salakhutdinov -
2011 Poster: Beating SGD: Learning SVMs in Sublinear Time »
Elad Hazan · Tomer Koren · Nati Srebro -
2011 Poster: Better Mini-Batch Algorithms via Accelerated Gradient Methods »
Andrew Cotter · Ohad Shamir · Nati Srebro · Karthik Sridharan -
2011 Poster: On the Universality of Online Mirror Descent »
Nati Srebro · Karthik Sridharan · Ambuj Tewari -
2011 Poster: Learning with the weighted trace-norm under arbitrary sampling distributions »
Rina Foygel · Russ Salakhutdinov · Ohad Shamir · Nati Srebro -
2010 Session: Spotlights Session 11 »
Nati Srebro -
2010 Session: Oral Session 13 »
Nati Srebro -
2010 Poster: Tight Sample Complexity of Large-Margin Learning »
Sivan Sabato · Nati Srebro · Naftali Tishby -
2010 Poster: Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm »
Russ Salakhutdinov · Nati Srebro -
2010 Poster: Practical Large-Scale Optimization for Max-norm Regularization »
Jason D Lee · Benjamin Recht · Russ Salakhutdinov · Nati Srebro · Joel A Tropp -
2010 Poster: Smoothness, Low Noise and Fast Rates »
Nati Srebro · Karthik Sridharan · Ambuj Tewari -
2009 Workshop: Understanding Multiple Kernel Learning Methods »
Brian McFee · Gert Lanckriet · Francis Bach · Nati Srebro -
2009 Poster: Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data »
Boaz Nadler · Nati Srebro · Xueyuan Zhou -
2009 Spotlight: Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data »
Boaz Nadler · Nati Srebro · Xueyuan Zhou -
2008 Poster: Fast Rates for Regularized Objectives »
Karthik Sridharan · Shai Shalev-Shwartz · Nati Srebro