Timezone: »
We show that the margin distribution --- normalized by a spectral complexity parameter --- is strongly predictive of neural network generalization performance. Namely, we 1) Use the margin distribution to correctly predict whether deep neural networks generalize under changes to label distribution such as randomization. That is, the margin distribution accurately predicts the difficulty of deep learning tasks. We further show that normalizing the margin by the network's spectral complexity is critical to obtaining this predictive power, and finally use the margin distribution to compare the generalization performance of multiple networks across different datasets on even terms. Our corresponding generalization bound places these results on rigorous theoretical footing.
Author Information
Peter Bartlett (UC Berkeley)

Peter Bartlett is professor of Computer Science and Statistics at the University of California at Berkeley, Associate Director of the Simons Institute for the Theory of Computing, and Director of the Foundations of Data Science Institute. He has previously held positions at the Queensland University of Technology, the Australian National University and the University of Queensland. His research interests include machine learning and statistical learning theory, and he is the co-author of the book Neural Network Learning: Theoretical Foundations. He has been Institute of Mathematical Statistics Medallion Lecturer, winner of the Malcolm McIntosh Prize for Physical Scientist of the Year, and Australian Laureate Fellow, and he is a Fellow of the IMS, Fellow of the ACM, and Fellow of the Australian Academy of Science.
Dylan J Foster (Cornell University)
Matus Telgarsky (UIUC)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Spectrally-normalized margin bounds for neural networks »
Thu. Dec 7th 02:30 -- 06:30 AM Room Pacific Ballroom #206
More from the Same Authors
-
2021 Spotlight: Early-stopped neural networks are consistent »
Ziwei Ji · Justin Li · Matus Telgarsky -
2023 Poster: Efficient Model-Free Exploration in Low-Rank MDPs »
Zak Mhammedi · Adam Block · Dylan J Foster · Alexander Rakhlin -
2023 Poster: Model-Free Reinforcement Learning with the Decision-Estimation Coefficient »
Dylan J Foster · Noah Golowich · Jian Qian · Alexander Rakhlin · Ayush Sekhari -
2023 Poster: Representational Strengths and Limitations of Transformers »
Clayton Sanford · Daniel Hsu · Matus Telgarsky -
2022 Poster: Interaction-Grounded Learning with Action-Inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan J Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Poster: Understanding the Eluder Dimension »
Gene Li · Pritish Kamath · Dylan J Foster · Nati Srebro -
2022 Poster: On the Complexity of Adversarial Decision Making »
Dylan J Foster · Alexander Rakhlin · Ayush Sekhari · Karthik Sridharan -
2021 Poster: Early-stopped neural networks are consistent »
Ziwei Ji · Justin Li · Matus Telgarsky -
2021 Invited Talk: Benign Overfitting »
Peter Bartlett -
2020 Poster: Directional convergence and alignment in deep learning »
Ziwei Ji · Matus Telgarsky -
2020 Spotlight: Directional convergence and alignment in deep learning »
Ziwei Ji · Matus Telgarsky -
2018 Poster: Size-Noise Tradeoffs in Generative Networks »
Bolton Bailey · Matus Telgarsky -
2018 Spotlight: Size-Noise Tradeoffs in Generative Networks »
Bolton Bailey · Matus Telgarsky -
2017 Poster: Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem »
Yasin Abbasi Yadkori · Peter Bartlett · Victor Gabillon -
2017 Poster: Alternating minimization for dictionary learning with random initialization »
Niladri Chatterji · Peter Bartlett -
2017 Poster: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Poster: Acceleration and Averaging in Stochastic Descent Dynamics »
Walid Krichene · Peter Bartlett -
2017 Spotlight: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Acceleration and Averaging in Stochastic Descent Dynamics »
Walid Krichene · Peter Bartlett -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2013 Poster: Moment-based Uniform Deviation Bounds for $k$-means and Friends »
Matus J Telgarsky · Sanjoy Dasgupta -
2011 Poster: The Fast Convergence of Boosting »
Matus J Telgarsky