Timezone: »
Modern machine learning systems such as deep neural networks are often highly over-parameterized so that they can fit the noisy training data exactly, yet they can still achieve small test errors in practice. In this paper, we study this "benign overfitting" phenomenon of the maximum margin classifier for linear classification problems. Specifically, we consider data generated from sub-Gaussian mixtures, and provide a tight risk bound for the maximum margin linear classifier in the over-parameterized setting. Our results precisely characterize the condition under which benign overfitting can occur in linear classification problems, and improve on previous work. They also have direct implications for over-parameterized logistic regression.
Author Information
Yuan Cao (UCLA)
Quanquan Gu (UCLA)
Mikhail Belkin (Ohio State University)
More from the Same Authors
-
2021 : Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization »
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu -
2021 : Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization »
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu -
2021 : Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium »
Chris Junchi Li · Dongruo Zhou · Quanquan Gu · Michael Jordan -
2022 Panel: Panel 6B-2: Depth is More… & Benign Overfitting in… »
Yuan Cao · Shen-Huan Lyu -
2022 Poster: Benign Overfitting in Two-layer Convolutional Neural Networks »
Yuan Cao · Zixiang Chen · Misha Belkin · Quanquan Gu -
2021 Poster: The Benefits of Implicit Regularization from SGD in Least Squares Problems »
Difan Zou · Jingfeng Wu · Vladimir Braverman · Quanquan Gu · Dean Foster · Sham Kakade -
2021 Poster: Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 Poster: Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent »
Spencer Frei · Quanquan Gu -
2021 Poster: Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 Poster: Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation »
Weitong ZHANG · Dongruo Zhou · Quanquan Gu -
2021 Poster: Variance-Aware Off-Policy Evaluation with Linear Function Approximation »
Yifei Min · Tianhao Wang · Dongruo Zhou · Quanquan Gu -
2021 Poster: Multiple Descent: Design Your Own Generalization Curve »
Lin Chen · Yifei Min · Mikhail Belkin · Amin Karbasi -
2021 Poster: Iterative Teacher-Aware Learning »
Luyao Yuan · Dongruo Zhou · Junhong Shen · Jingdong Gao · Jeffrey L Chen · Quanquan Gu · Ying Nian Wu · Song-Chun Zhu -
2021 Poster: Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints »
Tianhao Wang · Dongruo Zhou · Quanquan Gu -
2021 Poster: Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks »
Hanxun Huang · Yisen Wang · Sarah Erfani · Quanquan Gu · James Bailey · Xingjun Ma -
2021 Poster: Do Wider Neural Networks Really Help Adversarial Robustness? »
Boxi Wu · Jinghui Chen · Deng Cai · Xiaofei He · Quanquan Gu -
2021 Poster: Pure Exploration in Kernel and Neural Bandits »
Yinglun Zhu · Dongruo Zhou · Ruoxi Jiang · Quanquan Gu · Rebecca Willett · Robert Nowak -
2020 Poster: A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks »
Zixiang Chen · Yuan Cao · Quanquan Gu · Tong Zhang -
2020 Poster: Agnostic Learning of a Single Neuron with Gradient Descent »
Spencer Frei · Yuan Cao · Quanquan Gu -
2019 Poster: Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks »
Spencer Frei · Yuan Cao · Quanquan Gu -
2019 Poster: Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks »
Yuan Cao · Quanquan Gu -
2019 Poster: Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks »
Yuan Cao · Quanquan Gu -
2019 Spotlight: Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks »
Yuan Cao · Quanquan Gu -
2018 Poster: Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima »
Yaodong Yu · Pan Xu · Quanquan Gu -
2018 Poster: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate »
Mikhail Belkin · Daniel Hsu · Partha P Mitra -
2018 Poster: Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization »
Pan Xu · Jinghui Chen · Difan Zou · Quanquan Gu -
2018 Spotlight: Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization »
Pan Xu · Jinghui Chen · Difan Zou · Quanquan Gu -
2018 Poster: Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Spotlight: Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Poster: Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization »
Bargav Jayaraman · Lingxiao Wang · David Evans · Quanquan Gu -
2017 Poster: Speeding Up Latent Variable Gaussian Graphical Model Estimation via Nonconvex Optimization »
Pan Xu · Jian Ma · Quanquan Gu -
2017 Poster: Diving into the shallows: a computational perspective on large-scale shallow learning »
SIYUAN MA · Mikhail Belkin -
2017 Spotlight: Diving into the shallows: a computational perspective on large-scale shallow learning »
SIYUAN MA · Mikhail Belkin -
2016 Poster: Semiparametric Differential Graph Models »
Pan Xu · Quanquan Gu -
2016 Poster: Graphons, mergeons, and so on! »
Justin Eldridge · Mikhail Belkin · Yusu Wang -
2016 Oral: Graphons, mergeons, and so on! »
Justin Eldridge · Mikhail Belkin · Yusu Wang -
2016 Poster: Clustering with Bregman Divergences: an Asymptotic Analysis »
Chaoyue Liu · Mikhail Belkin -
2015 Poster: A Pseudo-Euclidean Iteration for Optimal Recovery in Noisy ICA »
James R Voss · Mikhail Belkin · Luis Rademacher -
2015 Poster: High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality »
Zhaoran Wang · Quanquan Gu · Yang Ning · Han Liu -
2014 Poster: Sparse PCA with Oracle Property »
Quanquan Gu · Zhaoran Wang · Han Liu -
2014 Poster: Robust Tensor Decomposition with Gross Corruption »
Quanquan Gu · Huan Gui · Jiawei Han -
2014 Poster: Learning with Fredholm Kernels »
Qichao Que · Mikhail Belkin · Yusu Wang -
2013 Workshop: Modern Nonparametric Methods in Machine Learning »
Arthur Gretton · Mladen Kolar · Samory Kpotufe · John Lafferty · Han Liu · Bernhard Schölkopf · Alexander Smola · Rob Nowak · Mikhail Belkin · Lorenzo Rosasco · peter bickel · Yue Zhao -
2013 Poster: Inverse Density as an Inverse Problem: the Fredholm Equation Approach »
Qichao Que · Mikhail Belkin -
2013 Poster: Fast Algorithms for Gaussian Noise Invariant Independent Component Analysis »
James R Voss · Luis Rademacher · Mikhail Belkin -
2013 Spotlight: Inverse Density as an Inverse Problem: the Fredholm Equation Approach »
Qichao Que · Mikhail Belkin -
2012 Poster: Selective Labeling via Error Bound Minimization »
Quanquan Gu · Tong Zhang · Chris Ding · Jiawei Han -
2011 Poster: Data Skeletonization via Reeb Graphs »
Xiaoyin Ge · Issam I Safa · Mikhail Belkin · Yusu Wang -
2009 Poster: Semi-supervised Learning using Sparse Eigenfunction Bases »
Kaushik Sinha · Mikhail Belkin -
2007 Spotlight: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2007 Poster: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2006 Poster: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts »
Hariharan Narayanan · Mikhail Belkin · Partha Niyogi -
2006 Poster: Convergence of Laplacian Eigenmaps »
Mikhail Belkin · Partha Niyogi