Timezone: »
While significant theoretical progress has been achieved, unveiling the generalization mystery of overparameterized neural networks still remains largely elusive. In this paper, we study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability. We consider gradient descent (GD) and stochastic gradient descent (SGD) to train SNNs, for both of which we develop consistent excess risk bounds by balancing the optimization and generalization via early-stopping. As compared to existing analysis on GD, our new analysis requires a relaxed overparameterization assumption and also applies to SGD. The key for the improvement is a better estimation of the smallest eigenvalues of the Hessian matrices of the empirical risks and the loss function along the trajectories of GD and SGD by providing a refined estimation of their iterates.
Author Information
Yunwen Lei (University of Birmingham)
Rong Jin
Yiming Ying (State University of New York at Albany)
More from the Same Authors
-
2022 : An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation »
Ziquan Liu · Yi Xu · Yuanhong Xu · Qi Qian · Hao Li · Rong Jin · Xiangyang Ji · Antoni Chan -
2022 : GLINKX: A Unified Framework for Large-scale Homophilous and Heterophilous Graphs »
Marios Papachristou · Rishab Goel · Frank Portman · Matthew Miller · Rong Jin -
2023 Poster: Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance »
Lisha Chen · Heshan Fernando · Yiming Ying · Tianyi Chen -
2023 Poster: OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling »
yifan zhang · Qingsong Wen · xue wang · Weiqi Chen · Liang Sun · Zhang Zhang · Liang Wang · Rong Jin · Tieniu Tan -
2023 Poster: Toward Better PAC-Bayes Bounds for Uniformly Stable Algorithms »
Sijia Zhou · Yunwen Lei · Ata Kaban -
2022 Spotlight: A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks »
Mingrui Liu · Zhenxun Zhuang · Yunwen Lei · Chunyang Liao -
2022 Poster: A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks »
Mingrui Liu · Zhenxun Zhuang · Yunwen Lei · Chunyang Liao -
2022 Poster: Stability and Generalization for Markov Chain Stochastic Gradient Methods »
Puyu Wang · Yunwen Lei · Yiming Ying · Ding-Xuan Zhou -
2021 Poster: Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning »
ZHENHUAN YANG · Yunwen Lei · Puyu Wang · Tianbao Yang · Yiming Ying -
2021 Poster: Generalization Guarantee of SGD for Pairwise Learning »
Yunwen Lei · Mingrui Liu · Yiming Ying -
2009 Poster: Sparse Metric Learning via Smooth Optimization »
Yiming Ying · Kaizhu Huang · Colin I Campbell -
2009 Poster: Analysis of SVM with Indefinite Kernels »
Yiming Ying · Colin I Campbell · Mark A Girolami -
2009 Spotlight: Analysis of SVM with Indefinite Kernels »
Yiming Ying · Colin I Campbell · Mark A Girolami -
2007 Spotlight: A Spectral Regularization Framework for Multi-Task Structure Learning »
Andreas Argyriou · Charles A. Micchelli · Massimiliano Pontil · Yiming Ying -
2007 Poster: A Spectral Regularization Framework for Multi-Task Structure Learning »
Andreas Argyriou · Charles A. Micchelli · Massimiliano Pontil · Yiming Ying