Timezone: »

Learning Trajectories are Generalization Indicators
Jingwen Fu · Zhizheng Zhang · Dacheng Yin · Yan Lu · Nanning Zheng

Wed Dec 13 03:00 PM -- 05:00 PM (PST) @ Great Hall & Hall B1+B2 #1808

This paper explores the connection between learning trajectories of Deep Neural Networks (DNNs) and their generalization capabilities when optimized using (stochastic) gradient descent algorithms. Instead of concentrating solely on the generalization error of the DNN post-training, we present a novel perspective for analyzing generalization error by investigating the contribution of each update step to the change in generalization error. This perspective enable a more direct comprehension of how the learning trajectory influences generalization error. Building upon this analysis, we propose a new generalization bound that incorporates more extensive trajectory information.Our proposed generalization bound depends on the complexity of learning trajectory and the ratio between the bias and diversity of training set. Experimental observations reveal that our method effectively captures the generalization error throughout the training process. Furthermore, our approach can also track changes in generalization error when adjustments are made to learning rates and label noise levels. These results demonstrate that learning trajectory information is a valuable indicator of a model's generalization capabilities.

Author Information

Jingwen Fu (Xi'an Jiaotong University)
Zhizheng Zhang (Microsoft Research)
Dacheng Yin (University of Science and Technology of China)
Yan Lu (Microsoft Research Asia)
Nanning Zheng (Xi'an Jiaotong University)

More from the Same Authors