Skip to yearly menu bar Skip to main content

Workshop: OPT 2023: Optimization for Machine Learning

Optimization dependent generalization bound for ReLU networks based on sensitivity in the tangent bundle

Dániel Rácz · Mihaly Petreczky · Balint Daroczy


Recent advances in deep learning have given us some verypromising results on the generalization ability of deep neural networks, howeverliterature still lacks a comprehensive theory explaining why heavilyover-parametrized models are able to generalize well while fitting the trainingdata. In this paper we propose a PAC type bound on the generalization error offeedforward ReLU networks via estimating the Rademacher complexity of the set ofnetworks available from an initial parameter vector via gradient descent. Thekey idea is to bound the sensitivity of the network's gradient to perturbationof the input data along the optimization trajectory. The obtained bound doesnot explicitly depend on the depth of the network. Our results areexperimentally verified on the MNIST and CIFAR-10 datasets.

Chat is not available.