NeurIPS Information-Theoretic Generalization Error Bound of Deep Neural Networks

Oral
in
Workshop: Information-Theoretic Principles in Cognitive Systems (InfoCog)

Information-Theoretic Generalization Error Bound of Deep Neural Networks

Haiyun He · Christina Yu · Ziv Goldfeld

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications. This work aims to capture the effect and benefits of depth for learning within the paradigm of information-theoretic generalization bounds. We derive two novel hierarchical bounds on the generalization error that capture the effect of the internal representations within each layer. The first bound demonstrates that the generalization bound shrinks as the layer index of the internal representation increases. The second bound aims to quantify the contraction of the relevant information measures when moving deeper into the network. To achieve this, we leverage the strong data processing inequality (SDPI) and employ a stochastic approximation of the DNN model we can explicitly control the SDPI coefficient. These results provide a new perspective for understanding generalization in deep models.

Chat is not available.

Oral in Workshop: Information-Theoretic Principles in Cognitive Systems (InfoCog)

Information-Theoretic Generalization Error Bound of Deep Neural Networks

Haiyun He · Christina Yu · Ziv Goldfeld

Oral
in
Workshop: Information-Theoretic Principles in Cognitive Systems (InfoCog)