Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Information-Theoretic Principles in Cognitive Systems (InfoCog)

Information-Theoretic Generalization Error Bound of Deep Neural Networks

Haiyun He · Christina Yu · Ziv Goldfeld

[ ] [ Project Page ]
Fri 15 Dec 1:50 p.m. PST — 2 p.m. PST
 
presentation: Information-Theoretic Principles in Cognitive Systems (InfoCog)
Fri 15 Dec 6:15 a.m. PST — 3:30 p.m. PST

Abstract:

Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications. This work aims to capture the effect and benefits of depth for learning within the paradigm of information-theoretic generalization bounds. We derive two novel hierarchical bounds on the generalization error that capture the effect of the internal representations within each layer. The first bound demonstrates that the generalization bound shrinks as the layer index of the internal representation increases. The second bound aims to quantify the contraction of the relevant information measures when moving deeper into the network. To achieve this, we leverage the strong data processing inequality (SDPI) and employ a stochastic approximation of the DNN model we can explicitly control the SDPI coefficient. These results provide a new perspective for understanding generalization in deep models.

Chat is not available.