Timezone: »

Measuring Generalization with Optimal Transport
Ching-Yao Chuang · Youssef Mroueh · Kristjan Greenewald · Antonio Torralba · Stefanie Jegelka

Tue Dec 07 08:30 AM -- 10:00 AM (PST) @ None #None

Understanding the generalization of deep neural networks is one of the most important tasks in deep learning. Although much progress has been made, theoretical error bounds still often behave disparately from empirical observations. In this work, we develop margin-based generalization bounds, where the margins are normalized with optimal transport costs between independent random subsets sampled from the training distribution. In particular, the optimal transport cost can be interpreted as a generalization of variance which captures the structural properties of the learned feature space. Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets. Theoretically, we demonstrate that the concentration and separation of features play crucial roles in generalization, supporting empirical results in the literature.

Author Information

Ching-Yao Chuang (MIT)
Youssef Mroueh (IBM T.J Watson Research Center)
Kristjan Greenewald (MIT-IBM Watson AI Lab; IBM Research)
Antonio Torralba (Massachusetts Institute of Technology)
Stefanie Jegelka (MIT)

Stefanie Jegelka is an X-Consortium Career Development Assistant Professor in the Department of EECS at MIT. She is a member of the Computer Science and AI Lab (CSAIL), the Center for Statistics and an affiliate of the Institute for Data, Systems and Society and the Operations Research Center. Before joining MIT, she was a postdoctoral researcher at UC Berkeley, and obtained her PhD from ETH Zurich and the Max Planck Institute for Intelligent Systems. Stefanie has received a Sloan Research Fellowship, an NSF CAREER Award, a DARPA Young Faculty Award, the German Pattern Recognition Award and a Best Paper Award at the International Conference for Machine Learning (ICML). Her research interests span the theory and practice of algorithmic machine learning.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors