Timezone: »
Comparing the representations learned by different neural networks has recently emerged as a key tool to understand various architectures and ultimately optimize them. In this work, we introduce GULP, a family of distance measures between representations that is explicitly motivated by downstream predictive tasks. By construction, GULP provides uniform control over the difference in prediction performance between two representations, with respect to regularized linear prediction tasks. Moreover, it satisfies several desirable structural properties, such as the triangle inequality and invariance under orthogonal transformations, and thus lends itself to data embedding and visualization. We extensively evaluate GULP relative to other methods, and demonstrate that it correctly differentiates between architecture families, converges over the course of training, and captures generalization performance on downstream linear tasks.
Author Information
Enric Boix-Adsera (MIT)
Hannah Lawrence (MIT)
George Stepaniants (Massachusetts Institute of Technology)

George Stepaniants is a fourth-year PhD student at the Massachusetts Institute of Technology (MIT) in the Department of Mathematics advised by Prof. Philippe Rigollet and Prof. Jörn Dunkel. He is a member of the Interdisciplinary Doctoral Program in Statistics through the Institute for Data, Systems, and Society (IDSS). He obtained his Bachelor of Science at the University of Washington (UW) in Mathematics and Computer Science in 2019 where he performed research in the Department of Applied Mathematics under Prof. Nathan Kutz. George’s research is on the intersection of statistics and physical applied mathematics where he studies how statistical and machine learning algorithms can be used to infer and predict systems governed by ordinary and partial differential equations such as physical, biological, and chemical processes. He is also interested in the applications of optimal transport methods for solving matching problems in genomics, metabolomics, and other fields.
Philippe Rigollet (MIT)
More from the Same Authors
-
2022 : Barron's Theorem for Equivariant Networks »
Hannah Lawrence -
2022 Poster: Variational inference via Wasserstein gradient flows »
Marc Lambert · Sinho Chewi · Francis Bach · Silvère Bonnabel · Philippe Rigollet -
2022 Poster: On the non-universality of deep learning: quantifying the cost of symmetry »
Emmanuel Abbe · Enric Boix-Adsera -
2021 Poster: The staircase property: How hierarchical structure can guide deep learning »
Emmanuel Abbe · Enric Boix-Adsera · Matthew S Brennan · Guy Bresler · Dheeraj Nagaraj -
2020 Poster: Exponential ergodicity of mirror-Langevin diffusions »
Sinho Chewi · Thibaut Le Gouic · Chen Lu · Tyler Maunu · Philippe Rigollet · Austin Stromme -
2020 Poster: SVGD as a kernelized Wasserstein gradient flow of the chi-squared divergence »
Sinho Chewi · Thibaut Le Gouic · Chen Lu · Tyler Maunu · Philippe Rigollet -
2020 Poster: Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition »
Lin Chen · Qian Yu · Hannah Lawrence · Amin Karbasi -
2019 Poster: Power analysis of knockoff filters for correlated designs »
Jingbo Liu · Philippe Rigollet -
2019 Poster: Sample Efficient Active Learning of Causal Trees »
Kristjan Greenewald · Dmitriy Katz · Karthikeyan Shanmugam · Sara Magliacane · Murat Kocaoglu · Enric Boix-Adsera · Guy Bresler -
2017 Poster: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration »
Jason Altschuler · Jonathan Niles-Weed · Philippe Rigollet -
2017 Spotlight: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration »
Jason Altschuler · Jonathan Niles-Weed · Philippe Rigollet