Timezone: »
Optimal transport (OT) is gradually establishing itself as a powerful and essential tool to compare probability measures, which in machine learning take the form of point clouds, histograms, bags-of-features, or more generally datasets to be compared with probability densities and generative models. OT can be traced back to early work by Monge, and later to Kantorovich and Dantzig during the birth of linear programming. The mathematical theory of OT has produced several important developments since the 90's, crowned by Cédric Villani's Fields Medal in 2010. OT is now transitioning into more applied spheres, including recent applications to machine learning, because it can tackle challenging learning scenarios including dimensionality reduction, structured prediction problems that involve histograms, and estimation of generative models in highly degenerate, high-dimensional problems. This workshop will follow that organized 3 years ago (NIPS 2014) and will seek to amplify that trend. We will provide the audience with an update on all of the very recent successes brought forward by efficient solvers and innovative applications through a long list of invited talks. We will add to that a few contributed presentations (oral, and, if needed posters) and, finally, a panel for all invited speakers to take questions from the audience and formulate more nuanced opinions on this nascent field.
Sat 8:00 a.m. - 8:20 a.m.
|
Structured Optimal Transport (with T. Jaakkola, S. Jegelka)
(
Contributed 1
)
|
David Alvarez-Melis 🔗 |
Sat 8:20 a.m. - 9:00 a.m.
|
Approximate Bayesian computation with the Wasserstein distance
(
Invited 1
)
A growing range of generative statistical models prohibits the numerical evaluation of their likelihood functions. Approximate Bayesian computation has become a popular approach to overcome this issue, simulating synthetic data given parameters and comparing summaries of these simulations with the corresponding observed values. We propose to avoid these summaries and the ensuing loss of information through the use of Wasserstein distances between empirical distributions of observed and synthetic data. We describe how the approach can be used in the setting of dependent data such as time series, and how approximations of the Wasserstein distance allow in practice the method to scale to large datasets. In particular, we propose a new approximation to the optimal assignment problem using the Hilbert space-filling curve. The approach is illustrated on various examples including i.i.d. data and time series. |
Pierre E Jacob 🔗 |
Sat 9:00 a.m. - 9:40 a.m.
|
Gradient flow in the Wasserstein metric
(
Invited 2
)
Optimal transport not only provides powerful techniques for comparing probability measures, but also for analyzing their evolution over time. For a range of partial differential equations arising in physics, biology, and engineering, solutions are gradient flows in the Wasserstein metric: each equation has a notion of energy for which solutions dissipate energy as quickly as possible, with respect to the Wasserstein structure. Steady states of the equation correspond to minimizers of the energy, and stability properties of the equation translate into convexity properties of the energy. In this talk, I will compare Wasserstein gradient flow with more classical gradient flows arising in optimization and machine learning. I’ll then introduce a class of particle blob methods for simulating Wasserstein gradient flows numerically. |
Katy Craig 🔗 |
Sat 9:40 a.m. - 10:00 a.m.
|
Approximate inference with Wasserstein gradient flows (with T. Poggio)
(
Contributed 2
)
|
Charlie Frogner 🔗 |
Sat 10:00 a.m. - 10:20 a.m.
|
6 x 3 minutes spotlights
(
Poster Spotlights
)
|
Rémi Flamary · Yongxin Chen · Napat Rujeerapaiboon · Jonas Adler · John Lee · Lucas R Roberts 🔗 |
Sat 11:00 a.m. - 11:40 a.m.
|
Optimal planar transport in near-linear time
(
Invited 3
)
We show how to compute the Earth Mover Distance between two planar sets of size N in N^{1+o(1)} time. The algorithm is based on a generic framework that decomposes the natural Linear Programming formulation for the transport problem into a tree of smaller LPs, and recomposes it in a divide-and-conquer fashion. The main enabling idea is use sketching -- a generalization of the dimension reduction method -- in order to reduce the size of the "partial computation" so that the conquer step is more efficient. We will conclude with some open questions in the area. This is joint work with Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. |
Alexandr Andoni 🔗 |
Sat 11:40 a.m. - 12:20 p.m.
|
Laplacian operator and Brownian motions on the Wasserstein space
(
Invited 4
)
We endow the space of probability measures on $\mathbb{R}^d$ with $\Delta_w$, a Laplacian operator.
A Brownian motion is shown to be consistent with the Laplacian operator. The smoothing
effect of the heat equation is established for a class of functions. Special perturbations of
the Laplacian operator, denoted $\Delta_{w,\epsilon}$, appearing in Mean Field Games theory, are considered (Joint work with Y. T. Chow).
|
Wilfrid Gangbo 🔗 |
Sat 1:40 p.m. - 2:20 p.m.
|
Geometrical Insights for Unsupervised Learning
(
Invited 6
)
After arguing that choosing the right probability distance is critical for achieving the elusive goals of unsupervised learning, we compare the geometric properties of the two currently most promising distances: (1) the earth-mover distance, and (2) the energy distance, also known as maximum mean discrepancy. These insights allow us to give a fresh viewpoint on reported experimental results and to risk a couple predictions. Joint work with Leon Bottou, Martin Arjovsky, David Lopez-Paz, and Maxime Oquab. |
Leon Bottou 🔗 |
Sat 2:20 p.m. - 2:40 p.m.
|
Improving GANs Using Optimal Transport (with H. Zhang, A. Radford, D. Metaxas)
(
Contributed 3
)
|
Tim Salimans 🔗 |
Sat 2:40 p.m. - 3:00 p.m.
|
Overrelaxed Sinkhorn-Knopp Algorithm for Regularized Optimal Transport (with L. Chizat, C. Dossal, N. Papadakis)
(
Contributed 4
)
|
Alexis THIBAULT 🔗 |
Sat 3:30 p.m. - 4:10 p.m.
|
Domain adaptation with optimal transport : from mapping to learning with joint distribution
(
Invited 6
)
This presentation deals with the unsupervised domain adaptation problem, where one wants to estimate a prediction function f in a given target domain without any labeled sample by exploiting the knowledge available from a source domain where labels are known. After a short introduction of recent developent in domain adaptation and their relation to optimal transport we will present a method that estimates a barycentric mapping between the feature distributions in order to adapt the training dataset prior to learning. Next we propose a novel method that model with optimal transport the transformation between the joint feature/labels space distributions of the two domains. We aim at recovering an estimated target distribution ptf=(X,f(X)) by optimizing simultaneously the optimal coupling and f. We discuss the generalization of the proposed method, and provide an efficient algorithmic solution. The versatility of the approach, both in terms of class of hypothesis or loss functions is demonstrated with real world classification, regression problems and large datasets where stochastic approaches become necessary. Joint work with Nicolas COURTY, Devis TUIA, Amaury HABRARD, and Alain RAKOTOMAMONJY |
Rémi Flamary 🔗 |
Sat 4:10 p.m. - 4:50 p.m.
|
Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance
(
Invited 7
)
The Wasserstein distance between two probability measures on a metric space is a measure of closeness with applications in statistics, probability, and machine learning. In this work, we consider the fundamental question of how quickly the empirical measure obtained fromnindependent samples from μ approaches μ in the Wasserstein distance of any order. We prove sharp asymptotic and finite-sample results for this rate of convergence for general measures on general compact metric spaces. Our finite-sample results show the existence of multi-scale behavior, where measures can exhibit radically different rates of convergence as n grows. See more details in: J. Weed, F. Bach. Sharp asymptotic and finite-sample ratesof convergence of empirical measures in Wasserstein distance. Technical Report, Arxiv-1707.00087, 2017. |
Francis Bach 🔗 |
Sat 4:50 p.m. - 5:10 p.m.
|
7 x 3 minutes spotlights
(
Poster Spotlights
)
|
Elsa Cazelles · Aude Genevay · Gonzalo Mena · Christoph Brauer · Asja Fischer · Henning Petzka · Vivien Seguy · Antoine Rolet · Sho Sonoda 🔗 |
Sat 5:10 p.m. - 5:30 p.m.
|
short Q&A session with plenary speakers
(
Roundtable
)
|
🔗 |
Sat 5:30 p.m. - 6:30 p.m.
|
Closing session
(
Poster Session
)
|
🔗 |
Author Information
Olivier Bousquet (Google Brain (Zurich))
Marco Cuturi (Google Brain & CREST - ENSAE)
Marco Cuturi is a research scientist at Apple, in Paris. He received his Ph.D. in 11/2005 from the Ecole des Mines de Paris in applied mathematics. Before that he graduated from National School of Statistics (ENSAE) with a master degree (MVA) from ENS Cachan. He worked as a post-doctoral researcher at the Institute of Statistical Mathematics, Tokyo, between 11/2005 and 3/2007 and then in the financial industry between 4/2007 and 9/2008. After working at the ORFE department of Princeton University as a lecturer between 2/2009 and 8/2010, he was at the Graduate School of Informatics of Kyoto University between 9/2010 and 9/2016 as a tenured associate professor. He joined ENSAE in 9/2016 as a professor, where he is now working part-time. He was at Google between 10/2018 and 1/2022. His main employment is now with Apple, since 1/2022, as a research scientist working on fundamental aspects of machine learning.
Gabriel Peyré (Université Paris Dauphine)
Fei Sha (University of Southern California (USC))
Justin Solomon (Stanford University)
More from the Same Authors
-
2021 : Faster Unbalanced Optimal Transport: Translation invariant Sinkhorn and 1-D Frank-Wolfe »
Thibault Sejourne · Francois-Xavier Vialard · Gabriel Peyré -
2021 : Faster Unbalanced Optimal Transport: Translation invariant Sinkhorn and 1-D Frank-Wolfe »
Thibault Sejourne · Francois-Xavier Vialard · Gabriel Peyré -
2021 : Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs »
Meyer Scetbon · Gabriel Peyré · Marco Cuturi -
2021 : Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs »
Meyer Scetbon · Gabriel Peyré · Marco Cuturi -
2023 Poster: Unbalanced Low-rank Optimal Transport Solvers »
Meyer Scetbon · Michal Klein · Giovanni Palla · Marco Cuturi -
2023 Workshop: Optimal Transport and Machine Learning »
Anna Korba · Aram-Alexandre Pooladian · Charlotte Bunne · David Alvarez-Melis · Marco Cuturi · Ziv Goldfeld -
2022 Poster: Supervised Training of Conditional Monge Maps »
Charlotte Bunne · Andreas Krause · Marco Cuturi -
2022 Poster: Efficient and Modular Implicit Differentiation »
Mathieu Blondel · Quentin Berthet · Marco Cuturi · Roy Frostig · Stephan Hoyer · Felipe Llinares-Lopez · Fabian Pedregosa · Jean-Philippe Vert -
2022 Poster: Low-rank Optimal Transport: Approximation, Statistics and Debiasing »
Meyer Scetbon · Marco Cuturi -
2021 Workshop: Optimal Transport and Machine Learning »
Jason Altschuler · Charlotte Bunne · Laetitia Chapel · Marco Cuturi · Rémi Flamary · Gabriel Peyré · Alexandra Suvorikova -
2021 Poster: Smooth Bilevel Programming for Sparse Regularization »
Clarice Poon · Gabriel Peyré -
2021 Poster: The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation »
Thibault Sejourne · Francois-Xavier Vialard · Gabriel Peyré -
2020 Poster: Projection Robust Wasserstein Distance and Riemannian Optimization »
Tianyi Lin · Chenyou Fan · Nhat Ho · Marco Cuturi · Michael Jordan -
2020 Poster: Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast Algorithm »
Tianyi Lin · Nhat Ho · Xi Chen · Marco Cuturi · Michael Jordan -
2020 Spotlight: Projection Robust Wasserstein Distance and Riemannian Optimization »
Tianyi Lin · Chenyou Fan · Nhat Ho · Marco Cuturi · Michael Jordan -
2020 Poster: Learning with Differentiable Pertubed Optimizers »
Quentin Berthet · Mathieu Blondel · Olivier Teboul · Marco Cuturi · Jean-Philippe Vert · Francis Bach -
2020 Memorial: In Memory of Olivier Chapelle »
Bernhard Schölkopf · Andre Elisseeff · Olivier Bousquet · Vladimir Vapnik · Jason E Weston -
2020 Poster: Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form »
Hicham Janati · Boris Muzellec · Gabriel Peyré · Marco Cuturi -
2020 Poster: Linear Time Sinkhorn Divergences using Positive Features »
Meyer Scetbon · Marco Cuturi -
2020 Oral: Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form »
Hicham Janati · Boris Muzellec · Gabriel Peyré · Marco Cuturi -
2020 Session: Orals & Spotlights Track 21: Optimization »
Peter Richtarik · Marco Cuturi -
2020 Poster: Synthetic Data Generators -- Sequential and Private »
Olivier Bousquet · Roi Livni · Shay Moran -
2020 Poster: What Do Neural Networks Learn When Trained With Random Labels? »
Hartmut Maennel · Ibrahim Alabdulmohsin · Ilya Tolstikhin · Robert Baldock · Olivier Bousquet · Sylvain Gelly · Daniel Keysers -
2020 Spotlight: What Do Neural Networks Learn When Trained With Random Labels? »
Hartmut Maennel · Ibrahim Alabdulmohsin · Ilya Tolstikhin · Robert Baldock · Olivier Bousquet · Sylvain Gelly · Daniel Keysers -
2020 Session: Orals & Spotlights Track 01: Representation/Relational »
Laurens van der Maaten · Fei Sha -
2019 Workshop: Optimal Transport for Machine Learning »
Marco Cuturi · Gabriel Peyré · Rémi Flamary · Alexandra Suvorikova -
2019 Poster: Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections »
Boris Muzellec · Marco Cuturi -
2019 Poster: Differentiable Ranking and Sorting using Optimal Transport »
Marco Cuturi · Olivier Teboul · Jean-Philippe Vert -
2019 Spotlight: Differentiable Ranking and Sorting using Optimal Transport »
Marco Cuturi · Olivier Teboul · Jean-Philippe Vert -
2019 Poster: Practical and Consistent Estimation of f-Divergences »
Paul Rubenstein · Olivier Bousquet · Josip Djolonga · Carlos Riquelme · Ilya Tolstikhin -
2019 Poster: Tree-Sliced Variants of Wasserstein Distances »
Tam Le · Makoto Yamada · Kenji Fukumizu · Marco Cuturi -
2018 Poster: Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport »
Theo Lacombe · Marco Cuturi · Steve OUDOT -
2018 Poster: Assessing Generative Models via Precision and Recall »
Mehdi S. M. Sajjadi · Olivier Bachem · Mario Lucic · Olivier Bousquet · Sylvain Gelly -
2018 Poster: Synthesize Policies for Transfer and Adaptation across Tasks and Environments »
Hexiang Hu · Liyu Chen · Boqing Gong · Fei Sha -
2018 Spotlight: Synthesize Policies for Transfer and Adaptation across Tasks and Environments »
Hexiang Hu · Liyu Chen · Boqing Gong · Fei Sha -
2018 Poster: Are GANs Created Equal? A Large-Scale Study »
Mario Lucic · Karol Kurach · Marcin Michalski · Sylvain Gelly · Olivier Bousquet -
2018 Poster: Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions »
Boris Muzellec · Marco Cuturi -
2017 Poster: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Spotlight: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Poster: An Empirical Study on The Properties of Random Bases for Kernel Methods »
Maximilian Alber · Pieter-Jan Kindermans · Kristof Schütt · Klaus-Robert Müller · Fei Sha -
2017 Poster: AdaGAN: Boosting Generative Models »
Ilya Tolstikhin · Sylvain Gelly · Olivier Bousquet · Carl-Johann SIMON-GABRIEL · Bernhard Schölkopf -
2017 Tutorial: A Primer on Optimal Transport »
Marco Cuturi · Justin Solomon -
2016 Workshop: Time Series Workshop »
Oren Anava · Marco Cuturi · Azadeh Khaleghi · Vitaly Kuznetsov · Sasha Rakhlin -
2016 Poster: A Multi-step Inertial Forward-Backward Splitting Method for Non-convex Optimization »
Jingwei Liang · Jalal Fadili · Gabriel Peyré -
2016 Poster: Wasserstein Training of Restricted Boltzmann Machines »
Grégoire Montavon · Klaus-Robert Müller · Marco Cuturi -
2016 Poster: Sparse Support Recovery with Non-smooth Loss Functions »
Kévin Degraux · Gabriel Peyré · Jalal Fadili · Laurent Jacques -
2016 Poster: Stochastic Optimization for Large-scale Optimal Transport »
Aude Genevay · Marco Cuturi · Gabriel Peyré · Francis Bach -
2015 : Do Shallow Kernel Methods Match Deep Neural Networks »
Fei Sha -
2015 : Do Shallow Kernel Methods Match Deep Neural Networks? »
Fei Sha -
2015 Poster: Biologically Inspired Dynamic Textures for Probing Motion Perception »
Jonathan Vacher · Andrew Isaac Meso · Laurent U Perrinet · Gabriel Peyré -
2015 Spotlight: Biologically Inspired Dynamic Textures for Probing Motion Perception »
Jonathan Vacher · Andrew Isaac Meso · Laurent U Perrinet · Gabriel Peyré -
2015 Poster: Principal Geodesic Analysis for Probability Measures under the Optimal Transport Metric »
Vivien Seguy · Marco Cuturi -
2014 Workshop: Optimal Transport and Machine Learning »
Marco Cuturi · Gabriel Peyré · Justin Solomon · Alexander Barvinok · Piotr Indyk · Robert McCann · Adam Oberman -
2014 Poster: Diverse Sequential Subset Selection for Supervised Video Summarization »
Boqing Gong · Wei-Lun Chao · Kristen Grauman · Fei Sha -
2014 Poster: Local Linear Convergence of Forward--Backward under Partial Smoothness »
Jingwei Liang · Jalal Fadili · Gabriel Peyré -
2013 Workshop: New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks »
Urun Dogan · Marius Kloft · Tatiana Tommasi · Francesco Orabona · Massimiliano Pontil · Sinno Jialin Pan · Shai Ben-David · Arthur Gretton · Fei Sha · Marco Signoretto · Rajhans Samdani · Yun-Qian Miao · Mohammad Gheshlaghi azar · Ruth Urner · Christoph Lampert · Jonathan How -
2013 Poster: Reshaping Visual Datasets for Domain Adaptation »
Boqing Gong · Kristen Grauman · Fei Sha -
2013 Poster: Sinkhorn Distances: Lightspeed Computation of Optimal Transport »
Marco Cuturi -
2013 Spotlight: Sinkhorn Distances: Lightspeed Computation of Optimal Transport »
Marco Cuturi -
2013 Poster: Similarity Component Analysis »
Soravit Changpinyo · Kuan Liu · Fei Sha -
2012 Poster: Non-linear Metric Learning »
Dor Kedem · Stephen Tyree · Kilian Q Weinberger · Fei Sha · Gert Lanckriet -
2012 Session: Oral Session 5 »
Fei Sha -
2012 Poster: Semantic Kernel Forests from Multiple Taxonomies »
Sung Ju Hwang · Kristen Grauman · Fei Sha -
2011 Poster: Learning a Tree of Metrics with Disjoint Visual Features »
Sung Ju Hwang · Kristen Grauman · Fei Sha -
2010 Workshop: Challenges of Data Visualization »
Barbara Hammer · Laurens van der Maaten · Fei Sha · Alexander Smola -
2010 Poster: Unsupervised Kernel Dimension Reduction »
Meihong Wang · Fei Sha · Michael Jordan -
2009 Workshop: Statistical Machine Learning for Visual Analytics »
Guy Lebanon · Fei Sha -
2009 Poster: White Functionals for Anomaly Detection in Dynamical Systems »
Marco Cuturi · Jean-Philippe Vert · Alexandre d'Aspremont -
2008 Poster: DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification »
Simon Lacoste-Julien · Fei Sha · Michael Jordan -
2008 Session: Oral session 1: Clustering »
Fei Sha -
2007 Workshop: Machine Learning for Systems Problems (Part 2) »
Archana Ganapathi · Sumit Basu · Fei Sha · Emre Kiciman -
2007 Workshop: Machine Learning for Systems Problems (Part 1) »
Archana Ganapathi · Sumit Basu · Fei Sha · Emre Kiciman -
2007 Session: Session 7: Systems and Applications »
Fei Sha -
2007 Poster: The Tradeoffs of Large Scale Learning »
Leon Bottou · Olivier Bousquet -
2006 Poster: Large Margin Gaussian Mixture Models for Automatic Speech Recognition »
Fei Sha · Lawrence Saul -
2006 Poster: Kernels on Structured Objects Through Nested Histograms »
Marco Cuturi · Kenji Fukumizu -
2006 Talk: Large Margin Gaussian Mixture Models for Automatic Speech Recognition »
Fei Sha · Lawrence Saul -
2006 Poster: Graph Regularization for Maximum Variance Unfolding with an Application to Sensor Localization »
Kilian Q Weinberger · Fei Sha · Qihui Zhu · Lawrence Saul