Poster
Learning Elastic Costs to Shape Monge Displacements
Michal Klein · Aram-Alexandre Pooladian · Pierre Ablin · Eugene Ndiaye · Jonathan Niles-Weed · Marco Cuturi
West Ballroom A-D #7109
[
Abstract
]
Fri 13 Dec 11 a.m. PST
— 2 p.m. PST
Abstract:
Given a source and a target probability measure supported on $\mathbb{R}^d$, the Monge problem asks to find the most efficient way to map one distribution to the other.This efficiency is quantified by defining a \textit{cost} function between source and target data. Such a cost is often set by default in the machine learning literature to the squared-Euclidean distance, $\ell^2_2(\mathbf{x},\mathbf{y})=\tfrac12\|\mathbf{x}-\mathbf{y}\|_2^2$.Recently, Cuturi et. al '23 highlighted the benefits of using elastic costs, defined through a regularizer $\tau$ as $c(\mathbf{x},\mathbf{y})=\ell^2_2(\mathbf{x},\mathbf{y})+\tau(\mathbf{x}-\mathbf{y})$. Such costs shape the \textit{displacements} of Monge maps $T$, i.e., the difference between a source point and its image $T(\mathbf{x})-\mathbf{x})$, by giving them a structure that matches that of the proximal operator of $\tau$.In this work, we make two important contributions to the study of elastic costs: *(i)* For any elastic cost, we propose a numerical method to compute Monge maps that are provably optimal. This provides a much-needed routine to create synthetic problems where the ground truth OT map is known, by analogy to the Brenier theorem, which states that the gradient of any convex potential is always a valid Monge map for the $\ell_2^2$ cost; *(ii)* We propose a loss to \textit{learn} the parameter $\theta$ of a parameterized regularizer $\tau_\theta$, and apply it in the case where $\tau_{A}(\mathbf{z})=\|A^\perp \mathbf{z}\|^2_2$. This regularizer promotes displacements that lie on a low dimensional subspace of $\mathbb{R}^d$, spanned by the $p$ rows of $A\in\mathbb{R}^{p\times d}$. We illustrate the success of our procedure on synthetic data, generated using our first contribution, in which we show near-perfect recovery of $A$'s subspace using only samples. We demonstrate the applicability of this method by showing predictive improvements on single-cell data tasks.
Live content is unavailable. Log in and register to view live content