Timezone: »
Poster
On UMAP's True Loss Function
Sebastian Damrich · Fred Hamprecht
UMAP has supplanted $t$-SNE as state-of-the-art for visualizing high-dimensional datasets in many disciplines, but the reason for its success is not well understood. In this work, we investigate UMAP's sampling based optimization scheme in detail. We derive UMAP's true loss function in closed form and find that it differs from the published one in a dataset size dependent way. As a consequence, we show that UMAP does not aim to reproduce its theoretically motivated high-dimensional UMAP similarities. Instead, it tries to reproduce similarities that only encode the $k$ nearest neighbor graph, thereby challenging the previous understanding of UMAP's effectiveness. Alternatively, we consider the implicit balancing of attraction and repulsion due to the negative sampling to be key to UMAP's success. We corroborate our theoretical findings on toy and single cell RNA sequencing data.
Author Information
Sebastian Damrich (Heidelberg University)
Fred Hamprecht (Heidelberg University)
More from the Same Authors
-
2022 Poster: Theory and Approximate Solvers for Branched Optimal Transport with Multiple Sources »
Peter Lippmann · Enrique Fita SanmartĂn · Fred Hamprecht -
2021 Poster: Directed Probabilistic Watershed »
Enrique Fita Sanmartin · Sebastian Damrich · Fred Hamprecht -
2019 Poster: Probabilistic Watershed: Sampling all spanning forests for seeded segmentation and semi-supervised learning »
Enrique Fita Sanmartin · Sebastian Damrich · Fred Hamprecht -
2019 Spotlight: Probabilistic Watershed: Sampling all spanning forests for seeded segmentation and semi-supervised learning »
Enrique Fita Sanmartin · Sebastian Damrich · Fred Hamprecht -
2017 Poster: Sparse convolutional coding for neuronal assembly detection »
Sven Peter · Elke Kirschbaum · Martin Both · Lee Campbell · Brandon Harvey · Conor Heins · Daniel Durstewitz · Ferran Diego · Fred Hamprecht -
2017 Poster: Cost efficient gradient boosting »
Sven Peter · Ferran Diego · Fred Hamprecht · Boaz Nadler -
2016 : Fred Hamprecht : Motif Discovery in Functional Brain Data »
Fred Hamprecht -
2014 Poster: Sparse Space-Time Deconvolution for Calcium Image Analysis »
Ferran Diego Andilla · Fred Hamprecht -
2014 Spotlight: Sparse Space-Time Deconvolution for Calcium Image Analysis »
Ferran Diego Andilla · Fred Hamprecht -
2013 Poster: Learning Multi-level Sparse Representations »
Ferran Diego Andilla · Fred Hamprecht -
2011 Poster: Structured Learning for Cell Tracking »
Xinghua Lou · Fred Hamprecht