Timezone: »
Spotlight
Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces
Vincent Cohen-Addad · David Saulpic · Chris Schwiegelshohn
@
In this paper, we consider the problem of finding high dimensional power means: given a set $A$ of $n$ points in $\R^d$, find the point $m$ that minimizes the sum of Euclidean distance, raised to the power $z$, over all input points. Special cases of problem include the well-known Fermat-Weber problem -- or geometric median problem -- where $z = 1$, the mean or centroid where $z=2$, and the Minimum Enclosing Ball problem, where $z = \infty$.We consider these problem in the big data regime.Here, we are interested in sampling as few points as possible such that we can accurately estimate $m$.More specifically, we consider sublinear algorithms as well as coresets for these problems.Sublinear algorithms have a random query access to the $A$ and the goal is to minimize the number of queries.Here, we show that $\tilde{O}(\varepsilon^{-z-3})$ samples are sufficient to achieve a $(1+\varepsilon)$ approximation, generalizing the results from Cohen, Lee, Miller, Pachocki, and Sidford [STOC '16] and Inaba, Katoh, and Imai [SoCG '94] to arbitrary $z$. Moreover, we show that this bound is nearly optimal, as any algorithm requires at least $\Omega(\varepsilon^{-z+1})$ queries to achieve said approximation.The second contribution are coresets for these problems, where we aim to find find a small, weighted subset of the points which approximate cost of every candidate point $c\in \mathbb{R}^d$ up to a $(1\pm\varepsilon)$ factor. Here, we show that $\tilde{O}(\varepsilon^{-2})$ points are sufficient, improving on the $\tilde{O}(d\varepsilon^{-2})$ bound by Feldman and Langberg [STOC '11] and the $\tilde{O}(\varepsilon^{-4})$ bound by Braverman, Jiang, Krauthgamer, and Wu [SODA 21].
Author Information
Vincent Cohen-Addad (Google research)
David Saulpic (Sorbonne Université-LIP6)
Chris Schwiegelshohn (Aarhus University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces »
Fri. Dec 10th 04:30 -- 06:00 PM Room
More from the Same Authors
-
2022 : Scalable and Improved Algorithms for Individually Fair Clustering »
Mohammadhossein Bateni · Vincent Cohen-Addad · Alessandro Epasto · Silvio Lattanzi -
2023 Poster: Multi-Swap k-Means++ »
Lorenzo Beretta · Vincent Cohen-Addad · Silvio Lattanzi · Nikos Parotsidis -
2023 Poster: Private estimation algorithms for stochastic block models and mixture models »
Hongjie Chen · Vincent Cohen-Addad · Tommaso d'Orsi · Alessandro Epasto · Jacob Imola · David Steurer · Stefan Tiegel -
2022 Poster: Improved Coresets for Euclidean $k$-Means »
Vincent Cohen-Addad · Kasper Green Larsen · David Saulpic · Chris Schwiegelshohn · Omar Ali Sheikh-Omar -
2022 Poster: Near-Optimal Correlation Clustering with Privacy »
Vincent Cohen-Addad · Chenglin Fan · Silvio Lattanzi · Slobodan Mitrovic · Ashkan Norouzi-Fard · Nikos Parotsidis · Jakub Tarnawski -
2022 Poster: Near-Optimal Private and Scalable $k$-Clustering »
Vincent Cohen-Addad · Alessandro Epasto · Vahab Mirrokni · Shyam Narayanan · Peilin Zhong -
2021 Poster: Parallel and Efficient Hierarchical k-Median Clustering »
Vincent Cohen-Addad · Silvio Lattanzi · Ashkan Norouzi-Fard · Christian Sohler · Ola Svensson -
2020 Poster: On the Power of Louvain in the Stochastic Block Model »
Vincent Cohen-Addad · Adrian Kosowski · Frederik Mallmann-Trenn · David Saulpic -
2019 Poster: Fully Dynamic Consistent Facility Location »
Vincent Cohen-Addad · Niklas Oskar D Hjuler · Nikos Parotsidis · David Saulpic · Chris Schwiegelshohn -
2018 Poster: On Coresets for Logistic Regression »
Alexander Munteanu · Chris Schwiegelshohn · Christian Sohler · David Woodruff -
2018 Spotlight: On Coresets for Logistic Regression »
Alexander Munteanu · Chris Schwiegelshohn · Christian Sohler · David Woodruff -
2017 Poster: Hierarchical Clustering Beyond the Worst-Case »
Vincent Cohen-Addad · Varun Kanade · Frederik Mallmann-Trenn