Timezone: »
Computing approximate nearest neighbors in high dimensional spaces is a central problem in large-scale data mining with a wide range of applications in machine learning and data science. A popular and effective technique in computing nearest neighbors approximately is the locality-sensitive hashing (LSH) scheme. In this paper, we aim to develop LSH schemes for distance functions that measure the distance between two probability distributions, particularly for f-divergences as well as a generalization to capture mutual information loss. First, we provide a general framework to design LHS schemes for f-divergence distance functions and develop LSH schemes for the generalized Jensen-Shannon divergence and triangular discrimination in this framework. We show a two-sided approximation result for approximation of the generalized Jensen-Shannon divergence by the Hellinger distance, which may be of independent interest. Next, we show a general method of reducing the problem of designing an LSH scheme for a Krein kernel (which can be expressed as the difference of two positive definite kernels) to the problem of maximum inner product search. We exemplify this method by applying it to the mutual information loss, due to its several important applications such as model compression.
Author Information
Lin Chen (Yale University)
Hossein Esfandiari (Google Research)
Gang Fu (Google Research)
Vahab Mirrokni (Google Research NYC)
More from the Same Authors
-
2023 Poster: Replicable Clustering »
Hossein Esfandiari · Amin Karbasi · Vahab Mirrokni · Grigoris Velegkas · Felix Zhou -
2022 Poster: Subquadratic Kronecker Regression with Applications to Tensor Decomposition »
Matthew Fahrbach · Gang Fu · Mehrdad Ghadiri -
2022 Poster: Anonymous Bandits for Multi-User Systems »
Hossein Esfandiari · Vahab Mirrokni · Jon Schneider -
2020 Poster: Optimal Approximation - Smoothness Tradeoffs for Soft-Max Functions »
Alessandro Epasto · Mohammad Mahdian · Vahab Mirrokni · Emmanouil Zampetakis -
2020 Spotlight: Optimal Approximation - Smoothness Tradeoffs for Soft-Max Functions »
Alessandro Epasto · Mohammad Mahdian · Vahab Mirrokni · Emmanouil Zampetakis -
2020 Poster: Smoothly Bounding User Contributions in Differential Privacy »
Alessandro Epasto · Mohammad Mahdian · Jieming Mao · Vahab Mirrokni · Lijie Ren -
2020 Poster: Contextual Reserve Price Optimization in Auctions via Mixed Integer Programming »
Joey Huchette · Haihao Lu · Hossein Esfandiari · Vahab Mirrokni -
2020 : Clustering At Scale »
Vahab Mirrokni -
2020 Expo Workshop: Mining and Learning with Graphs at Scale »
Vahab Mirrokni · Bryan Perozzi · Jakub Lacki · Jonathan Halcrow · Jaqui C Herman -
2020 : Introduction »
Vahab Mirrokni -
2019 Poster: Contextual Bandits with Cross-Learning »
Santiago Balseiro · Negin Golrezaei · Mohammad Mahdian · Vahab Mirrokni · Jon Schneider -
2019 Poster: Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions »
Negin Golrezaei · Adel Javanmard · Vahab Mirrokni -
2019 Poster: A Robust Non-Clairvoyant Dynamic Mechanism for Contextual Auctions »
Yuan Deng · Sébastien Lahaie · Vahab Mirrokni -
2019 Poster: Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback »
Mingrui Zhang · Lin Chen · Hamed Hassani · Amin Karbasi -
2019 Poster: Variance Reduction in Bipartite Experiments through Correlation Clustering »
Jean Pouget-Abadie · Kevin Aydin · Warren Schudy · Kay Brodersen · Vahab Mirrokni -
2017 Poster: Dynamic Revenue Sharing »
Santiago Balseiro · Max Lin · Vahab Mirrokni · Renato Leme · IIIS Song Zuo -
2017 Poster: Affinity Clustering: Hierarchical Clustering at Scale »
Mohammadhossein Bateni · Soheil Behnezhad · Mahsa Derakhshan · MohammadTaghi Hajiaghayi · Raimondas Kiveris · Silvio Lattanzi · Vahab Mirrokni -
2016 Poster: Bi-Objective Online Matching and Submodular Allocations »
Hossein Esfandiari · Nitish Korula · Vahab Mirrokni -
2016 Poster: Linear Relaxations for Finding Diverse Elements in Metric Spaces »
Aditya Bhaskara · Mehrdad Ghadiri · Vahab Mirrokni · Ola Svensson -
2014 Poster: Distributed Balanced Clustering via Mapping Coresets »
Mohammadhossein Bateni · Aditya Bhaskara · Silvio Lattanzi · Vahab Mirrokni