Timezone: »
Poster
Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search
Anshumali Shrivastava · Ping Li
Sun Dec 08 02:00 PM -- 06:00 PM (PST) @ Harrah's Special Events Center, 2nd Floor
We go beyond the notion of pairwise similarity and look into search problems with $k$-way similarity functions. In this paper, we focus on problems related to \emph{3-way Jaccard} similarity: $\mathcal{R}^{3way}= \frac{|S_1 \cap S_2 \cap S_3|}{|S_1 \cup S_2 \cup S_3|}$, $S_1, S_2, S_3 \in \mathcal{C}$, where $\mathcal{C}$ is a size $n$ collection of sets (or binary vectors). We show that approximate $\mathcal{R}^{3way}$ similarity search problems admit fast algorithms with provable guarantees, analogous to the pairwise case. Our analysis and speedup guarantees naturally extend to $k$-way resemblance. In the process, we extend traditional framework of \emph{locality sensitive hashing (LSH)} to handle higher order similarities, which could be of independent theoretical interest. The applicability of $\mathcal{R}^{3way}$ search is shown on the "Google sets" application. In addition, we demonstrate the advantage of $\mathcal{R}^{3way}$ resemblance over the pairwise case in improving retrieval quality.
Author Information
Anshumali Shrivastava (Rice University / ThirdAI Corp.)
Ping Li (Baidu Research USA)
More from the Same Authors
-
2017 Poster: Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery »
Jie Shen · Ping Li -
2017 Poster: Simple strategies for recovering inner products from coarsely quantized random projections »
Ping Li · Martin Slawski -
2016 Poster: Exact Recovery of Hard Thresholding Pursuit »
Xiaotong Yuan · Ping Li · Tong Zhang -
2016 Poster: Learning Additive Exponential Family Graphical Models via $\ell_{2,1}$-norm Regularized M-Estimation »
Xiaotong Yuan · Ping Li · Tong Zhang · Qingshan Liu · Guangcan Liu -
2016 Poster: Quantized Random Projections and Non-Linear Estimation of Cosine Similarity »
Ping Li · Michael Mitzenmacher · Martin Slawski -
2015 Poster: b-bit Marginal Regression »
Martin Slawski · Ping Li -
2015 Spotlight: b-bit Marginal Regression »
Martin Slawski · Ping Li -
2015 Poster: Regularization-Free Estimation in Trace Regression with Symmetric Positive Semidefinite Matrices »
Martin Slawski · Ping Li · Matthias Hein -
2014 Poster: Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) »
Anshumali Shrivastava · Ping Li -
2014 Poster: Recovery of Coherent Data via Low-Rank Dictionary Pursuit »
Guangcan Liu · Ping Li -
2014 Poster: Online Optimization for Max-Norm Regularization »
Jie Shen · Huan Xu · Ping Li -
2014 Spotlight: Recovery of Coherent Data via Low-Rank Dictionary Pursuit »
Guangcan Liu · Ping Li -
2014 Oral: Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) »
Anshumali Shrivastava · Ping Li -
2013 Poster: Sign Cauchy Projections and Chi-Square Kernel »
Ping Li · Gennady Samorodnitsk · John Hopcroft -
2012 Poster: Entropy Estimations Using Correlated Symmetric Stable Random Projections »
Ping Li · Cun-Hui Zhang -
2012 Poster: One Permutation Hashing »
Ping Li · Art B Owen · Cun-Hui Zhang -
2011 Poster: Hashing Algorithms for Large-Scale Learning »
Ping Li · Anshumali Shrivastava · Joshua L Moore · Arnd C König -
2010 Spotlight: b-Bit Minwise Hashing for Estimating Three-Way Similarities »
Ping Li · Arnd C König · Wenhao Gui -
2010 Poster: b-Bit Minwise Hashing for Estimating Three-Way Similarities »
Ping Li · Arnd C König · Wenhao Gui -
2008 Poster: One sketch for all: Theory and Application of Conditional Random Sampling »
Ping Li · Kenneth W Church · Trevor Hastie -
2008 Spotlight: One sketch for all: Theory and Application of Conditional Random Sampling »
Ping Li · Kenneth W Church · Trevor Hastie -
2007 Spotlight: McRank: Learning to Rank Using Multiple Classification and Gradient Boosting »
Ping Li · Chris J Burges · Qiang Wu -
2007 Poster: McRank: Learning to Rank Using Multiple Classification and Gradient Boosting »
Ping Li · Chris J Burges · Qiang Wu -
2007 Poster: A Unified Near-Optimal Estimator For Dimension Reduction in $l_\alpha$ ($0<\alpha\leq 2$) Using Sta »
Ping Li · Trevor Hastie -
2006 Poster: Conditional Random Sampling: A Sketch-based Sampling Technique for Sparse Data »
Ping Li · Kenneth W Church · Trevor Hastie