Timezone: »
Random projections have been increasingly adopted for a diverse set of tasks in machine learning involving dimensionality reduction. One specific line of research on this topic has investigated the use of quantization subsequent to projection with the aim of additional data compression. Motivated by applications in nearest neighbor search and linear learning, we revisit the problem of recovering inner products (respectively cosine similarities) in such setting. We show that even under coarse scalar quantization with 3 to 5 bits per projection, the loss in accuracy tends to range from negligible'' to
moderate''. One implication is that in most scenarios of practical interest, there is no need for a sophisticated recovery approach like maximum likelihood estimation as considered in previous work on the subject. What we propose herein also yields considerable improvements in terms of accuracy over the Hamming distance-based approach in Li et al. (ICML 2014) which is comparable in terms of simplicity
Author Information
Ping Li (Rugters University)
Martin Slawski
More from the Same Authors
-
2017 Poster: Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery »
Jie Shen · Ping Li -
2016 Poster: Exact Recovery of Hard Thresholding Pursuit »
Xiaotong Yuan · Ping Li · Tong Zhang -
2016 Poster: Learning Additive Exponential Family Graphical Models via $\ell_{2,1}$-norm Regularized M-Estimation »
Xiaotong Yuan · Ping Li · Tong Zhang · Qingshan Liu · Guangcan Liu -
2016 Poster: Quantized Random Projections and Non-Linear Estimation of Cosine Similarity »
Ping Li · Michael Mitzenmacher · Martin Slawski -
2015 Poster: b-bit Marginal Regression »
Martin Slawski · Ping Li -
2015 Spotlight: b-bit Marginal Regression »
Martin Slawski · Ping Li -
2015 Poster: Regularization-Free Estimation in Trace Regression with Symmetric Positive Semidefinite Matrices »
Martin Slawski · Ping Li · Matthias Hein -
2014 Poster: Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) »
Anshumali Shrivastava · Ping Li -
2014 Poster: Recovery of Coherent Data via Low-Rank Dictionary Pursuit »
Guangcan Liu · Ping Li -
2014 Poster: Online Optimization for Max-Norm Regularization »
Jie Shen · Huan Xu · Ping Li -
2014 Spotlight: Recovery of Coherent Data via Low-Rank Dictionary Pursuit »
Guangcan Liu · Ping Li -
2014 Oral: Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) »
Anshumali Shrivastava · Ping Li -
2013 Poster: Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search »
Anshumali Shrivastava · Ping Li -
2013 Poster: Sign Cauchy Projections and Chi-Square Kernel »
Ping Li · Gennady Samorodnitsk · John Hopcroft -
2012 Poster: Entropy Estimations Using Correlated Symmetric Stable Random Projections »
Ping Li · Cun-Hui Zhang -
2012 Poster: One Permutation Hashing »
Ping Li · Art B Owen · Cun-Hui Zhang -
2011 Poster: Hashing Algorithms for Large-Scale Learning »
Ping Li · Anshumali Shrivastava · Joshua L Moore · Arnd C König -
2010 Spotlight: b-Bit Minwise Hashing for Estimating Three-Way Similarities »
Ping Li · Arnd C König · Wenhao Gui -
2010 Poster: b-Bit Minwise Hashing for Estimating Three-Way Similarities »
Ping Li · Arnd C König · Wenhao Gui -
2008 Poster: One sketch for all: Theory and Application of Conditional Random Sampling »
Ping Li · Kenneth W Church · Trevor Hastie -
2008 Spotlight: One sketch for all: Theory and Application of Conditional Random Sampling »
Ping Li · Kenneth W Church · Trevor Hastie -
2007 Spotlight: McRank: Learning to Rank Using Multiple Classification and Gradient Boosting »
Ping Li · Chris J Burges · Qiang Wu -
2007 Poster: McRank: Learning to Rank Using Multiple Classification and Gradient Boosting »
Ping Li · Chris J Burges · Qiang Wu -
2007 Poster: A Unified Near-Optimal Estimator For Dimension Reduction in $l_\alpha$ ($0<\alpha\leq 2$) Using Sta »
Ping Li · Trevor Hastie -
2006 Poster: Conditional Random Sampling: A Sketch-based Sampling Technique for Sparse Data »
Ping Li · Kenneth W Church · Trevor Hastie