Timezone: »
Poster
Linear-Sample Learning of Low-Rank Distributions
Ayush Jain · Alon Orlitsky
Many latent-variable applications, including community detection, collaborative filtering, genomic analysis, and NLP, model data as generated by low-rank matrices. Yet despite considerable research, except for very special cases, the number of samples required to efficiently recover the underlying matrices has not been known.
We determine the onset of learning in several common latent-variable settings. For all of them, we show that learning $k\times k$, rank-$r$, matrices to normalized $L_1$ distance $\epsilon$ requires $\Omega(\frac{kr}{\epsilon^2})$ samples, and propose an algorithm that uses ${\cal O}(\frac{kr}{\epsilon^2}\log^2\frac r\epsilon)$ samples, a number linear in the high dimension, and nearly linear in the, typically low, rank.
The algorithm improves on existing spectral techniques and runs in polynomial time. The proofs establish new results on the rapid convergence of the spectral distance between the model and observation matrices, and may be of independent interest.
Author Information
Ayush Jain (UC San Diego)
Alon Orlitsky (University of California, San Diego)
More from the Same Authors
-
2020 Poster: A General Method for Robust Learning from Batches »
Ayush Jain · Alon Orlitsky -
2020 Poster: SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm »
Yi Hao · Ayush Jain · Alon Orlitsky · Vaishakh Ravindrakumar -
2020 Poster: Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Distributions »
Yi Hao · Alon Orlitsky -
2019 Poster: Unified Sample-Optimal Property Estimation in Near-Linear Time »
Yi Hao · Alon Orlitsky -
2019 Poster: The Broad Optimality of Profile Maximum Likelihood »
Yi Hao · Alon Orlitsky -
2019 Spotlight: The Broad Optimality of Profile Maximum Likelihood »
Yi Hao · Alon Orlitsky -
2018 Poster: On Learning Markov Chains »
Yi Hao · Alon Orlitsky · Venkatadheeraj Pichapati -
2018 Poster: Data Amplification: A Unified and Competitive Approach to Property Estimation »
Yi Hao · Alon Orlitsky · Ananda Theertha Suresh · Yihong Wu -
2017 Poster: The power of absolute discounting: all-dimensional distribution estimation »
Moein Falahatgar · Mesrob Ohannessian · Alon Orlitsky · Venkatadheeraj Pichapati -
2017 Poster: Maxing and Ranking with Few Assumptions »
Moein Falahatgar · Yi Hao · Alon Orlitsky · Venkatadheeraj Pichapati · Vaishakh Ravindrakumar -
2016 Poster: Near-Optimal Smoothing of Structured Conditional Probability Matrices »
Moein Falahatgar · Mesrob Ohannessian · Alon Orlitsky -
2015 Poster: Competitive Distribution Estimation: Why is Good-Turing Good »
Alon Orlitsky · Ananda Theertha Suresh -
2015 Oral: Competitive Distribution Estimation: Why is Good-Turing Good »
Alon Orlitsky · Ananda Theertha Suresh -
2014 Poster: Near-Optimal-Sample Estimators for Spherical Gaussian Mixtures »
Ananda Theertha Suresh · Alon Orlitsky · Jayadev Acharya · Ashkan Jafarpour -
2012 Poster: Tight Bounds on Redundancy and Distinguishability of Label-Invariant Distributions »
Jayadev Acharya · Hirakendu Das · Alon Orlitsky