Timezone: »
Utilizing the structure of a probabilistic model can significantly increase its learning speed. Motivated by several recent applications, in particular bigram models in language processing, we consider learning low-rank conditional probability matrices under expected KL-risk. This choice makes smoothing, that is the careful handling of low-probability elements, paramount. We derive an iterative algorithm that extends classical non-negative matrix factorization to naturally incorporate additive smoothing and prove that it converges to the stationary points of a penalized empirical risk. We then derive sample-complexity bounds for the global minimizer of the penalized risk and show that it is within a small factor of the optimal sample complexity. This framework generalizes to more sophisticated smoothing techniques, including absolute-discounting.
Author Information
Moein Falahatgar (UCSD)
Mesrob Ohannessian (University of Illinois at Chicago)
Alon Orlitsky (University of California, San Diego)
More from the Same Authors
-
2020 Poster: Linear-Sample Learning of Low-Rank Distributions »
Ayush Jain · Alon Orlitsky -
2020 Poster: A General Method for Robust Learning from Batches »
Ayush Jain · Alon Orlitsky -
2020 Poster: SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm »
Yi Hao · Ayush Jain · Alon Orlitsky · Vaishakh Ravindrakumar -
2020 Poster: Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Distributions »
Yi Hao · Alon Orlitsky -
2019 Poster: Unified Sample-Optimal Property Estimation in Near-Linear Time »
Yi Hao · Alon Orlitsky -
2019 Poster: The Broad Optimality of Profile Maximum Likelihood »
Yi Hao · Alon Orlitsky -
2019 Spotlight: The Broad Optimality of Profile Maximum Likelihood »
Yi Hao · Alon Orlitsky -
2018 Poster: On Learning Markov Chains »
Yi Hao · Alon Orlitsky · Venkatadheeraj Pichapati -
2018 Poster: Data Amplification: A Unified and Competitive Approach to Property Estimation »
Yi Hao · Alon Orlitsky · Ananda Theertha Suresh · Yihong Wu -
2017 Poster: The power of absolute discounting: all-dimensional distribution estimation »
Moein Falahatgar · Mesrob Ohannessian · Alon Orlitsky · Venkatadheeraj Pichapati -
2017 Poster: Maxing and Ranking with Few Assumptions »
Moein Falahatgar · Yi Hao · Alon Orlitsky · Venkatadheeraj Pichapati · Vaishakh Ravindrakumar -
2015 Poster: Competitive Distribution Estimation: Why is Good-Turing Good »
Alon Orlitsky · Ananda Theertha Suresh -
2015 Oral: Competitive Distribution Estimation: Why is Good-Turing Good »
Alon Orlitsky · Ananda Theertha Suresh -
2014 Poster: Near-Optimal-Sample Estimators for Spherical Gaussian Mixtures »
Ananda Theertha Suresh · Alon Orlitsky · Jayadev Acharya · Ashkan Jafarpour -
2012 Poster: Tight Bounds on Redundancy and Distinguishability of Label-Invariant Distributions »
Jayadev Acharya · Hirakendu Das · Alon Orlitsky