Poster
Competitive Distribution Estimation: Why is GoodTuring Good
Alon Orlitsky · Ananda Theertha Suresh
Estimating distributions over large alphabets is a fundamental machinelearning tenet. Yet no method is known to estimate all distributions well. For example, addconstant estimators are nearly minmax optimal but often perform poorly in practice, and practical estimators such as absolute discounting, JelinekMercer, and GoodTuring are not known to be near optimal for essentially any distribution.We describe the first universally nearoptimal probability estimators. For every discrete distribution, they are provably nearly the best in the following two competitive ways. First they estimate every distribution nearly as well as the best estimator designed with prior knowledge of the distribution up to a permutation. Second, they estimate every distribution nearly as well as the best estimator designed with prior knowledge of the exact distribution, but as all natural estimators, restricted to assign the same probability to all symbols appearing the same number of times.Specifically, for distributions over $k$ symbols and $n$ samples, we show that for both comparisons, a simple variant of GoodTuring estimator is always within KL divergence of $(3+o(1))/n^{1/3}$ from the best estimator, and that a more involved estimator is within $\tilde{\mathcal{O}}(\min(k/n,1/\sqrt n))$. Conversely, we show that any estimator must have a KL divergence $\ge\tilde\Omega(\min(k/n,1/ n^{2/3}))$ over the best estimator for the first comparison, and $\ge\tilde\Omega(\min(k/n,1/\sqrt{n}))$ for the second.
Author Information
Alon Orlitsky (University of California, San Diego)
Ananda Theertha Suresh (UCSD)
More from the Same Authors

2019 Poster: Unified SampleOptimal Property Estimation in NearLinear Time »
Yi Hao · Alon Orlitsky 
2019 Poster: The Broad Optimality of Profile Maximum Likelihood »
Yi Hao · Alon Orlitsky 
2019 Spotlight: The Broad Optimality of Profile Maximum Likelihood »
Yi Hao · Alon Orlitsky 
2018 Poster: On Learning Markov Chains »
Yi Hao · Alon Orlitsky · Venkatadheeraj Pichapati 
2018 Poster: Data Amplification: A Unified and Competitive Approach to Property Estimation »
Yi Hao · Alon Orlitsky · Ananda Theertha Suresh · Yihong Wu 
2017 Poster: The power of absolute discounting: alldimensional distribution estimation »
Moein Falahatgar · Mesrob Ohannessian · Alon Orlitsky · Venkatadheeraj Pichapati 
2017 Poster: Maxing and Ranking with Few Assumptions »
Moein Falahatgar · Yi Hao · Alon Orlitsky · Venkatadheeraj Pichapati · Vaishakh Ravindrakumar 
2016 Poster: NearOptimal Smoothing of Structured Conditional Probability Matrices »
Moein Falahatgar · Mesrob Ohannessian · Alon Orlitsky 
2015 Oral: Competitive Distribution Estimation: Why is GoodTuring Good »
Alon Orlitsky · Ananda Theertha Suresh 
2014 Poster: NearOptimalSample Estimators for Spherical Gaussian Mixtures »
Ananda Theertha Suresh · Alon Orlitsky · Jayadev Acharya · Ashkan Jafarpour 
2012 Poster: Tight Bounds on Redundancy and Distinguishability of LabelInvariant Distributions »
Jayadev Acharya · Hirakendu Das · Alon Orlitsky