Timezone: »

Tight Bounds on Redundancy and Distinguishability of Label-Invariant Distributions
Jayadev Acharya · Hirakendu Das · Alon Orlitsky

Mon Dec 03 07:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor
The minimax KL-divergence of any distribution from all distributions in a given collection has several practical implications. In compression, it is the least additional number of bits over the entropy needed in the worst case to encode the output of a distribution in the collection. In online estimation and learning, it is the lowest expected log-loss regret when guessing a sequence of random values. In hypothesis testing, it upper bounds the largest number of distinguishable distributions in the collection. Motivated by problems ranging from population estimation to text classification and speech recognition, several machine-learning and information-theory researchers have recently considered label-invariant distributions and properties of \iid-drawn samples. Using techniques that reveal and exploit the structure of these distributions, we improve on a sequence of previous works and show that the minimax KL-divergence of the collection of label-invariant distributions over length-$n$ \iid sequences is between $0.3\cdot n^{1/3}$ and $n^{1/3}\log^2n$.

Author Information

Jayadev Acharya (Cornell University)
Hirakendu Das (University of California, San Diego)
Alon Orlitsky (University of California, San Diego)

More from the Same Authors