Timezone: »

Learning discrete distributions with infinite support
Doron Cohen · Aryeh Kontorovich · Geoffrey Wolfer

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #538

We present a novel approach to estimating discrete distributions with (potentially) infinite support in the total variation metric. In a departure from the established paradigm, we make no structural assumptions whatsoever on the sampling distribution. In such a setting, distribution-free risk bounds are impossible, and the best one could hope for is a fully empirical data-dependent bound. We derive precisely such bounds, and demonstrate that these are, in a well-defined sense, the best possible. Our main discovery is that the half-norm of the empirical distribution provides tight upper and lower estimates on the empirical risk. Furthermore, this quantity decays at a nearly optimal rate as a function of the true distribution. The optimality follows from a minimax result, of possible independent interest. Additional structural results are provided, including an exact Rademacher complexity calculation and apparently a first connection between the total variation risk and the missing mass.

Author Information

Doron Cohen (Ben-Gurion University of the Negev)
Aryeh Kontorovich (Ben Gurion University)
Geoffrey Wolfer (Ben-Gurion University of the Negev)

More from the Same Authors