Timezone: »

 
Poster
SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm
Yi Hao · Ayush Jain · Alon Orlitsky · Vaishakh Ravindrakumar

Wed Dec 09 09:00 PM -- 11:00 PM (PST) @ Poster Session 4 #1257
Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning. We present $\SURF$, an algorithm for approximating distributions by piecewise polynomials. $\SURF$ is: simple, replacing prior complex optimization techniques by straight-forward empirical probability approximation of each potential polynomial piece through simple empirical-probability interpolation, and using plain divide-and-conquer to merge the pieces; universal, as well-known polynomial-approximation results imply that it accurately approximates a large class of common distributions; robust to distribution mis-specification as for any degree $d \le 8$, it estimates any distribution to an $\ell_1$ distance $< 3$ times that of the nearest degree-$d$ piecewise polynomial, improving known factor upper bounds of 3 for single polynomials and 15 for polynomials with arbitrarily many pieces; fast, using optimal sample complexity, running in near sample-linear time, and if given sorted samples it may be parallelized to run in sub-linear time. In experiments, $\SURF$ outperforms state-of-the art algorithms.

Author Information

Yi Hao (University of California, San Diego)

Fifth-year Ph.D. student supervised by Prof. Alon Orlitsky at UC San Diego. Broadly interested in Machine Learning, Learning Theory, Algorithm Design, Symbolic and Numerical Optimization. Seeking a summer 2020 internship in Data Science and Machine Learning.

Ayush Jain (UC San Diego)
Alon Orlitsky (University of California, San Diego)
Vaishakh Ravindrakumar (UC San Diego)

More from the Same Authors