Timezone: »

Conic Scan-and-Cover algorithms for nonparametric topic modeling
Mikhail Yurochkin · Aritra Guha · XuanLong Nguyen

Wed Dec 06 06:30 PM -- 10:30 PM (PST) @ Pacific Ballroom #189

We propose new algorithms for topic modeling when the number of topics is unknown. Our approach relies on an analysis of the concentration of mass and angular geometry of the topic simplex, a convex polytope constructed by taking the convex hull of vertices representing the latent topics. Our algorithms are shown in practice to have accuracy comparable to a Gibbs sampler in terms of topic estimation, which requires the number of topics be given. Moreover, they are one of the fastest among several state of the art parametric techniques. Statistical consistency of our estimator is established under some conditions.

Author Information

Mikhail Yurochkin (IBM Research AI)

I am a Research Staff Member at IBM Research and MIT-IBM Watson AI Lab in Cambridge, Massachusetts. My research interests are - Algorithmic Fairness - Out-of-Distribution Generalization - Applications of Optimal Transport in Machine Learning - Model Fusion and Federated Learning Before joining IBM, I completed my PhD in Statistics at the University of Michigan, where I worked with Long Nguyen. I received my Bachelor's degree in applied mathematics and physics from Moscow Institute of Physics and Technology.

Aritra Guha (University of Michigan)
XuanLong Nguyen (University of Michigan)

More from the Same Authors