Timezone: »

Online Bayesian Moment Matching for Topic Modeling with Unknown Number of Topics
Wei-Shou Hsu · Pascal Poupart

Wed Dec 07 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #90

Latent Dirichlet Allocation (LDA) is a very popular model for topic modeling as well as many other problems with latent groups. It is both simple and effective. When the number of topics (or latent groups) is unknown, the Hierarchical Dirichlet Process (HDP) provides an elegant non-parametric extension; however, it is a complex model and it is difficult to incorporate prior knowledge since the distribution over topics is implicit. We propose two new models that extend LDA in a simple and intuitive fashion by directly expressing a distribution over the number of topics. We also propose a new online Bayesian moment matching technique to learn the parameters and the number of topics of those models based on streaming data. The approach achieves higher log-likelihood than batch and online HDP with fixed hyperparameters on several corpora.

Author Information

Wei-Shou Hsu (University of Waterloo)
Pascal Poupart (University of Waterloo & Vector Institute)

More from the Same Authors