Timezone: »
Remarkably easy implementation and guaranteed convergence has made the EM algorithm one of the most used algorithms for mixture modeling. On the downside, the E-step is linear in both the sample size and the number of mixture components, making it impractical for large-scale data. Based on the variational EM framework, we propose a fast alternative that uses component-specific data partitions to obtain a sub-linear E-step in sample size, while the algorithm still maintains provable convergence. Our approach builds on previous work, but is significantly faster and scales much better in the number of mixture components. We demonstrate this speedup by experiments on large-scale synthetic and real data.
Author Information
Bo Thiesson (Microsoft Research)
Chong Wang (Apple)
More from the Same Authors
-
2012 Poster: Truncation-free Online Variational Inference for Bayesian Nonparametric Models »
Chong Wang · David Blei -
2009 Poster: Reading Tea Leaves: How Humans Interpret Topic Models »
Jonathan Chang · Jordan Boyd-Graber · Sean Gerrish · Chong Wang · David Blei -
2009 Oral: Reading Tea Leaves: How Humans Interpret Topic Models »
Jonathan Chang · Jordan Boyd-Graber · Sean Gerrish · Chong Wang · David Blei -
2009 Poster: Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process »
Chong Wang · David Blei -
2009 Spotlight: Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process »
Chong Wang · David Blei -
2009 Poster: Variational Inference for the Nested Chinese Restaurant Process »
Chong Wang · David Blei