Timezone: »
We propose new algorithms for topic modeling when the number of topics is unknown. Our approach relies on an analysis of the concentration of mass and angular geometry of the topic simplex, a convex polytope constructed by taking the convex hull of vertices representing the latent topics. Our algorithms are shown in practice to have accuracy comparable to a Gibbs sampler in terms of topic estimation, which requires the number of topics be given. Moreover, they are one of the fastest among several state of the art parametric techniques. Statistical consistency of our estimator is established under some conditions.
Author Information
Mikhail Yurochkin (IBM Research AI)
I am a Research Staff Member at IBM Research and MIT-IBM Watson AI Lab in Cambridge, Massachusetts. My research interests are - Algorithmic Fairness - Out-of-Distribution Generalization - Applications of Optimal Transport in Machine Learning - Model Fusion and Federated Learning Before joining IBM, I completed my PhD in Statistics at the University of Michigan, where I worked with Long Nguyen. I received my Bachelor's degree in applied mathematics and physics from Moscow Institute of Physics and Technology.
Aritra Guha (University of Michigan)
XuanLong Nguyen (University of Michigan)
More from the Same Authors
-
2021 : Measuring the sensitivity of Gaussian processes to kernel choice »
Will Stephenson · Soumya Ghosh · Tin Nguyen · Mikhail Yurochkin · Sameer Deshpande · Tamara Broderick -
2022 : Towards Algorithmic Fairness in Space-Time: Filling in Black Holes »
Cheryl Flynn · Aritra Guha · Subhabrata Majumdar · Divesh Srivastava · Zhengyi Zhou -
2022 Poster: Domain Adaptation meets Individual Fairness. And they get along. »
Debarghya Mukherjee · Felix Petersen · Mikhail Yurochkin · Yuekai Sun -
2022 Poster: Beyond black box densities: Parameter learning for the deviated components »
Dat Do · Nhat Ho · XuanLong Nguyen -
2022 Poster: Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees »
Songkai Xue · Yuekai Sun · Mikhail Yurochkin -
2021 Poster: Does enforcing fairness mitigate biases caused by subpopulation shift? »
Subha Maity · Debarghya Mukherjee · Mikhail Yurochkin · Yuekai Sun -
2021 Poster: Post-processing for Individual Fairness »
Felix Petersen · Debarghya Mukherjee · Yuekai Sun · Mikhail Yurochkin -
2021 Poster: On sensitivity of meta-learning to support data »
Mayank Agarwal · Mikhail Yurochkin · Yuekai Sun -
2020 Poster: Continuous Regularized Wasserstein Barycenters »
Lingxiao Li · Aude Genevay · Mikhail Yurochkin · Justin Solomon -
2020 Demonstration: IBM Federated Learning Community Edition: An Interactive Demonstration »
Laura Wynter · Chaitanya Kumar · Pengqian Yu · Mikhail Yurochkin · Amogh Tarcar -
2019 Poster: Alleviating Label Switching with Optimal Transport »
Pierre Monteiller · Sebastian Claici · Edward Chien · Farzaneh Mirzazadeh · Justin Solomon · Mikhail Yurochkin -
2019 Poster: Hierarchical Optimal Transport for Document Representation »
Mikhail Yurochkin · Sebastian Claici · Edward Chien · Farzaneh Mirzazadeh · Justin Solomon -
2019 Poster: Scalable inference of topic evolution via models for latent geometric structures »
Mikhail Yurochkin · Zhiwei Fan · Aritra Guha · Paraschos Koutris · XuanLong Nguyen -
2019 Poster: Statistical Model Aggregation via Parameter Matching »
Mikhail Yurochkin · Mayank Agarwal · Soumya Ghosh · Kristjan Greenewald · Nghia Hoang -
2017 Poster: Multi-way Interacting Regression via Factorization Machines »
Mikhail Yurochkin · XuanLong Nguyen · nikolaos Vasiloglou -
2016 Poster: Geometric Dirichlet Means Algorithm for topic inference »
Mikhail Yurochkin · XuanLong Nguyen -
2014 Poster: Parallel Feature Selection Inspired by Group Testing »
Yingbo Zhou · Utkarsh Porwal · Ce Zhang · Hung Q Ngo · XuanLong Nguyen · Christopher RĂ© · Venu Govindaraju -
2013 Poster: Bayesian inference as iterated random functions with applications to sequential inference in graphical models »
Arash Amini · XuanLong Nguyen -
2013 Spotlight: Bayesian inference as iterated random functions with applications to sequential inference in graphical models »
Arash Amini · XuanLong Nguyen -
2007 Spotlight: Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization »
XuanLong Nguyen · Martin J Wainwright · Michael Jordan -
2007 Poster: Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization »
XuanLong Nguyen · Martin J Wainwright · Michael Jordan -
2006 Poster: Distributed PCA and Network Anomaly Detection »
Ling Huang · XuanLong Nguyen · Minos Garofalakis · Michael Jordan · Anthony D Joseph · Nina Taft