Timezone: »
Latent variable mixture models are a powerful tool for exploring the structure in large datasets. A common challenge for interpreting such models is a desire to impose sparsity, the natural assumption that each data point only contains few latent features. Since mixture distributions are constrained in their L1 norm, typical sparsity techniques based on L1 regularization become toothless, and concave regularization becomes necessary. Unfortunately concave regularization typically results in EM algorithms that must perform problematic non-concave M-step maximizations. In this work, we introduce a technique for circumventing this difficulty, using the so-called Mountain Pass Theorem to provide easily verifiable conditions under which the M-step is well-behaved despite the lacking concavity. We also develop a correspondence between logarithmic regularization and what we term the pseudo-Dirichlet distribution, a generalization of the ordinary Dirichlet distribution well-suited for inducing sparsity. We demonstrate our approach on a text corpus, inferring a sparse topic mixture model for 2,406 weblogs.
Author Information
Martin O Larsson (Operations Research and Information Engineering / Cornell University)
Johan Ugander (Stanford University)
More from the Same Authors
-
2021 : Designing Defaults for School Choice »
Amel Awadelkarim · Johan Ugander · Itai Ashlagi · Irene Lo -
2023 Poster: Counterfactual Evaluation of Peer-Review Assignment Strategies »
Martin Saveski · Steven Jecmen · Nihar Shah · Johan Ugander -
2021 : Designing Defaults for School Choice »
Amel Awadelkarim · Johan Ugander · Itai Ashlagi · Irene Lo -
2021 : Choices and Rankings with Irrelevant Alternatives »
Johan Ugander -
2021 : Keynote speakers Q&A »
Sarit Kraus · Drew Fudenberg · Duncan J Watts · Colin Camerer · Johan Ugander · Emma Pierson -
2020 Poster: Learning Rich Rankings »
Arjun Seshadri · Stephen Ragain · Johan Ugander -
2016 Poster: Pairwise Choice Markov Chains »
Stephen Ragain · Johan Ugander -
2015 Workshop: Networks in the Social and Information Sciences »
Edo M Airoldi · David S Choi · Aaron Clauset · Johan Ugander · Panagiotis Toulis -
2014 Workshop: Networks: From Graphs to Rich Data »
Edo M Airoldi · Aaron Clauset · Johan Ugander · David S Choi · Leto Peel