Timezone: »
Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference corpus. Contemporary neural topic models surpass classical ones according to these metrics. At the same time, topic model evaluation suffers from a validation gap: automated coherence, developed for classical models, has not been validated using human experimentation for neural models. In addition, a meta-analysis of topic modeling literature reveals a substantial standardization gap in automated topic modeling benchmarks. To address the validation gap, we compare automated coherence with the two most widely accepted human judgment tasks: topic rating and word intrusion. To address the standardization gap, we systematically evaluate a dominant classical model and two state-of-the-art neural models on two commonly used datasets. Automated evaluations declare a winning model when corresponding human evaluations do not, calling into question the validity of fully automatic evaluations independent of human judgments.
Author Information
Alexander Hoyle (University of Maryland, College Park)
Pranav Goel (University of Maryland College Park (SSO))
Andrew Hian-Cheong (University of Maryland, College Park)
Denis Peskov (University of Maryland, College Park)
Jordan Boyd-Graber (University of Maryland)
Philip Resnik (University of Maryland)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence »
Dates n/a. Room
More from the Same Authors
-
2020 : Showdown against trivia experts »
Jordan Boyd-Graber -
2018 Poster: Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages »
Michelle Yuan · Benjamin Van Durme · Jordan Boyd-Graber -
2017 : Competition V: Human-Computer Question Answering »
Jordan Boyd-Graber · Hal Daumé III · He He · Mohit Iyyer · Pedro Rodriguez -
2015 Demonstration: Interactive Incremental Question Answering »
Jordan Boyd-Graber · Mohit Iyyer -
2014 Poster: Learning a Concept Hierarchy from Multi-labeled Documents »
Viet-An Nguyen · Jordan Boyd-Graber · Philip Resnik · Jonathan Chang -
2013 Workshop: Topic Models: Computation, Application, and Evaluation »
David Mimno · Amr Ahmed · Jordan Boyd-Graber · Ankur Moitra · Hanna Wallach · Alexander Smola · David Blei · Anima Anandkumar -
2013 Poster: Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent »
Yuening Hu · Jordan Boyd-Graber · Hal Daumé III · Z. Irene Ying -
2013 Poster: Lexical and Hierarchical Topic Regression »
Viet-An Nguyen · Jordan Boyd-Graber · Philip Resnik -
2009 Workshop: Applications for Topic Models: Text and Beyond »
David Blei · Jordan Boyd-Graber · Jonathan Chang · Katherine Heller · Hanna Wallach -
2009 Poster: Reading Tea Leaves: How Humans Interpret Topic Models »
Jonathan Chang · Jordan Boyd-Graber · Sean Gerrish · Chong Wang · David Blei -
2009 Oral: Reading Tea Leaves: How Humans Interpret Topic Models »
Jonathan Chang · Jordan Boyd-Graber · Sean Gerrish · Chong Wang · David Blei -
2008 Poster: Syntactic Topic Models »
Jordan Boyd-Graber · David Blei -
2008 Spotlight: Syntactic Topic Models »
Jordan Boyd-Graber · David Blei