Timezone: »

Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora
Shuang Yang · Hongyuan Zha · Bao-Gang Hu

Mon Dec 07 07:00 PM -- 11:59 PM (PST) @

We propose Dirichlet-Bernoulli Alignment (DBA), a generative model for corpora in which each pattern (e.g., a document) contains a set of instances (e.g., paragraphs in the document) and belongs to multiple classes. By casting predefined classes as latent Dirichlet variables (i.e., instance level labels), and modeling the multi-label of each pattern as Bernoulli variables conditioned on the weighted empirical average of topic assignments, DBA automatically aligns the latent topics discovered from data to human-defined classes. DBA is useful for both pattern classification and instance disambiguation, which are tested on text classification and named entity disambiguation for web search queries respectively.

Author Information

Shuang Yang (Facebook)
Hongyuan Zha (Georgia Tech)
Bao-Gang Hu (NLPR, CASIA, China)

More from the Same Authors