4:00 – 6:00pm Monday, December 12, 2011
Andulucia II & III
In this tutorial we give an overview over applications and scalable inference in graphical models for the internet. Structured data analysis has become a key enabling technique to process significant amounts of data, ranging from entity extraction on webpages to sentiment and topic analysis for news articles and comments. Our tutorial covers large scale sampling and optimization methods for Nonparametric Bayesian models such as Latent Dirichlet Allocation, both from a statistics and a systems perspective. Subsequently we give an overview over a range of generative models to elicit sentiment, ideology, time dependence, hierarchical structure, and multilingual similarity from data. We conclude with an overview of recent advances in (semi)supervised information extraction methods based on conditional random fields and related undirected graphical models.