Skip to yearly menu bar Skip to main content


Demonstration

Project Emporia: News Recommendation using Graphical Models

Jurgen Van Gael

Georgia A

Abstract:

Project Emporia is a recommendation engine for news. Based on the Matchbox technology ( http://research.microsoft.com/apps/pubs/default.aspx?id=79460 ) it uses a Bayesian probabilistic model to learn the preferences of users for recent news stories. When a person visits Project Emporia he can up or down vote each link according to her taste. The Matchbox model is then updated in real time so it can instantly improve its link recommendation. The news stories themselves are mined by crawling various RSS feeds and Twitter. In this way, Project Emporia performs Bayesian inference on more than 100,000,000 data points every day. Another feature of Project Emporia is the automatic classification of links into categories. The classification is based on a recently published classifier ( http://research.microsoft.com/apps/pubs/default.aspx?id=122779 ). More interestingly though, we have developed a pipeline which uses active learning to automatically discover links that cannot reliably be classified. These links are then automatically sent to Amazon Mechanical Turk for labelling, after which we spam filter the results and update the classification model.

Live content is unavailable. Log in and register to view live content