Jennifer Wortman Vaughan, Greg Stoddard, Chien-Ju Ho, Adish Singla, Michael Bernstein, Devavrat Shah, Arpita Ghosh, Evgeniy Gabrilovich, Dengyong Zhou, Nikhil Devanur, Xi Chen, Alex T Ihler, Qiang Liu, Genevieve Patterson, Ashwinkumar Badanidiyuru, Hossein Azari Soufiani, Jacob Whitehill
Microsoft Research; Northwestern University; UCLA; ETH, Zurich; Stanford University; Massachusetts Institute of Technology; Cornell University; Google; Microsoft Research; Microsoft Research; NYU; UC Irvine; UC Irvine; Brown University; Cornell University; Harvard University; University of California, San Diego
Northwestern University; UCLA; ETH, Zurich; Stanford University; Massachusetts Institute of Technology; Cornell University; Google; Brown University; Cornell University; Harvard University; University of California, San Diego
Workshop: Crowdsourcing: Theory, Algorithms and Applications
7:30am – 6:30pm Monday, December 09, 2013
Harrah's Tahoe A+B
07:30-08:00 Michael Bernstein; Crowd-Powered Systems
08:05-08:17 Greg Stoddard; Social Status and the Design of Optimal Badges.
08:20-08:32 Chien-Ju Ho; Adaptive Contract Design for Crowdsourcing
08:35-08:45 Poster lightnings
08:45-09:30 Poster session + Coffee break
09:30-10:00 Evgeniy Gabrilovich; Crowdsourcing knowledge, one billion facts at a time
03:30-04:00 Devavrat Shah; Collaborative Decision Making
04:05-04:35 Arpita Ghosh; Incentive Design for Crowdsourcing: A Game-Theoretic Approach
04:40-04:52 Afshin Nikzad; Matching Workers Expertise with Tasks: Incentives in Heterogeneous
04:55-05:30 Coffee Break
05:30-06:00 Dengyong Zhou; Algorithmic Crowdsourcing.
06:05-06:30 Wrap-up discussion.
All machine learning systems are an integration of data that store human or physical knowledge, and algorithms that discover knowledge patterns and make predictions to new instances. Even though most research attention has been focused on developing more efficient learning algorithms, it is the quality and amount of training data that predominately govern the performance of real-world systems. This is only amplified by the recent popularity of large scale and complicated learning systems such as deep networks, which require millions to billions of training data to perform well. Unfortunately, the traditional methods of collecting data from specialized workers are usually expensive and slow. In recent years, however, the situation has dramatically changed with the emergence of crowdsourcing, where huge amounts of labeled data are collected from large groups of (usually online) workers for low or no cost. Many machine learning tasks, such as computer vision and natural language processing are increasingly benefitting from data crowdsourced platforms such as Amazon Mechanical Turk and CrowdFlower. On the other hand, tools in machine learning, game theory and mechanism design can help to address many challenging problems in crowdsourcing systems, such as making them more reliable, efficient and less expensive.
In this workshop, we call attention back to sources of data, discussing cheap and fast data collection methods based on crowdsourcing, and how it could impact subsequent machine learning stages.
Furthermore, we will emphasize how the data sourcing paradigm interacts with the most recent emerging trends of machine learning in NIPS community.
Examples of topics of potential interest in the workshop include (but are not limited to):
Application of crowdsourcing to machine learning.
Reliable crowdsourcing, e.g., label aggregation, quality control.
Optimal budget allocation or active learning in crowdsourcing.
Workflow design and answer aggregation for complex tasks (e.g., machine translation, proofreading).
Pricing and incentives in crowdsourcing markets.
Prediction markets / information markets and its connection to learning.
Theoretical analysis for crowdsourcing algorithms, e.g., error rates and sample complexities for label aggregation and budget allocation algorithms.