Timezone: »
In this paper, we study the problem of aggregating noisy labels from crowd workers to infer the underlying true labels of binary tasks. Unlike most prior work which has examined this problem under the random worker paradigm, we consider a much broader class of {\em adversarial} workers with no specific assumptions on their labeling strategy. Our key contribution is the design of a computationally efficient reputation algorithm to identify and filter out these adversarial workers in crowdsourcing systems. Our algorithm uses the concept of optimal semi-matchings in conjunction with worker penalties based on label disagreements, to assign a reputation score for every worker. We provide strong theoretical guarantees for deterministic adversarial strategies as well as the extreme case of {\em sophisticated} adversaries where we analyze the worst-case behavior of our algorithm. Finally, we show that our reputation algorithm can significantly improve the accuracy of existing label aggregation algorithms in real-world crowdsourcing datasets.
Author Information
Srikanth Jagabathula (NYU)
Lakshminarayanan Subramanian (NYU)
Ashwin Venkataraman (New York University)
More from the Same Authors
-
2009 Poster: A Data-Driven Approach to Modeling Choice »
Vivek Farias · Srikanth Jagabathula · Devavrat Shah -
2009 Spotlight: A Data-Driven Approach to Modeling Choice »
Vivek Farias · Srikanth Jagabathula · Devavrat Shah -
2008 Poster: Inferring rankings under constrained sensing »
Srikanth Jagabathula · Devavrat Shah -
2008 Oral: Inferring rankings under constrained sensing »
Srikanth Jagabathula · Devavrat Shah