Timezone: »

 
Spotlight
A general agnostic active learning algorithm
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni

Tue Dec 04 05:20 PM -- 05:30 PM (PST) @

We present an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions. Most previous work on active learning either makes strong distributional assumptions, or else is computationally prohibitive. Our algorithm extends the simple scheme of Cohn, Atlas, and Ladner to the agnostic setting, using reductions to supervised learning that harness generalization bounds in a simple but subtle manner. We provide a fall-back guarantee that bounds the algorithm's label complexity by the agnostic PAC sample complexity. Our analysis yields asymptotic label complexity improvements for certain hypothesis classes and distributions. We also demonstrate improvements experimentally.

Author Information

Sanjoy Dasgupta (UC San Diego)
Daniel Hsu (Columbia University)

See <https://www.cs.columbia.edu/~djhsu/>

Claire Monteleoni (University of Colorado Boulder)

Claire Monteleoni is an associate professor of Computer Science at University of Colorado Boulder. Previously, she was an associate professor at George Washington University, and research faculty at the Center for Computational Learning Systems, at Columbia University. She did a postdoc in Computer Science and Engineering at the University of California, San Diego, and completed her PhD and Masters in Computer Science, at MIT. She holds a Bachelors in Earth and Planetary Sciences from Harvard. Her research focuses on machine learning algorithms and theory for problems including learning from data streams, learning from raw (unlabeled) data, learning from private data, and climate informatics: accelerating discovery in climate science with machine learning. Her work on climate informatics received the Best Application Paper Award at NASA CIDU 2010. In 2011, she co-founded the International Workshop on Climate Informatics, which is now in its fourth year, attracting climate scientists and data scientists from over 14 countries and 26 states.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors