Timezone: »
We present an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions. Most previous work on active learning either makes strong distributional assumptions, or else is computationally prohibitive. Our algorithm extends the simple scheme of Cohn, Atlas, and Ladner to the agnostic setting, using reductions to supervised learning that harness generalization bounds in a simple but subtle manner. We provide a fall-back guarantee that bounds the algorithm's label complexity by the agnostic PAC sample complexity. Our analysis yields asymptotic label complexity improvements for certain hypothesis classes and distributions. We also demonstrate improvements experimentally.
Author Information
Sanjoy Dasgupta (UC San Diego)
Daniel Hsu (Columbia University)
See <https://www.cs.columbia.edu/~djhsu/>
Claire Monteleoni (University of Colorado Boulder)
Claire Monteleoni is an associate professor of Computer Science at University of Colorado Boulder. Previously, she was an associate professor at George Washington University, and research faculty at the Center for Computational Learning Systems, at Columbia University. She did a postdoc in Computer Science and Engineering at the University of California, San Diego, and completed her PhD and Masters in Computer Science, at MIT. She holds a Bachelors in Earth and Planetary Sciences from Harvard. Her research focuses on machine learning algorithms and theory for problems including learning from data streams, learning from raw (unlabeled) data, learning from private data, and climate informatics: accelerating discovery in climate science with machine learning. Her work on climate informatics received the Best Application Paper Award at NASA CIDU 2010. In 2011, she co-founded the International Workshop on Climate Informatics, which is now in its fourth year, attracting climate scientists and data scientists from over 14 countries and 26 states.
Related Events (a corresponding poster, oral, or spotlight)
-
2007 Poster: A general agnostic active learning algorithm »
Tue. Dec 4th 06:30 -- 06:40 PM Room
More from the Same Authors
-
2020 : Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics »
Bo Cowgill · Fabrizio Dell'Acqua · Augustin Chaintreau · Nakul Verma · Samuel Deng · Daniel Hsu -
2021 Spotlight: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2022 Poster: Masked Prediction: A Parameter Identifiability View »
Bingbin Liu · Daniel Hsu · Pradeep Ravikumar · Andrej Risteski -
2021 Poster: Support vector machines and linear regression coincide with very high-dimensional features »
Navid Ardeshir · Clayton Sanford · Daniel Hsu -
2021 Poster: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2020 Poster: Ensuring Fairness Beyond the Training Data »
Debmalya Mandal · Samuel Deng · Suman Jana · Jeannette Wing · Daniel Hsu -
2019 Poster: On the number of variables to use in principal component regression »
Ji Xu · Daniel Hsu -
2018 Poster: Interactive Structure Learning with Structural Query-by-Committee »
Christopher Tosh · Sanjoy Dasgupta -
2018 Spotlight: Interactive Structure Learning with Structural Query-by-Committee »
Christopher Tosh · Sanjoy Dasgupta -
2018 Poster: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate »
Mikhail Belkin · Daniel Hsu · Partha P Mitra -
2018 Poster: Benefits of over-parameterization with EM »
Ji Xu · Daniel Hsu · Arian Maleki -
2018 Poster: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2018 Spotlight: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2017 Poster: Linear regression without correspondence »
Daniel Hsu · Kevin Shi · Xiaorui Sun -
2016 Poster: An algorithm for L1 nearest neighbor search via monotonic embedding »
Xinan Wang · Sanjoy Dasgupta -
2016 Poster: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Oral: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Poster: Search Improves Label for Active Learning »
Alina Beygelzimer · Daniel Hsu · John Langford · Chicheng Zhang -
2015 Poster: Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path »
Daniel Hsu · Aryeh Kontorovich · Csaba Szepesvari -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2014 Poster: Rates of Convergence for Nearest Neighbor Classification »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2014 Spotlight: Rates of Convergence for Nearest Neighbor Classification »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2014 Tutorial: Climate Change: Challenges for Machine Learning »
Arindam Banerjee · Claire Monteleoni -
2013 Workshop: Workshop on Spectral Learning »
Byron Boots · Daniel Hsu · Borja Balle -
2013 Poster: The Fast Convergence of Incremental PCA »
Akshay Balsubramani · Sanjoy Dasgupta · Yoav Freund -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2013 Poster: Contrastive Learning Using Spectral Methods »
James Y Zou · Daniel Hsu · David Parkes · Ryan Adams -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2010 Poster: Rates of convergence for the cluster tree »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Poster: Streaming k-means approximation »
Nir Ailon · Ragesh Jaiswal · Claire Monteleoni -
2008 Poster: Privacy-preserving logistic regression »
Kamalika Chaudhuri · Claire Monteleoni -
2007 Session: Session 3: Theory and Sequential Decision Making »
Sanjoy Dasgupta -
2007 Poster: Learning the structure of manifolds using random projections »
Yoav S Freund · Sanjoy Dasgupta · Mayank Kabra · Nakul Verma -
2007 Poster: A learning framework for nearest neighbor search »
Lawrence Cayton · Sanjoy Dasgupta