Timezone: »
Not all data in a typical training set help with generalization; some samples can be overly ambiguous or outrightly mislabeled. This paper introduces a new method to identify such samples and mitigate their impact when training neural networks. At the heart of our algorithm is the Area Under the Margin (AUM) statistic, which exploits differences in the training dynamics of clean and mislabeled samples. A simple procedure - adding an extra class populated with purposefully mislabeled threshold samples - learns a AUM upper bound that isolates mislabeled data. This approach consistently improves upon prior work on synthetic and real-world datasets. On the WebVision50 classification task our method removes 17% of training data, yielding a 1.6% (absolute) improvement in test error. On CIFAR100 removing 13% of the data leads to a 1.2% drop in error.
Author Information
Geoff Pleiss (Columbia University)
Tianyi Zhang (Stanford University)
Ethan Elenberg (ASAPP)
Kilian Weinberger (Cornell University / ASAPP Research)
More from the Same Authors
-
2021 : Fixed Neural Network Steganography: Train the images, not the network »
Varsha Kishore · Xiangyu Chen · Yan Wang · Boyi Li · Kilian Weinberger -
2022 Panel: Panel 4B-2: Decentralized Training of… & Sharper Convergence Guarantees… »
Anastasiia Koloskova · Tianyi Zhang -
2022 Poster: Unsupervised Adaptation from Repeated Traversals for Autonomous Driving »
Yurong You · Cheng Perng Phoo · Katie Luo · Travis Zhang · Wei-Lun Chao · Bharath Hariharan · Mark Campbell · Kilian Weinberger -
2022 Poster: Decentralized Training of Foundation Models in Heterogeneous Environments »
Binhang Yuan · Yongjun He · Jared Davis · Tianyi Zhang · Tri Dao · Beidi Chen · Percy Liang · Christopher Ré · Ce Zhang -
2021 Poster: The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective »
Geoff Pleiss · John Cunningham -
2021 Poster: Rectangular Flows for Manifold Learning »
Anthony Caterini · Gabriel Loaiza-Ganem · Geoff Pleiss · John Cunningham -
2020 : Panel »
Kilian Weinberger · Maria De-Arteaga · Shibani Santurkar · Jonathan Frankle · Deborah Raji -
2020 : Q&A with Kilian »
Kilian Weinberger -
2020 : Invited: Kilian Weinberger »
Kilian Weinberger -
2020 Poster: Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization »
Geoff Pleiss · Martin Jankowiak · David Eriksson · Anil Damle · Jacob Gardner -
2020 Poster: Wasserstein Distances for Stereo Disparity Estimation »
Divyansh Garg · Yan Wang · Bharath Hariharan · Mark Campbell · Kilian Weinberger · Wei-Lun Chao -
2020 Spotlight: Wasserstein Distances for Stereo Disparity Estimation »
Divyansh Garg · Yan Wang · Bharath Hariharan · Mark Campbell · Kilian Weinberger · Wei-Lun Chao -
2019 : Poster Session 1 »
Simeon Spasov · Prateeth Nayak · Ferran Diego Andilla · Tianyi Zhang · Amit Trivedi -
2019 Poster: Positional Normalization »
Boyi Li · Felix Wu · Kilian Weinberger · Serge Belongie -
2019 Spotlight: Positional Normalization »
Boyi Li · Felix Wu · Kilian Weinberger · Serge Belongie -
2019 Poster: Exact Gaussian Processes on a Million Data Points »
Ke Alexander Wang · Geoff Pleiss · Jacob Gardner · Stephen Tyree · Kilian Weinberger · Andrew Gordon Wilson -
2019 Poster: A New Defense Against Adversarial Images: Turning a Weakness into a Strength »
Shengyuan Hu · Tao Yu · Chuan Guo · Wei-Lun Chao · Kilian Weinberger -
2018 Poster: GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration »
Jacob Gardner · Geoff Pleiss · Kilian Weinberger · David Bindel · Andrew Wilson -
2018 Spotlight: GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration »
Jacob Gardner · Geoff Pleiss · Kilian Weinberger · David Bindel · Andrew Wilson -
2017 Poster: On Fairness and Calibration »
Geoff Pleiss · Manish Raghavan · Felix Wu · Jon Kleinberg · Kilian Weinberger -
2017 Poster: Streaming Weak Submodularity: Interpreting Neural Networks on the Fly »
Ethan Elenberg · Alex Dimakis · Moran Feldman · Amin Karbasi -
2017 Oral: Streaming Weak Submodularity: Interpreting Neural Networks on the Fly »
Ethan Elenberg · Alex Dimakis · Moran Feldman · Amin Karbasi -
2016 Poster: Supervised Word Mover's Distance »
Gao Huang · Chuan Guo · Matt J Kusner · Yu Sun · Fei Sha · Kilian Weinberger -
2016 Oral: Supervised Word Mover's Distance »
Gao Huang · Chuan Guo · Matt J Kusner · Yu Sun · Fei Sha · Kilian Weinberger -
2015 : Deep Manifold Traversal »
Kilian Weinberger -
2015 Poster: Fast Distributed k-Center Clustering with Outliers on Massive Data »
Gustavo Malkomes · Matt J Kusner · Wenlin Chen · Kilian Q Weinberger · Benjamin Moseley -
2015 Poster: Bayesian Active Model Selection with an Application to Automated Audiometry »
Jacob Gardner · Gustavo Malkomes · Roman Garnett · Kilian Weinberger · Dennis Barbour · John Cunningham -
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · Jean-Francis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto -
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser -
2012 Poster: Non-linear Metric Learning »
Dor Kedem · Stephen Tyree · Kilian Q Weinberger · Fei Sha · Gert Lanckriet -
2011 Workshop: Beyond Mahalanobis: Supervised Large-Scale Learning of Similarity »
Greg Shakhnarovich · Dhruv Batra · Brian Kulis · Kilian Q Weinberger -
2011 Poster: Co-Training for Domain Adaptation »
Minmin Chen · Kilian Q Weinberger · John Blitzer -
2010 Session: Oral Session 16 »
Kilian Q Weinberger -
2010 Poster: Large Margin Multi-Task Metric Learning »
Shibin Parameswaran · Kilian Q Weinberger -
2010 Poster: Decoding Ipsilateral Finger Movements from ECoG Signals in Humans »
Yuzong Liu · Mohit Sharma · Charles M Gaona · Jonathan D Breshears · jarod Roland · zachary V Freudenburg · Kilian Q Weinberger · Eric C Leuthardt -
2008 Poster: Large Margin Taxonomy Embedding for Document Categorization »
Kilian Q Weinberger · Olivier Chapelle -
2008 Spotlight: Large Margin Taxonomy Embedding for Document Categorization »
Kilian Q Weinberger · Olivier Chapelle -
2006 Workshop: Novel Applications of Dimensionality Reduction »
John Blitzer · Rajarshi Das · Irina Rish · Kilian Q Weinberger -
2006 Poster: Graph Regularization for Maximum Variance Unfolding with an Application to Sensor Localization »
Kilian Q Weinberger · Fei Sha · Qihui Zhu · Lawrence Saul