Timezone: »
Excessive reuse of test data has become commonplace in today's machine learning workflows. Popular benchmarks, competitions, industrial scale tuning, among other applications, all involve test data reuse beyond guidance by statistical confidence bounds. Nonetheless, recent replication studies give evidence that popular benchmarks continue to support progress despite years of extensive reuse. We proffer a new explanation for the apparent longevity of test data: Many proposed models are similar in their predictions and we prove that this similarity mitigates overfitting. Specifically, we show empirically that models proposed for the ImageNet ILSVRC benchmark agree in their predictions well beyond what we can conclude from their accuracy levels alone. Likewise, models created by large scale hyperparameter search enjoy high levels of similarity. Motivated by these empirical observations, we give a non-asymptotic generalization bound that takes similarity into account, leading to meaningful confidence bounds in practical settings.
Author Information
Horia Mania (UC Berkeley)
John Miller (University of California, Berkeley)
Ludwig Schmidt (UC Berkeley)
Moritz Hardt (University of California, Berkeley)
Benjamin Recht (UC Berkeley)
More from the Same Authors
-
2021 : Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning »
Thomas Liao · Rohan Taori · Deborah Raji · Ludwig Schmidt -
2021 : Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Becca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2021 : Evaluating Machine Accuracy on ImageNet »
Vaishaal Shankar · Becca Roelofs · Horia Mania · Benjamin Recht · Ludwig Schmidt -
2021 : Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2021 : Robust fine-tuning of zero-shot models »
Mitchell Wortsman · Gabriel Ilharco · Jong Wook Kim · Mike Li · Hanna Hajishirzi · Ali Farhadi · Hongseok Namkoong · Ludwig Schmidt -
2021 : Alternative Microfoundations for Strategic Classification »
Meena Jagadeesan · Celestine Mendler-Dünner · Moritz Hardt -
2021 : Alternative Microfoundations for Strategic Classification »
Meena Jagadeesan · Celestine Mendler-Dünner · Moritz Hardt -
2022 : Causal Inference out of Control: Identifying the Steerability of Consumption »
Gary Cheng · Moritz Hardt · Celestine Mendler-Dünner -
2022 : Causal Inference out of Control: Identifying the Steerability of Consumption »
Gary Cheng · Moritz Hardt · Celestine Mendler-Dünner -
2021 : Microfoundations of Algorithmic decisions »
Moritz Hardt -
2021 Oral: Retiring Adult: New Datasets for Fair Machine Learning »
Frances Ding · Moritz Hardt · John Miller · Ludwig Schmidt -
2021 Poster: Retiring Adult: New Datasets for Fair Machine Learning »
Frances Ding · Moritz Hardt · John Miller · Ludwig Schmidt -
2021 Poster: Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning »
Timo Milbich · Karsten Roth · Samarth Sinha · Ludwig Schmidt · Marzyeh Ghassemi · Bjorn Ommer -
2020 : Contributed Talk 6: Do Offline Metrics Predict Online Performance in Recommender Systems? »
Karl Krauth · Sarah Dean · Wenshuo Guo · Benjamin Recht · Michael Jordan -
2020 : Invited Talk 7: Prediction Dynamics »
Moritz Hardt -
2020 Workshop: Consequential Decisions in Dynamic Environments »
Niki Kilbertus · Angela Zhou · Ashia Wilson · John Miller · Lily Hu · Lydia T. Liu · Nathan Kallus · Shira Mitchell -
2020 : Tutorial: A brief tutorial on causality and fair decision making »
Moritz Hardt -
2020 Poster: Stochastic Optimization for Performative Prediction »
Celestine Mendler-Dünner · Juan Perdomo · Tijana Zrnic · Moritz Hardt -
2020 Oral: Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent »
Benjamin Recht · Christopher Ré · Stephen Wright · Feng Niu -
2020 Poster: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Spotlight: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2019 Poster: Unlabeled Data Improves Adversarial Robustness »
Yair Carmon · Aditi Raghunathan · Ludwig Schmidt · John Duchi · Percy Liang -
2019 Poster: A Meta-Analysis of Overfitting in Machine Learning »
Becca Roelofs · Vaishaal Shankar · Benjamin Recht · Sara Fridovich-Keil · Moritz Hardt · John Miller · Ludwig Schmidt -
2019 Poster: Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator »
Karl Krauth · Stephen Tu · Benjamin Recht -
2019 Poster: Certainty Equivalence is Efficient for Linear Quadratic Control »
Horia Mania · Stephen Tu · Benjamin Recht -
2018 Poster: Simple random search of static linear policies is competitive for reinforcement learning »
Horia Mania · Aurelia Guy · Benjamin Recht -
2018 Poster: Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator »
Sarah Dean · Horia Mania · Nikolai Matni · Benjamin Recht · Stephen Tu -
2017 : Safety beyond Security: Societal Challenges for Machine Learning »
Moritz Hardt -
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht -
2017 Poster: Avoiding Discrimination through Causal Reasoning »
Niki Kilbertus · Mateo Rojas Carulla · Giambattista Parascandolo · Moritz Hardt · Dominik Janzing · Bernhard Schölkopf -
2017 Poster: The Marginal Value of Adaptive Gradient Methods in Machine Learning »
Ashia C Wilson · Becca Roelofs · Mitchell Stern · Nati Srebro · Benjamin Recht -
2017 Oral: The Marginal Value of Adaptive Gradient Methods in Machine Learning »
Ashia C Wilson · Becca Roelofs · Mitchell Stern · Nati Srebro · Benjamin Recht -
2017 Oral: Test of Time Award »
ali rahimi · Benjamin Recht -
2017 Tutorial: Fairness in Machine Learning »
Solon Barocas · Moritz Hardt -
2016 Poster: The Power of Adaptivity in Identifying Statistical Alternatives »
Kevin Jamieson · Daniel Haas · Benjamin Recht -
2016 Poster: Cyclades: Conflict-free Asynchronous Machine Learning »
Xinghao Pan · Maximilian Lam · Stephen Tu · Dimitris Papailiopoulos · Ce Zhang · Michael Jordan · Kannan Ramchandran · Christopher Ré · Benjamin Recht -
2016 Poster: Equality of Opportunity in Supervised Learning »
Moritz Hardt · Eric Price · Eric Price · Nati Srebro -
2015 Workshop: Adaptive Data Analysis »
Adam Smith · Aaron Roth · Vitaly Feldman · Moritz Hardt -
2015 Poster: Generalization in Adaptive Data Analysis and Holdout Reuse »
Cynthia Dwork · Vitaly Feldman · Moritz Hardt · Toni Pitassi · Omer Reingold · Aaron Roth -
2015 Poster: Parallel Correlation Clustering on Big Graphs »
Xinghao Pan · Dimitris Papailiopoulos · Samet Oymak · Benjamin Recht · Kannan Ramchandran · Michael Jordan -
2015 Poster: Differentially Private Learning of Structured Discrete Distributions »
Ilias Diakonikolas · Moritz Hardt · Ludwig Schmidt -
2014 Workshop: Fairness, Accountability, and Transparency in Machine Learning »
Moritz Hardt · Solon Barocas -
2014 Poster: The Noisy Power Method: A Meta Algorithm with Applications »
Moritz Hardt · Eric Price -
2014 Spotlight: The Noisy Power Method: A Meta Algorithm with Applications »
Moritz Hardt · Eric Price