Timezone: »
Image classification accuracy on the ImageNet dataset has been a barometer for progress in computer vision over the last decade. Several recent papers have questioned the degree to which the benchmark remains useful to the community, yet innovations continue to contribute gains to performance, with today's largest models achieving 90%+ top-1 accuracy. To help contextualize progress on ImageNet and provide a more meaningful evaluation for today's state-of-the-art models, we manually review and categorize every remaining mistake that a few top models make in order to provide insight into the long-tail of errors on one of the most benchmarked datasets in computer vision. We focus on the multi-label subset evaluation of ImageNet, where today's best models achieve upwards of 97% top-1 accuracy. Our analysis reveals that nearly half of the supposed mistakes are not mistakes at all, and we uncover new valid multi-labels, demonstrating that, without careful review, we are significantly underestimating the performance of these models. On the other hand, we also find that today's best models still make a significant number of mistakes (40%) that are obviously wrong to human reviewers. To calibrate future progress on ImageNet, we provide an updated multi-label evaluation set, and we curate ImageNet-Major: a 68-example "major error" slice of the obvious mistakes made by today's top models -- a slice where models should achieve near perfection, but today are far from doing so.
Author Information
Vijay Vasudevan (Google Brain)
Benjamin Caine (Google)
Raphael Gontijo Lopes (Google Brain)
Sara Fridovich-Keil (UC Berkeley)
Rebecca Roelofs (Google)
More from the Same Authors
-
2022 : Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios »
Yiren Lu · Yiren Lu · Yiren Lu · Justin Fu · George Tucker · Xinlei Pan · Eli Bronstein · Rebecca Roelofs · Benjamin Sapp · Brandyn White · Aleksandra Faust · Shimon Whiteson · Dragomir Anguelov · Sergey Levine -
2023 Poster: DriveMax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research »
Cole Gulino · Justin Fu · Wenjie Luo · George Tucker · Eli Bronstein · Yiren Lu · Jean Harb · Xinlei Pan · Yan Wang · Xiangyu Chen · John Co-Reyes · Rishabh Agarwal · Rebecca Roelofs · Yao Lu · Nico Montali · Paul Mougin · Zoey Yang · Brandyn White · Aleksandra Faust · Rowan McAllister · Dragomir Anguelov · Benjamin Sapp -
2023 Workshop: Workshop on Distribution Shifts: New Frontiers with Foundation Models »
Rebecca Roelofs · Fanny Yang · Hongseok Namkoong · Masashi Sugiyama · Jacob Eisenstein · Pang Wei Koh · Shiori Sagawa · Tatsunori Hashimoto · Yoonho Lee -
2022 Workshop: Workshop on Distribution Shifts: Connecting Methods and Applications »
Chelsea Finn · Fanny Yang · Hongseok Namkoong · Masashi Sugiyama · Jacob Eisenstein · Jonas Peters · Rebecca Roelofs · Shiori Sagawa · Pang Wei Koh · Yoonho Lee -
2022 Poster: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding »
Chitwan Saharia · William Chan · Saurabh Saxena · Lala Li · Jay Whang · Remi Denton · Kamyar Ghasemipour · Raphael Gontijo Lopes · Burcu Karagol Ayan · Tim Salimans · Jonathan Ho · David Fleet · Mohammad Norouzi -
2022 Poster: Models Out of Line: A Fourier Lens on Distribution Shift Robustness »
Sara Fridovich-Keil · Brian Bartoldson · James Diffenderfer · Bhavya Kailkhura · Timo Bremer -
2022 Poster: Spectral Bias in Practice: The Role of Function Frequency in Generalization »
Sara Fridovich-Keil · Raphael Gontijo Lopes · Rebecca Roelofs -
2020 Workshop: Resistance AI Workshop »
Suzanne Kite · Mattie Tesfaldet · J Khadijah Abdurahman · William Agnew · Elliot Creager · Agata Foryciarz · Raphael Gontijo Lopes · Pratyusha Kalluri · Marie-Therese Png · Manuel Sabin · Maria Skoularidou · Ramon Vilarino · Rose Wang · Sayash Kapoor · Micah Carroll -
2020 Poster: Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains »
Matthew Tancik · Pratul Srinivasan · Ben Mildenhall · Sara Fridovich-Keil · Nithin Raghavan · Utkarsh Singhal · Ravi Ramamoorthi · Jonathan Barron · Ren Ng -
2020 Spotlight: Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains »
Matthew Tancik · Pratul Srinivasan · Ben Mildenhall · Sara Fridovich-Keil · Nithin Raghavan · Utkarsh Singhal · Ravi Ramamoorthi · Jonathan Barron · Ren Ng -
2020 Affinity Workshop: Queer in AI Workshop @ NeurIPS 2020 »
Raphael Gontijo Lopes · Luke Stark · Melvin Selim Atay · ST John -
2019 : Coffee + Posters »
Benjamin Caine · Renhao Wang · Nazmus Sakib · Nana Otawara · Meha Kaushik · elmira amirloo · Nemanja Djuric · Johanna Rock · Tanmay Agarwal · Angelos Filos · Panagiotis Tigkas · Donsuk Lee · Wootae Jeon · Nikita Jaipuria · Pin Wang · Jinxin Zhao · Liangjun Zhang · Ashutosh Singh · Ershad Banijamali · Mohsen Rohani · Aman Sinha · Ameya Joshi · Ching-Yao Chan · Mohammed Abdou · Changhao Chen · Jong-Chan Kim · eslam mohamed · Matt OKelly · Nirvan Singhania · Hiroshi Tsukahara · Atsushi Keyaki · Praveen Palanisamy · Justin Norden · Micol Marchetti-Bowick · Yiming Gu · Hitesh Arora · Shubhankar Deshpande · Jeff Schneider · Shangling Jui · Vaneet Aggarwal · Tryambak Gangopadhyay · Qiaojing Yan -
2019 Poster: A Fourier Perspective on Model Robustness in Computer Vision »
Dong Yin · Raphael Gontijo Lopes · Jonathon Shlens · Ekin Dogus Cubuk · Justin Gilmer -
2019 Poster: A Meta-Analysis of Overfitting in Machine Learning »
Becca Roelofs · Vaishaal Shankar · Benjamin Recht · Sara Fridovich-Keil · Moritz Hardt · John Miller · Ludwig Schmidt