Timezone: »
Poster
Saccader: Improving Accuracy of Hard Attention Models for Vision
Gamaleldin Elsayed · Simon Kornblith · Quoc V Le
Wed Dec 11 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #70
Although deep convolutional neural networks achieve state-of-the-art performance across nearly all image classification tasks, their decisions are difficult to interpret. One approach that offers some level of interpretability by design is \textit{hard attention}, which uses only relevant portions of the image. However, training hard attention models with only class label supervision is challenging, and hard attention has proved difficult to scale to complex datasets. Here, we propose a novel hard attention model, which we term Saccader.
Key to Saccader is a pretraining step that requires only class labels and provides initial attention locations for policy gradient optimization. Our best models narrow the gap to common ImageNet baselines, achieving $75\%$ top-1 and $91\%$ top-5 while attending to less than one-third of the image.
Author Information
Gamaleldin Elsayed (Google Research, Brain Team)
Simon Kornblith (Google Brain)
Quoc V Le (Google)
More from the Same Authors
-
2022 : Neural Network Online Training with Sensitivity to Multiscale Temporal Structure »
Matt Jones · Tyler Scott · Gamaleldin Elsayed · Mengye Ren · Katherine Hermann · David Mayo · Michael Mozer -
2022 : Spatial Symmetry in Slot Attention »
Ondrej Biza · Sjoerd van Steenkiste · Mehdi S. M. Sajjadi · Gamaleldin Elsayed · Aravindh Mahendran · Thomas Kipf -
2022 : Teacher-generated pseudo human spatial-attention labels boost contrastive learning models »
Yushi Yao · Chang Ye · Junfeng He · Gamaleldin Elsayed -
2022 : Human alignment of neural network representations »
Lukas Muttenthaler · Lorenz Linhardt · Jonas Dippel · Robert Vandermeulen · Simon Kornblith -
2022 Poster: Patching open-vocabulary models by interpolating weights »
Gabriel Ilharco · Mitchell Wortsman · Samir Yitzhak Gadre · Shuran Song · Hannaneh Hajishirzi · Simon Kornblith · Ali Farhadi · Ludwig Schmidt -
2022 Poster: SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos »
Gamaleldin Elsayed · Aravindh Mahendran · Sjoerd van Steenkiste · Klaus Greff · Michael Mozer · Thomas Kipf -
2022 Poster: Mixture-of-Experts with Expert Choice Routing »
Yanqi Zhou · Tao Lei · Hanxiao Liu · Nan Du · Yanping Huang · Vincent Zhao · Andrew Dai · zhifeng Chen · Quoc V Le · James Laudon -
2022 Poster: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models »
Jason Wei · Xuezhi Wang · Dale Schuurmans · Maarten Bosma · brian ichter · Fei Xia · Ed Chi · Quoc V Le · Denny Zhou -
2022 Poster: TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets »
Chengrun Yang · Gabriel Bender · Hanxiao Liu · Pieter-Jan Kindermans · Madeleine Udell · Yifeng Lu · Quoc V Le · Da Huang -
2021 Poster: Why Do Better Loss Functions Lead to Less Transferable Features? »
Simon Kornblith · Ting Chen · Honglak Lee · Mohammad Norouzi -
2021 Poster: Generalized Shape Metrics on Neural Representations »
Alex H Williams · Erin Kunz · Simon Kornblith · Scott Linderman -
2021 Poster: Meta-learning to Improve Pre-training »
Aniruddh Raghu · Jonathan Lorraine · Simon Kornblith · Matthew McDermott · David Duvenaud -
2021 Poster: CoAtNet: Marrying Convolution and Attention for All Data Sizes »
Zihang Dai · Hanxiao Liu · Quoc V Le · Mingxing Tan -
2021 Poster: Searching for Efficient Transformers for Language Modeling »
David So · Wojciech Mańke · Hanxiao Liu · Zihang Dai · Noam Shazeer · Quoc V Le -
2021 Poster: Pay Attention to MLPs »
Hanxiao Liu · Zihang Dai · David So · Quoc V Le -
2021 Poster: Do Vision Transformers See Like Convolutional Neural Networks? »
Maithra Raghu · Thomas Unterthiner · Simon Kornblith · Chiyuan Zhang · Alexey Dosovitskiy -
2020 : Panel Discussion & Closing »
Yejin Choi · Alexei Efros · Chelsea Finn · Kristen Grauman · Quoc V Le · Yann LeCun · Ruslan Salakhutdinov · Eric Xing -
2020 Poster: Evolving Normalization-Activation Layers »
Hanxiao Liu · Andy Brock · Karen Simonyan · Quoc V Le -
2020 Spotlight: Evolving Normalization-Activation Layers »
Hanxiao Liu · Andy Brock · Karen Simonyan · Quoc V Le -
2020 Poster: The Origins and Prevalence of Texture Bias in Convolutional Neural Networks »
Katherine L. Hermann · Ting Chen · Simon Kornblith -
2020 Poster: PyGlove: Symbolic Programming for Automated Machine Learning »
Daiyi Peng · Xuanyi Dong · Esteban Real · Mingxing Tan · Yifeng Lu · Gabriel Bender · Hanxiao Liu · Adam Kraft · Chen Liang · Quoc V Le -
2020 Poster: RandAugment: Practical Automated Data Augmentation with a Reduced Search Space »
Ekin Dogus Cubuk · Barret Zoph · Jonathon Shlens · Quoc V Le -
2020 Oral: The Origins and Prevalence of Texture Bias in Convolutional Neural Networks »
Katherine L. Hermann · Ting Chen · Simon Kornblith -
2020 Oral: PyGlove: Symbolic Programming for Automated Machine Learning »
Daiyi Peng · Xuanyi Dong · Esteban Real · Mingxing Tan · Yifeng Lu · Gabriel Bender · Hanxiao Liu · Adam Kraft · Chen Liang · Quoc V Le -
2020 Poster: Big Self-Supervised Models are Strong Semi-Supervised Learners »
Ting Chen · Simon Kornblith · Kevin Swersky · Mohammad Norouzi · Geoffrey E Hinton -
2020 Poster: Rethinking Pre-training and Self-training »
Barret Zoph · Golnaz Ghiasi · Tsung-Yi Lin · Yin Cui · Hanxiao Liu · Ekin Dogus Cubuk · Quoc V Le -
2020 Oral: Rethinking Pre-training and Self-training »
Barret Zoph · Golnaz Ghiasi · Tsung-Yi Lin · Yin Cui · Hanxiao Liu · Ekin Dogus Cubuk · Quoc V Le -
2020 Poster: Unsupervised Data Augmentation for Consistency Training »
Qizhe Xie · Zihang Dai · Eduard Hovy · Thang Luong · Quoc V Le -
2020 Poster: Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing »
Zihang Dai · Guokun Lai · Yiming Yang · Quoc V Le -
2019 : Poster Session »
Ethan Harris · Tom White · Oh Hyeon Choung · Takashi Shinozaki · Dipan Pal · Katherine L. Hermann · Judy Borowski · Camilo Fosco · Chaz Firestone · Vijay Veerabadran · Benjamin Lahner · Chaitanya Ryali · Fenil Doshi · Pulkit Singh · Sharon Zhou · Michel Besserve · Michael Chang · Anelise Newman · Mahesan Niranjan · Jonathon Hare · Daniela Mihai · Marios Savvides · Simon Kornblith · Christina M Funke · Aude Oliva · Virginia de Sa · Dmitry Krotov · Colin Conwell · George Alvarez · Alex Kolchinski · Shengjia Zhao · Mitchell Gordon · Michael Bernstein · Stefano Ermon · Arash Mehrjou · Bernhard Schölkopf · John Co-Reyes · Michael Janner · Jiajun Wu · Josh Tenenbaum · Sergey Levine · Yalda Mohsenzadeh · Zhenglong Zhou -
2019 Poster: XLNet: Generalized Autoregressive Pretraining for Language Understanding »
Zhilin Yang · Zihang Dai · Yiming Yang · Jaime Carbonell · Russ Salakhutdinov · Quoc V Le -
2019 Oral: XLNet: Generalized Autoregressive Pretraining for Language Understanding »
Zhilin Yang · Zihang Dai · Yiming Yang · Jaime Carbonell · Russ Salakhutdinov · Quoc V Le -
2019 Poster: CondConv: Conditionally Parameterized Convolutions for Efficient Inference »
Brandon Yang · Gabriel Bender · Quoc V Le · Jiquan Ngiam -
2019 Poster: When does label smoothing help? »
Rafael Müller · Simon Kornblith · Geoffrey E Hinton -
2019 Spotlight: When does label smoothing help? »
Rafael Müller · Simon Kornblith · Geoffrey E Hinton -
2019 Poster: Mixtape: Breaking the Softmax Bottleneck Efficiently »
Zhilin Yang · Thang Luong · Russ Salakhutdinov · Quoc V Le -
2019 Poster: GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism »
Yanping Huang · Youlong Cheng · Ankur Bapna · Orhan Firat · Dehao Chen · Mia Chen · HyoukJoong Lee · Jiquan Ngiam · Quoc V Le · Yonghui Wu · zhifeng Chen -
2019 Poster: High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks »
Ruben Villegas · Arkanath Pathak · Harini Kannan · Dumitru Erhan · Quoc V Le · Honglak Lee -
2018 Poster: Large Margin Deep Networks for Classification »
Gamaleldin Elsayed · Dilip Krishnan · Hossein Mobahi · Kevin Regan · Samy Bengio -
2018 Poster: Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing »
Chen Liang · Mohammad Norouzi · Jonathan Berant · Quoc V Le · Ni Lao -
2018 Spotlight: Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing »
Chen Liang · Mohammad Norouzi · Jonathan Berant · Quoc V Le · Ni Lao -
2018 Poster: DropBlock: A regularization method for convolutional networks »
Golnaz Ghiasi · Tsung-Yi Lin · Quoc V Le -
2018 Poster: Adversarial Examples that Fool both Computer Vision and Time-Limited Humans »
Gamaleldin Elsayed · Shreya Shankar · Brian Cheung · Nicolas Papernot · Alexey Kurakin · Ian Goodfellow · Jascha Sohl-Dickstein -
2017 Symposium: Metalearning »
Risto Miikkulainen · Quoc V Le · Kenneth Stanley · Chrisantha Fernando -
2016 Poster: An Online Sequence-to-Sequence Model Using Partial Conditioning »
Navdeep Jaitly · Quoc V Le · Oriol Vinyals · Ilya Sutskever · David Sussillo · Samy Bengio -
2015 Poster: Semi-supervised Sequence Learning »
Andrew Dai · Quoc V Le -
2014 Poster: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2014 Oral: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2013 Workshop: Randomized Methods for Machine Learning »
David Lopez-Paz · Quoc V Le · Alexander Smola