Timezone: »
Learning to capture long-range relations is fundamental to image/video recognition. Existing CNN models generally rely on increasing depth to model such relations which is highly inefficient. In this work, we propose the “double attention block”, a novel component that aggregates and propagates informative global features from the entire spatio-temporal space of input images/videos, enabling subsequent convolution layers to access features from the entire space efficiently. The component is designed with a double attention mechanism in two steps, where the first step gathers features from the entire space into a compact set through second-order attention pooling and the second step adaptively selects and distributes features to each location via another attention. The proposed double attention block is easy to adopt and can be plugged into existing deep neural networks conveniently. We conduct extensive ablation studies and experiments on both image and video recognition tasks for evaluating its performance. On the image recognition task, a ResNet-50 equipped with our double attention blocks outperforms a much larger ResNet-152 architecture on ImageNet-1k dataset with over 40% less the number of parameters and less FLOPs. On the action recognition task, our proposed model achieves the state-of-the-art results on the Kinetics and UCF-101 datasets with significantly higher efficiency than recent works.
Author Information
Yunpeng Chen (National University of Singapore)
Yannis Kalantidis (Facebook)
Jianshu Li (National University of Singapore)
Shuicheng Yan (National University of Singapore)
Jiashi Feng (National University of Singapore)
More from the Same Authors
-
2021 Workshop: Distribution shifts: connecting methods and applications (DistShift) »
Shiori Sagawa · Pang Wei Koh · Fanny Yang · Hongseok Namkoong · Jiashi Feng · Kate Saenko · Percy Liang · Sarah Bird · Sergey Levine -
2021 Poster: Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond »
Pan Zhou · Hanshu Yan · Xiaotong Yuan · Jiashi Feng · Shuicheng Yan -
2021 Poster: How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness? »
Xinshuai Dong · Anh Tuan Luu · Min Lin · Shuicheng Yan · Hanwang Zhang -
2021 Poster: Direct Multi-view Multi-person 3D Pose Estimation »
tao wang · Jianfeng Zhang · Yujun Cai · Shuicheng Yan · Jiashi Feng -
2020 Poster: Towards Theoretically Understanding Why Sgd Generalizes Better Than Adam in Deep Learning »
Pan Zhou · Jiashi Feng · Chao Ma · Caiming Xiong · Steven Chu Hong Hoi · Weinan E -
2020 Poster: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts »
Guilin Li · Junlei Zhang · Yunhe Wang · Chuanjian Liu · Matthias Tan · Yunfeng Lin · Wei Zhang · Jiashi Feng · Tong Zhang -
2020 Poster: Improving Generalization in Reinforcement Learning with Mixture Regularization »
KAIXIN WANG · Bingyi Kang · Jie Shao · Jiashi Feng -
2020 Poster: Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation »
Jianfeng Zhang · Xuecheng Nie · Jiashi Feng -
2020 Poster: ConvBERT: Improving BERT with Span-based Dynamic Convolution »
Zi-Hang Jiang · Weihao Yu · Daquan Zhou · Yunpeng Chen · Jiashi Feng · Shuicheng Yan -
2020 Poster: Hard Negative Mixing for Contrastive Learning »
Yannis Kalantidis · Mert Bulent Sariyildiz · Noe Pion · Philippe Weinzaepfel · Diane Larlus -
2020 Spotlight: ConvBERT: Improving BERT with Span-based Dynamic Convolution »
Zi-Hang Jiang · Weihao Yu · Daquan Zhou · Yunpeng Chen · Jiashi Feng · Shuicheng Yan -
2019 Poster: Efficient Meta Learning via Minibatch Proximal Update »
Pan Zhou · Xiaotong Yuan · Huan Xu · Shuicheng Yan · Jiashi Feng -
2019 Spotlight: Efficient Meta Learning via Minibatch Proximal Update »
Pan Zhou · Xiaotong Yuan · Huan Xu · Shuicheng Yan · Jiashi Feng -
2018 Poster: New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity »
Pan Zhou · Xiaotong Yuan · Jiashi Feng -
2018 Poster: Efficient Stochastic Gradient Hard Thresholding »
Pan Zhou · Xiaotong Yuan · Jiashi Feng -
2017 Poster: Dual Path Networks »
Yunpeng Chen · Jianan Li · Huaxin Xiao · Xiaojie Jin · Shuicheng Yan · Jiashi Feng -
2017 Spotlight: Dual Path Networks »
Yunpeng Chen · Jianan Li · Huaxin Xiao · Xiaojie Jin · Shuicheng Yan · Jiashi Feng -
2017 Poster: Multimodal Learning and Reasoning for Visual Question Answering »
Ilija Ilievski · Jiashi Feng -
2017 Poster: Predicting Scene Parsing and Motion Dynamics in the Future »
Xiaojie Jin · Huaxin Xiao · Xiaohui Shen · Jimei Yang · Zhe Lin · Yunpeng Chen · Zequn Jie · Jiashi Feng · Shuicheng Yan -
2017 Poster: Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis »
Jian Zhao · Lin Xiong · Panasonic Karlekar Jayashree · Jianshu Li · Fang Zhao · Zhecan Wang · Panasonic Sugiri Pranata · Panasonic Shengmei Shen · Shuicheng Yan · Jiashi Feng -
2016 Poster: Tree-Structured Reinforcement Learning for Sequential Object Localization »
Zequn Jie · Xiaodan Liang · Jiashi Feng · Xiaojie Jin · Wen Lu · Shuicheng Yan -
2014 Poster: Robust Logistic Regression and Classification »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2014 Poster: Convex Optimization Procedure for Clustering: Theoretical Revisit »
Changbo Zhu · Huan Xu · Chenlei Leng · Shuicheng Yan -
2014 Poster: On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification »
Yingzhen Yang · Feng Liang · Shuicheng Yan · Zhangyang Wang · Thomas S Huang -
2013 Poster: Online Robust PCA via Stochastic Optimization »
Jiashi Feng · Huan Xu · Shuicheng Yan -
2013 Poster: Online PCA for Contaminated Data »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan