Timezone: »
Transformers, the de-facto standard for language modeling, have been recently applied for vision tasks. This paper introduces sparse queries for vision transformers to exploit the intrinsic spatial redundancy of natural images and save computational costs. Specifically, we propose a Dynamic Grained Encoder for vision transformers, which can adaptively assign a suitable number of queries to each spatial region. Thus it achieves a fine-grained representation in discriminative regions while keeping high efficiency. Besides, the dynamic grained encoder is compatible with most vision transformer frameworks. Without bells and whistles, our encoder allows the state-of-the-art vision transformers to reduce computational complexity by 40%-60% while maintaining comparable performance on image classification. Extensive experiments on object detection and segmentation further demonstrate the generalizability of our approach. Code is available at https://github.com/StevenGrove/vtpack.
Author Information
Lin Song (Xi'an Jiaotong University)
Songyang Zhang
Songtao Liu (Beihang University, Beijing, China)
Zeming Li (Megvii(Face++) Inc)
Xuming He (ShanghaiTech University)
Hongbin Sun (Xi'an Jiaotong University)
Jian Sun (Megvii, Face++)
Nanning Zheng (Xi'an Jiaotong University)
More from the Same Authors
-
2021 Spotlight: Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay »
Ruosi Wan · Zhanxing Zhu · Xiangyu Zhang · Jian Sun -
2022 Poster: Could Giant Pre-trained Image Models Extract Universal Representations? »
Yutong Lin · Ze Liu · Zheng Zhang · Han Hu · Nanning Zheng · Stephen Lin · Yue Cao -
2022 Poster: Unifying Voxel-based Representation with Transformer for 3D Object Detection »
Yanwei Li · Yilun Chen · Xiaojuan Qi · Zeming Li · Jian Sun · Jiaya Jia -
2022 Spotlight: Lightning Talks 5B-4 »
Yuezhi Yang · Zeyu Yang · Yong Lin · Yishi Xu · Linan Yue · Tao Yang · Weixin Chen · Qi Liu · Jiaqi Chen · Dongsheng Wang · Baoyuan Wu · Yuwang Wang · Hao Pan · Shengyu Zhu · Zhenwei Miao · Yan Lu · Lu Tan · Bo Chen · Yichao Du · Haoqian Wang · Wei Li · Yanqing An · Ruiying Lu · Peng Cui · Nanning Zheng · Li Wang · Zhibin Duan · Xiatian Zhu · Mingyuan Zhou · Enhong Chen · Li Zhang -
2022 Spotlight: Visual Concepts Tokenization »
Tao Yang · Yuwang Wang · Yan Lu · Nanning Zheng -
2022 Spotlight: Lightning Talks 2A-3 »
David Buterez · Chengan He · Xuan Kan · Yutong Lin · Konstantin Schürholt · Yu Yang · Louis Annabi · Wei Dai · Xiaotian Cheng · Alexandre Pitti · Ze Liu · Jon Paul Janet · Jun Saito · Boris Knyazev · Mathias Quoy · Zheng Zhang · James Zachary · Steven J Kiddle · Xavier Giro-i-Nieto · Chang Liu · Hejie Cui · Zilong Zhang · Hakan Bilen · Damian Borth · Dino Oglic · Holly Rushmeier · Han Hu · Xiangyang Ji · Yi Zhou · Nanning Zheng · Ying Guo · Pietro Liò · Stephen Lin · Carl Yang · Yue Cao -
2022 Spotlight: Could Giant Pre-trained Image Models Extract Universal Representations? »
Yutong Lin · Ze Liu · Zheng Zhang · Han Hu · Nanning Zheng · Stephen Lin · Yue Cao -
2022 Poster: Visual Concepts Tokenization »
Tao Yang · Yuwang Wang · Yan Lu · Nanning Zheng -
2021 Poster: Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay »
Ruosi Wan · Zhanxing Zhu · Xiangyu Zhang · Jian Sun -
2021 Poster: Co-evolution Transformer for Protein Contact Prediction »
He Zhang · Fusong Ju · Jianwei Zhu · Liang He · Bin Shao · Nanning Zheng · Tie-Yan Liu -
2021 Poster: Instance-Conditional Knowledge Distillation for Object Detection »
Zijian Kang · Peizhen Zhang · Xiangyu Zhang · Jian Sun · Nanning Zheng -
2020 Poster: Compositional Generalization by Learning Analytical Expressions »
Qian Liu · Shengnan An · Jian-Guang Lou · Bei Chen · Zeqi Lin · Yan Gao · Bin Zhou · Nanning Zheng · Dongmei Zhang -
2020 Spotlight: Compositional Generalization by Learning Analytical Expressions »
Qian Liu · Shengnan An · Jian-Guang Lou · Bei Chen · Zeqi Lin · Yan Gao · Bin Zhou · Nanning Zheng · Dongmei Zhang -
2020 Poster: Rethinking Learnable Tree Filter for Generic Feature Transform »
Lin Song · Yanwei Li · Zhengkai Jiang · Zeming Li · Xiangyu Zhang · Hongbin Sun · Jian Sun · Nanning Zheng -
2020 Poster: Fine-Grained Dynamic Head for Object Detection »
Lin Song · Yanwei Li · Zhengkai Jiang · Zeming Li · Hongbin Sun · Jian Sun · Nanning Zheng -
2019 Poster: Learnable Tree Filter for Structure-preserving Feature Transform »
Lin Song · Yanwei Li · Zeming Li · Gang Yu · Hongbin Sun · Jian Sun · Nanning Zheng -
2019 Poster: DetNAS: Backbone Search for Object Detection »
Yukang Chen · Tong Yang · Xiangyu Zhang · GAOFENG MENG · Xinyu Xiao · Jian Sun -
2018 Poster: MetaAnchor: Learning to Detect Objects with Customized Anchors »
Tong Yang · Xiangyu Zhang · Zeming Li · Wenqiang Zhang · Jian Sun