Timezone: »
Vision Transformers (ViTs) have triggered the most recent and significant breakthroughs in computer vision. Their efficient designs are mostly guided by the indirect metric of computational complexity, i.e., FLOPs, which however has a clear gap with the direct metric such as throughput. Thus, we propose to use the direct speed evaluation on the target platform as the design principle for efficient ViTs. Particularly, we introduce LITv2, a simple and effective ViT which performs favourably against the existing state-of-the-art methods across a spectrum of different model sizes with faster speed. At the core of LITv2 is a novel self-attention mechanism, which we dub HiLo. HiLo is inspired by the insight that high frequencies in an image capture local fine details and low frequencies focus on global structures, whereas a multi-head self-attention layer neglects the characteristic of different frequencies. Therefore, we propose to disentangle the high/low frequency patterns in an attention layer by separating the heads into two groups, where one group encodes high frequencies via self-attention within each local window, and another group performs the attention to model the global relationship between the average-pooled low-frequency keys from each window and each query position in the input feature map. Benefiting from the efficient design for both groups, we show that HiLo is superior to the existing attention mechanisms by comprehensively benchmarking FLOPs, speed and memory consumption on GPUs and CPUs. For example, HiLo is 1.4× faster than spatial reduction attention and 1.6× faster than local window attention on CPUs. Powered by HiLo, LITv2 serves as a strong backbone for mainstream vision tasks including image classification, dense detection and segmentation. Code is available at https://github.com/ziplab/LITv2.
Author Information
Zizheng Pan (Monash University)
Jianfei Cai (Monash University)
Bohan Zhuang (Monash University)

Dr. Bohan Zhuang is a tenure-track assistant professor at the Faculty of Information Technology, Monash University, Australia. Previously, he was a Senior Research Fellow and finished his PhD in Computer Science at the University of Adelaide, where he was advised by Prof. Ian Reid and Prof. Chunhua Shen. During his undergraduate study, he was fortunately supervised by Prof. Huchuan Lu. His main research topic is efficient deep learning computing. He has also been contributing to a wide span of applications in Machine Learning and Computer Vision. Apart from academic, he is a music enthusiast and have been playing piano since 4 years old.
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Fast Vision Transformers with HiLo Attention »
Thu. Dec 1st 05:00 -- 07:00 PM Room Hall J #423
More from the Same Authors
-
2022 Poster: MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation »
Chuanxia Zheng · Tung-Long Vuong · Jianfei Cai · Dinh Phung -
2022 Spotlight: Lightning Talks 6B-4 »
Junjie Chen · Chuanxia Zheng · JINLONG LI · Yu Shi · Shichao Kan · Yu Wang · Fermín Travi · Ninh Pham · Lei Chai · Guobing Gan · Tung-Long Vuong · Gonzalo Ruarte · Tao Liu · Li Niu · Jingjing Zou · Zequn Jie · Peng Zhang · Ming LI · Yixiong Liang · Guolin Ke · Jianfei Cai · Gaston Bujia · Sunzhu Li · Siyuan Zhou · Jingyang Lin · Xu Wang · Min Li · Zhuoming Chen · Qing Ling · Xiaolin Wei · Xiuqing Lu · Shuxin Zheng · Dinh Phung · Yigang Cen · Jianlou Si · Juan Esteban Kamienkowski · Jianxin Wang · Chen Qian · Lin Ma · Benyou Wang · Yingwei Pan · Tie-Yan Liu · Liqing Zhang · Zhihai He · Ting Yao · Tao Mei -
2022 Spotlight: Lightning Talks 6B-3 »
Lingfeng Yang · Yao Lai · Zizheng Pan · Zhenyu Wang · Weicong Liang · Chuanyang Zheng · Jian-Wei Zhang · Peng Jin · Jing Liu · Xiuying Wei · Yao Mu · Xiang Li · YUHUI YUAN · Zizheng Pan · Yifan Sun · Yunchen Zhang · Jianfei Cai · Hao Luo · zheyang li · Jinfa Huang · Haoyu He · Yi Yang · Ping Luo · Fenglin Liu · Henghui Ding · Borui Zhao · Xiangguo Zhang · Kai Zhang · Pichao WANG · Bohan Zhuang · Wei Chen · Ruihao Gong · Zhi Yang · Xian Wu · Feng Ding · Jianfei Cai · Xiao Luo · Renjie Song · Weihong Lin · Jian Yang · Wenming Tan · Bohan Zhuang · Shanghang Zhang · Shen Ge · Fan Wang · Qi Zhang · Guoli Song · Jun Xiao · Hao Li · Ding Jia · David Clifton · Ye Ren · Fengwei Yu · Zheng Zhang · Jie Chen · Shiliang Pu · Xianglong Liu · Chao Zhang · Han Hu -
2022 Spotlight: EcoFormer: Energy-Saving Attention with Linear Complexity »
Jing Liu · Zizheng Pan · Haoyu He · Jianfei Cai · Bohan Zhuang -
2022 Spotlight: MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation »
Chuanxia Zheng · Tung-Long Vuong · Jianfei Cai · Dinh Phung -
2022 Poster: EcoFormer: Energy-Saving Attention with Linear Complexity »
Jing Liu · Zizheng Pan · Haoyu He · Jianfei Cai · Bohan Zhuang