Timezone: »
Recently, transformer has achieved remarkable performance on a variety of computer vision applications. Compared with mainstream convolutional neural networks, vision transformers are often of sophisticated architectures for extracting powerful feature representations, which are more difficult to be developed on mobile devices. In this paper, we present an effective post-training quantization algorithm for reducing the memory storage and computational costs of vision transformers. Basically, the quantization task can be regarded as finding the optimal low-bit quantization intervals for weights and inputs, respectively. To preserve the functionality of the attention mechanism, we introduce a ranking loss into the conventional quantization objective that aims to keep the relative order of the self-attention results after quantization. Moreover, we thoroughly analyze the relationship between quantization loss of different layers and the feature diversity, and explore a mixed-precision quantization scheme by exploiting the nuclear norm of each attention map and output feature. The effectiveness of the proposed method is verified on several benchmark models and datasets, which outperforms the state-of-the-art post-training quantization algorithms. For instance, we can obtain an 81.29% top-1 accuracy using DeiT-B model on ImageNet dataset with about 8-bit quantization. Code will be available at https://gitee.com/mindspore/models/tree/master/research/cv/VT-PTQ.
Author Information
Zhenhua Liu (Peking University)
Yunhe Wang (Huawei Noah's Ark Lab)
Kai Han (Huawei Noah's Ark Lab)
Wei Zhang (Noah's Ark Lab, Huawei Inc.)
Siwei Ma (Peking University)
Wen Gao (peking university)
More from the Same Authors
-
2021 : One Million Scenes for Autonomous Driving: ONCE Dataset »
Jiageng Mao · Niu Minzhe · ChenHan Jiang · hanxue liang · Jingheng Chen · Xiaodan Liang · Yamin Li · Chaoqiang Ye · Wei Zhang · Zhenguo Li · Jie Yu · Hang Xu · Chunjing XU -
2021 : SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving »
Jianhua Han · Xiwen Liang · Hang Xu · Kai Chen · Lanqing Hong · Jiageng Mao · Chaoqiang Ye · Wei Zhang · Zhenguo Li · Xiaodan Liang · Chunjing XU -
2022 Poster: Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation »
Zhiwei Hao · Jianyuan Guo · Ding Jia · Kai Han · Yehui Tang · Chao Zhang · Han Hu · Yunhe Wang -
2022 Poster: Vision GNN: An Image is Worth Graph of Nodes »
Kai Han · Yunhe Wang · Jianyuan Guo · Yehui Tang · Enhua Wu -
2022 Spotlight: BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons »
Yixing Xu · Xinghao Chen · Yunhe Wang -
2022 Spotlight: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Spotlight: Lightning Talks 2B-1 »
Yehui Tang · Jian Wang · Zheng Chen · man zhou · Peng Gao · Chenyang Si · SHANGKUN SUN · Yixing Xu · Weihao Yu · Xinghao Chen · Kai Han · Hu Yu · Yulun Zhang · Chenhui Gou · Teli Ma · Yuanqi Chen · Yunhe Wang · Hongsheng Li · Jinjin Gu · Jianyuan Guo · Qiman Wu · Pan Zhou · Yu Zhu · Jie Huang · Chang Xu · Yichen Zhou · Haocheng Feng · Guodong Guo · yongbing zhang · Ziyi Lin · Feng Zhao · Ge Li · Junyu Han · Jinwei Gu · Jifeng Dai · Chao Xu · Xinchao Wang · Linghe Kong · Shuicheng Yan · Yu Qiao · Chen Change Loy · Xin Yuan · Errui Ding · Yunhe Wang · Deyu Meng · Jingdong Wang · Chongyi Li -
2022 Poster: Bridge the Gap Between Architecture Spaces via A Cross-Domain Predictor »
Yuqiao Liu · Yehui Tang · Zeqiong Lv · Yunhe Wang · Yanan Sun -
2022 Poster: Redistribution of Weights and Activations for AdderNet Quantization »
Ying Nie · Kai Han · Haikang Diao · Chuanjian Liu · Enhua Wu · Yunhe Wang -
2022 Poster: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Poster: Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark »
Jiaxi Gu · Xiaojun Meng · Guansong Lu · Lu Hou · Niu Minzhe · Xiaodan Liang · Lewei Yao · Runhui Huang · Wei Zhang · Xin Jiang · Chunjing XU · Hang Xu -
2022 Poster: Accelerating Sparse Convolution with Column Vector-Wise Sparsity »
Yijun Tan · Kai Han · Kang Zhao · Xianzhi Yu · Zidong Du · Yunji Chen · Yunhe Wang · Jun Yao -
2022 Poster: DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection »
Lewei Yao · Jianhua Han · Youpeng Wen · Xiaodan Liang · Dan Xu · Wei Zhang · Zhenguo Li · Chunjing XU · Hang Xu -
2022 Poster: A Transformer-Based Object Detector with Coarse-Fine Crossing Representations »
Zhishan Li · Ying Nie · Kai Han · Jianyuan Guo · Lei Xie · Yunhe Wang -
2022 Poster: BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons »
Yixing Xu · Xinghao Chen · Yunhe Wang -
2022 Poster: Random Normalization Aggregation for Adversarial Defense »
Minjing Dong · Xinghao Chen · Yunhe Wang · Chang Xu -
2021 Poster: Adder Attention for Vision Transformer »
Han Shu · Jiahao Wang · Hanting Chen · Lin Li · Yujiu Yang · Yunhe Wang -
2021 Poster: Dynamic Resolution Network »
Mingjian Zhu · Kai Han · Enhua Wu · Qiulin Zhang · Ying Nie · Zhenzhong Lan · Yunhe Wang -
2021 Poster: Handling Long-tailed Feature Distribution in AdderNets »
Minjing Dong · Yunhe Wang · Xinghao Chen · Chang Xu -
2021 Poster: Towards Stable and Robust AdderNets »
Minjing Dong · Yunhe Wang · Xinghao Chen · Chang Xu -
2021 Poster: Transformer in Transformer »
Kai Han · An Xiao · Enhua Wu · Jianyuan Guo · Chunjing XU · Yunhe Wang -
2021 Poster: An Empirical Study of Adder Neural Networks for Object Detection »
Xinghao Chen · Chang Xu · Minjing Dong · Chunjing XU · Yunhe Wang -
2021 Poster: Neural Architecture Dilation for Adversarial Robustness »
Yanxi Li · Zhaohui Yang · Yunhe Wang · Chang Xu -
2021 Poster: Learning Frequency Domain Approximation for Binary Neural Networks »
Yixing Xu · Kai Han · Chang Xu · Yehui Tang · Chunjing XU · Yunhe Wang -
2021 Poster: Augmented Shortcuts for Vision Transformers »
Yehui Tang · Kai Han · Chang Xu · An Xiao · Yiping Deng · Chao Xu · Yunhe Wang -
2021 Poster: MAU: A Motion-Aware Unit for Video Prediction and Beyond »
Zheng Chang · Xinfeng Zhang · Shanshe Wang · Siwei Ma · Yan Ye · Xiang Xinguang · Wen Gao -
2021 Oral: Learning Frequency Domain Approximation for Binary Neural Networks »
Yixing Xu · Kai Han · Chang Xu · Yehui Tang · Chunjing XU · Yunhe Wang -
2020 Poster: SCOP: Scientific Control for Reliable Neural Network Pruning »
Yehui Tang · Yunhe Wang · Yixing Xu · Dacheng Tao · Chunjing XU · Chao Xu · Chang Xu -
2020 Poster: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets »
Kai Han · Yunhe Wang · Qiulin Zhang · Wei Zhang · Chunjing XU · Tong Zhang -
2020 Spotlight: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts »
Guilin Li · Junlei Zhang · Yunhe Wang · Chuanjian Liu · Matthias Tan · Yunfeng Lin · Wei Zhang · Jiashi Feng · Tong Zhang -
2020 Poster: Searching for Low-Bit Weights in Quantized Neural Networks »
Zhaohui Yang · Yunhe Wang · Kai Han · Chunjing XU · Chao Xu · Dacheng Tao · Chang Xu -
2019 Poster: Positive-Unlabeled Compression on the Cloud »
Yixing Xu · Yunhe Wang · Hanting Chen · Kai Han · Chunjing XU · Dacheng Tao · Chang Xu -
2018 Poster: Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN »
Shupeng Su · Chao Zhang · Kai Han · Yonghong Tian -
2018 Poster: Frequency-Domain Dynamic Pruning for Convolutional Neural Networks »
Zhenhua Liu · Jizheng Xu · Xiulian Peng · Ruiqin Xiong -
2018 Poster: Learning Versatile Filters for Efficient Convolutional Neural Networks »
Yunhe Wang · Chang Xu · Chunjing XU · Chao Xu · Dacheng Tao -
2016 Poster: Maximal Sparsity with Deep Networks? »
Bo Xin · Yizhou Wang · Wen Gao · David Wipf · Baoyuan Wang -
2016 Poster: Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition »
Jinzhuo Wang · Wenmin Wang · xiongtao Chen · Ronggang Wang · Wen Gao -
2016 Poster: CNNpack: Packing Convolutional Neural Networks in the Frequency Domain »
Yunhe Wang · Chang Xu · Shan You · Dacheng Tao · Chao Xu