Timezone: »

Adder Attention for Vision Transformer
Han Shu · Jiahao Wang · Hanting Chen · Lin Li · Yujiu Yang · Yunhe Wang

Thu Dec 09 12:30 AM -- 02:00 AM (PST) @ None #None

Transformer is a new kind of calculation paradigm for deep learning which has shown strong performance on a large variety of computer vision tasks. However, compared with conventional deep models (e.g., convolutional neural networks), vision transformers require more computational resources which cannot be easily deployed on mobile devices. To this end, we present to reduce the energy consumptions using adder neural network (AdderNet). We first theoretically analyze the mechanism of self-attention and the difficulty for applying adder operation into this module. Specifically, the feature diversity, i.e., the rank of attention map using only additions cannot be well preserved. Thus, we develop an adder attention layer that includes an additional identity mapping. With the new operation, vision transformers constructed using additions can also provide powerful feature representations. Experimental results on several benchmarks demonstrate that the proposed approach can achieve highly competitive performance to that of the baselines while achieving an about 2~3× reduction on the energy consumption.

Author Information

Han Shu (huawei noah ark's lab)
Jiahao Wang
Hanting Chen (Peking University)
Lin Li (Huawei Technologies Co., Ltd.)
Yujiu Yang (Tsinghua University)
Yunhe Wang (Huawei Noah's Ark Lab)

More from the Same Authors