Timezone: »
Poster
Transformer in Transformer
Kai Han · An Xiao · Enhua Wu · Jianyuan Guo · Chunjing XU · Yunhe Wang
Transformer is a new kind of neural architecture which encodes the input data as powerful features via the attention mechanism. Basically, the visual transformers first divide the input images into several local patches and then calculate both representations and their relationship. Since natural images are of high complexity with abundant detail and color information, the granularity of the patch dividing is not fine enough for excavating features of objects in different scales and locations. In this paper, we point out that the attention inside these local patches are also essential for building visual transformers with high performance and we explore a new architecture, namely, Transformer iN Transformer (TNT). Specifically, we regard the local patches (\eg, 16$\times$16) as “visual sentences” and present to further divide them into smaller patches (\eg, 4$\times$4) as “visual words”. The attention of each word will be calculated with other words in the given visual sentence with negligible computational costs. Features of both words and sentences will be aggregated to enhance the representation ability. Experiments on several benchmarks demonstrate the effectiveness of the proposed TNT architecture, \eg, we achieve an 81.5\% top-1 accuracy on the ImageNet, which is about 1.7\% higher than that of the state-of-the-art visual transformer with similar computational cost. The PyTorch code is available at \url{https://github.com/huawei-noah/CV-Backbones}, and the MindSpore code is available at \url{https://gitee.com/mindspore/models/tree/master/research/cv/TNT}.
Author Information
Kai Han (Huawei Noah's Ark Lab)
An Xiao (Tsinghua University, Tsinghua University)
Enhua Wu (University of Macau)
Jianyuan Guo (Huawei Technologies Ltd.)
Chunjing XU (Huawei Technologies)
Yunhe Wang (Huawei Noah's Ark Lab)
More from the Same Authors
-
2021 : One Million Scenes for Autonomous Driving: ONCE Dataset »
Jiageng Mao · Niu Minzhe · ChenHan Jiang · hanxue liang · Jingheng Chen · Xiaodan Liang · Yamin Li · Chaoqiang Ye · Wei Zhang · Zhenguo Li · Jie Yu · Hang Xu · Chunjing XU -
2021 Spotlight: SOFT: Softmax-free Transformer with Linear Complexity »
Jiachen Lu · Jinghan Yao · Junge Zhang · Xiatian Zhu · Hang Xu · Weiguo Gao · Chunjing XU · Tao Xiang · Li Zhang -
2021 : SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving »
Jianhua Han · Xiwen Liang · Hang Xu · Kai Chen · Lanqing Hong · Jiageng Mao · Chaoqiang Ye · Wei Zhang · Zhenguo Li · Xiaodan Liang · Chunjing XU -
2022 Poster: Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation »
Zhiwei Hao · Jianyuan Guo · Ding Jia · Kai Han · Yehui Tang · Chao Zhang · Han Hu · Yunhe Wang -
2022 Poster: Vision GNN: An Image is Worth Graph of Nodes »
Kai Han · Yunhe Wang · Jianyuan Guo · Yehui Tang · Enhua Wu -
2022 Spotlight: BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons »
Yixing Xu · Xinghao Chen · Yunhe Wang -
2022 Spotlight: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Spotlight: Lightning Talks 2B-1 »
Yehui Tang · Jian Wang · Zheng Chen · man zhou · Peng Gao · Chenyang Si · SHANGKUN SUN · Yixing Xu · Weihao Yu · Xinghao Chen · Kai Han · Hu Yu · Yulun Zhang · Chenhui Gou · Teli Ma · Yuanqi Chen · Yunhe Wang · Hongsheng Li · Jinjin Gu · Jianyuan Guo · Qiman Wu · Pan Zhou · Yu Zhu · Jie Huang · Chang Xu · Yichen Zhou · Haocheng Feng · Guodong Guo · yongbing zhang · Ziyi Lin · Feng Zhao · Ge Li · Junyu Han · Jinwei Gu · Jifeng Dai · Chao Xu · Xinchao Wang · Linghe Kong · Shuicheng Yan · Yu Qiao · Chen Change Loy · Xin Yuan · Errui Ding · Yunhe Wang · Deyu Meng · Jingdong Wang · Chongyi Li -
2022 Poster: Bridge the Gap Between Architecture Spaces via A Cross-Domain Predictor »
Yuqiao Liu · Yehui Tang · Zeqiong Lv · Yunhe Wang · Yanan Sun -
2022 Poster: Redistribution of Weights and Activations for AdderNet Quantization »
Ying Nie · Kai Han · Haikang Diao · Chuanjian Liu · Enhua Wu · Yunhe Wang -
2022 Poster: GhostNetV2: Enhance Cheap Operation with Long-Range Attention »
Yehui Tang · Kai Han · Jianyuan Guo · Chang Xu · Chao Xu · Yunhe Wang -
2022 Poster: Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark »
Jiaxi Gu · Xiaojun Meng · Guansong Lu · Lu Hou · Niu Minzhe · Xiaodan Liang · Lewei Yao · Runhui Huang · Wei Zhang · Xin Jiang · Chunjing XU · Hang Xu -
2022 Poster: Accelerating Sparse Convolution with Column Vector-Wise Sparsity »
Yijun Tan · Kai Han · Kang Zhao · Xianzhi Yu · Zidong Du · Yunji Chen · Yunhe Wang · Jun Yao -
2022 Poster: DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection »
Lewei Yao · Jianhua Han · Youpeng Wen · Xiaodan Liang · Dan Xu · Wei Zhang · Zhenguo Li · Chunjing XU · Hang Xu -
2022 Poster: A Transformer-Based Object Detector with Coarse-Fine Crossing Representations »
Zhishan Li · Ying Nie · Kai Han · Jianyuan Guo · Lei Xie · Yunhe Wang -
2022 Poster: BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons »
Yixing Xu · Xinghao Chen · Yunhe Wang -
2022 Poster: Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving »
Xiwen Liang · Yangxin Wu · Jianhua Han · Hang Xu · Chunjing XU · Xiaodan Liang -
2022 Poster: Random Normalization Aggregation for Adversarial Defense »
Minjing Dong · Xinghao Chen · Yunhe Wang · Chang Xu -
2021 Poster: SOFT: Softmax-free Transformer with Linear Complexity »
Jiachen Lu · Jinghan Yao · Junge Zhang · Xiatian Zhu · Hang Xu · Weiguo Gao · Chunjing XU · Tao Xiang · Li Zhang -
2021 Poster: Adder Attention for Vision Transformer »
Han Shu · Jiahao Wang · Hanting Chen · Lin Li · Yujiu Yang · Yunhe Wang -
2021 Poster: Dynamic Resolution Network »
Mingjian Zhu · Kai Han · Enhua Wu · Qiulin Zhang · Ying Nie · Zhenzhong Lan · Yunhe Wang -
2021 Poster: Post-Training Quantization for Vision Transformer »
Zhenhua Liu · Yunhe Wang · Kai Han · Wei Zhang · Siwei Ma · Wen Gao -
2021 Poster: Handling Long-tailed Feature Distribution in AdderNets »
Minjing Dong · Yunhe Wang · Xinghao Chen · Chang Xu -
2021 Poster: Towards Stable and Robust AdderNets »
Minjing Dong · Yunhe Wang · Xinghao Chen · Chang Xu -
2021 Poster: An Empirical Study of Adder Neural Networks for Object Detection »
Xinghao Chen · Chang Xu · Minjing Dong · Chunjing XU · Yunhe Wang -
2021 Poster: Neural Architecture Dilation for Adversarial Robustness »
Yanxi Li · Zhaohui Yang · Yunhe Wang · Chang Xu -
2021 Poster: Learning Frequency Domain Approximation for Binary Neural Networks »
Yixing Xu · Kai Han · Chang Xu · Yehui Tang · Chunjing XU · Yunhe Wang -
2021 Poster: Augmented Shortcuts for Vision Transformers »
Yehui Tang · Kai Han · Chang Xu · An Xiao · Yiping Deng · Chao Xu · Yunhe Wang -
2021 Poster: S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks »
Xinlin Li · Bang Liu · Yaoliang Yu · Wulong Liu · Chunjing XU · Vahid Partovi Nia -
2021 Oral: Learning Frequency Domain Approximation for Binary Neural Networks »
Yixing Xu · Kai Han · Chang Xu · Yehui Tang · Chunjing XU · Yunhe Wang -
2020 Poster: SCOP: Scientific Control for Reliable Neural Network Pruning »
Yehui Tang · Yunhe Wang · Yixing Xu · Dacheng Tao · Chunjing XU · Chao Xu · Chang Xu -
2020 Poster: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets »
Kai Han · Yunhe Wang · Qiulin Zhang · Wei Zhang · Chunjing XU · Tong Zhang -
2020 Spotlight: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts »
Guilin Li · Junlei Zhang · Yunhe Wang · Chuanjian Liu · Matthias Tan · Yunfeng Lin · Wei Zhang · Jiashi Feng · Tong Zhang -
2020 Poster: Searching for Low-Bit Weights in Quantized Neural Networks »
Zhaohui Yang · Yunhe Wang · Kai Han · Chunjing XU · Chao Xu · Dacheng Tao · Chang Xu -
2019 Poster: Positive-Unlabeled Compression on the Cloud »
Yixing Xu · Yunhe Wang · Hanting Chen · Kai Han · Chunjing XU · Dacheng Tao · Chang Xu -
2018 Poster: Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN »
Shupeng Su · Chao Zhang · Kai Han · Yonghong Tian -
2018 Poster: Learning Versatile Filters for Efficient Convolutional Neural Networks »
Yunhe Wang · Chang Xu · Chunjing XU · Chao Xu · Dacheng Tao -
2016 Poster: CNNpack: Packing Convolutional Neural Networks in the Frequency Domain »
Yunhe Wang · Chang Xu · Shan You · Dacheng Tao · Chao Xu