Timezone: »
Although no specific domain knowledge is considered in the design, plain vision transformers have shown excellent performance in visual recognition tasks. However, little effort has been made to reveal the potential of such simple structures for pose estimation tasks. In this paper, we show the surprisingly good capabilities of plain vision transformers for pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model called ViTPose. Specifically, ViTPose employs plain and non-hierarchical vision transformers as backbones to extract features for a given person instance and a lightweight decoder for pose estimation. It can be scaled up from 100M to 1B parameters by taking the advantages of the scalable model capacity and high parallelism of transformers, setting a new Pareto front between throughput and performance. Besides, ViTPose is very flexible regarding the attention type, input resolution, pre-training and finetuning strategy, as well as dealing with multiple pose tasks. We also empirically demonstrate that the knowledge of large ViTPose models can be easily transferred to small ones via a simple knowledge token. Experimental results show that our basic ViTPose model outperforms representative methods on the challenging MS COCO Keypoint Detection benchmark, while the largest model sets a new state-of-the-art. The code and models are available at https://github.com/ViTAE-Transformer/ViTPose.
Author Information
Yufei Xu (The University of Sydney, University of Sydney)
Jing Zhang (The University of Sydney)
Qiming ZHANG (University of Sydney)
Dacheng Tao (University of Technology, Sydney)
More from the Same Authors
-
2021 : AP-10K: A Benchmark for Animal Pose Estimation in the Wild »
Hang Yu · Yufei Xu · Jing Zhang · Wei Zhao · Ziyu Guan · Dacheng Tao -
2022 Poster: Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach »
Peng Mi · Li Shen · Tianhe Ren · Yiyi Zhou · Xiaoshuai Sun · Rongrong Ji · Dacheng Tao -
2022 Poster: Exploring Figure-Ground Assignment Mechanism in Perceptual Organization »
Wei Zhai · Yang Cao · Jing Zhang · Zheng-Jun Zha -
2022 Poster: APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking »
Yuxiang Yang · Junjie Yang · Yufei Xu · Jing Zhang · Long Lan · Dacheng Tao -
2022 Spotlight: Lightning Talks 5B-3 »
Yanze Wu · Jie Xiao · Nianzu Yang · Jieyi Bi · Jian Yao · Yiting Chen · Qizhou Wang · Yangru Huang · Yongqiang Chen · Peixi Peng · Yuxin Hong · Xintao Wang · Feng Liu · Yining Ma · Qibing Ren · Xueyang Fu · Yonggang Zhang · Kaipeng Zeng · Jiahai Wang · GEN LI · Yonggang Zhang · Qitian Wu · Yifan Zhao · Chiyu Wang · Junchi Yan · Feng Wu · Yatao Bian · Xiaosong Jia · Ying Shan · Zhiguang Cao · Zheng-Jun Zha · Guangyao Chen · Tianjun Xiao · Han Yang · Jing Zhang · Jinbiao Chen · MA Kaili · Yonghong Tian · Junchi Yan · Chen Gong · Tong He · Binghui Xie · Yuan Sun · Francesco Locatello · Tongliang Liu · Yeow Meng Chee · David P Wipf · Tongliang Liu · Bo Han · Bo Han · Yanwei Fu · James Cheng · Zheng Zhang -
2022 Spotlight: Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits »
Kaining Zhang · Liu Liu · Min-Hsiu Hsieh · Dacheng Tao -
2022 Spotlight: Watermarking for Out-of-distribution Detection »
Qizhou Wang · Feng Liu · Yonggang Zhang · Jing Zhang · Chen Gong · Tongliang Liu · Bo Han -
2022 Spotlight: Lightning Talks 4B-4 »
Ziyue Jiang · Zeeshan Khan · Yuxiang Yang · Chenze Shao · Yichong Leng · Zehao Yu · Wenguan Wang · Xian Liu · Zehua Chen · Yang Feng · Qianyi Wu · James Liang · C.V. Jawahar · Junjie Yang · Zhe Su · Songyou Peng · Yufei Xu · Junliang Guo · Michael Niemeyer · Hang Zhou · Zhou Zhao · Makarand Tapaswi · Dongfang Liu · Qian Yang · Torsten Sattler · Yuanqi Du · Haohe Liu · Jing Zhang · Andreas Geiger · Yi Ren · Long Lan · Jiawei Chen · Wayne Wu · Dahua Lin · Dacheng Tao · Xu Tan · Jinglin Liu · Ziwei Liu · 振辉 叶 · Danilo Mandic · Lei He · Xiangyang Li · Tao Qin · sheng zhao · Tie-Yan Liu -
2022 Spotlight: APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking »
Yuxiang Yang · Junjie Yang · Yufei Xu · Jing Zhang · Long Lan · Dacheng Tao -
2022 Spotlight: Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach »
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao -
2022 Poster: Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network? »
Yibo Yang · Shixiang Chen · Xiangtai Li · Liang Xie · Zhouchen Lin · Dacheng Tao -
2022 Poster: CGLB: Benchmark Tasks for Continual Graph Learning »
Xikun Zhang · Dongjin Song · Dacheng Tao -
2022 Poster: Watermarking for Out-of-distribution Detection »
Qizhou Wang · Feng Liu · Yonggang Zhang · Jing Zhang · Chen Gong · Tongliang Liu · Bo Han -
2022 Poster: Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits »
Kaining Zhang · Liu Liu · Min-Hsiu Hsieh · Dacheng Tao -
2022 Poster: Benefits of Permutation-Equivariance in Auction Mechanisms »
Tian Qin · Fengxiang He · Dingfeng Shi · Wenbing Huang · Dacheng Tao -
2022 Poster: Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach »
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao -
2021 Poster: Class-Disentanglement and Applications in Adversarial Detection and Defense »
Kaiwen Yang · Tianyi Zhou · Yonggang Zhang · Xinmei Tian · Dacheng Tao -
2021 Poster: Gauge Equivariant Transformer »
Lingshen He · Yiming Dong · Yisen Wang · Dacheng Tao · Zhouchen Lin -
2021 Poster: ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias »
Yufei Xu · Qiming ZHANG · Jing Zhang · Dacheng Tao -
2019 Poster: Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation »
Qiming ZHANG · Jing Zhang · Wei Liu · Dacheng Tao -
2018 Poster: Learning Versatile Filters for Efficient Convolutional Neural Networks »
Yunhe Wang · Chang Xu · Chunjing XU · Chao Xu · Dacheng Tao