Test-Time Personalization with a Transformer for Human Pose Estimation

Yizhuo Li · Miao Hao · Zonglin Di · Nitesh Bharadwaj Gundavarapu · Xiaolong Wang

Keywords: [ Transformers ] [ Self-Supervised Learning ] [ Vision ]

[ Abstract ]
[ OpenReview
Tue 7 Dec 8:30 a.m. PST — 10 a.m. PST


We propose to personalize a 2D human pose estimator given a set of test images of a person without using any manual annotations. While there is a significant advancement in human pose estimation, it is still very challenging for a model to generalize to different unknown environments and unseen persons. Instead of using a fixed model for every test case, we adapt our pose estimator during test time to exploit person-specific information. We first train our model on diverse data with both a supervised and a self-supervised pose estimation objectives jointly. We use a Transformer model to build a transformation between the self-supervised keypoints and the supervised keypoints. During test time, we personalize and adapt our model by fine-tuning with the self-supervised objective. The pose is then improved by transforming the updated self-supervised keypoints. We experiment with multiple datasets and show significant improvements on pose estimations with our self-supervised personalization. Project page with code is available at

Chat is not available.