Timezone: »
Vision transformer networks have shown superiority in many computer vision tasks. In this paper, we take a step further by proposing a novel generative vision transformer with latent variables following an informative energy-based prior for salient object detection. Both the vision transformer network and the energy-based prior model are jointly trained via Markov chain Monte Carlo-based maximum likelihood estimation, in which the sampling from the intractable posterior and prior distributions of the latent variables are performed by Langevin dynamics. Further, with the generative vision transformer, we can easily obtain a pixel-wise uncertainty map from an image, which indicates the model confidence in predicting saliency from the image. Different from the existing generative models which define the prior distribution of the latent variables as a simple isotropic Gaussian distribution, our model uses an energy-based informative prior which can be more expressive to capture the latent space of the data. We apply the proposed framework to both RGB and RGB-D salient object detection tasks. Extensive experimental results show that our framework can achieve not only accurate saliency predictions but also meaningful uncertainty maps that are consistent with the human perception.
Author Information
Jing Zhang (Australian National University)
Jianwen Xie (Baidu Research)
Nick Barnes (Australian National University)
Ping Li (Baidu Research USA)
More from the Same Authors
-
2020 : Paper 2: Energy-Based Continuous Inverse Optimal Control »
Yifei Xu · Jianwen Xie · Chris Baker · Yibiao Zhao · Ying Nian Wu -
2021 Poster: A Comprehensively Tight Analysis of Gradient Descent for PCA »
Zhiqiang Xu · Ping Li -
2021 Poster: Backdoor Attack with Imperceptible Input and Latent Modification »
Khoa Doan · Yingjie Lao · Ping Li -
2021 Poster: On Path Integration of Grid Cells: Group Representation and Isotropic Scaling »
Ruiqi Gao · Jianwen Xie · Xue-Xin Wei · Song-Chun Zhu · Ying Nian Wu -
2021 Poster: A Note on Sparse Generalized Eigenvalue Problem »
Yunfeng Cai · Guanhua Fang · Ping Li -
2021 Poster: Mitigating Forgetting in Online Continual Learning with Neuron Calibration »
Haiyan Yin · peng yang · Ping Li -
2021 Poster: Rate-Optimal Subspace Estimation on Random Graphs »
Zhixin Zhou · Fan Zhou · Ping Li · Cun-Hui Zhang -
2021 Poster: Rethinking conditional GAN training: An approach using geometrically structured latent manifolds »
Sameera Ramasinghe · Moshiur Farazi · Salman H Khan · Nick Barnes · Stephen Gould -
2019 Poster: Semantic-Guided Multi-Attention Localization for Zero-Shot Learning »
Yizhe Zhu · Jianwen Xie · Zhiqiang Tang · Xi Peng · Ahmed Elgammal