Timezone: »
To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution. We adopt a distributional perspective on the discounted cumulative return and model it with a state-action-dependent implicit distribution, which is approximated by the DGNs that take state-action pairs and random noises as their input. Moreover, we use the SIA to provide a semi-implicit policy distribution, which mixes the policy parameters with a reparameterizable distribution that is not constrained by an analytic density function. In this way, the policy's marginal distribution is implicit, providing the potential to model complex properties such as covariance structure and skewness, but its parameter and entropy can still be estimated. We incorporate these features with an off-policy algorithm framework to solve problems with continuous action space and compare IDAC with state-of-the-art algorithms on representative OpenAI Gym environments. We observe that IDAC outperforms these baselines in most tasks. Python code is provided.
Author Information
Yuguang Yue (University of Texas at Austin)
Zhendong Wang (University of Texas, Austin)
Mingyuan Zhou (University of Texas at Austin)
More from the Same Authors
-
2021 Poster: Exploiting Chain Rule and Bayes' Theorem to Compare Probability Distributions »
Huangjie Zheng · Mingyuan Zhou -
2021 Poster: Alignment Attention by Matching Key and Query Distributions »
Shujian Zhang · Xinjie Fan · Huangjie Zheng · Korawat Tanwisuth · Mingyuan Zhou -
2021 Poster: Probabilistic Margins for Instance Reweighting in Adversarial Training »
qizhou wang · Feng Liu · Bo Han · Tongliang Liu · Chen Gong · Gang Niu · Mingyuan Zhou · Masashi Sugiyama -
2021 Poster: Convex Polytope Trees »
Mohammadreza Armandpour · Ali Sadeghian · Mingyuan Zhou -
2021 Poster: TopicNet: Semantic Graph-Guided Topic Discovery »
Zhibin Duan · Yi.shi Xu · Bo Chen · dongsheng wang · Chaojie Wang · Mingyuan Zhou -
2021 Poster: A Prototype-Oriented Framework for Unsupervised Domain Adaptation »
Korawat Tanwisuth · Xinjie Fan · Huangjie Zheng · Shujian Zhang · Hao Zhang · Bo Chen · Mingyuan Zhou -
2021 Poster: CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator »
Alek Dimitriev · Mingyuan Zhou -
2020 Poster: Bidirectional Convolutional Poisson Gamma Dynamical Systems »
wenchao chen · Chaojie Wang · Bo Chen · Yicheng Liu · Hao Zhang · Mingyuan Zhou -
2020 Poster: Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network »
Chaojie Wang · Hao Zhang · Bo Chen · Dongsheng Wang · Zhengjue Wang · Mingyuan Zhou -
2020 Poster: Bayesian Attention Modules »
Xinjie Fan · Shujian Zhang · Bo Chen · Mingyuan Zhou -
2019 Poster: Variational Graph Recurrent Neural Networks »
Ehsan Hajiramezanali · Arman Hasanzadeh · Krishna Narayanan · Nick Duffield · Mingyuan Zhou · Xiaoning Qian -
2019 Poster: Semi-Implicit Graph Variational Auto-Encoders »
Arman Hasanzadeh · Ehsan Hajiramezanali · Krishna Narayanan · Nick Duffield · Mingyuan Zhou · Xiaoning Qian -
2019 Poster: Poisson-Randomized Gamma Dynamical Systems »
Aaron Schein · Scott Linderman · Mingyuan Zhou · David Blei · Hanna Wallach -
2018 Poster: Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks »
Quan Zhang · Mingyuan Zhou -
2018 Poster: Deep Poisson gamma dynamical systems »
Dandan Guo · Bo Chen · Hao Zhang · Mingyuan Zhou -
2018 Poster: Dirichlet belief networks for topic structure learning »
He Zhao · Lan Du · Wray Buntine · Mingyuan Zhou -
2018 Poster: Parsimonious Bayesian deep networks »
Mingyuan Zhou -
2018 Poster: Masking: A New Perspective of Noisy Supervision »
Bo Han · Jiangchao Yao · Gang Niu · Mingyuan Zhou · Ivor Tsang · Ya Zhang · Masashi Sugiyama -
2018 Poster: Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data »
Ehsan Hajiramezanali · Siamak Zamani Dadaneh · Alireza Karbalayghareh · Mingyuan Zhou · Xiaoning Qian -
2016 Poster: Poisson-Gamma dynamical systems »
Aaron Schein · Hanna Wallach · Mingyuan Zhou -
2016 Oral: Poisson-Gamma dynamical systems »
Aaron Schein · Hanna Wallach · Mingyuan Zhou -
2015 Poster: The Poisson Gamma Belief Network »
Mingyuan Zhou · Yulai Cong · Bo Chen -
2014 Poster: Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling »
Mingyuan Zhou -
2012 Poster: Augment-and-Conquer Negative Binomial Processes »
Mingyuan Zhou · Lawrence Carin -
2012 Spotlight: Augment-and-Conquer Negative Binomial Processes »
Mingyuan Zhou · Lawrence Carin -
2009 Poster: Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations »
Mingyuan Zhou · Haojun Chen · John Paisley · Lu Ren · Guillermo Sapiro · Lawrence Carin -
2009 Oral: Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations »
Mingyuan Zhou · Haojun Chen · John Paisley · Lu Ren · Guillermo Sapiro · Larry Carin