Timezone: »
Distributional Reinforcement Learning (RL) differs from traditional RL in that, rather than the expectation of total returns, it estimates distributions and has achieved state-of-the-art performance on Atari Games. The key challenge in practical distributional RL algorithms lies in how to parameterize estimated distributions so as to better approximate the true continuous distribution. Existing distributional RL algorithms parameterize either the probability side or the return value side of the distribution function, leaving the other side uniformly fixed as in C51, QR-DQN or randomly sampled as in IQN. In this paper, we propose fully parameterized quantile function that parameterizes both the quantile fraction axis (i.e., the x-axis) and the value axis (i.e., y-axis) for distributional RL. Our algorithm contains a fraction proposal network that generates a discrete set of quantile fractions and a quantile value network that gives corresponding quantile values. The two networks are jointly trained to find the best approximation of the true distribution. Experiments on 55 Atari Games show that our algorithm significantly outperforms existing distributional RL algorithms and creates a new record for the Atari Learning Environment for non-distributed agents.
Author Information
Derek Yang (UC San Diego)
Li Zhao (Microsoft Research)
Zichuan Lin (Tsinghua University)
Tao Qin (Microsoft Research)
Jiang Bian (Microsoft)
Tie-Yan Liu (Microsoft Research Asia)
Tie-Yan Liu is an assistant managing director of Microsoft Research Asia, leading the machine learning research area. He is very well known for his pioneer work on learning to rank and computational advertising, and his recent research interests include deep learning, reinforcement learning, and distributed machine learning. Many of his technologies have been transferred to Microsoft’s products and online services (such as Bing, Microsoft Advertising, Windows, Xbox, and Azure), and open-sourced through Microsoft Cognitive Toolkit (CNTK), Microsoft Distributed Machine Learning Toolkit (DMTK), and Microsoft Graph Engine. He has also been actively contributing to academic communities. He is an adjunct/honorary professor at Carnegie Mellon University (CMU), University of Nottingham, and several other universities in China. He has published 200+ papers in refereed conferences and journals, with over 17000 citations. He has won quite a few awards, including the best student paper award at SIGIR (2008), the most cited paper award at Journal of Visual Communications and Image Representation (2004-2006), the research break-through award (2012) and research-team-of-the-year award (2017) at Microsoft Research, and Top-10 Springer Computer Science books by Chinese authors (2015), and the most cited Chinese researcher by Elsevier (2017). He has been invited to serve as general chair, program committee chair, local chair, or area chair for a dozen of top conferences including SIGIR, WWW, KDD, ICML, NIPS, IJCAI, AAAI, ACL, ICTIR, as well as associate editor of ACM Transactions on Information Systems, ACM Transactions on the Web, and Neurocomputing. Tie-Yan Liu is a fellow of the IEEE, and a distinguished member of the ACM.
More from the Same Authors
-
2021 : ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations »
Tongzhou Mu · Zhan Ling · Fanbo Xiang · Derek Yang · Xuanlin Li · Stone Tao · Zhiao Huang · Zhiwei Jia · Hao Su -
2022 Poster: Quantized Training of Gradient Boosting Decision Trees »
Yu Shi · Guolin Ke · Zhuoming Chen · Shuxin Zheng · Tie-Yan Liu -
2022 Poster: An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context »
Xiaoyu Chen · Xiangming Zhu · Yufeng Zheng · Pushi Zhang · Li Zhao · Wenxue Cheng · Peng CHENG · Yongqiang Xiong · Tao Qin · Jianyu Chen · Tie-Yan Liu -
2022 Poster: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 : Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management »
Yuandong Ding · Mingxiao Feng · Guozi Liu · Wei Jiang · Chuheng Zhang · Li Zhao · Lei Song · Houqiang Li · Yan Jin · Jiang Bian -
2022 : Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management »
Yuandong Ding · Mingxiao Feng · Guozi Liu · Wei Jiang · Chuheng Zhang · Li Zhao · Lei Song · Houqiang Li · Yan Jin · Jiang Bian -
2023 Poster: Learning Pareto-Optimal Policies for Multi-Objective Joint Distribution »
Xin-Qiang Cai · Pushi Zhang · Li Zhao · Jiang Bian · Masashi Sugiyama · Ashley Llorens -
2023 Poster: On the Generalization Properties of Diffusion Models »
Puheng Li · Zhong Li · Huishuai Zhang · Jiang Bian -
2023 Poster: FABind: Fast and Accurate Protein-Ligand Binding »
Qizhi Pei · Kaiyuan Gao · Lijun Wu · Jinhua Zhu · Yingce Xia · Shufang Xie · Tao Qin · Kun He · Tie-Yan Liu · Rui Yan -
2023 Poster: AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models »
Yuancheng Wang · Zeqian Ju · Xu Tan · Lei He · Zhizheng Wu · Jiang Bian · sheng zhao -
2023 Poster: Geometric Transformer with Interatomic Positional Encoding »
Yusong Wang · Shaoning Li · Tong Wang · Bin Shao · Nanning Zheng · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 6B-4 »
Junjie Chen · Chuanxia Zheng · JINLONG LI · Yu Shi · Shichao Kan · Yu Wang · Fermín Travi · Ninh Pham · Lei Chai · Guobing Gan · Tung-Long Vuong · Gonzalo Ruarte · Tao Liu · Li Niu · Jingjing Zou · Zequn Jie · Peng Zhang · Ming LI · Yixiong Liang · Guolin Ke · Jianfei Cai · Gaston Bujia · Sunzhu Li · Siyuan Zhou · Jingyang Lin · Xu Wang · Min Li · Zhuoming Chen · Qing Ling · Xiaolin Wei · Xiuqing Lu · Shuxin Zheng · Dinh Phung · Yigang Cen · Jianlou Si · Juan Esteban Kamienkowski · Jianxin Wang · Chen Qian · Lin Ma · Benyou Wang · Yingwei Pan · Tie-Yan Liu · Liqing Zhang · Zhihai He · Ting Yao · Tao Mei -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation »
Botao Yu · Peiling Lu · Rui Wang · Wei Hu · Xu Tan · Wei Ye · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: Quantized Training of Gradient Boosting Decision Trees »
Yu Shi · Guolin Ke · Zhuoming Chen · Shuxin Zheng · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4B-4 »
Ziyue Jiang · Zeeshan Khan · Yuxiang Yang · Chenze Shao · Yichong Leng · Zehao Yu · Wenguan Wang · Xian Liu · Zehua Chen · Yang Feng · Qianyi Wu · James Liang · C.V. Jawahar · Junjie Yang · Zhe Su · Songyou Peng · Yufei Xu · Junliang Guo · Michael Niemeyer · Hang Zhou · Zhou Zhao · Makarand Tapaswi · Dongfang Liu · Qian Yang · Torsten Sattler · Yuanqi Du · Haohe Liu · Jing Zhang · Andreas Geiger · Yi Ren · Long Lan · Jiawei Chen · Wayne Wu · Dahua Lin · Dacheng Tao · Xu Tan · Jinglin Liu · Ziwei Liu · 振辉 叶 · Danilo Mandic · Lei He · Xiangyang Li · Tao Qin · sheng zhao · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-3 »
Zhihan Gao · Yabin Wang · Xingyu Qu · Luziwei Leng · Mingqing Xiao · Bohan Wang · Yu Shen · Zhiwu Huang · Xingjian Shi · Qi Meng · Yupeng Lu · Diyang Li · Qingyan Meng · Kaiwei Che · Yang Li · Hao Wang · Huishuai Zhang · Zongpeng Zhang · Kaixuan Zhang · Xiaopeng Hong · Xiaohan Zhao · Di He · Jianguo Zhang · Yaofeng Tu · Bin Gu · Yi Zhu · Ruoyu Sun · Yuyang (Bernie) Wang · Zhouchen Lin · Qinghu Meng · Wei Chen · Wentao Zhang · Bin CUI · Jie Cheng · Zhi-Ming Ma · Mu Li · Qinghai Guo · Dit-Yan Yeung · Tie-Yan Liu · Jianxing Liao -
2022 Spotlight: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 Spotlight: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Spotlight: BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis »
Yichong Leng · Zehua Chen · Junliang Guo · Haohe Liu · Jiawei Chen · Xu Tan · Danilo Mandic · Lei He · Xiangyang Li · Tao Qin · sheng zhao · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 Poster: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Poster: Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling »
Kaitao Song · Yichong Leng · Xu Tan · Yicheng Zou · Tao Qin · Dongsheng Li -
2022 Poster: Your Transformer May Not be as Powerful as You Expect »
Shengjie Luo · Shanda Li · Shuxin Zheng · Tie-Yan Liu · Liwei Wang · Di He -
2022 Poster: BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis »
Yichong Leng · Zehua Chen · Junliang Guo · Haohe Liu · Jiawei Chen · Xu Tan · Danilo Mandic · Lei He · Xiangyang Li · Tao Qin · sheng zhao · Tie-Yan Liu -
2022 Poster: Efficient and Effective Multi-task Grouping via Meta Learning on Task Combinations »
Xiaozhuang Song · Shun Zheng · Wei Cao · James Yu · Jiang Bian -
2022 Poster: Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation »
Botao Yu · Peiling Lu · Rui Wang · Wei Hu · Xu Tan · Wei Ye · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2021 : AI X Science »
Tie-Yan Liu -
2021 Poster: On the Generative Utility of Cyclic Conditionals »
Chang Liu · Haoyue Tang · Tao Qin · Jintao Wang · Tie-Yan Liu -
2021 Poster: Curriculum Offline Imitating Learning »
Minghuan Liu · Hanye Zhao · Zhengyu Yang · Jian Shen · Weinan Zhang · Li Zhao · Tie-Yan Liu -
2021 Poster: Speech-T: Transducer for Text to Speech and Beyond »
Jiawei Chen · Xu Tan · Yichong Leng · Jin Xu · Guihua Wen · Tao Qin · Tie-Yan Liu -
2021 Poster: Stylized Dialogue Generation with Multi-Pass Dual Learning »
Jinpeng Li · Yingce Xia · Rui Yan · Hongda Sun · Dongyan Zhao · Tie-Yan Liu -
2021 Poster: Distributional Reinforcement Learning for Multi-Dimensional Reward Functions »
Pushi Zhang · Xiaoyu Chen · Li Zhao · Wei Xiong · Tao Qin · Tie-Yan Liu -
2021 Poster: Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD »
Bohan Wang · Huishuai Zhang · Jieyu Zhang · Qi Meng · Wei Chen · Tie-Yan Liu -
2021 Poster: Co-evolution Transformer for Protein Contact Prediction »
He Zhang · Fusong Ju · Jianwei Zhu · Liang He · Bin Shao · Nanning Zheng · Tie-Yan Liu -
2021 Poster: Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding »
Shengjie Luo · Shanda Li · Tianle Cai · Di He · Dinglan Peng · Shuxin Zheng · Guolin Ke · Liwei Wang · Tie-Yan Liu -
2021 Poster: Learning Causal Semantic Representation for Out-of-Distribution Prediction »
Chang Liu · Xinwei Sun · Jindong Wang · Haoyue Tang · Tao Li · Tao Qin · Wei Chen · Tie-Yan Liu -
2021 Poster: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning »
Jongjin Park · Younggyo Seo · Chang Liu · Li Zhao · Tao Qin · Jinwoo Shin · Tie-Yan Liu -
2021 Poster: FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition »
Yichong Leng · Xu Tan · Linchen Zhu · Jin Xu · Renqian Luo · Linquan Liu · Tao Qin · Xiangyang Li · Edward Lin · Tie-Yan Liu -
2021 Poster: Do Transformers Really Perform Badly for Graph Representation? »
Chengxuan Ying · Tianle Cai · Shengjie Luo · Shuxin Zheng · Guolin Ke · Di He · Yanming Shen · Tie-Yan Liu -
2021 Poster: R-Drop: Regularized Dropout for Neural Networks »
xiaobo liang · Lijun Wu · Juntao Li · Yue Wang · Qi Meng · Tao Qin · Wei Chen · Min Zhang · Tie-Yan Liu -
2021 Poster: Recovering Latent Causal Factor for Generalization to Distributional Shifts »
Xinwei Sun · Botong Wu · Xiangyu Zheng · Chang Liu · Wei Chen · Tao Qin · Tie-Yan Liu -
2020 Poster: Semi-Supervised Neural Architecture Search »
Renqian Luo · Xu Tan · Rui Wang · Tao Qin · Enhong Chen · Tie-Yan Liu -
2020 Poster: MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler »
Zhining Liu · Pengfei Wei · Jing Jiang · Wei Cao · Jiang Bian · Yi Chang -
2020 Poster: Model-based Adversarial Meta-Reinforcement Learning »
Zichuan Lin · Garrett Thomas · Guangwen Yang · Tengyu Ma -
2020 Poster: RD$^2$: Reward Decomposition with Representation Decomposition »
Zichuan Lin · Derek Yang · Li Zhao · Tao Qin · Guangwen Yang · Tie-Yan Liu -
2020 Poster: MPNet: Masked and Permuted Pre-training for Language Understanding »
Kaitao Song · Xu Tan · Tao Qin · Jianfeng Lu · Tie-Yan Liu -
2019 Poster: Neural Machine Translation with Soft Prototype »
Yiren Wang · Yingce Xia · Fei Tian · Fei Gao · Tao Qin · Cheng Xiang Zhai · Tie-Yan Liu -
2019 Poster: FastSpeech: Fast, Robust and Controllable Text to Speech »
Yi Ren · Yangjun Ruan · Xu Tan · Tao Qin · Sheng Zhao · Zhou Zhao · Tie-Yan Liu -
2019 Poster: Distributional Reward Decomposition for Reinforcement Learning »
Zichuan Lin · Li Zhao · Derek Yang · Tao Qin · Tie-Yan Liu · Guangwen Yang -
2019 Poster: Normalization Helps Training of Quantized LSTM »
Lu Hou · Jinhua Zhu · James Kwok · Fei Gao · Tao Qin · Tie-Yan Liu -
2018 Poster: Neural Architecture Optimization »
Renqian Luo · Fei Tian · Tao Qin · Enhong Chen · Tie-Yan Liu -
2018 Poster: Learning to Teach with Dynamic Loss Functions »
Lijun Wu · Fei Tian · Yingce Xia · Yang Fan · Tao Qin · Lai Jian-Huang · Tie-Yan Liu -
2018 Poster: On the Local Hessian in Back-propagation »
Huishuai Zhang · Wei Chen · Tie-Yan Liu -
2018 Poster: Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation »
Tianyu He · Xu Tan · Yingce Xia · Di He · Tao Qin · Zhibo Chen · Tie-Yan Liu -
2018 Poster: FRAGE: Frequency-Agnostic Word Representation »
Chengyue Gong · Di He · Xu Tan · Tao Qin · Liwei Wang · Tie-Yan Liu -
2017 Poster: Decoding with Value Networks for Neural Machine Translation »
Di He · Hanqing Lu · Yingce Xia · Tao Qin · Liwei Wang · Tie-Yan Liu -
2017 Poster: Finite sample analysis of the GTD Policy Evaluation Algorithms in Markov Setting »
Yue Wang · Wei Chen · Yuting Liu · Zhi-Ming Ma · Tie-Yan Liu -
2017 Poster: Deliberation Networks: Sequence Generation Beyond One-Pass Decoding »
Yingce Xia · Fei Tian · Lijun Wu · Jianxin Lin · Tao Qin · Nenghai Yu · Tie-Yan Liu -
2017 Poster: LightGBM: A Highly Efficient Gradient Boosting Decision Tree »
Guolin Ke · Qi Meng · Thomas Finley · Taifeng Wang · Wei Chen · Weidong Ma · Qiwei Ye · Tie-Yan Liu -
2016 Poster: A Communication-Efficient Parallel Algorithm for Decision Tree »
Qi Meng · Guolin Ke · Taifeng Wang · Wei Chen · Qiwei Ye · Zhi-Ming Ma · Tie-Yan Liu -
2016 Poster: Dual Learning for Machine Translation »
Di He · Yingce Xia · Tao Qin · Liwei Wang · Nenghai Yu · Tie-Yan Liu · Wei-Ying Ma -
2016 Poster: LightRNN: Memory and Computation-Efficient Recurrent Neural Networks »
Xiang Li · Tao Qin · Jian Yang · Xiaolin Hu · Tie-Yan Liu -
2013 Poster: Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising »
Min Xu · Tao Qin · Tie-Yan Liu -
2012 Poster: Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space »
Yanyan Lan · Jiafeng Guo · Xueqi Cheng · Tie-Yan Liu -
2012 Spotlight: Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space »
Yanyan Lan · Jiafeng Guo · Xueqi Cheng · Tie-Yan Liu -
2010 Workshop: Machine Learning in Online Advertising »
James G Shanahan · Deepak Agarwal · Tao Qin · Tie-Yan Liu -
2010 Poster: Two-Layer Generalization Analysis for Ranking Using Rademacher Average »
Wei Chen · Tie-Yan Liu · Zhi-Ming Ma -
2010 Poster: A New Probabilistic Model for Rank Aggregation »
Tao Qin · Xiubo Geng · Tie-Yan Liu -
2009 Poster: Statistical Consistency of Top-k Ranking »
fen xia · Tie-Yan Liu · Hang Li -
2009 Poster: Ranking Measures and Loss Functions in Learning to Rank »
Wei Chen · Tie-Yan Liu · Yanyan Lan · Zhi-Ming Ma · Hang Li -
2008 Poster: Global Ranking Using Continuous Conditional Random Fields »
Tao Qin · Tie-Yan Liu · Xu-Dong Zhang · De-Sheng Wang · Hang Li -
2008 Oral: Global Ranking Using Continuous Conditional Random Fields »
Tao Qin · Tie-Yan Liu · Xu-Dong Zhang · De-Sheng Wang · Hang Li