firstbacksecondback
120 Results
Poster
|
Wed 14:00 |
Temporal Regularization for Markov Decision Process Pierre Thodoroff · Audrey Durand · Joelle Pineau · Doina Precup |
|
Poster
|
Wed 14:00 |
Learning to Play With Intrinsically-Motivated, Self-Aware Agents Nick Haber · Damian Mrowca · Stephanie Wang · Li Fei-Fei · Daniel Yamins |
|
Poster
|
Wed 7:45 |
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization Tengyang Xie · Bo Liu · Yangyang Xu · Mohammad Ghavamzadeh · Yinlam Chow · Daoming Lyu · Daesub Yoon |
|
Poster
|
Wed 7:45 |
Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model Aaron Sidford · Mengdi Wang · Xian Wu · Lin Yang · Yinyu Ye |
|
Poster
|
Wed 14:00 |
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization Hoi-To Wai · Zhuoran Yang · Zhaoran Wang · Mingyi Hong |
|
Poster
|
Wed 14:00 |
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments Sriram Srinivasan · Marc Lanctot · Vinicius Zambaldi · Julien Perolat · Karl Tuyls · Remi Munos · Michael Bowling |
|
Poster
|
Wed 14:00 |
Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing Chen Liang · Mohammad Norouzi · Jonathan Berant · Quoc V Le · Ni Lao |
|
Poster
|
Wed 14:00 |
rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions Mathieu Fehr · Olivier Buffet · Vincent Thomas · Jilles Dibangoye |
|
Poster
|
Wed 14:00 |
Policy Optimization via Importance Sampling Alberto Maria Metelli · Matteo Papini · Francesco Faccio · Marcello Restelli |
|
Poster
|
Wed 14:00 |
Total stochastic gradient algorithms and applications in reinforcement learning Paavo Parmas |
|
Poster
|
Wed 7:45 |
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search Yelong Shen · Jianshu Chen · Po-Sen Huang · Yuqing Guo · Jianfeng Gao |
|
Poster
|
Wed 14:00 |
Representation Balancing MDPs for Off-policy Policy Evaluation Yao Liu · Omer Gottesman · Aniruddh Raghu · Matthieu Komorowski · Aldo Faisal · Finale Doshi-Velez · Emma Brunskill |