Workshop
|
|
AsymQ: Asymmetric Q-loss to mitigate overestimation bias in off-policy reinforcement learning
Qinsheng Zhang · Arjun Krishna · Sehoon Ha · Yongxin Chen
|
|
Workshop
|
|
Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction
Jiachen Li · Shuo Cheng · Zhenyu Liao · Huayan Wang · William Yang Wang · Qinxun Bai
|
|
Poster
|
Tue 9:00
|
The Pitfalls of Regularization in Off-Policy TD Learning
Gaurav Manek · J. Zico Kolter
|
|
Poster
|
|
On the role of overparameterization in off-policy Temporal Difference learning with linear function approximation
Valentin Thomas
|
|
Poster
|
Wed 9:00
|
Policy Gradient With Serial Markov Chain Reasoning
Edoardo Cetin · Oya Celiktutan
|
|
Poster
|
Thu 9:00
|
Markovian Interference in Experiments
Vivek Farias · Andrew Li · Tianyi Peng · Andrew Zheng
|
|
Workshop
|
|
Efficient Multi-Horizon Learning for Off-Policy Reinforcement Learning
Raja Farrukh Ali · Nasik Muhammad Nafi · Kevin Duong · William Hsu
|
|
Workshop
|
|
Variance Reduction in Off-Policy Deep Reinforcement Learning using Spectral Normalization
Payal Bawa · Rafael Oliveira · Fabio Ramos
|
|
Poster
|
Wed 14:00
|
Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions
Haanvid Lee · Jongmin Lee · Yunseon Choi · Wonseok Jeon · Byung-Jun Lee · Yung-Kyun Noh · Kee-Eung Kim
|
|
Poster
|
Thu 14:00
|
Action-modulated midbrain dopamine activity arises from distributed control policies
Jack Lindsey · Ashok Litwin-Kumar
|
|
Workshop
|
|
On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs
Yi Wan · Richard Sutton
|
|
Poster
|
Thu 14:00
|
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
Audrey Huang · Nan Jiang
|
|