Timezone: »
Offline policy learning (OPL) leverages existing data collected a priori for policy optimization without any active exploration. Despite the prevalence and recent interest in this problem, its theoretical and algorithmic foundations in function approximation settings remain under-developed. In this paper, we consider this problem on the axes of distributional shift, optimization, and generalization in offline contextual bandits with neural networks. In particular, we propose a provably efficient offline contextual bandit with neural network function approximation that does not require any functional assumption on the reward. We show that our method provably generalizes over unseen contexts under a milder condition for distributional shift than the existing OPL works. Notably, unlike any other OPL method, our method learns from the offline data in an online manner using stochastic gradient descent, allowing us to leverage the benefits of online learning into an offline setting. Moreover, we show that our method is more computationally efficient and has a better dependence on the effective dimension of the neural network than an online counterpart. Finally, we demonstrate the empirical effectiveness of our method in a range of synthetic and real-world OPL problems
Author Information
Thanh Nguyen-Tang (Deakin University)
Sunil Gupta (Deakin University)
A. Tuan Nguyen (University of Oxford)
Svetha Venkatesh (Deakin University)
More from the Same Authors
-
2022 Poster: Learning to Constrain Policy Optimization with Virtual Trust Region »
Thai Hung Le · Thommen Karimpanal George · Majid Abdolshah · Dung Nguyen · Kien Do · Sunil Gupta · Svetha Venkatesh -
2022 : Improving Domain Generalization with Interpolation Robustness »
Ragja Palakkadavath · Thanh Nguyen-Tang · Sunil Gupta · Svetha Venkatesh -
2022 : Improving Domain Generalization with Interpolation Robustness »
Ragja Palakkadavath · Thanh Nguyen-Tang · Sunil Gupta · Svetha Venkatesh -
2022 Spotlight: Lightning Talks 5A-2 »
Qiang LI · Zhiwei Xu · Jiaqi Yang · Thai Hung Le · Haoxuan Qu · Yang Li · Artyom Sorokin · Peirong Zhang · Mira Finkelstein · Nitsan levy · Chung-Yiu Yau · dapeng li · Thommen Karimpanal George · De-Chuan Zhan · Nazar Buzun · Jiajia Jiang · Li Xu · Yichuan Mo · Yujun Cai · Yuliang Liu · Leonid Pugachev · Bin Zhang · Lucy Liu · Hoi-To Wai · Liangliang Shi · Majid Abdolshah · Yoav Kolumbus · Lin Geng Foo · Junchi Yan · Mikhail Burtsev · Lianwen Jin · Yuan Zhan · Dung Nguyen · David Parkes · Yunpeng Baiia · Jun Liu · Kien Do · Guoliang Fan · Jeffrey S Rosenschein · Sunil Gupta · Sarah Keren · Svetha Venkatesh -
2022 Spotlight: Learning to Constrain Policy Optimization with Virtual Trust Region »
Thai Hung Le · Thommen Karimpanal George · Majid Abdolshah · Dung Nguyen · Kien Do · Sunil Gupta · Svetha Venkatesh -
2022 Poster: Human-AI Collaborative Bayesian Optimisation »
Arun Kumar A V · Santu Rana · Alistair Shilton · Svetha Venkatesh -
2022 Poster: Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation »
Kien Do · Thai Hung Le · Dung Nguyen · Dang Nguyen · HARIPRIYA HARIKUMAR · Truyen Tran · Santu Rana · Svetha Venkatesh -
2022 Poster: Expected Improvement for Contextual Bandits »
Hung Tran-The · Sunil Gupta · Santu Rana · Tuan Truong · Long Tran-Thanh · Svetha Venkatesh -
2021 Poster: Model-Based Episodic Memory Induces Dynamic Hybrid Controls »
Hung Le · Thommen Karimpanal George · Majid Abdolshah · Truyen Tran · Svetha Venkatesh -
2021 Poster: Kernel Functional Optimisation »
Arun Kumar Anjanapura Venkatesh · Alistair Shilton · Santu Rana · Sunil Gupta · Svetha Venkatesh -
2021 Poster: Domain Invariant Representation Learning with Domain Density Transformations »
A. Tuan Nguyen · Toan Tran · Yarin Gal · Atilim Gunes Baydin -
2020 Poster: Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search Spaces »
Hung Tran-The · Sunil Gupta · Santu Rana · Huong Ha · Svetha Venkatesh -
2019 Poster: Bayesian Optimization with Unknown Search Space »
Huong Ha · Santu Rana · Sunil Gupta · Thanh Nguyen-Tang · Hung Tran-The · Svetha Venkatesh -
2019 Poster: Multi-objective Bayesian optimisation with preferences over objectives »
Majid Abdolshah · Alistair Shilton · Santu Rana · Sunil Gupta · Svetha Venkatesh -
2018 Poster: Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation »
Shivapratap Gopakumar · Sunil Gupta · Santu Rana · Vu Nguyen · Svetha Venkatesh -
2018 Poster: Variational Memory Encoder-Decoder »
Hung Le · Truyen Tran · Thin Nguyen · Svetha Venkatesh -
2017 Poster: Process-constrained batch Bayesian optimisation »
Pratibha Vellanki · Santu Rana · Sunil Gupta · David Rubin · Alessandra Sutti · Thomas Dorin · Murray Height · Paul Sanders · Svetha Venkatesh -
2017 Spotlight: Process-constrained batch Bayesian optimisation »
Pratibha Vellanki · Santu Rana · Sunil Gupta · David Rubin · Alessandra Sutti · Thomas Dorin · Murray Height · Paul Sanders · Svetha Venkatesh