Timezone: »
Poster
Expected Improvement for Contextual Bandits
Hung The Tran · Sunil Gupta · Santu Rana · Tuan Truong · Long Tran-Thanh · Svetha Venkatesh
The expected improvement (EI) is a popular technique to handle the tradeoff between exploration and exploitation under uncertainty. This technique has been widely used in Bayesian optimization but it is not applicable for the contextual bandit problem which is a generalization of the standard bandit and Bayesian optimization. In this paper, we initiate and study the EI technique for contextual bandits from both theoretical and practical perspectives. We propose two novel EI-based algorithms, one when the reward function is assumed to be linear and the other for more general reward functions. With linear reward functions, we demonstrate that our algorithm achieves a near-optimal regret. Notably, our regret improves that of LinTS \cite{agrawal13} by a factor $\sqrt{d}$ while avoiding to solve a NP-hard problem at each iteration as in LinUCB \cite{Abbasi11}. For more general reward functions which are modeled by deep neural networks, we prove that our algorithm achieves a $\tilde{\mathcal O} (\tilde{d}\sqrt{T})$ regret, where $\tilde{d}$ is the effective dimension of a neural tangent kernel (NTK) matrix, and $T$ is the number of iterations. Our experiments on various benchmark datasets show that both proposed algorithms work well and consistently outperform existing approaches, especially in high dimensions.
Author Information
Hung The Tran (Deakin University)
Sunil Gupta (Deakin University)
Santu Rana (Deakin University)
Tuan Truong (University of British Columbia)
Long Tran-Thanh (University of Warwick)
Svetha Venkatesh (Deakin University)
More from the Same Authors
-
2021 : Offline neural contextual bandits: Pessimism, Optimization and Generalization »
Thanh Nguyen-Tang · Sunil Gupta · A. Tuan Nguyen · Svetha Venkatesh -
2022 Poster: Learning to Constrain Policy Optimization with Virtual Trust Region »
Thai Hung Le · Thommen Karimpanal George · Majid Abdolshah · Dung Nguyen · Kien Do · Sunil Gupta · Svetha Venkatesh -
2022 : Multiresolution Mesh Networks For Learning Dynamical Fluid Simulations »
Bach Nguyen · Truong Son Hy · Long Tran-Thanh · Risi Kondor -
2022 : Improving Domain Generalization with Interpolation Robustness »
Ragja Palakkadavath · Thanh Nguyen-Tang · Sunil Gupta · Svetha Venkatesh -
2022 : Fuzzy c-Means Clustering in Persistence Diagram Space for Deep Learning Model Selection »
Thomas Davies · Jack Aspinall · Bryan Wilder · Long Tran-Thanh -
2022 : Improving Domain Generalization with Interpolation Robustness »
Ragja Palakkadavath · Thanh Nguyen-Tang · Sunil Gupta · Svetha Venkatesh -
2022 Spotlight: Lightning Talks 5A-2 »
Qiang LI · Zhiwei Xu · Jia-Qi Yang · Thai Hung Le · Haoxuan Qu · Yang Li · Artyom Sorokin · Peirong Zhang · Mira Finkelstein · Nitsan levy · Chung-Yiu Yau · dapeng li · Thommen Karimpanal George · De-Chuan Zhan · Nazar Buzun · Jiajia Jiang · Li Xu · Yichuan Mo · Yujun Cai · Yuliang Liu · Leonid Pugachev · Bin Zhang · Lucy Liu · Hoi-To Wai · Liangliang Shi · Majid Abdolshah · Yoav Kolumbus · Lin Geng Foo · Junchi Yan · Mikhail Burtsev · Lianwen Jin · Yuan Zhan · Dung Nguyen · David Parkes · Yunpeng Baiia · Jun Liu · Kien Do · Guoliang Fan · Jeffrey S Rosenschein · Sunil Gupta · Sarah Keren · Svetha Venkatesh -
2022 Spotlight: Learning to Constrain Policy Optimization with Virtual Trust Region »
Thai Hung Le · Thommen Karimpanal George · Majid Abdolshah · Dung Nguyen · Kien Do · Sunil Gupta · Svetha Venkatesh -
2022 Poster: Human-AI Collaborative Bayesian Optimisation »
Arun Kumar A V · Santu Rana · Alistair Shilton · Svetha Venkatesh -
2022 Poster: Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation »
Kien Do · Thai Hung Le · Dung Nguyen · Dang Nguyen · HARIPRIYA HARIKUMAR · Truyen Tran · Santu Rana · Svetha Venkatesh -
2021 Poster: Model-Based Episodic Memory Induces Dynamic Hybrid Controls »
Hung Le · Thommen Karimpanal George · Majid Abdolshah · Truyen Tran · Svetha Venkatesh -
2021 Poster: Kernel Functional Optimisation »
Arun Kumar Anjanapura Venkatesh · Alistair Shilton · Santu Rana · Sunil Gupta · Svetha Venkatesh -
2020 : Spotlight: Hypothesis Classes with a Unique Persistence Diagram are Nonuniformly Learnable »
Nicholas Bishop · Long Tran-Thanh · Thomas Davies -
2020 Poster: Optimal Learning from Verified Training Data »
Nicholas Bishop · Long Tran-Thanh · Enrico Gerding -
2020 Poster: Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search Spaces »
Hung The Tran · Sunil Gupta · Santu Rana · Huong Ha · Svetha Venkatesh -
2020 Poster: Adversarial Blocking Bandits »
Nicholas Bishop · Hau Chan · Debmalya Mandal · Long Tran-Thanh -
2019 Poster: Bayesian Optimization with Unknown Search Space »
Huong Ha · Santu Rana · Sunil Gupta · Thanh Nguyen-Tang · Hung The Tran · Svetha Venkatesh -
2019 Poster: Multi-objective Bayesian optimisation with preferences over objectives »
Majid Abdolshah · Alistair Shilton · Santu Rana · Sunil Gupta · Svetha Venkatesh -
2018 Poster: Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation »
Shivapratap Gopakumar · Sunil Gupta · Santu Rana · Vu Nguyen · Svetha Venkatesh -
2018 Poster: Variational Memory Encoder-Decoder »
Hung Le · Truyen Tran · Thin Nguyen · Svetha Venkatesh -
2017 Poster: Process-constrained batch Bayesian optimisation »
Pratibha Vellanki · Santu Rana · Sunil Gupta · David Rubin · Alessandra Sutti · Thomas Dorin · Murray Height · Paul Sanders · Svetha Venkatesh -
2017 Spotlight: Process-constrained batch Bayesian optimisation »
Pratibha Vellanki · Santu Rana · Sunil Gupta · David Rubin · Alessandra Sutti · Thomas Dorin · Murray Height · Paul Sanders · Svetha Venkatesh