Timezone: »
In this paper, we study both multi-armed and contextual bandit problems in censored environments. Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments. Our main contributions include the introduction of a broad class of censorship models and their analysis in terms of the effective dimension of the problem -- a natural measure of its underlying statistical complexity and main driver of the regret bound. In particular, the effective dimension allows us to maintain the structure of the original problem at first order, while embedding it in a bigger space, and thus naturally leads to results analogous to uncensored settings. Our analysis involves a continuous generalization of the Elliptical Potential Inequality, which we believe is of independent interest. We also discover an interesting property of decision-making under censorship: a transient phase during which initial misspecification of censorship is self-corrected at an extra cost; followed by a stationary phase that reflects the inherent slowdown of learning governed by the effective dimension. Our results are useful for applications of sequential decision-making models where the feedback received depends on strategic uncertainty (e.g., agents’ willingness to follow a recommendation) and/or random uncertainty (e.g., loss or delay in arrival of information).
Author Information
Gauthier Guinet (MIT)
Saurabh Amin (MIT)
Patrick Jaillet (MIT)
More from the Same Authors
-
2020 : Pareto-efficient Acquisition Functions for Cost-Aware Bayesian Optimization »
Gauthier Guinet -
2022 Poster: Trade-off between Payoff and Model Rewards in Shapley-Fair Collaborative Machine Learning »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2022 Poster: Sample-Then-Optimize Batch Neural Thompson Sampling »
Zhongxiang Dai · YAO SHU · Bryan Kian Hsiang Low · Patrick Jaillet -
2022 Poster: Scalable design of Error-Correcting Output Codes using Discrete Optimization with Graph Coloring »
Samarth Gupta · Saurabh Amin -
2021 Poster: Differentially Private Federated Bayesian Optimization with Distributed Exploration »
Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2021 Poster: Optimizing Conditional Value-At-Risk of Black-Box Functions »
Quoc Phong Nguyen · Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2020 Poster: Variational Bayesian Unlearning »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2020 Poster: Federated Bayesian Optimization via Thompson Sampling »
Zhongxiang Dai · Bryan Kian Hsiang Low · Patrick Jaillet -
2020 Poster: No-regret Learning in Price Competitions under Consumer Reference Effects »
Negin Golrezaei · Patrick Jaillet · Jason Cheuk Nam Liang -
2019 Poster: Implicit Posterior Variational Inference for Deep Gaussian Processes »
Haibin YU · Yizhou Chen · Bryan Kian Hsiang Low · Patrick Jaillet · Zhongxiang Dai -
2019 Spotlight: Implicit Posterior Variational Inference for Deep Gaussian Processes »
Haibin YU · Yizhou Chen · Bryan Kian Hsiang Low · Patrick Jaillet · Zhongxiang Dai -
2017 : Aligned AI Poster Session »
Amanda Askell · Rafal Muszynski · William Wang · Yaodong Yang · Quoc Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet · Candice Schumann · Anqi Liu · Peter Eckersley · Angelina Wang · William Saunders -
2017 Poster: Real-Time Bidding with Side Information »
arthur flajolet · Patrick Jaillet -
2017 Poster: Online Learning with a Hint »
Ofer Dekel · arthur flajolet · Nika Haghtalab · Patrick Jaillet -
2015 Poster: Inverse Reinforcement Learning with Locally Consistent Reward Functions »
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet -
2013 Poster: Regret based Robust Solutions for Uncertain Markov Decision Processes »
Asrar Ahmed · Pradeep Varakantham · Yossiri Adulyasak · Patrick Jaillet