Timezone: »
The Multi-Armed Bandit problem constitutes an archetypal setting for sequential decision-making, permeating multiple domains including engineering, business, and medicine. One of the hallmarks of a bandit setting is the agent's capacity to explore its environment through active intervention, which contrasts with the ability to collect passive data by estimating associational relationships between actions and payouts. The existence of unobserved confounders, namely unmeasured variables affecting both the action and the outcome variables, implies that these two data-collection modes will in general not coincide. In this paper, we show that formalizing this distinction has conceptual and algorithmic implications to the bandit setting. The current generation of bandit algorithms implicitly try to maximize rewards based on estimation of the experimental distribution, which we show is not always the best strategy to pursue. Indeed, to achieve low regret in certain realistic classes of bandit problems (namely, in the face of unobserved confounders), both experimental and observational quantities are required by the rational agent. After this realization, we propose an optimization metric (employing both experimental and observational distributions) that bandit agents should pursue, and illustrate its benefits over traditional algorithms.
Author Information
Elias Bareinboim (Purdue University )
Elias Bareinboim is a PhD candidate in Computer Science at UCLA advised by Judea Pearl. He works on the problem of generalizability in causal inference, and more specifically proposed solutions for the problems of selection bias, fusion of experimental and non-experimental knowledge, and external validity (transfer of causal knowledge) in non-parametric settings. Recently, Elias received the "Yahoo Key Scientific Challenges Award 2012" (area of Statistics) and Dissertation Year Fellowship (2013-2014) from UCLA. He holds B.Sc. and M.Sc. degrees in Computer Science from Federal University of Rio de Janeiro, Brazil, where he worked in the areas of Complex Networks, Artificial Intelligence, and Bioinformatics.
Andrew Forney (UCLA)
Judea Pearl (UCLA)
Judea Pearl is a professor of computer science and statistics at UCLA. He is a graduate of the Technion, Israel, and has joined the faculty of UCLA in 1970, where he conducts research in artificial intelligence, causal inference and philosophy of science. Pearl has authored three books: Heuristics (1984), Probabilistic Reasoning (1988), and Causality (2000;2009), the latter won the Lakatos Prize from the London School of Economics. He is a member of the National Academy of Engineering, the American Academy of Arts and Sciences, and a Fellow of the IEEE, AAAI and the Cognitive Science Society. Pearl received the 2008 Benjamin Franklin Medal from the Franklin Institute and the 2011 Rumelhart Prize from the Cognitive Science Society. In 2012, he received the Technion's Harvey Prize and the ACM Alan M. Turing Award.
More from the Same Authors
-
2022 : Probabilities of Causation: Adequate Size of Experimental and Observational Samples »
Ang Li · Ruirui Mao · Judea Pearl -
2022 : Unit Selection: Learning Benefit Function from Finite Population Data »
Ang Li · Song Jiang · Yizhou Sun · Judea Pearl -
2022 : Opening Keynote for nCSI »
Judea Pearl -
2022 Poster: Causal Inference with Non-IID Data using Linear Graphical Models »
Chi Zhang · Karthika Mohan · Judea Pearl -
2021 Workshop: Causal Inference & Machine Learning: Why now? »
Elias Bareinboim · Bernhard Schölkopf · Terrence Sejnowski · Yoshua Bengio · Judea Pearl -
2017 : Contributed Talk 4 »
Judea Pearl -
2017 : Poster session »
Abbas Zaidi · Christoph Kurz · David Heckerman · YiJyun Lin · Stefan Riezler · Ilya Shpitser · Songbai Yan · Olivier Goudet · Yash Deshpande · Judea Pearl · Jovana Mitrovic · Brian Vegetabile · Tae Hwy Lee · Karen Sachs · Karthika Mohan · Reagan Rose · Julius Ramakers · Negar Hassanpour · Pierre Baldi · Razieh Nabi · Noah Hammarlund · Eli Sherman · Carolin Lawrence · Fattaneh Jabbari · Vira Semenova · Maria Dimakopoulou · Pratik Gajane · Russell Greiner · Ilias Zadik · Alexander Blocker · Hao Xu · Tal EL HAY · Tony Jebara · Benoit Rostykus -
2016 : The Data-Fusion Problem: Causal Inference and Reinforcement Learning »
Elias Bareinboim -
2014 Poster: Transportability from Multiple Environments with Limited Experiments: Completeness Results »
Elias Bareinboim · Judea Pearl -
2014 Poster: Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data »
Karthika Mohan · Judea Pearl -
2014 Spotlight: Transportability from Multiple Environments with Limited Experiments: Completeness Results »
Elias Bareinboim · Judea Pearl -
2013 Poster: Transportability from Multiple Environments with Limited Experiments »
Elias Bareinboim · Sanghack Lee · Vasant Honavar · Judea Pearl -
2013 Poster: Graphical Models for Inference with Missing Data »
Karthika Mohan · Judea Pearl · Jin Tian -
2013 Spotlight: Graphical Models for Inference with Missing Data »
Karthika Mohan · Judea Pearl · Jin Tian -
2013 Tutorial: Causes and Counterfactuals: Concepts, Principles and Tools. »
Judea Pearl · Elias Bareinboim