Timezone: »
Motivated by the consideration of fairly sharing the cost of exploration between multiple groups in learning problems, we develop the Nash bargaining solution in the context of multi-armed bandits. Specifically, the 'grouped' bandit associated with any multi-armed bandit problem associates, with each time step, a single group from some finite set of groups. The utility gained by a given group under some learning policy is naturally viewed as the reduction in that group's regret relative to the regret that group would have incurred 'on its own'. We derive policies that yield the Nash bargaining solution relative to the set of incremental utilities possible under any policy. We show that on the one hand, the 'price of fairness' under such policies is limited, while on the other hand, regret optimal policies are arbitrarily unfair under generic conditions. Our theoretical development is complemented by a case study on contextual bandits for warfarin dosing where we are concerned with the cost of exploration across multiple races and age groups.
Author Information
Jackie Baek (Massachusetts Institute of Technology)
Vivek Farias (Massachusetts Institute of Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Fair Exploration via Axiomatic Bargaining »
Tue. Dec 7th 04:30 -- 06:00 PM Room
More from the Same Authors
-
2021 : Learning Treatment Effects in Panels with General Intervention Patterns »
Vivek Farias · Andrew Li · Tianyi Peng -
2021 : The Limits to Learning a Diffusion Model »
Jackie Baek · Vivek Farias · ANDREEA GEORGESCU · Retsef Levi · Tianyi Peng · Joshua Wilde · Andrew Zheng -
2021 : The Limits to Learning a Diffusion Model »
Jackie Baek · Vivek Farias · ANDREEA GEORGESCU · Retsef Levi · Tianyi Peng · Joshua Wilde · Andrew Zheng -
2022 Poster: Markovian Interference in Experiments »
Vivek Farias · Andrew Li · Tianyi Peng · Andrew Zheng -
2021 Oral: Learning Treatment Effects in Panels with General Intervention Patterns »
Vivek Farias · Andrew Li · Tianyi Peng -
2021 Poster: Learning Treatment Effects in Panels with General Intervention Patterns »
Vivek Farias · Andrew Li · Tianyi Peng -
2016 Poster: Optimistic Gittins Indices »
Eli Gutin · Vivek Farias -
2012 Poster: Non-parametric Approximate Dynamic Programming via the Kernel Method »
Nikhil Bhat · Ciamac C Moallemi · Vivek Farias -
2009 Poster: A Data-Driven Approach to Modeling Choice »
Vivek Farias · Srikanth Jagabathula · Devavrat Shah -
2009 Spotlight: A Data-Driven Approach to Modeling Choice »
Vivek Farias · Srikanth Jagabathula · Devavrat Shah -
2009 Poster: A Smoothed Approximate Linear Program »
Vijay Desai · Vivek Farias · Ciamac C Moallemi -
2009 Spotlight: A Smoothed Approximate Linear Program »
Vijay Desai · Vivek Farias · Ciamac C Moallemi