Timezone: »
Poster
The price of unfairness in linear bandits with biased feedback
Solenne Gaucher · Alexandra Carpentier · Christophe Giraud
In this paper, we study the problem of fair sequential decision making with biased linear bandit feedback. At each round, a player selects an action described by a covariate and by a sensitive attribute. The perceived reward is a linear combination of the covariates of the chosen action, but the player only observes a biased evaluation of this reward, depending on the sensitive attribute. To characterize the difficulty of this problem, we design a phased elimination algorithm that corrects the unfair evaluations, and establish upper bounds on its regret. We show that the worst-case regret is smaller than $\mathcal{O}(\kappa_* ^{1/3}\log(T)^{1/3}T^{2/3})$, where $\kappa_*$ is an explicit geometrical constant characterizing the difficulty of bias estimation. We prove lower bounds on the worst-case regret for some sets of actions showing that this rate is tight up to a possible sub-logarithmic factor. We also derive gap-dependent upper bounds on the regret, and matching lower bounds for some problem instance. Interestingly, these results reveal a transition between a regime where the problem is as difficult as its unbiased counterpart, and a regime where it can be much harder.
Author Information
Solenne Gaucher (Université Paris-Saclay)
Alexandra Carpentier
Christophe Giraud (Université Paris Saclay)
More from the Same Authors
-
2021 Spotlight: A Unified Approach to Fair Online Learning via Blackwell Approachability »
Evgenii Chzhen · Christophe Giraud · Gilles Stoltz -
2021 Poster: A Unified Approach to Fair Online Learning via Blackwell Approachability »
Evgenii Chzhen · Christophe Giraud · Gilles Stoltz -
2021 Poster: Optimality of variational inference for stochasticblock model with missing links »
Solenne Gaucher · Olga Klopp -
2020 Poster: Finite Continuum-Armed Bandits »
Solenne Gaucher