Timezone: »
Poster
Bandit Learning with Delayed Impact of Actions
Wei Tang · Chien-Ju Ho · Yang Liu
We consider a stochastic multi-armed bandit (MAB) problem with delayed impact of actions. In our setting, actions taken in the pastimpact the arm rewards in the subsequent future. This delayed impact of actions is prevalent in the real world. For example, the capability to pay back a loan for people in a certain social group might depend on historically how frequently that group has been approved loan applications. If banks keep rejecting loan applications to people in a disadvantaged group, it could create a feedback loop and further damage the chance of getting loans for people in that group. In this paper, we formulate this delayed and long-term impact of actions within the context of multi-armed bandits. We generalize the bandit setting to encode the dependency of this ``bias" due to the action history during learning. The goal is to maximize the collected utilities over time while taking into account the dynamics created by the delayed impacts of historical actions. We propose an algorithm that achieves a regret of $\tilde{O}(KT^{2/3})$ and show a matching regret lower bound of $\Omega(KT^{2/3})$, where $K$ is the number of arms and $T$ is the learning horizon. Our results complement the bandit literature by adding techniques to deal with actions with long-term impacts and have implications in designing fair algorithms.
Author Information
Wei Tang (Washington University in St.Louis)
Chien-Ju Ho (Washington University in St. Louis)
Yang Liu (UC Santa Cruz)
More from the Same Authors
-
2021 Spotlight: Unintended Selection: Persistent Qualification Rate Disparities and Interventions »
Reilly Raab · Yang Liu -
2021 : Unfairness Despite Awareness: Group-Fair Classification with Strategic Agents »
Andrew Estornell · Sanmay Das · Yang Liu · Yevgeniy Vorobeychik -
2021 : Unfairness Despite Awareness: Group-Fair Classification with Strategic Agents »
Andrew Estornell · Sanmay Das · Yang Liu · Yevgeniy Vorobeychik -
2022 : Tier Balancing: Towards Dynamic Fairness over Underlying Causal Factors »
Zeyu Tang · Yatong Chen · Yang Liu · Kun Zhang -
2022 : Fast Implicit Constrained Optimization of Non-decomposable Objectives for Deep Networks »
Yatong Chen · Abhishek Kumar · Yang Liu · Ehsan Amid -
2022 : Improving the Strength of Human-Like Models in Chess »
Saumik Narayanan · Kassa Korley · Chien-Ju Ho · Siddhartha Sen -
2022 Spotlight: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Poster: Fairness Transferability Subject to Bounded Distribution Shift »
Yatong Chen · Reilly Raab · Jialu Wang · Yang Liu -
2022 Poster: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Poster: Adaptive Data Debiasing through Bounded Exploration »
Yifan Yang · Yang Liu · Parinaz Naghizadeh -
2021 : Revisiting Dynamics in Strategic ML »
Yang Liu -
2021 : Bounded Fairness Transferability subject to Distribution Shift »
Reilly Raab · Yatong Chen · Yang Liu -
2021 Poster: Unintended Selection: Persistent Qualification Rate Disparities and Interventions »
Reilly Raab · Yang Liu -
2021 Poster: Can Less be More? When Increasing-to-Balancing Label Noise Rates Considered Beneficial »
Yang Liu · Jialu Wang -
2021 Poster: Policy Learning Using Weak Supervision »
Jingkang Wang · Hongyi Guo · Zhaowei Zhu · Yang Liu -
2020 : Contributed Talk 4: Strategic Recourse in Linear Classification »
Yatong Chen · Yang Liu -
2020 Poster: Learning Strategy-Aware Linear Classifiers »
Yiling Chen · Yang Liu · Chara Podimata -
2020 Poster: How do fair decisions fare in long-term qualification? »
Xueru Zhang · Ruibo Tu · Yang Liu · Mingyan Liu · Hedvig Kjellstrom · Kun Zhang · Cheng Zhang -
2020 Poster: Optimal Query Complexity of Secure Stochastic Convex Optimization »
Wei Tang · Chien-Ju Ho · Yang Liu