Poster
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)

Risk Aversion of Online Learning Algorithms

Andreas Haupt · Aroon Narayanan

Project Page [ OpenReview]

Abstract

We study a novel bias in online decision-making: Emergent risk aversion. When presented with actions of the same expectation, $\varepsilon$-Greedy chooses the lower-variance action with probability approaching one. Upper Confidence Band avoids this by debiasing their estimates of arm rewards. Risk aversion shapes arm choices in finite time, as we show in experiments.

Video

Chat is not available.