Skip to yearly menu bar Skip to main content


Poster

Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit

Seok-Jin Kim · Min-hwan Oh

[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract: We study the performance guarantees of exploration-free greedy algorithms for the linear contextual bandit problem. We introduce a novel condition, named \textit{Local Anti-Concentration} (LAC) condition, which permits a greedy bandit algorithm to achieve provable efficiency. We show that the newly proposed LAC condition is satisfied by a wide range of distributions including Gaussian, Exponential, Uniform, Cauchy, Student's~$t$ distributions, and other exponential family distributions as well as their truncated variants. This significantly expands the class of admissible distributions that enable greedy algorithms to perform efficiently. Under our proposed LAC condition, we prove that the cumulative expected regret of the greedy algorithm for linear contextual bandit is bounded by $\mathcal{O}(\operatorname{poly} \log T)$. Our result is the first to achieve a poly-logarithmic rates of regret for a broad range of distributions without additionally assuming a margin condition.

Live content is unavailable. Log in and register to view live content