NeurIPS Poster Piecewise-Stationary Bandits with Knapsacks

Poster

Piecewise-Stationary Bandits with Knapsacks

Xilin Zhang · Wang Chi Cheung

West Ballroom A-D #5911

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract: We study Bandits with Knapsacks (Bwk) in a piecewise-stationary environment. We propose a novel inventory reserving algorithm which draws new insights into the problem. Suppose parameters

η_{min}, η_{max} \in (0, 1]

$\eta_{\min}, \eta_{\max} \in (0,1]$ respectively lower and upper bound the reward earned and the resources consumed in a time round. Our algorithm achieves a provably near-optimal competitive ratio of

O (\log (η_{max} / η_{min}))

$O(\log(\eta_{\max}/\eta_{\min}))$ , with a matching lower bound provided. Our performance guarantee is based on a dynamic benchmark, distinguishing our work from existing works on adversarial Bwk who compare with the static benchmark. Furthermore, different from existing non-stationary Bwk work, we do not require a bounded global variation.

Chat is not available.