Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Bayesian Decision-making and Uncertainty: from probabilistic and spatiotemporal modeling to sequential experiment design

Gaussian Randomized Exploration for Semi-bandits with Sleeping Arms

Zhiming Huang · Bingshan Hu · jianping pan

Keywords: [ Sleeping Arms ] [ Combinatorial Bandits ] [ Gaussian Priors ] [ Multi-armed Bandits ] [ Online Learning ] [ Semi-bandit Feedback ] [ thompson sampling ]


Abstract: This paper provides theoretical analyses of problem-independent upper and lower regret bounds for Gaussian randomized algorithms in semi-bandits with sleeping arms, where arms may be unavailable in certain rounds, and available arms satisfying combinatorial constraints can be played simultaneously. We first introduce the CTS-G algorithm, an adaptation of Thompson sampling with Gaussian priors, achieving an upper bound of O~(mNT) over T rounds with N arms and up to m arms played per round, where O~ hides the logarithmic factors. Next, we present CL-SG, which improves upon CTS-G by using a single Gaussian sample per round, achieving a near-optimal upper regret bound of O~(mNT). We also establish that both algorithms have lower regret bounds of Ω(mNTlnNm) and Ω(mNT), respectively.

Chat is not available.