Poster
Nearly Minimax Optimal Submodular Maximization with Bandit Feedback
Artin Tajdini · Lalit Jain · Kevin Jamieson
West Ballroom A-D #5802
Abstract:
We consider maximizing an unknown monotonic, submodular set function with cardinality constraint under stochastic bandit feedback. At each time the learner chooses a set with and receives reward where is mean-zero sub-Gaussian noise. The objective is to minimize the learner's regret with respect to an approximation of the maximum with , obtained through robust greedy maximization of . To date, the best regret bound in the literature scales as . And by trivially treating every set as a unique arm one deduces that is also achievable using standard multi-armed bandit algorithms. In this work, we establish the first minimax lower bound for this setting that scales like . For a slightly restricted algorithm class, we prove a stronger regret lower bound of . Moreover, we propose an algorithm Sub-UCB that achieves regret capable of matching the lower bound on regret for the restricted class up to logarithmic factors.
Chat is not available.