Workshop: OPT 2022: Optimization for Machine Learning

Distributed Online and Bandit Convex Optimization

Kumar Kshitij Patel · Aadirupa Saha · Nati Srebro · Lingxiao Wang


We study the problems of distributed online and bandit convex optimization against an adaptive adversary. We want to minimize the average regret on M machines that communicate R times intermittently. We show that collaboration is not beneficial if the machines can access gradients of the cost functions at each time step, i.e., they have first-order feedback. In this setting, simple non-collaborative algorithms are min-max optimal. This contrasts with the provable benefit of collaboration when optimizing against a stochastic adversary, which samples the cost functions from fixed distributions. To identify the benefit of collaboration, we consider the harder setting where the machines can only access values of their cost function, i.e., they have bandit feedback. Here, we identify the high-dimensional regime where collaboration is beneficial and may even lead to a linear speedup in the number of machines. Our results bridge the gap between distributed online optimization against stochastic and adaptive adversaries.

Chat is not available.