Poster
Logarithmic Regret from Sublinear Hints
Aditya Bhaskara · Ashok Cutkosky · Ravi Kumar · Manish Purohit
Keywords: [ Online Learning ] [ Optimization ]
Abstract:
We consider the online linear optimization problem, where at every step the algorithm plays a point xt in the unit ball, and suffers loss ⟨ct,xt⟩ for some cost vector ct that is then revealed to the algorithm. Recent work showed that if an algorithm receives a _hint_ ht that has non-trivial correlation with ct before it plays xt, then it can achieve a regret guarantee of O(logT), improving on the bound of Θ(√T) in the standard setting. In this work, we study the question of whether an algorithm really requires a hint at _every_ time step. Somewhat surprisingly, we show that an algorithm can obtain O(logT) regret with just O(√T) hints under a natural query model; in contrast, we also show that o(√T) hints cannot guarantee better than Ω(√T) regret. We give two applications of our result, to the well-studied setting of {\em optimistic} regret bounds, and to the problem of online learning with abstention.
Chat is not available.