Processing math: 100%
Skip to yearly menu bar Skip to main content


Poster

Logarithmic Regret from Sublinear Hints

Aditya Bhaskara · Ashok Cutkosky · Ravi Kumar · Manish Purohit

Keywords: [ Online Learning ] [ Optimization ]


Abstract: We consider the online linear optimization problem, where at every step the algorithm plays a point xt in the unit ball, and suffers loss ct,xt for some cost vector ct that is then revealed to the algorithm. Recent work showed that if an algorithm receives a _hint_ ht that has non-trivial correlation with ct before it plays xt, then it can achieve a regret guarantee of O(logT), improving on the bound of Θ(T) in the standard setting. In this work, we study the question of whether an algorithm really requires a hint at _every_ time step. Somewhat surprisingly, we show that an algorithm can obtain O(logT) regret with just O(T) hints under a natural query model; in contrast, we also show that o(T) hints cannot guarantee better than Ω(T) regret. We give two applications of our result, to the well-studied setting of {\em optimistic} regret bounds, and to the problem of online learning with abstention.

Chat is not available.