Timezone: »

Logarithmic Regret from Sublinear Hints
Aditya Bhaskara · Ashok Cutkosky · Ravi Kumar · Manish Purohit

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @
We consider the online linear optimization problem, where at every step the algorithm plays a point $x_t$ in the unit ball, and suffers loss $\langle c_t, x_t \rangle$ for some cost vector $c_t$ that is then revealed to the algorithm. Recent work showed that if an algorithm receives a _hint_ $h_t$ that has non-trivial correlation with $c_t$ before it plays $x_t$, then it can achieve a regret guarantee of $O(\log T)$, improving on the bound of $\Theta(\sqrt{T})$ in the standard setting. In this work, we study the question of whether an algorithm really requires a hint at _every_ time step. Somewhat surprisingly, we show that an algorithm can obtain $O(\log T)$ regret with just $O(\sqrt{T})$ hints under a natural query model; in contrast, we also show that $o(\sqrt{T})$ hints cannot guarantee better than $\Omega(\sqrt{T})$ regret. We give two applications of our result, to the well-studied setting of {\em optimistic} regret bounds, and to the problem of online learning with abstention.

Author Information

Aditya Bhaskara (University of Utah)
Ashok Cutkosky (Boston University)
Ravi Kumar (Google)
Manish Purohit (Google)

More from the Same Authors