Timezone: »

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
Dylan Foster · Akshay Krishnamurthy

Thu Dec 06 07:45 AM -- 09:45 AM (PST) @ Room 517 AB #165
We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning. Using the ramp loss, we derive a new margin-based regret bound in terms of standard sequential complexity measures of a benchmark class of real-valued regression functions. Using the hinge loss, we derive an efficient algorithm with a $\sqrt{dT}$-type mistake bound against benchmark policies induced by $d$-dimensional regressors. Under realizability assumptions, our results also yield classical regret bounds.

Author Information

Dylan Foster (Cornell University)
Akshay Krishnamurthy (Microsoft)

More from the Same Authors