Poster
Contextual Pricing for Lipschitz Buyers
Jieming Mao · Renato Leme · Jon Schneider
Room 210 #74
Keywords: [ Bandit Algorithms ] [ Online Learning ] [ Game Theory and Computational Economics ]
Abstract:
We investigate the problem of learning a Lipschitz function from binary
feedback. In this problem, a learner is trying to learn a Lipschitz function
$f:[0,1]^d \rightarrow [0,1]$ over the course of $T$ rounds. On round $t$, an
adversary provides the learner with an input $x_t$, the learner submits a
guess $y_t$ for $f(x_t)$, and learns whether $y_t > f(x_t)$ or $y_t \leq
f(x_t)$. The learner's goal is to minimize their total loss $\sum_t\ell(f(x_t),
y_t)$ (for some loss function $\ell$). The problem is motivated by \textit{contextual dynamic pricing},
where a firm must sell a stream of differentiated products to a collection of
buyers with non-linear valuations for the items and observes only whether the
item was sold or not at the posted price.
For the symmetric loss $\ell(f(x_t), y_t) = \vert f(x_t) - y_t \vert$, we
provide an algorithm for this problem achieving total loss $O(\log T)$
when $d=1$ and $O(T^{(d-1)/d})$ when $d>1$, and show that both bounds are
tight (up to a factor of $\sqrt{\log T}$). For the pricing loss function
$\ell(f(x_t), y_t) = f(x_t) - y_t {\bf 1}\{y_t \leq f(x_t)\}$ we show a regret
bound of $O(T^{d/(d+1)})$ and show that this bound is tight. We present
improved bounds in the special case of a population of linear buyers.
Chat is not available.