Poster
Contextual Pricing for Lipschitz Buyers
Jieming Mao · Renato Leme · Jon Schneider
Room 210 #74
Keywords: [ Bandit Algorithms ] [ Online Learning ] [ Game Theory and Computational Economics ]
Abstract:
We investigate the problem of learning a Lipschitz function from binary
feedback. In this problem, a learner is trying to learn a Lipschitz function
f:[0,1]d→[0,1]f:[0,1]d→[0,1] over the course of TT rounds. On round tt, an
adversary provides the learner with an input xtxt, the learner submits a
guess ytyt for f(xt)f(xt), and learns whether yt>f(xt)yt>f(xt) or yt≤f(xt)yt≤f(xt). The learner's goal is to minimize their total loss ∑tℓ(f(xt),yt)∑tℓ(f(xt),yt) (for some loss function ℓℓ). The problem is motivated by \textit{contextual dynamic pricing},
where a firm must sell a stream of differentiated products to a collection of
buyers with non-linear valuations for the items and observes only whether the
item was sold or not at the posted price.
For the symmetric loss ℓ(f(xt),yt)=|f(xt)−yt|ℓ(f(xt),yt)=|f(xt)−yt|, we
provide an algorithm for this problem achieving total loss O(logT)O(logT)
when d=1d=1 and O(T(d−1)/d)O(T(d−1)/d) when d>1d>1, and show that both bounds are
tight (up to a factor of √logT√logT). For the pricing loss function
ℓ(f(xt),yt)=f(xt)−yt1{yt≤f(xt)}ℓ(f(xt),yt)=f(xt)−yt1{yt≤f(xt)} we show a regret
bound of O(Td/(d+1))O(Td/(d+1)) and show that this bound is tight. We present
improved bounds in the special case of a population of linear buyers.
Chat is not available.