Timezone: »
We study online learning for optimal allocation when the resource to be allocated is time. An agent receives task proposals sequentially according to a Poisson process and can either accept or reject a proposed task. If she accepts the proposal, she is busy for the duration of the task and obtains a reward that depends on the task duration. If she rejects it, she remains on hold until a new task proposal arrives. We study the regret incurred by the agent first when she knows her reward function but does not know the distribution of the task duration, and then when she does not know her reward function, either. Faster rates are finally obtained by adding structural assumptions on the distribution of rides or on the reward function. This natural setting bears similarities with contextual (one-armed) bandits, but with the crucial difference that the normalized reward associated to a context depends on the whole distribution of contexts.
Author Information
Etienne Boursier (EPFL)
Tristan Garrec (Bifröst)
Vianney Perchet (ENSAE & Criteo AI Lab)
Marco Scarsini (Luiss)
More from the Same Authors
-
2021 Spotlight: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Spotlight: Decentralized Learning in Online Queuing Systems »
Flore Sentenac · Etienne Boursier · Vianney Perchet -
2023 Poster: Penalising the biases in norm regularisation enforces sparsity »
Etienne Boursier · Nicolas Flammarion -
2022 Poster: Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs »
Etienne Boursier · Loucas PILLAUD-VIVIEN · Nicolas Flammarion -
2021 Poster: Local Differential Privacy for Regret Minimization in Reinforcement Learning »
Evrard Garcelon · Vianney Perchet · Ciara Pike-Burke · Matteo Pirotta -
2021 Poster: ROI Maximization in Stochastic Online Decision-Making »
Nicolò Cesa-Bianchi · Tom Cesari · Yishay Mansour · Vianney Perchet -
2021 Poster: Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm »
Nathan Noiry · Vianney Perchet · Flore Sentenac -
2021 Poster: Decentralized Learning in Online Queuing Systems »
Flore Sentenac · Etienne Boursier · Vianney Perchet -
2020 Poster: Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits »
Pierre Perrault · Etienne Boursier · Michal Valko · Vianney Perchet -
2019 Poster: SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits »
Etienne Boursier · Vianney Perchet -
2019 Spotlight: SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits »
Etienne Boursier · Vianney Perchet -
2017 Poster: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe »
Quentin Berthet · Vianney Perchet -
2017 Spotlight: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe »
Quentin Berthet · Vianney Perchet