Timezone: »
Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer in online learning, notably in the multi-arm bandit framework, where the objective is to minimize the cumulative regret over a sequence of tasks by incrementally transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for the estimation of the possible tasks and derive regret bounds for it.
Author Information
Mohammad Gheshlaghi azar (CMU)
Alessandro Lazaric (INRIA)
Emma Brunskill (CMU)
More from the Same Authors
-
2021 Spotlight: Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret »
Jean Tarbouriech · Runlong Zhou · Simon Du · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 Spotlight: A Provably Efficient Sample Collection Strategy for Reinforcement Learning »
Jean Tarbouriech · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 : Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning »
Denis Yarats · Rob Fergus · Alessandro Lazaric · Lerrel Pinto -
2021 Poster: Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret »
Jean Tarbouriech · Runlong Zhou · Simon Du · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 Poster: A Provably Efficient Sample Collection Strategy for Reinforcement Learning »
Jean Tarbouriech · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 Poster: Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2018 Poster: Fighting Boredom in Recommender Systems with Linear Reinforcement Learning »
Romain WARLOP · Alessandro Lazaric · Jérémie Mary -
2017 Poster: Regret Minimization in MDPs with Options without Prior Knowledge »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Emma Brunskill -
2017 Spotlight: Regret Minimization in MDPs with Options without Prior Knowledge »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Emma Brunskill -
2016 : Learning to improve learning: ML in the classroom »
Emma Brunskill -
2016 Poster: (Withdrawn)Only H is left: Near-tight Episodic PAC RL »
Christoph Dann · Emma Brunskill -
2015 Poster: Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning »
Christoph Dann · Emma Brunskill -
2014 Workshop: Novel Trends and Applications in Reinforcement Learning »
Csaba Szepesvari · Marc Deisenroth · Sergey Levine · Pedro Ortega · Brian Ziebart · Emma Brunskill · Naftali Tishby · Gerhard Neumann · Daniel Lee · Sridhar Mahadevan · Pieter Abbeel · David Silver · Vicenç Gómez -
2013 Workshop: New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks »
Urun Dogan · Marius Kloft · Tatiana Tommasi · Francesco Orabona · Massimiliano Pontil · Sinno Jialin Pan · Shai Ben-David · Arthur Gretton · Fei Sha · Marco Signoretto · Rajhans Samdani · Yun-Qian Miao · Mohammad Gheshlaghi azar · Ruth Urner · Christoph Lampert · Jonathan How