Timezone: »
We consider the problem of robust and adaptive model predictive control (MPC) of a linear system, with unknown parameters that are learned along the way (adaptive), in a critical setting where failures must be prevented (robust). This problem has been studied from different perspectives by different communities. However, the existing theory deals only with the case of quadratic costs (the LQ problem), which limits applications to stabilisation and tracking tasks only. In order to handle more general (non-convex) costs that naturally arise in many practical problems, we carefully select and bring together several tools from different communities, namely non-asymptotic linear regression, recent results in interval prediction, and tree-based planning. Combining and adapting the theoretical guarantees at each layer is non trivial, and we provide the first end-to-end suboptimality analysis for this setting. Interestingly, our analysis naturally adapts to handle many models and combines with a data-driven robust model selection strategy, which enables to relax the modelling assumptions. Last, we strive to preserve tractability at any stage of the method, that we illustrate on two challenging simulated environments.
Author Information
Edouard Leurent (INRIA)
PhD student in Reinforcement Learning, at: - INRIA SequeL project for sequential learning - INRIA Non-A project for finite-time control - Renault Group
Odalric-Ambrym Maillard (INRIA)
Denis Efimov (Inria)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Oral: Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs »
Thu Dec 10th 02:00 -- 02:15 PM Room Orals & Spotlights: Reinforcement Learning
More from the Same Authors
-
2020 Poster: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Spotlight: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Poster: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity »
Anders Jonsson · Emilie Kaufmann · Pierre Menard · Omar Darwiche Domingues · Edouard Leurent · Michal Valko -
2019 Poster: Budgeted Reinforcement Learning in Continuous State Space »
Nicolas Carrara · Edouard Leurent · Romain Laroche · Tanguy Urvoy · Odalric-Ambrym Maillard · Olivier Pietquin -
2019 Poster: Learning Multiple Markov Chains via Adaptive Allocation »
Mohammad Sadegh Talebi · Odalric-Ambrym Maillard -
2019 Poster: Regret Bounds for Learning State Representations in Reinforcement Learning »
Ronald Ortner · Matteo Pirotta · Alessandro Lazaric · Ronan Fruit · Odalric-Ambrym Maillard -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Oral: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2012 Poster: Online allocation and homogeneous partitioning for piecewise constant mean-approximation »
Alexandra Carpentier · Odalric-Ambrym Maillard -
2012 Poster: Hierarchical Optimistic Region Selection driven by Curiosity »
Odalric-Ambrym Maillard -
2011 Poster: Selecting the State-Representation in Reinforcement Learning »
Odalric-Ambrym Maillard · Remi Munos · Daniil Ryabko -
2011 Poster: Sparse Recovery with Brownian Sensing »
Alexandra Carpentier · Odalric-Ambrym Maillard · Remi Munos -
2010 Spotlight: LSTD with Random Projections »
Mohammad Ghavamzadeh · Alessandro Lazaric · Odalric-Ambrym Maillard · Remi Munos -
2010 Poster: LSTD with Random Projections »
Mohammad Ghavamzadeh · Alessandro Lazaric · Odalric-Ambrym Maillard · Remi Munos -
2010 Poster: Scrambled Objects for Least-Squares Regression »
Odalric-Ambrym Maillard · Remi Munos -
2009 Poster: Compressed Least-Squares Regression »
Odalric-Ambrym Maillard · Remi Munos