Timezone: »
We study online control of time-varying linear systems with unknown dynamics in the nonstochastic control model. At a high level, we demonstrate that this setting is \emph{qualitatively harder} than that of either unknown time-invariant or known time-varying dynamics, and complement our negative results with algorithmic upper bounds in regimes where sublinear regret is possible. More specifically, we study regret bounds with respect to common classes of policies: Disturbance Action (SLS), Disturbance Response (Youla), and linear feedback policies. While these three classes are essentially equivalent for LTI systems, we demonstrate that these equivalences break down for time-varying systems. We prove a lower bound that no algorithm can obtain sublinear regret with respect to the first two classes unless a certain measure of system variability also scales sublinearly in the horizon. Furthermore, we show that offline planning over the state linear feedback policies is NP-hard, suggesting hardness of the online learning problem. On the positive side, we give an efficient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term. In fact, our algorithm enjoys sublinear \emph{adaptive} regret bounds, which is a strictly stronger metric than standard regret and is more appropriate for time-varying systems. We sketch extensions to Disturbance Action policies and partial observation, and propose an inefficient algorithm for regret against linear state feedback policies.
Author Information
Edgar Minasyan (Princeton University)
Paula Gradu (UC Berkeley)
Max Simchowitz (MIT)
Elad Hazan (Princeton University)
More from the Same Authors
-
2021 Spotlight: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2021 : Exploration and Incentives in Reinforcement Learning »
Max Simchowitz · Aleksandrs Slivkins -
2021 : Exploration and Incentives in Reinforcement Learning »
Max Simchowitz · Aleksandrs Slivkins -
2022 : Learning to Extrapolate: A Transductive Approach »
Aviv Netanyahu · Abhishek Gupta · Max Simchowitz · Kaiqing Zhang · Pulkit Agrawal -
2022 : Valid Inference after Causal Discovery »
Paula Gradu · Tijana Zrnic · Yixin Wang · Michael Jordan -
2022 Poster: Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions »
Adam Block · Max Simchowitz -
2022 Poster: Globally Convergent Policy Search for Output Estimation »
Jack Umenberger · Max Simchowitz · Juan Perdomo · Kaiqing Zhang · Russ Tedrake -
2021 : Spotlight 1: Exploration and Incentives in Reinforcement Learning »
Max Simchowitz · Aleksandrs Slivkins -
2021 Poster: Multiclass Boosting and the Cost of Weak Learning »
Nataly Brukhim · Elad Hazan · Shay Moran · Indraneel Mukherjee · Robert Schapire -
2021 Poster: Stabilizing Dynamical Systems via Policy Gradient Methods »
Juan Perdomo · Jack Umenberger · Max Simchowitz -
2021 Poster: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2020 : Oral 03: DELUCA - Differentiable control library - environments, methods, and benchmarking »
Paula Gradu -
2020 Poster: Geometric Exploration for Online Control »
Orestis Plevrakis · Elad Hazan -
2020 Poster: Non-Stochastic Control with Bandit Feedback »
Paula Gradu · John Hallman · Elad Hazan -
2020 Poster: Making Non-Stochastic Control (Almost) as Easy as Stochastic »
Max Simchowitz -
2020 Poster: Online Agnostic Boosting via Regret Minimization »
Nataly Brukhim · Xinyi Chen · Elad Hazan · Shay Moran -
2020 Poster: Learning the Linear Quadratic Regulator from Nonlinear Observations »
Zakaria Mhammedi · Dylan Foster · Max Simchowitz · Dipendra Misra · Wen Sun · Akshay Krishnamurthy · Alexander Rakhlin · John Langford -
2020 Poster: Constrained episodic reinforcement learning in concave-convex and knapsack settings »
Kianté Brantley · Miro Dudik · Thodoris Lykouris · Sobhan Miryoosefi · Max Simchowitz · Aleksandrs Slivkins · Wen Sun -
2019 : Logarithmic Regret for Online Control »
Naman Agarwal · Elad Hazan · Karan Singh -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 Poster: Private Learning Implies Online Learning: An Efficient Reduction »
Alon Gonen · Elad Hazan · Shay Moran -
2019 Spotlight: Private Learning Implies Online Learning: An Efficient Reduction »
Alon Gonen · Elad Hazan · Shay Moran -
2019 Poster: Logarithmic Regret for Online Control »
Naman Agarwal · Elad Hazan · Karan Singh -
2019 Poster: Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs »
Max Simchowitz · Kevin Jamieson -
2019 Oral: Logarithmic Regret for Online Control »
Naman Agarwal · Elad Hazan · Karan Singh -
2018 Poster: Online Improper Learning with an Approximation Oracle »
Elad Hazan · Wei Hu · Yuanzhi Li · Zhiyuan Li -
2018 Poster: Online Learning of Quantum States »
Scott Aaronson · Xinyi Chen · Elad Hazan · Satyen Kale · Ashwin Nayak -
2018 Poster: Spectral Filtering for General Linear Dynamical Systems »
Elad Hazan · Holden Lee · Karan Singh · Cyril Zhang · Yi Zhang -
2018 Oral: Spectral Filtering for General Linear Dynamical Systems »
Elad Hazan · Holden Lee · Karan Singh · Cyril Zhang · Yi Zhang -
2017 Poster: Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls »
Zeyuan Allen-Zhu · Elad Hazan · Wei Hu · Yuanzhi Li -
2017 Poster: Learning Linear Dynamical Systems via Spectral Filtering »
Elad Hazan · Karan Singh · Cyril Zhang -
2017 Spotlight: Online Learning of Linear Dynamical Systems »
Elad Hazan · Karan Singh · Cyril Zhang -
2017 Spotlight: Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls »
Zeyuan Allen-Zhu · Elad Hazan · Wei Hu · Yuanzhi Li -
2016 Poster: Optimal Black-Box Reductions Between Optimization Objectives »
Zeyuan Allen-Zhu · Elad Hazan -
2016 Poster: A Non-generative Framework and Convex Relaxations for Unsupervised Learning »
Elad Hazan · Tengyu Ma -
2016 Poster: The Limits of Learning with Missing Data »
Brian Bullins · Elad Hazan · Tomer Koren -
2015 Poster: Online Learning for Adversaries with Memory: Price of Past Mistakes »
Oren Anava · Elad Hazan · Shie Mannor -
2015 Poster: Beyond Convexity: Stochastic Quasi-Convex Optimization »
Elad Hazan · Kfir Y. Levy · Shai Shalev-Shwartz -
2015 Poster: Online Gradient Boosting »
Alina Beygelzimer · Elad Hazan · Satyen Kale · Haipeng Luo -
2009 Poster: On Stochastic and Worst-case Models for Investing »
Elad Hazan · Satyen Kale -
2009 Oral: On Stochastic and Worst-case Models for Investing »
Elad Hazan · Satyen Kale -
2009 Poster: An Efficient Interior-Point Method for Minimum-Regret Learning in Online Convex Optimization »
Elad Hazan · Nimrod Megiddo -
2009 Spotlight: An Efficient Interior-Point Method for Minimum-Regret Learning in Online Convex Optimization »
Elad Hazan · Nimrod Megiddo -
2009 Poster: Beyond Convexity: Online Submodular Minimization »
Elad Hazan · Satyen Kale -
2007 Oral: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin -
2007 Poster: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin -
2007 Poster: Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria »
Elad Hazan · Satyen Kale