Timezone: »
Reinforcement learning (RL) for continuous control typically employs distributions whose support covers the entire action space. In this work, we investigate the colloquially known phenomenon that trained agents often prefer actions at the boundaries of that space. We draw theoretical connections to the emergence of bang-bang behavior in optimal control, and provide extensive empirical evaluation across a variety of recent RL algorithms. We replace the normal Gaussian by a Bernoulli distribution that solely considers the extremes along each action dimension - a bang-bang controller. Surprisingly, this achieves state-of-the-art performance on several continuous control benchmarks - in contrast to robotic hardware, where energy and maintenance cost affect controller choices. Since exploration, learning, and the final solution are entangled in RL, we provide additional imitation learning experiments to reduce the impact of exploration on our analysis. Finally, we show that our observations generalize to environments that aim to model real-world challenges and evaluate factors to mitigate the emergence of bang-bang solutions. Our findings emphasise challenges for benchmarking continuous control algorithms, particularly in light of potential real-world applications.
Author Information
Tim Seyde (MIT CSAIL)
Igor Gilitschenski (University of Toronto)
Wilko Schwarting (Massachusetts Institute of Technology)
Bartolomeo Stellato (Massachusetts Institute of Technology)
Martin Riedmiller (DeepMind)
Markus Wulfmeier (DeepMind)
Daniela Rus (Massachusetts Institute of Technology)
More from the Same Authors
-
2021 : Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration »
Oliver Groth · Markus Wulfmeier · Giulia Vezzani · Vibhavari Dasagi · Tim Hertweck · Roland Hafner · Nicolas Heess · Martin Riedmiller -
2021 : Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies »
Dushyant Rao · Fereshteh Sadeghi · Leonard Hasenclever · Markus Wulfmeier · Martina Zambelli · Giulia Vezzani · Dhruva Tirumala · Yusuf Aytar · Josh Merel · Nicolas Heess · Raia Hadsell -
2021 : Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation »
Todor Davchev · Oleg Sushkov · Jean-Baptiste Regli · Stefan Schaal · Yusuf Aytar · Markus Wulfmeier · Jonathan Scholz -
2021 : Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks »
Ryan Sander · Wilko Schwarting · Tim Seyde · Igor Gilitschenski · Sertac Karaman · Daniela Rus -
2021 : Strength Through Diversity: Robust Behavior Learning via Mixture Policies »
Tim Seyde · Wilko Schwarting · Igor Gilitschenski · Markus Wulfmeier · Daniela Rus -
2022 : PyHopper - A Plug-and-Play Hyperparameter Optimization Engine »
Mathias Lechner · Ramin Hasani · Sophie Neubauer · Philipp Neubauer · Daniela Rus -
2022 : Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap »
Mathias Lechner · Ramin Hasani · Alexander Amini · Tsun-Hsuan Johnson Wang · Thomas Henzinger · Daniela Rus -
2022 : Infrastructure-based End-to-End Learning and Prevention of Driver Failure »
Noam Buckman · Shiva Sreeram · Mathias Lechner · Yutong Ban · Ramin Hasani · Sertac Karaman · Daniela Rus -
2022 : Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks »
Sadhana Lolla · Iaroslav Elistratov · Alejandro Perez · Elaheh Ahmadi · Daniela Rus · Alexander Amini -
2022 : Fifteen-minute Competition Overview Video »
Nico Gürtler · Georg Martius · Pavel Kolev · Sebastian Blaes · Manuel Wuethrich · Markus Wulfmeier · Cansu Sancaktar · Martin Riedmiller · Arthur Allshire · Bernhard Schölkopf · Annika Buchholz · Stefan Bauer -
2022 : Infrastructure-based End-to-End Learning and Prevention of Driver Failure »
Noam Buckman · Shiva Sreeram · Mathias Lechner · Yutong Ban · Ramin Hasani · Sertac Karaman · Daniela Rus -
2022 : Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks »
Sadhana Lolla · Iaroslav Elistratov · Alejandro Perez · Elaheh Ahmadi · Daniela Rus · Alexander Amini -
2022 Workshop: 5th Robot Learning Workshop: Trustworthy Robotics »
Alex Bewley · Roberto Calandra · Anca Dragan · Igor Gilitschenski · Emily Hannigan · Masha Itkina · Hamidreza Kasaei · Jens Kober · Danica Kragic · Nathan Lambert · Julien PEREZ · Fabio Ramos · Ransalu Senanayake · Jonathan Tompson · Vincent Vanhoucke · Markus Wulfmeier -
2022 Competition: Real Robot Challenge III - Learning Dexterous Manipulation from Offline Data in the Real World »
Nico Gürtler · Georg Martius · Sebastian Blaes · Pavel Kolev · Cansu Sancaktar · Stefan Bauer · Manuel Wuethrich · Markus Wulfmeier · Martin Riedmiller · Arthur Allshire · Annika Buchholz · Bernhard Schölkopf -
2022 Poster: Efficient Dataset Distillation using Random Feature Approximation »
Noel Loo · Ramin Hasani · Alexander Amini · Daniela Rus -
2022 Poster: Evolution of Neural Tangent Kernels under Benign and Adversarial Training »
Noel Loo · Ramin Hasani · Alexander Amini · Daniela Rus -
2022 Poster: ActionSense: A Multimodal Dataset and Recording Framework for Human Activities Using Wearable Sensors in a Kitchen Environment »
Joseph DelPreto · Chao Liu · Yiyue Luo · Michael Foshey · Yunzhu Li · Antonio Torralba · Wojciech Matusik · Daniela Rus -
2021 : Panel A: Deployable Learning Algorithms for Embodied Systems »
Shuran Song · Martin Riedmiller · Nick Roy · Aude G Billard · Angela Schoellig · SiQi Zhou -
2021 : Reinforcement Learning in Real-World Control Systems »
Martin Riedmiller -
2021 Workshop: 4th Robot Learning Workshop: Self-Supervised and Lifelong Learning »
Alex Bewley · Masha Itkina · Hamidreza Kasaei · Jens Kober · Nathan Lambert · Julien PEREZ · Ransalu Senanayake · Vincent Vanhoucke · Markus Wulfmeier · Igor Gilitschenski -
2021 Poster: Accelerating Quadratic Optimization with Reinforcement Learning »
Jeffrey Ichnowski · Paras Jain · Bartolomeo Stellato · Goran Banjac · Michael Luo · Francesco Borrelli · Joseph Gonzalez · Ion Stoica · Ken Goldberg -
2021 Poster: Sparse Flows: Pruning Continuous-depth Models »
Lucas Liebenwein · Ramin Hasani · Alexander Amini · Daniela Rus -
2021 Poster: Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition »
Lucas Liebenwein · Alaa Maalouf · Dan Feldman · Daniela Rus -
2021 Poster: Causal Navigation by Continuous-time Neural Networks »
Charles Vorbach · Ramin Hasani · Alexander Amini · Mathias Lechner · Daniela Rus -
2020 Workshop: 3rd Robot Learning Workshop »
Masha Itkina · Alex Bewley · Roberto Calandra · Igor Gilitschenski · Julien PEREZ · Ransalu Senanayake · Markus Wulfmeier · Vincent Vanhoucke -
2020 Poster: Deep Evidential Regression »
Alexander Amini · Wilko Schwarting · Ava P Soleimany · Daniela Rus -
2019 : Towards Robust Interactive Autonomy »
Igor Gilitschenski -
2019 Workshop: Robot Learning: Control and Interaction in the Real World »
Roberto Calandra · Markus Wulfmeier · Kate Rakelly · Sanket Kamthe · Danica Kragic · Stefan Schaal · Markus Wulfmeier -
2019 Poster: Learning-In-The-Loop Optimization: End-To-End Control And Co-Design Of Soft Robots Through Learned Deep Latent Representations »
Andrew Spielberg · Allan Zhao · Yuanming Hu · Tao Du · Wojciech Matusik · Daniela Rus -
2018 Workshop: Infer to Control: Probabilistic Reinforcement Learning and Structured Control »
Leslie Kaelbling · Martin Riedmiller · Marc Toussaint · Igor Mordatch · Roy Fox · Tuomas Haarnoja -
2017 Workshop: Acting and Interacting in the Real World: Challenges in Robot Learning »
Ingmar Posner · Raia Hadsell · Martin Riedmiller · Markus Wulfmeier · Rohan Paul -
2016 Poster: Dimensionality Reduction of Massive Sparse Datasets Using Coresets »
Dan Feldman · Mikhail Volkov · Daniela Rus -
2015 Poster: Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images »
Manuel Watter · Jost Springenberg · Joschka Boedecker · Martin Riedmiller -
2014 Poster: Coresets for k-Segmentation of Streaming Data »
Guy Rosman · Mikhail Volkov · Dan Feldman · John Fisher III · Daniela Rus