Timezone: »

Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Yiren Lu · Yiren Lu · Yiren Lu · Justin Fu · George Tucker · Xinlei Pan · Eli Bronstein · Rebecca Roelofs · Benjamin Sapp · Brandyn White · Aleksandra Faust · Shimon Whiteson · Dragomir Anguelov · Sergey Levine

Imitation learning (IL) is a simple and powerful way to use high-quality human driving data, which can be collected at scale, to identify driving preferences and produce human-like behavior. However, policies based on imitation learning alone often fail to sufficiently account for safety and reliability concerns. In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantially improve the safety and reliability of driving policies over those learned from imitation alone. In particular, we use imitation and reinforcement learning to train a policy on over 100k miles of urban driving data, and measure its effectiveness in test scenarios grouped by different levels of collision risk. To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.

Author Information

Yiren Lu (Google)
Yiren Lu (Waymo Research)
Yiren Lu (Waymo Research)
Justin Fu (Waymo Research)
George Tucker (Google Brain)
Xinlei Pan (Waymo Research)
Eli Bronstein (Waymo Research)
Rebecca Roelofs (Google)
Benjamin Sapp (Waymo)
Brandyn White (Waymo)
Aleksandra Faust (Google Brain)

Aleksandra Faust is a Senior Research Engineer at Google Brain, specializing in robot intelligence. Previously, Aleksandra led machine learning efforts for self-driving car planning and controls in Waymo and Google X, and was a researcher in Sandia National Laboratories, where she worked on satellites and other remote sensing applications. She earned a Ph.D. in Computer Science at the University of New Mexico (with distinction), a Master’s in Computer Science from University of Illinois at Urbana-Champaign, and a Bachelor’s in Mathematics from University of Belgrade, Serbia. Her research interests include reinforcement learning, adaptive motion planning, and machine learning for decision-making. Aleksandra won Tom L. Popejoy Award for the best doctoral dissertation at the University of New Mexico in Engineering, Mathematics, and Sciences in the period of 2011-2014. She was also awarded with the Best Paper in Service Robotics at ICRA 2018, Sandia National Laboratories’ Doctoral Studies Program and New Mexico Space Grant fellowships, as well as the Outstanding Graduate Student in Computer Science award. Her work has been featured in the New York Times.​

Shimon Whiteson (Waymo Research)
Dragomir Anguelov (Waymo)
Sergey Levine (Google)

More from the Same Authors