Timezone: »

Continual Learning of Control Primitives : Skill Discovery via Reset-Games
Kelvin Xu · Siddharth Verma · Chelsea Finn · Sergey Levine

Thu Dec 10 09:00 AM -- 11:00 AM (PST) @ Poster Session 5 #607

Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed. First, in real world settings, when an agent attempts a tasks and fails, the environment must somehow "reset" so that the agent can attempt the task again. While easy in simulation, this could require considerable human effort in the real world, especially if the number of trials is very large. Second, real world learning is often limited by challenges in exploration, as complex, temporally extended behavior is often times difficult to acquire with random exploration. In this work, we show how a single method can allow an agent to acquire skills with minimal supervision while removing the need for resets. We do this by exploiting the insight that the need to reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set ofreset-skills." We propose a general-sum game formulation that naturally balances the objective of resetting and learning skills, and demonstrate that this approach improves performance on reset-free tasks, and additionally show that the skills we obtain can be used to significantly accelerate downstream learning.

Author Information

Kelvin Xu (UC Berkeley)
Siddharth Verma (UC Berkeley)
Chelsea Finn (Stanford)
Sergey Levine (UC Berkeley)
Sergey Levine

Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as applications in other decision-making domains. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more

More from the Same Authors