Timezone: »
Most online reinforcement learning (RL) algorithms require a large number of interactions with the environment to learn a reliable control policy. Unfortunately, the assumption of the availability of repeated interactions with the environment does not hold for many real-world applications. Batch RL aims to learn a good control policy from a previously collected dataset without requiring additional interactions with the environment, which are very promising in solving real-world problems. However, in the real world, we may only have a limited amount of data points for certain tasks we are interested in. Also, most of the current batch RL methods are mainly aimed to learn policy over one fixed dataset with which it is hard to learn a policy that can perform well over multiple tasks. In this work, we propose to tackle these challenges with sample transfer and policy distillation. The proposed methods are evaluated on multiple control tasks to showcase their effectiveness.
Author Information
Di Wu (Samsung Electronics)
David Meger (McGill University)
Michael Jenkin (York University)
Steve Liu (Samsung Electronics Canada)
Gregory Dudek (Samsung Electronics Canada)
More from the Same Authors
-
2022 : A Study of Human-Robot Handover through Human-Human Object Transfer »
Charlotte Morissette · Bobak Baghi · Francois Hogan · Gregory Dudek -
2021 Poster: Active 3D Shape Reconstruction from Vision and Touch »
Edward Smith · David Meger · Luis Pineda · Roberto Calandra · Jitendra Malik · Adriana Romero Soriano · Michal Drozdzal