Timezone: »

 
Transfer RL across Observation Feature Spaces via Model-Based Regularization
Yanchao Sun · Ruijie Zheng · Xiyao Wang · Andrew Cohen · Furong Huang
Event URL: https://openreview.net/forum?id=Ut-xGkKUlm »

In many reinforcement learning (RL) applications, the observation space is specified by human developers and restricted by physical realizations, and may thus be subject to dramatic changes over time (e.g. increased number of observable features). However, when the observation space changes, the previous policy usually fails due to the mismatch of input features, and therefore one has to train another policy from scratch, which is computationally and sample inefficient. In this paper, we propose a novel algorithm that extracts the latent-space dynamics in the source task, and transfers the dynamics model to the target task with a model-based regularizer. Theoretical analysis shows that the transferred dynamics model helps with representation learning in the target task. Our algorithm works for drastic changes of observation space (e.g. from vector-based observation to image-based observation), without any inter-task mapping or any prior knowledge of the target task. Empirical results have justified that our algorithm significantly improves the efficiency and stability of learning in the target task.

Author Information

Yanchao Sun (University of Maryland, College Park)
Ruijie Zheng (University of Maryland, College Park)
Xiyao Wang (Center for Research on Intelligent System and Engineering, Institute of Automation, CAS, University of Chinese Academy of Sciences)
Andrew Cohen (Unity Technologies)
Furong Huang (University of Maryland)

Furong Huang is an assistant professor of computer science. Huang’s research focuses on machine learning, high-dimensional statistics and distributed algorithms—both the theoretical analysis and practical implementation of parallel spectral methods for latent variable graphical models. Some applications of her research include developing fast detection algorithms to discover hidden and overlapping user communities in social networks, learning convolutional sparse coding models for understanding semantic meanings of sentences and object recognition in images, healthcare analytics by learning a hierarchy on human diseases for guiding doctors to identify potential diseases afflicting patients, and more. Huang recently completed a postdoctoral position at Microsoft Research in New York.

More from the Same Authors