NeurIPS Learning a Domain-Agnostic Policy through Adversarial Representation Matching for Cross-Domain Policy Transfer

Poster
in
Workshop: Deep Reinforcement Learning Workshop

Learning a Domain-Agnostic Policy through Adversarial Representation Matching for Cross-Domain Policy Transfer

Hayato Watahiki · Ryo Iwase · Ryosuke Unno · Yoshimasa Tsuruoka

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

The low transferability of learned policies is one of the most critical problems limiting the applicability of learning-based solutions to decision-making tasks. In this paper, we present a way to align latent representations of states and actions between different domains by optimizing an adversarial objective. We train two models, a policy and a domain discriminator, with unpaired trajectories of proxy tasks through behavioral cloning as well as adversarial training. After the latent representations are aligned between domains, a domain-agnostic part of the policy trained with any method in the source domain can be immediately transferred to the target domain in a zero-shot manner. We empirically show that our simple approach achieves comparable performance to the latest methods in zero-shot cross-domain transfer. We also observe that our method performs better than other approaches in transfer between domains with different complexities, whereas other methods fail catastrophically.

Chat is not available.

Poster in Workshop: Deep Reinforcement Learning Workshop

Learning a Domain-Agnostic Policy through Adversarial Representation Matching for Cross-Domain Policy Transfer

Hayato Watahiki · Ryo Iwase · Ryosuke Unno · Yoshimasa Tsuruoka

Poster
in
Workshop: Deep Reinforcement Learning Workshop