Connecting Frameworks for Unsupervised Learning of Temporal Distance-Based Representations
John L Zhou · Shreyas Kaasyap · Jonathan Kao
Keywords:
unsupervised reinforcement learning
representation learning
eigenoptions
temporal distance
Abstract
Unsupervised reinforcement learning (URL) seeks to learn generally useful representations and behaviors in environments without extrinsic reward signals. In this work, we compare two approaches to this problem that seek to learn representations that preserve temporal distances. We draw a theoretical connection between eigenoptions, an unsupervised options learning objective derived from spectral graph theory, to more recent frameworks for jointly learning options and temporal distance-based metric spaces. We discuss potential implications of the differences between the two approaches and show empirical support for our hypotheses in a simple grid-based environment.
Chat is not available.
Successful Page Load