Poster
in
Workshop: Second Workshop on Quantum Tensor Networks in Machine Learning

Deep variational reinforcement learning by optimizing Hamiltonian equation

Zeliang Zhang ⋅ Xiao-Yang Liu

Abstract

Deep variational reinforcement learning by optimizing Hamiltonian equation is a novel training method in reinforcement learning. Liu \cite{liu2020vrl} proposed to maximize the Hamiltonian equation to obtain the policy network. In this poster, we apply the massively parallel simulation to sample trajectories (collecting information of the reward tensor) and train the deep policy network by maximizing a partial Hamiltonian equation. On the FrozenLake $8\times8$ and GridWorld $10\times10$ examples, we verify the theory in \cite{liu2020vrl} by showing that deep Hamiltonian network (DHN) for variational reinforcement learning is more stable and efficient than DQN \cite{mnih2013playing}. Our codes are available at:\href{https://github.com/AI4Finance-Foundation/Quantum-Tensor-Networks-for-Variational-Reinforcement-Learning-NeurIPS-2020}.

Chat is not available.