Workshop: Deployable Decision Making in Embodied Systems (DDM)
Reward-Based Environment States for Robot Manipulation Policy Learning
Isabelle Ferrane · Heriberto Cuayahuitl
Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success classifier. Our experiments---using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task---reveal that our proposed state representation can achieve up to 97\% task success using our best policies.