Timezone: »

Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
Alessandro Lazaric · Marcello Restelli · Andrea Bonarini

Tue Dec 04 05:20 PM -- 05:30 PM (PST) @

Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforcement Learning algorithms to continuous state problems, the same techniques can be hardly extended to continuous action spaces, where, besides the computation of a good approximation of the value function, a fast method for the identification of the highest-valued action is needed. In this paper, we propose a novel actor-critic approach in which the policy of the actor is estimated through sequential Monte Carlo methods. The importance sampling step is performed on the basis of the values learned by the critic, while the resampling step modifies the actor's policy. The proposed approach has been empirically compared to other learning algorithms into several domains; in this paper, we report results obtained in a control problem consisting of steering a boat across a river.

Author Information

Alessandro Lazaric (Facebook Artificial Intelligence Research)
Marcello Restelli (Politecnico di Milano)
Andrea Bonarini (AI&Robotics Lab - Politecnico di Milano)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors