Timezone: »

Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
Gunnar Kedenburg · Raphael Fonteneau · Remi Munos

Thu Dec 05 07:00 PM -- 11:59 PM (PST) @ Harrah's Special Events Center, 2nd Floor

This paper addresses the problem of online planning in Markov Decision Processes using only a generative model. We propose a new algorithm which is based on the construction of a forest of single successor state planning trees. For every explored state-action, such a tree contains exactly one successor state, drawn from the generative model. The trees are built using a planning algorithm which follows the optimism in the face of uncertainty principle, in assuming the most favorable outcome in the absence of further information. In the decision making step of the algorithm, the individual trees are combined. We discuss the approach, prove that our proposed algorithm is consistent, and empirically show that it performs better than a related algorithm which additionally assumes the knowledge of all transition distributions.

Author Information

Gunnar Kedenburg
Raphael Fonteneau (Université de Liège)
Remi Munos (Google DeepMind)

More from the Same Authors