Timezone: »

Stable Dual Dynamic Programming
Tao Wang · Daniel Lizotte · Michael Bowling · Dale Schuurmans

Tue Dec 04 05:20 PM -- 05:30 PM (PST) @

Recently, a novel approach to dynamic programming and reinforcement learning has been proposed based on maintaining explicit representations of stationary distributions instead of value functions. The convergence properties and practical effectiveness of these algorithms have not been previously studied however. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

Author Information

Tao Wang (Australian National University / University of Alberta)
Daniel Lizotte (The University of Western Ontario)
Michael Bowling (DeepMind / University of Alberta)
Dale Schuurmans (Google Brain & University of Alberta)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors