Timezone: »
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation that provides strong a priori guarantees. In particular, it provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NP hard, but this is unavoidable because the Bellman-residual minimization itself is NP hard. We, therefore, employ and analyze a common approximate algorithm for bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on a simple benchmark problem.
Author Information
Marek Petrik (IBM)
Shlomo Zilberstein (Univ of Mass)
Related Events (a corresponding poster, oral, or spotlight)
-
2009 Spotlight: Robust Value Function Approximation Using Bilinear Programming »
Tue. Dec 8th 11:25 -- 11:26 PM Room
More from the Same Authors
-
2014 Poster: Stochastic Network Design in Bidirected Trees »
Xiaojian Wu · Daniel Sheldon · Shlomo Zilberstein -
2010 Poster: MAP Estimation for Graphical Models by Likelihood Maximization »
Akshat Kumar · Shlomo Zilberstein -
2009 Poster: Complexity of Decentralized Control: Special Cases »
Martin Allen · Shlomo Zilberstein -
2008 Poster: Biasing Approximate Dynamic Programming with a Lower Discount Factor »
Marek Petrik · Bruno Scherrer