Timezone: »

Sample Complexity Bounds for Iterative Stochastic Policy Optimization
Marin Kobilarov

Mon Dec 07 04:00 PM -- 08:59 PM (PST) @ 210 C #61 #None

This paper is concerned with robustness analysis of decision making under uncertainty. We consider a class of iterative stochastic policy optimization problems and analyze the resulting expected performance for each newly updated policy at each iteration. In particular, we employ concentration-of-measure inequalities to compute future expected cost and probability of constraint violation using empirical runs. A novel inequality bound is derived that accounts for the possibly unbounded change-of-measure likelihood ratio resulting from iterative policy adaptation. The bound serves as a high-confidence certificate for providing future performance or safety guarantees. The approach is illustrated with a simple robot control scenario and initial steps towards applications to challenging aerial vehicle navigation problems are presented.

Author Information

Marin Kobilarov (Johns Hopkins University)