Timezone: »

Planning with General Objective Functions: Going Beyond Total Rewards
Ruosong Wang · Peilin Zhong · Simon Du · Russ Salakhutdinov · Lin Yang

Tue Dec 08 09:00 PM -- 11:00 PM (PST) @ Poster Session 2 #600
Standard sequential decision-making paradigms aim to maximize the cumulative reward when interacting with the unknown environment., i.e., maximize $\sum_{h = 1}^H r_h$ where $H$ is the planning horizon. However, this paradigm fails to model important practical applications, e.g., safe control that aims to maximize the lowest reward, i.e., maximize $\min_{h= 1}^H r_h$. In this paper, based on techniques in sketching algorithms, we propose a novel planning algorithm in deterministic systems which deals with a large class of objective functions of the form $f(r_1, r_2, ... r_H)$ that are of interest to practical applications. We show that efficient planning is possible if $f$ is symmetric under permutation of coordinates and satisfies certain technical conditions. Complementing our algorithm, we further prove that removing any of the conditions will make the problem intractable in the worst case and thus demonstrate the necessity of our conditions.

Author Information

Ruosong Wang (Carnegie Mellon University)
Peilin Zhong (Columbia University)
Simon Du (Institute for Advanced Study)
Russ Salakhutdinov (Carnegie Mellon University)
Lin Yang (UCLA)

More from the Same Authors