Timezone: »

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Zixiang Chen · Chris Junchi Li · Angela Yuan · Quanquan Gu · Michael Jordan
Event URL: https://openreview.net/forum?id=8WN1GSIJf6U »
With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning problems. In this paper, we propose a unified framework that integrates both model-based and model-free reinforcement learning and subsumes nearly all Markov decision process (MDP) models in the existing literature for tractable RL. We propose a novel estimation function with decomposable structural properties for optimization-based exploration and use the functional Eluder dimension with respect to an admissible Bellman characterization function as a complexity measure of the model class. Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed, achieving regret bounds that match or improve over the best-known results for a variety of MDP models. In particular, for MDPs with low Witness rank, under a slightly stronger assumption, OPERA improves the state-of-the-art sample complexity results by a factor of $dH$. Our framework provides a generic interface to study and design new RL models and algorithms.

Author Information

Zixiang Chen (UCLA)
Chris Junchi Li (UC Berkeley)
Angela Yuan (University of California, Los Angeles)
Quanquan Gu (UCLA)
Michael Jordan (UC Berkeley)

More from the Same Authors