Contributed talks
in
Workshop: Adaptive and Scalable Nonparametric Methods in Machine Learning
Yunpeng Pan, Xinyan Yan, Evangelos Theodorou, Byron Boots. Solving the Linear Bellman Equation via Kernel Embeddings and Stochastic Gradient Descent.
We introduce a data-efficient approach for solving the linear Bellman equation, which corresponds to a class of Markov decision processes (MDPs) and stochastic optimal control (SOC) problems. We show that this class of control problem can be reformulated as a stochastic composition optimization problem, which can be further reformulated as a saddle point problem and solved via dual kernel embeddings. Our method is model-free and using only one sample per state transition from stochastic dynamical systems. Different from related work such as Z-learning based on temporal-difference learning, our method is an on-line algorithm exploiting stochastic optimization. Numerical results are provided, showing that our method outperforms the Z-learning algorithm.
Live content is unavailable. Log in and register to view live content