Timezone: »

 
Poster
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes
Yi Tian · Jian Qian · Suvrit Sra

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #501

We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs), which are MDPs with conditionally independent transition components. Assuming the factorization is known, we propose two model-based algorithms. The first one achieves minimax optimal regret guarantees for a rich class of factored structures, while the second one enjoys better computational complexity with a slightly worse regret. A key new ingredient of our algorithms is the design of a bonus term to guide exploration. We complement our algorithms by presenting several structure dependent lower bounds on regret for FMDPs that reveal the difficulty hiding in the intricacy of the structures.

Author Information

Yi Tian (MIT)
Jian Qian (MIT)
Suvrit Sra (MIT)

Suvrit Sra is a faculty member within the EECS department at MIT, where he is also a core faculty member of IDSS, LIDS, MIT-ML Group, as well as the statistics and data science center. His research spans topics in optimization, matrix theory, differential geometry, and probability theory, which he connects with machine learning --- a key focus of his research is on the theme "Optimization for Machine Learning” (http://opt-ml.org)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors