Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Optimization for ML Workshop

Fast Convergence of Softmax Policy Mirror Ascent for Bandits & Tabular MDPs

Reza Asad ⋅ Reza Babanezhad Harikandeh ⋅ Issam Hadj Laradji ⋅ Nicolas Le Roux ⋅ Sharan Vaswani

Abstract

Chat is not available.