Skip to yearly menu bar Skip to main content


Poster Thu, Dec 4, 2025 • 11:00 AM – 2:00 PM PST

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

Jin Zhou ⋅ Kaiwen Wang ⋅ Jonathan Chang ⋅ Zhaolin Gao ⋅ Nathan Kallus ⋅ Kilian Weinberger ⋅ Kianté Brantley ⋅ Wen Sun

Abstract

Video

Chat is not available.