Skip to yearly menu bar Skip to main content


Poster

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

Jin Zhou ⋅ Kaiwen Wang ⋅ Jonathan Chang ⋅ Zhaolin Gao ⋅ Nathan Kallus ⋅ Kilian Weinberger ⋅ Kianté Brantley ⋅ Wen Sun
2025 Poster

Abstract

Video

Chat is not available.