Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Multi-Turn Interactions in Large Language Models
Sat, Dec 6, 2025 • 10:30 AM – 11:30 AM PST

Language Models Rate Their Own Actions As Safer

Dipika Khullar · Jack Hopkins · Rowan Wang · Fabien Roger

Abstract

Chat is not available.