Skip to yearly menu bar Skip to main content


Poster
in
Affinity Event: Women in Machine Learning

Evaluating AI Agent Persuasion of Safety Monitors

Jennifer Za ⋅ Julija Bainiaksina ⋅ Nikita Ostrovsky ⋅ Tanush Chopra ⋅ Victoria Krakovna

Abstract

Chat is not available.