Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Reliable ML from Unreliable Data
Sat, Dec 6, 2025 • 4:00 PM – 5:00 PM PST

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

Dani Roytburg · Matthew Nguyen · Matthew Bozoukov · Hongyu Fu · Jou Barzdukas · Narmeen Oozeer

Abstract

Chat is not available.