Skip to yearly menu bar Skip to main content


Poster

Position: Benchmarking is Broken - Don't Let AI be Its Own Judge

Zerui Cheng ⋅ Stella Wohnig ⋅ Ruchika Gupta ⋅ Samiul Alam ⋅ Tassallah Abdullahi ⋅ João Alves Ribeiro ⋅ Christian Nielsen-Garcia ⋅ Saif Mir ⋅ Siran Li ⋅ Jason Orender ⋅ Seyed Ali Bahrainian ⋅ Daniel Kirste ⋅ Aaron Gokaslan ⋅ Carsten Eickhoff ⋅ Ruben Wolff
2025 Poster

Abstract

Video

Chat is not available.