firstbacksecondback
113 Results
Poster
|
Thu 11:00 |
Apathetic or Empathetic? Evaluating LLMs' Emotional Alignments with Humans Jen-Tse Huang · Man Ho LAM · Eric John Li · Shujie Ren · Wenxuan Wang · Wenxiang Jiao · Zhaopeng Tu · Michael R Lyu |
|
Workshop
|
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y |
||
Workshop
|
Rethinking Artistic Copyright Infringements in the Era of Text-to-Image Generative Models Mazda Moayeri · Samyadeep Basu · Sriram Balasubramanian · Priyatham Kattakinda · Atoosa Chegini · Robert Brauneis · Soheil Feizi |
||
Workshop
|
Dimensions of Generative AI Evaluation Design Alex Dow · Jennifer Wortman Vaughan · Solon Barocas · Chad Atalla · Alexandra Chouldechova · Hanna Wallach |
||
Workshop
|
Approximations may be all you need: Towards Pre-training LLMs with Low-Rank Decomposition and Optimizers Namrata Shivagunde · Mayank Kulkarni · Giannis Karamanolakis · Jack FitzGerald · Yannick Versley · Saleh Soltan · Volkan Cevher · Jianhua Lu · Anna Rumshisky |
||
Workshop
|
Had enough of experts? Elicitation and evaluation of Bayesian priors from large language models David Antony Selby · Kai Spriestersbach · Yuichiro Iwashita · Dennis Bappert · Archana Warrier · Sumantrak Mukherjee · Muhammad Asim · Koichi Kise · Sebastian Vollmer |
||
Poster
|
Wed 11:00 |
Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics Lukas Klein · Carsten Lüth · Udo Schlegel · Till Bungert · Mennatallah El-Assady · Paul Jaeger |
|
Workshop
|
Sat 15:45 |
A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach |
|
Workshop
|
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models Mengfei Liang · Archish Arun · Zekun Wu · CRISTIAN VILLALOBOS · Jonathan Lutch · Emre Kazim · Adriano Koshiyama · Philip Treleaven |
||
Workshop
|
Sat 15:45 |
Evaluating Generative AI Systems is a Social Science Measurement Challenge Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs |
|
Poster
|
Fri 11:00 |
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs Irene Huang · Wei Lin · Muhammad Jehanzeb Mirza · Jacob Hansen · Sivan Doveh · Victor Butoi · Roei Herzig · Assaf Arbelle · Hilde Kuehne · Trevor Darrell · Chuang Gan · Aude Oliva · Rogerio Feris · Leonid Karlinsky |
|
Workshop
|
Evaluating Generative AI Systems is a Social Science Measurement Challenge Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs |