Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

113 Results

<<   <   Page 9 of 10   >   >>
Poster
Thu 11:00 Apathetic or Empathetic? Evaluating LLMs' Emotional Alignments with Humans
Jen-Tse Huang · Man Ho LAM · Eric John Li · Shujie Ren · Wenxuan Wang · Wenxiang Jiao · Zhaopeng Tu · Michael R Lyu
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y
Workshop
Rethinking Artistic Copyright Infringements in the Era of Text-to-Image Generative Models
Mazda Moayeri · Samyadeep Basu · Sriram Balasubramanian · Priyatham Kattakinda · Atoosa Chegini · Robert Brauneis · Soheil Feizi
Workshop
Dimensions of Generative AI Evaluation Design
Alex Dow · Jennifer Wortman Vaughan · Solon Barocas · Chad Atalla · Alexandra Chouldechova · Hanna Wallach
Workshop
Approximations may be all you need: Towards Pre-training LLMs with Low-Rank Decomposition and Optimizers
Namrata Shivagunde · Mayank Kulkarni · Giannis Karamanolakis · Jack FitzGerald · Yannick Versley · Saleh Soltan · Volkan Cevher · Jianhua Lu · Anna Rumshisky
Workshop
Had enough of experts? Elicitation and evaluation of Bayesian priors from large language models
David Antony Selby · Kai Spriestersbach · Yuichiro Iwashita · Dennis Bappert · Archana Warrier · Sumantrak Mukherjee · Muhammad Asim · Koichi Kise · Sebastian Vollmer
Poster
Wed 11:00 Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics
Lukas Klein · Carsten Lüth · Udo Schlegel · Till Bungert · Mennatallah El-Assady · Paul Jaeger
Workshop
Sat 15:45 A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts
Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach
Workshop
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models
Mengfei Liang · Archish Arun · Zekun Wu · CRISTIAN VILLALOBOS · Jonathan Lutch · Emre Kazim · Adriano Koshiyama · Philip Treleaven
Workshop
Sat 15:45 Evaluating Generative AI Systems is a Social Science Measurement Challenge
Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs
Poster
Fri 11:00 ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
Irene Huang · Wei Lin · Muhammad Jehanzeb Mirza · Jacob Hansen · Sivan Doveh · Victor Butoi · Roei Herzig · Assaf Arbelle · Hilde Kuehne · Trevor Darrell · Chuang Gan · Aude Oliva · Rogerio Feris · Leonid Karlinsky
Workshop
Evaluating Generative AI Systems is a Social Science Measurement Challenge
Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs