Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

28 Results

<<   <   Page 2 of 3   >   >>
Workshop
Sat 15:45 A Framework for Evaluating LLMs Under Task Indeterminacy
Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova
Workshop
Sat 15:45 A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts
Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach
Workshop
A Framework for Evaluating LLMs Under Task Indeterminacy
Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova
Workshop
Sat 15:45 Evaluating Generative AI Systems is a Social Science Measurement Challenge
Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs
Workshop
Evaluating Generative AI Systems is a Social Science Measurement Challenge
Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs
Workshop
Flood Prediction in Kenya - Leveraging Pre-Trained Models to Generate More Validation Data in a Sparse Observation Settings
Alim Karimi · David Quispe · Hammed Akande · Nicole Mongare · Valerie Brosnan · Asbina Baral
Workshop
Multimodal Auto Validation For Self-Refinement in Web Agents
Ruhana Azam · Tamer Abuelsaad · Aditya Vempaty · Ashish Jagmohan
Workshop
Generating and Validating Agent and Environment Code for Simulating Realistic Personality Profiles with Large Language Models
Nathan Cloos · M Ganesh Kumar · Adam Manoogian · Christopher Cueva · Shawn Rhoads
Workshop
Sun 14:00 Invited talk: Valid scientific inference with neural density estimators and generative models
Ann Lee
Competition
Sun 10:45 Compilation and Validation of the Weather Event Dataset
Aleksandra Gruca
Workshop
Sat 12:00 Statistically Valid Information Bottleneck via Multiple Hypothesis Testing
Amirmohammad Farzaneh · Osvaldo Simeone
Workshop
Sat 15:45 Estimating and Correcting for Misclassification Error in Empirical Textual Research
Jonathan Choi