firstbacksecondback
16 Results
Poster
|
Thu 16:30 |
GTBench: Uncovering the Strategic Reasoning Capabilities of LLMs via Game-Theoretic Evaluations Jinhao Duan · Renming Zhang · James Diffenderfer · Bhavya Kailkhura · Lichao Sun · Elias Stengel-Eskin · Mohit Bansal · Tianlong Chen · Kaidi Xu |
|
Workshop
|
Sat 15:45 |
A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach |
|
Workshop
|
Evaluating Generative AI Systems is a Social Science Measurement Challenge Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs |
||
Workshop
|
Sat 15:45 |
Evaluating Generative AI Systems is a Social Science Measurement Challenge Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs |