Skip to yearly menu bar Skip to main content


Search All 2023 Events
 

188 Results

<<   <   Page 1 of 16   >   >>
Workshop
Sat 8:50 Self-Evaluation Improves Selective Generation in Large Language Models
Jie Ren · Yao Zhao · Tu Vu · Peter Liu · Balaji Lakshminarayanan
Workshop
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders
David Bruns-Smith · Angela Zhou
Poster
Tue 15:15 Self-Evaluation Guided Beam Search for Reasoning
Yuxi Xie · Kenji Kawaguchi · Yiran Zhao · James Xu Zhao · Min-Yen Kan · Junxian He · Michael Xie
Workshop
Re-evaluating Retrosynthesis Algorithms with Syntheseus
Krzysztof Maziarz · Austin Tripp · Austin Tripp · Guoqing Liu · Guoqing Liu · Megan J Stanley · Megan J Stanley · Shufang Xie · Shufang Xie · Piotr Gaiński · Piotr Gaiński · Philipp Seidl · Philipp Seidl · Marwin Segler · Marwin Segler
Workshop
Knowledge-based in silico models and dataset for the comparative evaluation of mammography AI
Elena Sizikova · Niloufar Saharkhiz · Diksha Sharma · Miguel Lago · Berkman Sahiner · Jana Delfino · Aldo Badano
Workshop
MCU: A Task-centric Framework for Open-ended Agent Evaluation in Minecraft
Haowei Lin · Zihao Wang · Jianzhu Ma · Yitao Liang
Workshop
Structure-based and leakage-free data splits for rigorous protein function evaluation
Charlotte Rochereau · Mohammed AlQuraishi · Arthur Valentin · Gergo Nikolenyi
Workshop
Paper 44: Evaluating ChatGPT-generated Textbook Questions using IRT
Shreya Bhandari · Yunting Liu · Zachary Pardos
Workshop
SCIBENCH: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
Xiaoxuan Wang · Ziniu Hu · Pan Lu · Yanqiao Zhu · Jieyu Zhang · Satyen Subramaniam · Arjun Loomba · Shichang Zhang · Yizhou Sun · Wei Wang
Workshop
Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models
Yujin Kim · Jaehong Yoon · Seonghyeon Ye · Sung Ju Hwang · Se-Young Yun
Workshop
An International Consortium for AI Risk Evaluations
Ross Gruetzemacher · Alan Chan · Štěpán Los · Kevin Frazier · Simeon Campos · Matija Franklin · José Hernández-Orallo · James Fox · Christin Manning · Philip M Tomei · Kyle Kilian
Workshop
Evaluating AI-guided Design for Scientific Discovery
Michael Pekala · Elizabeth Pogue · Alexander New · Gregory Bassen · Janna Domenico · Tyrel McQueen · Christopher Stiles