Workshop
|
|
On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning
Soheila Samiee · Anton Thielmann
|
|
Workshop
|
|
Uncertainty as a criterion for SOTIF evaluation of deep learning models in autonomous driving systems
Ho Suk
|
|
Poster
|
Wed 11:00
|
Evaluating the design space of diffusion-based generative models
Yuqing Wang · Ye He · Molei Tao
|
|
Workshop
|
|
Rethinking Backdoor Detection Evaluation for Language Models
Jun Yan · Wenjie Mo · Xiang Ren · Robin Jia
|
|
Workshop
|
|
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA
Qianqi Yan · Xuehai He · Xiang Yue · Xin Eric Wang
|
|
Poster
|
Fri 11:00
|
SETLEXSEM CHALLENGE: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models
Nicholas Dronen · Bardiya Akhbari · Manish Digambar Gawali
|
|
Workshop
|
Sun 10:35
|
Invited Talk by Christoph Bergmeir - Fundamental limitations of foundational forecasting models: The need for multimodality and rigorous evaluation
|
|
Workshop
|
|
CausalBench: A Comprehensive Benchmark for Evaluating Causal Reasoning Capabilities of Large Language Models
ZEYU WANG
|
|
Tutorial
|
Tue 9:30
|
Evaluating Large Language Models - Principles, Approaches, and Applications
Bo Li · Irina Sigler · Yuan Xue
|
|
Expo Talk Panel
|
Wed 16:30
|
EUREKA: Evaluating and Understanding Large Foundation Models
Besmira Nushi · Vidhisha Balachandran · Neel Joshi
|
|
Workshop
|
|
Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources
Issey Sukeda
|
|
Workshop
|
|
Cascaded to End-to-End: New Safety, Security, and Evaluation Questions for Audio Language Models
Luxi He · Xiangyu Qi · Inyoung Cheong · Prateek Mittal · Danqi Chen · Peter Henderson
|
|