firstbacksecondback
188 Results
Poster
|
Thu 15:00 |
On Evaluating Adversarial Robustness of Large Vision-Language Models Yunqing Zhao · Tianyu Pang · Chao Du · Xiao Yang · Chongxuan LI · Ngai-Man (Man) Cheung · Min Lin |
|
Poster
|
Tue 15:15 |
High Precision Causal Model Evaluation with Conditional Randomization Chao Ma · Cheng Zhang |
|
Oral
|
Wed 14:15 |
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks Stephanie Milani · Anssi Kanervisto · Karolis Ramanauskas · Sander Schulhoff · Brandon Houghton · Rohin Shah |
|
Workshop
|
Fri 14:40 |
EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations Vaibhav Bihani · UTKARSH PRATIUSH · Sajid Mannan · Tao Du · Zhimin Chen · Santiago Miret · Matthieu Micoulaut · Morten Smedskjaer · Sayan Ranu · N M Anoop Krishnan |
|
Oral
|
Thu 8:30 |
Evaluating Post-hoc Explanations for Graph Neural Networks via Robustness Analysis Junfeng Fang · Wei Liu · Yuan Gao · Zemin Liu · An Zhang · Xiang Wang · Xiangnan He |
|
Workshop
|
CoDBench: A Critical Evaluation of Data-driven Models for Continuous Dynamical Systems Priyanshu Burark · Karn Tiwari · Meer Mehran Rashid · Prathosh AP · N M Anoop Krishnan |
||
Workshop
|
Structure-based and leakage-free data splits for rigorous protein function evaluation Charlotte Rochereau · Mohammed AlQuraishi · Arthur Valentin · Gergo Nikolenyi |
||
Workshop
|
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment Yang Liu · Yuanshun (Kevin) Yao · Jean-Francois Ton · Xiaoying Zhang · Ruocheng Guo · Hao Cheng · Yegor Klochkov · Muhammad Faaiz Taufiq · Hang Li |