Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

340 Results

<<   <   Page 3 of 29   >   >>
Poster
Thu 11:00 Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
Zhao Xu · Fan LIU · Hao Liu
Poster
Thu 11:00 When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
Yinghui Li · Qingyu Zhou · Yuanzhen Luo · Shirong Ma · Yangning Li · Hai-Tao Zheng · Xuming Hu · Philip S Yu
Poster
Thu 16:30 PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations
Jiatong Li · Renjun Hu · Kunzhe Huang · Yan Zhuang · Qi Liu · Mengxiao Zhu · Xing Shi · Wei Lin
Poster
Thu 16:30 NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security
Minghao Shao · Sofija Jancheska · Meet Udeshi · Brendan Dolan-Gavitt · haoran xi · Kimberly Milner · Boyuan Chen · Max Yin · Siddharth Garg · Prashanth Krishnamurthy · Farshad Khorrami · Ramesh Karri · Muhammad Shafique
Poster
Wed 11:00 Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation
Kehan Guo · Bozhao Nan · Yujun Zhou · Taicheng Guo · Zhichun Guo · Mihir Surve · Zhenwen Liang · Nitesh Chawla · Olaf Wiest · Xiangliang Zhang
Poster
Wed 16:30 Benchmarking LLMs via Uncertainty Quantification
Fanghua Ye · Mingming Yang · Jianhui Pang · Longyue Wang · Derek Wong · Emine Yilmaz · Shuming Shi · Zhaopeng Tu
Poster
Fri 16:30 Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs
Ching-An Cheng · Allen Nie · Adith Swaminathan
Poster
Fri 11:00 UniTox: Leveraging LLMs to Curate a Unified Dataset of Drug-Induced Toxicity from FDA Labels
Jacob Silberg · Kyle Swanson · Elana Simon · Angela Zhang · Zaniar Ghazizadeh · Scott Ogden · Hisham Hamadeh · James Zou
Poster
Fri 11:00 StackEval: Benchmarking LLMs in Coding Assistance
Nidhish Shah · Zulkuf Genc · Dogu Araci
Poster
Wed 16:30 RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content
Joao Monteiro · Pierre-André Noël · Étienne Marcotte · Sai Rajeswar Mudumba · Valentina Zantedeschi · David Vazquez · Nicolas Chapados · Chris Pal · Perouz Taslakian
Poster
Fri 16:30 Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
Rudolf Laine · Bilal Chughtai · Jan Betley · Kaivalya Hariharan · Mikita Balesni · Jérémy Scheurer · Marius Hobbhahn · Alexander Meinke · Owain Evans
Poster
Fri 16:30 QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
Zhuo Chen · Rumen Dangovski · Charlotte Loh · Owen Dugan · Di Luo · Marin Soljacic