Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

23 Results

<<   <   Page 1 of 2   >   >>
Workshop
Plentiful Jailbreaks with String Compositions
Brian Huang
Workshop
Plentiful Jailbreaks with String Compositions
Brian Huang
Workshop
Does Refusal Training in LLMs Generalize to the Past Tense?
Maksym Andriushchenko · Nicolas Flammarion
Poster
Thu 16:30 Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks
Andy Zhou · Bo Li · Haohan Wang
Workshop
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
Yifan Zeng · Yiran Wu · Xiao Zhang · Huazheng Wang · Qingyun Wu
Workshop
Infecting LLM Agents via Generalizable Adversarial Attack
Weichen Yu · Kai Hu · Tianyu Pang · Chao Du · Min Lin · Matt Fredrikson
Workshop
Jailbreaking Large Language Models with Symbolic Mathematics
Emet Bethany · Mazal Bethany · Juan Nolazco-Flores · Sumit Jha · peyman najafirad
Workshop
Sun 16:50 Contributed Talk 6: Infecting LLM Agents via Generalizable Adversarial Attack
Weichen Yu · Kai Hu · Tianyu Pang · Chao Du · Min Lin · Matt Fredrikson
Poster
Thu 11:00 Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
Zhao Xu · Fan LIU · Hao Liu
Workshop
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue
Poster
Thu 11:00 Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
Anay Mehrotra · Manolis Zampetakis · Paul Kassianik · Blaine Nelson · Hyrum Anderson · Yaron Singer · Amin Karbasi
Workshop
Sun 11:05 Contributed Talk 3: LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li · Ziwen Han · Ian Steneker · Willow Primack · Riley Goodside · Hugh Zhang · Zifan Wang · Cristina Menghini · Summer Yue