Skip to yearly menu bar Skip to main content


Poster

Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs

Zhixin Xie ⋅ Xurui Song ⋅ Jun Luo
2025 Poster

Abstract

Video

Chat is not available.