firstbacksecondback
1 Results
Workshop
|
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks Nathalie Kirch · Severin Field · Stephen Casper |