firstbacksecondback
39 Results
Workshop
|
Evaluating Interventional Reasoning Capabilities of Large Language Models Tejas Kasetty · Divyat Mahajan · Gintare Karolina Dziugaite · Alexandre Drouin · Dhanya Sridhar |
||
Workshop
|
Conditioned Language Policy: A General Framework For Steerable Multi-Objective Finetuning Kaiwen Wang · Rahul Kidambi · Ryan Sullivan · Alekh Agarwal · Christoph Dann · Andrea Michi · Marco Gelmi · Yunxuan Li · Raghav Gupta · Kumar Avinava Dubey · Alexandre Rame · Johan Ferret · Geoffrey Cideron · Le Hou · Hongkun Yu · Amr Ahmed · Aranyak Mehta · Leonard Hussenot · Olivier Bachem · Edouard Leurent |
||
Poster
|
Wed 11:00 |
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts Mikayel Samvelyan · Sharath Chandra Raparthy · Andrei Lupu · Eric Hambro · Aram Markosyan · Manish Bhatt · Yuning Mao · Minqi Jiang · Jack Parker-Holder · Jakob Foerster · Tim Rocktäschel · Roberta Raileanu |