Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

2 Results

<<   <   Page 1 of 1   >>   >
Workshop
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus · Steven Abreu
Workshop
Towards Reliable Evaluation of Behavior Steering Interventions in LLMs
Itamar Pres · Laura Ruis · Ekdeep S Lubana · David Krueger