firstbacksecondback
3 Results
Workshop
|
Towards Reliable Evaluation of Behavior Steering Interventions in LLMs Itamar Pres · Laura Ruis · Ekdeep S Lubana · David Krueger |
||
Workshop
|
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering Joris Postmus · Steven Abreu |
||
Workshop
|
Language decoding from human brain activity via contrastive learning Matteo Ferrante · Nicola Toschi · Alexander Huth |