NeurIPS 2024

Skip to yearly menu bar Skip to main content

3 Results

Workshop		Towards Reliable Evaluation of Behavior Steering Interventions in LLMs Itamar Pres · Laura Ruis · Ekdeep S Lubana · David Krueger
Workshop		Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering Joris Postmus · Steven Abreu
Workshop		Language decoding from human brain activity via contrastive learning Matteo Ferrante · Nicola Toschi · Alexander Huth