Timezone: »

A Mechanistic Lens on Mode Connectivity
Ekdeep S Lubana · Eric Bigelow · Robert Dick · David Krueger · Hidenori Tanaka

With the rise of pretrained models, fine-tuning has become increasingly important. However, naive fine-tuning often does not eliminate a model's sensitivity to spurious cues. To understand and address this limitation, we study the geometry of neural network loss landscapes through the lens of mode-connectivity. We tackle two questions: 1) Are models trained on different distributions mode-connected? 2) Can we fine tune a pre-trained model to switch modes? We define a notion of mechanistic similarity based on shared invariances and show linearly-connected modes are mechanistically similar. We find naive fine-tuning yields linearly connected solutions and hence is unable to induce relevant invariances. We also propose and validate a method of ``mechanistic fine-tuning'' based on our gained insights.

Author Information

Ekdeep S Lubana (University of Michigan; CBS, Harvard University)
Eric Bigelow (Harvard University)
Robert Dick (University of Michigan)
David Krueger (Mila, University of Montreal)
Hidenori Tanaka (Harvard University, Harvard University)

More from the Same Authors