`

Timezone: »

 
Robust fine-tuning of zero-shot models
Mitchell Wortsman · Gabriel Ilharco · Jong Wook Kim · Mike Li · Hanna Hajishirzi · Ali Farhadi · Hongseok Namkoong · Ludwig Schmidt
Event URL: https://openreview.net/forum?id=x4-czw5UxFX »

Large pre-trained models such as CLIP offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning approaches substantially improve accuracy in-distribution, they also reduce out-of-distribution robustness. We address this tension by introducing a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models (WiSE-FT). Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements out-of-distribution, while matching or improving in-distribution accuracy. On ImageNet (in-distribution) and five derived distribution shifts, WiSE-FT improves out-of-distribution accuracy by 2 to 10 percentage points (pp) while increasing in-distribution accuracy by nearly 1 pp relative to standard fine-tuning. WiSE-FT achieves similarly large robustness improvements (2 to 15 pp) on a diverse set of six further distribution shifts, and in-distribution accuracy gains of 0.8 to 3.3 pp compared to standard fine-tuning on seven commonly used transfer learning datasets. These improvements come at no additional computational cost during fine-tuning or inference.

Author Information

Mitchell Wortsman (University of Washington, Allen Institute for Artificial Intelligence)
Gabriel Ilharco (Department of Computer Science, University of Washington)
Jong Wook Kim (OpenAI)

Jong Wook Kim is a member of technical staff at OpenAI, where he worked on GPT-2 output detection, Jukebox, and CLIP. His research interests include representation learning and generative modeling of audio and music, as well as its applications to multimodal deep learning. Prior to OpenAI, he completed a Ph.D. in music technology from NYU, which focused on automatic music transcription. He also worked as a research scientist intern at Pandora and Spotify, and as a software engineer at Kakao and NCSOFT.

Mike Li (Columbia University)
Hanna Hajishirzi (University of Washington)
Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence)
Hongseok Namkoong (Stanford University)
Ludwig Schmidt (University of Washington)

More from the Same Authors