Timezone: »
Pre-training on massive unlabeled datasets greatly improves accuracy under distribution shifts. As a first step toward understanding this, we study a popular pre-training method, contrastive learning, in the unsupervised domain adaptation (UDA) setting where we only have labeled data from a source domain and unlabeled data from a target domain. We begin by showing on 4 benchmark datasets that out-of-the-box contrastive pre-training (even without large-scale unlabeled data) is competitive with other UDA methods. Intuitions from classical UDA methods such as domain adversarial training focus on bringing the domains together in feature space to improve generalization from source to target. Surprisingly, we find that contrastive pre-training learns features that are very far apart between the source and target domains. How then does contrastive learning improve robustness to distribution shift? We develop a conceptual model for contrastive learning under domain shifts, where data augmentations form connections between classes and domains that can be far apart. We propose a new measure of connectivity ---the relative connection strengths between same and different classes across domains---that governs the success of contrastive pre-training for domain adaptation in a simple example and strongly correlates with our results on benchmark datasets.
Author Information
Kendrick Shen (Stanford University)
Robert Jones (Stanford University)
Ananya Kumar (Stanford University)
Sang Michael Xie (Stanford University)
Percy Liang (Stanford University)

Percy Liang is an Assistant Professor of Computer Science at Stanford University (B.S. from MIT, 2004; Ph.D. from UC Berkeley, 2011). His research spans machine learning and natural language processing, with the goal of developing trustworthy agents that can communicate effectively with people and improve over time through interaction. Specific topics include question answering, dialogue, program induction, interactive learning, and reliable machine learning. His awards include the IJCAI Computers and Thought Award (2016), an NSF CAREER Award (2016), a Sloan Research Fellowship (2015), and a Microsoft Research Faculty Fellowship (2014).
More from the Same Authors
-
2020 : In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness »
Robert Jones -
2021 Spotlight: Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning »
Colin Wei · Sang Michael Xie · Tengyu Ma -
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 : Ensembles and Cocktails: Robust Finetuning for Natural Language Generation »
John Hewitt · Xiang Li · Sang Michael Xie · Benjamin Newman · Percy Liang -
2021 : Calibrated Ensembles: A Simple Way to Mitigate ID-OOD Accuracy Tradeoffs »
Ananya Kumar · Aditi Raghunathan · Tengyu Ma · Percy Liang -
2021 : Extending the WILDS Benchmark for Unsupervised Adaptation »
Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang -
2022 : Surgical Fine-Tuning Improves Adaptation to Distribution Shifts »
Yoonho Lee · Annie Chen · Fahim Tajwar · Ananya Kumar · Huaxiu Yao · Percy Liang · Chelsea Finn -
2022 : Surgical Fine-Tuning Improves Adaptation to Distribution Shifts »
Yoonho Lee · Annie Chen · Fahim Tajwar · Ananya Kumar · Huaxiu Yao · Percy Liang · Chelsea Finn -
2022 : Fine-Tuning without Distortion: Improving Robustness to Distribution Shifts »
Percy Liang · Ananya Kumar -
2022 Poster: Beyond Separability: Analyzing the Linear Transferability of Contrastive Representations to Related Subpopulations »
Jeff Z. HaoChen · Colin Wei · Ananya Kumar · Tengyu Ma -
2022 Poster: Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization? »
Rishi Bommasani · Kathleen A. Creel · Ananya Kumar · Dan Jurafsky · Percy Liang -
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 Poster: Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning »
Colin Wei · Sang Michael Xie · Tengyu Ma -
2020 Poster: Self-training Avoids Using Spurious Features Under Domain Shift »
Yining Chen · Colin Wei · Ananya Kumar · Tengyu Ma -
2019 Poster: Unlabeled Data Improves Adversarial Robustness »
Yair Carmon · Aditi Raghunathan · Ludwig Schmidt · John Duchi · Percy Liang -
2019 Poster: Verified Uncertainty Calibration »
Ananya Kumar · Percy Liang · Tengyu Ma -
2019 Spotlight: Verified Uncertainty Calibration »
Ananya Kumar · Percy Liang · Tengyu Ma