Skip to yearly menu bar Skip to main content

Workshop: Distribution shifts: connecting methods and applications (DistShift)

Extending the WILDS Benchmark for Unsupervised Adaptation

Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang


Machine learning systems deployed in the wild are often trained on a source distribution that differs from the target distribution on which it is deployed. Unlabeled data can be a powerful source of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data. However, existing distribution shift benchmarks for unlabeled data do not reflect many scenarios that arise naturally in real-world applications. In this work, we introduce U-WILDS, which augments the WILDS benchmark of in-the-wild distribution shifts with curated unlabeled data that would be realistically obtainable in deployment. U-WILDS contains 8 datasets spanning a wide range of applications (from histology to wildlife conservation), tasks (classification, regression, and detection), and modalities (photos, satellite images, microscope slides, text, molecular graphs). We systematically benchmark contemporary methods that leverage unlabeled data, including domain-invariant, self-training, and self-supervised methods, and show that their success on the shifts in U-WILDS is limited. To facilitate the development of methods that can work reliably on real-world distribution shifts, we provide an open-source package containing all of the relevant data loaders, model architectures, and methods.

Chat is not available.