Workshop: Workshop on Distribution Shifts: Connecting Methods and Applications

A Learning Based Hypothesis Test for Harmful Covariate Shift

Tom Ginsberg · Zhongyuan Liang · Rahul Krishnan


Quickly and accurately identifying covariate shift at test time is a critical and often overlooked component of safe machine learning systems deployed in high-risk domains. In this work, we give an intuitive definition of harmful covariate shift (HCS) as a change in distribution that may weaken the generalization of a classification model. To detect HCS, we use the discordance between classifiers trained to agree on training data and disagree on test data. We derive a loss function for training these models and show that their disagreement rate and entropy represent powerful discriminative statistics for HCS. Empirically, we demonstrate the ability of our method to detect harmful covariate shift with statistical certainty on a variety of high-dimensional datasets. Across numerous domains and modalities, we show state-of-the-art performance compared to existing methods, particularly when the number of observed test samples is small.

Chat is not available.