Workshop: Distribution shifts: connecting methods and applications (DistShift)

Exploiting Causal Chains for Domain Generalization

Olawale Salaudeen · Sanmi Koyejo


Invariant Causal Prediction provides a framework for domain (or out-of-distribution) generalization – predicated on the assumption of invariant causal mechanisms that are constant across the data distributions of interest. Accordingly, the Invariant Risk Minimization (IRM) objective has been proposed to learn this stable structure, given sufficient training distributions. Unfortunately, recent work has identified the limitations of IRM when extended to data-generating mechanisms that are different from those considered in its formulation. This work considers the generative process with causal (predecessor) and anticausal (successor) features where environment-specific exogenous factors influence all features – but the target is free of direct environment-specific influences. We show empirically that IRM fails under this data-generating process. Instead, we propose a target conditioned representation independence (TCRI) constraint, which enforces the mediative effect of the observed target on the causal chain of latent features we aim to identify. We show that this approach outperforms both Empirical Risk Minimization (ERM) and IRM.

Chat is not available.