Timezone: »
Common explanations for shortcut learning assume that the shortcut improves prediction only under the training distribution. Thus, models trained in the typical way by minimizing log-loss using gradient descent, which we call default-ERM, should utilize the shortcut. However, even when the stable feature determines the label in the training distribution and the shortcut does not provide any additional information, like in perception tasks, default-ERM exhibits shortcut learning. Why are such solutions preferred when the loss can be driven to zero when using the stable feature alone? By studying a linear perception task, we show that default-ERM’s preference for maximizing the margin, even without overparameterization, leads to models that depend more on the shortcut than the stable feature. This insight suggests that default-ERM’s implicit inductive bias towards max-margin may be unsuitable for perception tasks. Instead, we consider inductive biases toward uniform margins. We show that uniform margins guarantee sole dependence on the perfect stable feature in the linear perception task and suggest alternative loss functions, termed margin control (MARG-CTRL), that encourage uniform-margin solutions. MARG-CTRL techniques mitigate shortcut learning on a variety of vision and language tasks, showing that changing inductive biases can remove the need for complicated shortcut-mitigating methods in perception tasks.
Author Information
Aahlad Manas Puli (New York University)
Lily Zhang (New York University)
Yoav Wald (NYU)
Rajesh Ranganath (New York University)
More from the Same Authors
-
2021 Spotlight: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2021 : Learning Invariant Representations with Missing Data »
Mark Goldstein · Adriel Saporta · Aahlad Puli · Rajesh Ranganath · Andrew Miller -
2021 : Learning to Accelerate MR Screenings »
Raghav Singhal · Mukund Sudarshan · Angela Tong · Daniel Sodickson · Rajesh Ranganath -
2021 : Individual treatment effect estimation in the presence of unobserved confounding based on a fixed relative treatment effect »
Wouter van Amsterdam · Rajesh Ranganath -
2021 : Quantile Filtered Imitation Learning »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2022 : Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds »
Yoav Wald · Gal Yona · Uri Shalit · Yair Carmon -
2023 Poster: Causal-structure Driven Augmentations for Text OOD Generalization »
Amir Feder · Yoav Wald · Claudia Shi · Suchi Saria · David Blei -
2022 Poster: In the Eye of the Beholder: Robust Prediction with Causal User Modeling »
Amir Feder · Guy Horowitz · Yoav Wald · Roi Reichart · Nir Rosenfeld -
2021 Poster: Inverse-Weighted Survival Games »
Xintian Han · Mark Goldstein · Aahlad Puli · Thomas Wies · Adler Perotte · Rajesh Ranganath -
2021 Poster: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2021 Poster: On Calibration and Out-of-Domain Generalization »
Yoav Wald · Amir Feder · Daniel Greenfeld · Uri Shalit -
2020 Poster: Deep Direct Likelihood Knockoffs »
Mukund Sudarshan · Wesley Tansey · Rajesh Ranganath -
2020 Poster: General Control Functions for Causal Effect Estimation from IVs »
Aahlad Puli · Rajesh Ranganath -
2020 Poster: X-CAL: Explicit Calibration for Survival Analysis »
Mark Goldstein · Xintian Han · Aahlad Puli · Adler Perotte · Rajesh Ranganath -
2020 Poster: Causal Estimation with Functional Confounders »
Aahlad Puli · Adler Perotte · Rajesh Ranganath -
2019 Poster: Globally Optimal Learning for Structured Elliptical Losses »
Yoav Wald · Nofar Noy · Gal Elidan · Ami Wiesel -
2019 Poster: Energy-Inspired Models: Learning with Sampler-Induced Distributions »
Dieterich Lawson · George Tucker · Bo Dai · Rajesh Ranganath -
2017 Poster: Robust Conditional Probabilities »
Yoav Wald · Amir Globerson