Timezone: »

Towards Last-Layer Retraining for Group Robustness with Fewer Annotations
Tyler LaBonte · Vidya Muthukumar · Abhishek Kumar

Tue Dec 12 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #726

Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations and poor generalization on minority groups. The recent deep feature reweighting (DFR) technique achieves state-of-the-art group robustness via simple last-layer retraining, but it requires held-out group and class annotations to construct a group-balanced reweighting dataset. In this work, we examine this impractical requirement and find that last-layer retraining can be surprisingly effective with no group annotations (other than for model selection) and only a handful of class annotations. We first show that last-layer retraining can greatly improve worst-group accuracy even when the reweighting dataset has only a small proportion of worst-group data. This implies a "free lunch" where holding out a subset of training data to retrain the last layer can substantially outperform ERM on the entire dataset with no additional data, annotations, or computation for training. To further improve group robustness, we introduce a lightweight method called selective last-layer finetuning (SELF), which constructs the reweighting dataset using misclassifications or disagreements. Our experiments present the first evidence that model disagreement upsamples worst-group data, enabling SELF to nearly match DFR on four well-established benchmarks across vision and language tasks with no group annotations and less than 3% of the held-out class annotations.

Author Information

Tyler LaBonte (Georgia Institute of Technology)
Tyler LaBonte

I am a second-year PhD student in Machine Learning at the Georgia Institute of Technology advised by Jake Abernethy and Vidya Muthukumar. I completed my BS in Applied and Computational Mathematics at the University of Southern California, where I was a Trustee Scholar and Viterbi Fellow. My work is generously supported by the DoD NDSEG Fellowship. I am interested in advancing our scientific understanding of deep learning using both theory and experimentation. My current focus is characterizing the generalization phenomena of overparameterized neural networks and developing provable algorithms for efficient, accurate, and robust learning. I also enjoy applying mathematically-justified techniques to large-scale computer vision problems. The ultimate goal of my research is to enable the safe and trusted deployment of deep learning systems in high-consequence applications such as medicine, defense, and energy.

Vidya Muthukumar (Georgia Institute of Technology)
Abhishek Kumar (Google DeepMind)

More from the Same Authors