Recent work shows that deep neural networks (DNNs) first learn clean samples and then memorize noisy samples. Early stopping can therefore be used to improve performance when training with noisy labels. It was also shown recently that the training trajectory of DNNs can be approximated in a low-dimensional subspace using PCA. The DNNs can then be trained in this subspace achieving similar or better generalization. These two observations were utilized together, to further boost the generalization performance of vanilla early stopping on noisy label datasets. In this paper, we probe this finding further on different real-world and synthetic label noises. First, we show that the prior method is sensitive to the early stopping hyper-parameter. Second, we investigate the effectiveness of PCA, for approximating the optimization trajectory under noisy label information. We propose to estimate low-rank subspace through robust and structured variants of PCA, namely Robust PCA, and Sparse PCA. We find that the subspace estimated through these variants can be less sensitive to early stopping, and can outperform PCA to achieve better test error when trained on noisy labels.
Vasu Singla (University of Maryland)
I am a 3rd year Grad Student at the University of Maryland, interested in adversarial robustness.
Shuchin Aeron (Tufts University)
Toshiaki Koike-Akino (MERL)
Matthew Brand (Mitsubishi Electric Research Labs)
Ye Wang (Mitsubishi Electric Research Labs)
More from the Same Authors
2022 Poster: Autoregressive Perturbations for Data Poisoning »
Pedro Sandoval-Segura · Vasu Singla · Jonas Geiping · Micah Goldblum · Tom Goldstein · David Jacobs
2021 Poster: Shift Invariance Can Reduce Adversarial Robustness »
Vasu Singla · Songwei Ge · Basri Ronen · David Jacobs
2017 Poster: Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks »
Ziming Zhang · Matthew Brand