Timezone: »

What Can We Learn from Unlearnable Datasets?
Pedro Sandoval-Segura · Vasu Singla · Jonas Geiping · Micah Goldblum · Tom Goldstein

Thu Dec 14 03:00 PM -- 05:00 PM (PST) @ Great Hall & Hall B1+B2 #1610
Event URL: https://github.com/psandovalsegura/learn-from-unlearnable »

In an era of widespread web scraping, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization. In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured. Unlearnable datasets are also believed to induce learning shortcuts through linear separability of added perturbations. We provide a counterexample, demonstrating that linear separability of perturbations is not a necessary condition. To emphasize why linearly separable perturbations should not be relied upon, we propose an orthogonal projection attack which allows learning from unlearnable datasets published in ICML 2021 and ICLR 2023. Our proposed attack is significantly less complex than recently proposed techniques.

Author Information

Pedro Sandoval-Segura (University of Maryland)

I am currently a PhD student at the University of Maryland at College Park, where I am advised by Prof. David Jacobs and Prof. Tom Goldstein. I am broadly interested in computer vision and deep learning research. Lately, my research focuses on adversarial examples, adversarial training, and data poisoning.

Vasu Singla (University of Maryland)

I am a 5th-year grad student at the University of Maryland, interested in the security and privacy of ML systems.

Jonas Geiping (ELLIS Institute & MPI Intelligent Systems, Tübingen AI Center)
Jonas Geiping

Jonas is a postdoctoral researcher at UMD. His background is in Mathematics, more specifically in mathematical optimization and its applications to deep learning. His current focus is on designing more secure and private ML systems, especially for federated learning, and on understanding fundamental phenomena behind generalization.

Micah Goldblum (New York University)
Tom Goldstein (University of Maryland)

More from the Same Authors