Timezone: »
Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color is still identified as a car. Data augmentation, which adds the transformed data into the training set and trains a model on the augmented data, is one commonly used technique to build these invariances into the learning process. However, it is unclear how data augmentation performs theoretically and what the optimal algorithm is in presence of transformation invariances. In this paper, we study PAC learnability under transformation invariances in three settings according to different levels of realizability: (i) A hypothesis fits the augmented data; (ii) A hypothesis fits only the original data and the transformed data lying in the support of the data distribution; (iii) Agnostic case. One interesting observation is that distinguishing between the original data and the transformed data is necessary to achieve optimal accuracy in setting (ii) and (iii), which implies that any algorithm not differentiating between the original and transformed data (including data augmentation) is not optimal. Furthermore, this type of algorithms can even ``harm'' the accuracy. In setting (i), although it is unnecessary to distinguish between the two data sets, data augmentation still does not perform optimally. Due to such a difference, we propose two combinatorial measures characterizing the optimal sample complexity in setting (i) and (ii)(iii) and provide the optimal algorithms.
Author Information
Han Shao (Toyota Technological Institute at Chicago)
Omar Montasser (Toyota Technological Institute at Chicago)
Avrim Blum (Toyota Technological Institute at Chicago)
More from the Same Authors
-
2021 Spotlight: Excess Capacity and Backdoor Poisoning »
Naren Manoj · Avrim Blum -
2021 : One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning »
Richard Phillips · Han Shao · Avrim Blum · Nika Haghtalab -
2021 : On classification of strategic agents who can both game and improve »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2021 : The Strategic Perceptron »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2021 : One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning »
Richard Phillips · Han Shao · Avrim Blum · Nika Haghtalab -
2021 : On classification of strategic agents who can both game and improve »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2021 : The Strategic Perceptron »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2022 : Certifiable Robustness Against Patch Attacks Using an ERM Oracle »
Kevin Stangl · Avrim Blum · Omar Montasser · Saba Ahmadi -
2023 Poster: Provably Efficient Personalized Multi-Objective Decision Making via Comparative Feedback »
Han Shao · Lee Cohen · Avrim Blum · Yishay Mansour · Aadirupa Saha · Matthew Walter -
2023 Poster: Strategic Classification under Unknown Personalized Manipulation »
Avrim Blum · Omar Montasser · Han Shao -
2022 Panel: Panel 2C-2: Agreement-on-the-line: Predicting the… & A Theory of… »
Han Shao · Christina Baek -
2022 Panel: Panel 1A-4: Hardness of Noise-Free… & Adversarially Robust Learning:… »
Aravind Gollakota · Omar Montasser -
2022 : Panel »
Meena Jagadeesan · Avrim Blum · Jon Kleinberg · Celestine Mendler-Dünner · Jennifer Wortman Vaughan · Chara Podimata -
2022 Poster: Boosting Barely Robust Learners: A New Perspective on Adversarial Robustness »
Avrim Blum · Omar Montasser · Greg Shakhnarovich · Hongyang Zhang -
2022 Poster: Adversarially Robust Learning: A Generic Minimax Optimal Learner and Characterization »
Omar Montasser · Steve Hanneke · Nati Srebro -
2021 Poster: Excess Capacity and Backdoor Poisoning »
Naren Manoj · Avrim Blum -
2020 Workshop: Workshop on Dataset Curation and Security »
Nathalie Baracaldo · Yonatan Bisk · Avrim Blum · Michael Curry · John Dickerson · Micah Goldblum · Tom Goldstein · Bo Li · Avi Schwarzschild -
2020 Poster: Reducing Adversarially Robust Learning to Non-Robust PAC Learning »
Omar Montasser · Steve Hanneke · Nati Srebro -
2020 Poster: Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples »
Shafi Goldwasser · Adam Tauman Kalai · Yael Kalai · Omar Montasser -
2020 Spotlight: Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples »
Shafi Goldwasser · Adam Tauman Kalai · Yael Kalai · Omar Montasser -
2020 Session: Orals & Spotlights Track 24: Learning Theory »
Avrim Blum · Steve Hanneke -
2020 Poster: Online Learning with Primary and Secondary Losses »
Avrim Blum · Han Shao -
2018 Poster: On preserving non-discrimination when combining expert advice »
Avrim Blum · Suriya Gunasekar · Thodoris Lykouris · Nati Srebro -
2017 Poster: Collaborative PAC Learning »
Avrim Blum · Nika Haghtalab · Ariel Procaccia · Mingda Qiao -
2014 Poster: Learning Optimal Commitment to Overcome Insecurity »
Avrim Blum · Nika Haghtalab · Ariel Procaccia -
2014 Poster: Learning Mixtures of Ranking Models »
Pranjal Awasthi · Avrim Blum · Or Sheffet · Aravindan Vijayaraghavan -
2014 Poster: Active Learning and Best-Response Dynamics »
Maria-Florina F Balcan · Christopher Berlind · Avrim Blum · Emma Cohen · Kaushik Patnaik · Le Song -
2014 Spotlight: Learning Mixtures of Ranking Models »
Pranjal Awasthi · Avrim Blum · Or Sheffet · Aravindan Vijayaraghavan -
2010 Spotlight: Trading off Mistakes and Don't-Know Predictions »
Amin Sayedi · Avrim Blum · Morteza Zadimoghaddam -
2010 Poster: Trading off Mistakes and Don't-Know Predictions »
Amin Sayedi · Morteza Zadimoghaddam · Avrim Blum -
2009 Workshop: Clustering: Science or art? Towards principled approaches »
Margareta Ackerman · Shai Ben-David · Avrim Blum · Isabelle Guyon · Ulrike von Luxburg · Robert Williamson · Reza Zadeh -
2009 Poster: Tracking Dynamic Sources of Malicious Activity at Internet Scale »
Shobha Venkataraman · Avrim Blum · Dawn Song · Subhabrata Sen · Oliver Spatscheck -
2009 Spotlight: Tracking Dynamic Sources of Malicious Activity at Internet Scale »
Shobha Venkataraman · Avrim Blum · Dawn Song · Subhabrata Sen · Oliver Spatscheck -
2008 Workshop: New Challanges in Theoretical Machine Learning: Data Dependent Concept Spaces »
Maria-Florina F Balcan · Shai Ben-David · Avrim Blum · Kristiaan Pelckmans · John Shawe-Taylor