Timezone: »

Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao

Tue Nov 29 09:00 AM -- 11:00 AM (PST) @ Hall J #101

Data augmentation is a critical contributing factor to the success of deep learning but heavily relies on prior domain knowledge which is not always available. Recent works on automatic data augmentation learn a policy to form a sequence of augmentation operations, which are still pre-defined and restricted to limited options. In this paper, we show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle that aims to preserve the minimum sufficient information of the labels. Given an example, the objective aims at creating a distant ``hard positive example'' as the augmentation, while still preserving the original label. We then propose a practical surrogate to the objective that can be optimized efficiently and integrated seamlessly into existing methods for a broad class of machine learning tasks, e.g., supervised, semi-supervised, and noisy-label learning. Unlike previous works, our method does not require training an extra generative model but instead leverages the intermediate layer representations of the end-task model for generating data augmentations. In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with pre-defined augmentations, e.g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly. Code will be released publicly.

Author Information

Kaiwen Yang (University of Science and Technology of China)
Yanchao Sun (University of Maryland, College Park)
Jiahao Su (University of Maryland)
Fengxiang He (JD.com Inc)

Fengxiang He received his BSc in statistics from University of Science and Technology of China, MPhil and PhD in computer science from the University of Sydney. He is currently an algorithm scientist at JD Explore Academy, JD.com Inc., leading its trustworthy AI team. His research interest is in the theory and practice of trustworthy AI, including deep learning theory, privacy-preserving ML, decentralized learning, and their applications. He has published in prominent journals and conferences, including TNNLS, TMM, TCSVT, ICML, NeurIPS, ICLR, CVPR, and ICCV. He is the area chair of AISTATS, BMVC, and ACML. He is the leading author of several standards on trustworthy AI.

Xinmei Tian (University of Science and Technology of China)
Furong Huang (University of Maryland)
Tianyi Zhou (University of Maryland, College Park)
Tianyi Zhou

Tianyi Zhou (https://tianyizhou.github.io) is a tenure-track assistant professor of computer science at the University of Maryland, College Park. He received his Ph.D. from the school of computer science & engineering at the University of Washington, Seattle. His research interests are in machine learning, optimization, and natural language processing (NLP). His recent works study curriculum learning that can combine high-level human learning strategies with model training dynamics to create a hybrid intelligence. The applications include semi/self-supervised learning, robust learning, reinforcement learning, meta-learning, ensemble learning, etc. He published >80 papers and is a recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE Computer Society TCSC Most Influential Paper Award.

Dacheng Tao (University of Technology, Sydney)

More from the Same Authors