Timezone: »

Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
Tao Lei · Junwen Bai · Siddhartha Brahma · Joshua Ainslie · Kenton Lee · Yanqi Zhou · Nan Du · Vincent Zhao · Yuexin Wu · Bo Li · Yu Zhang · Ming-Wei Chang

Wed Dec 13 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #319

We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation.Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-weight training phase.Our experiments demonstrate that the CoDA approach provides an unexpectedly efficient way to transfer knowledge.Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter approaches with moderate to no accuracy loss and the same parameter efficiency.

Author Information

Tao Lei (Google)
Junwen Bai (Google)
Junwen Bai

I'm a Research Scientist at Google. I received my PhD degree from the Department of Computer Science at Cornell University in 2022, advised by Prof. Carla P. Gomes. I received my Bachelor's degree in 2017 from Shanghai Jiao Tong University, where I spent four years in ACM honored Class. I am interested in the general areas of machine learning and language technology, with research focuses on sequence representation learning and probabilistic modeling, often under scenarios with low-supervision. I have developed scalable and general machine learning methods for real-world problems including automatic speech recognition, climate change and scientific discovery.

Siddhartha Brahma (IBM Research AI)

I am a research staff member at IBM Research AI. I am interested in all aspects of deep learning and its application to AI. My present work is focussed on improving the state-of-the-art in text classification, relation embedding, entity resolution and semi supervised rule learning using neural models. I am also fascinated by generative models and reinforcement learning. I have a PhD from EPFL, an MA from Princeton and a BTech from IIT Kharagpur and have worked in Google. I won the best paper award at ACM Mobihoc, 2013 and the President of India Gold Medal from IIT Kharagpur. In the past, I have worked on algorithms for wireless networks, information retrieval and theoretical computer science.

Joshua Ainslie (Google)
Kenton Lee (Google Research)
Yanqi Zhou (Google Deepmind)
Nan Du (Apple/AIML)
Vincent Zhao (Augment Computing)
Yuexin Wu (Google)
Bo Li (Google)
Yu Zhang (Google)
Ming-Wei Chang (Google DeepMind)

More from the Same Authors