Timezone: »

Active Learning Over Multiple Domains in Natural Language Tasks
Shayne Longpre · Julia Reisler · Edward Huang · Yi Lu · Andrew Frank · Nikhil Ramesh · Chris DuBois
Event URL: https://openreview.net/forum?id=3623QadW7Gj »

Studies of active learning traditionally assume the target and source data stem from a single domain. However, in realistic applications, practitioners often require active learning with multiple sources of out-of-distribution data, where it is unclear a priori which data sources will help or hurt the target domain. We survey a wide variety of techniques in active learning (AL), domain shift detection (DS), and multi-domain sampling to examine this challenging setting for question answering and sentiment analysis. Among 18 acquisition functions from 4 families of methods, we find H-Divergence methods, and particularly our proposed variant DAL-E, yield effective results, averaging 2-3% improvements over the random baseline. Our findings yield the first comprehensive analysis of both existing and novel methods for practitioners faced with multi-domain active learning for natural language tasks.

Author Information

Shayne Longpre (Massachusetts Institute of Technology)
Julia Reisler (Apple)

Julia is a Machine Learning Engineer in Apple's AI/ML division. She graduated from Caltech with a BS in computer science. Her research interests include dataset shift and how it relates to active learning and fairness/bias.

Edward Huang (Apple)
Yi Lu (Forethought)
Andrew Frank (Apple)
Nikhil Ramesh
Chris DuBois (Apple)

More from the Same Authors

  • 2023 Workshop: Instruction Tuning and Instruction Following »
    Qinyuan Ye · Yizhong Wang · Shayne Longpre · Yao Fu · Daniel Khashabi
  • 2022 Poster: The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset »
    Hugo Laurençon · Lucile Saulnier · Thomas Wang · Christopher Akiki · Albert Villanova del Moral · Teven Le Scao · Leandro Von Werra · Chenghao Mou · Eduardo González Ponferrada · Huu Nguyen · Jörg Frohberg · Mario Šaško · Quentin Lhoest · Angelina McMillan-Major · Gerard Dupont · Stella Biderman · Anna Rogers · Loubna Ben allal · Francesco De Toni · Giada Pistilli · Olivier Nguyen · Somaieh Nikpoor · Maraim Masoud · Pierre Colombo · Javier de la Rosa · Paulo Villegas · Tristan Thrush · Shayne Longpre · Sebastian Nagel · Leon Weber · Manuel Muñoz · Jian Zhu · Daniel Van Strien · Zaid Alyafeai · Khalid Almubarak · Minh Chien Vu · Itziar Gonzalez-Dios · Aitor Soroa · Kyle Lo · Manan Dey · Pedro Ortiz Suarez · Aaron Gokaslan · Shamik Bose · David Adelani · Long Phan · Hieu Tran · Ian Yu · Suhas Pai · Jenny Chim · Violette Lepercq · Suzana Ilic · Margaret Mitchell · Sasha Alexandra Luccioni · Yacine Jernite