`

Timezone: »

 
Few-Shot Learning Evaluation in Natural Language Understanding
Subhabrata Mukherjee · Xiaodong Liu · Guoqing Zheng · Saghar Hosseini · Hao Cheng · Ge Yang · Christopher Meek · Ahmed Awadallah · Jianfeng Gao
Event URL: https://openreview.net/forum?id=VhIIQBm00VI »

Most recent progress in natural language understanding (NLU) has been driven, in part, by benchmarks such as GLUE, SuperGLUE, SQuAD, etc. In fact, many NLU models have now matched or exceeded "human-level" performance on many tasks in these benchmarks. Most of these benchmarks, however, give models access to relatively large amounts of labeled data for training. As such, the models are provided far more data than required by humans to achieve strong performance. That has motivated a line of work that focuses on improving few-shot learning performance of NLU models. However, there is a lack of standardized evaluation benchmarks for few-shot NLU resulting in different experimental settings in different papers.To help accelerate this line of work, we introduce CLUES, a benchmark for evaluating the few-shot learning capabilities of NLU models. We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks. We also demonstrate differences between alternative model families and adaptation techniques in the few shot setting. Finally, we discuss several principles and choices in designing the experimental settings for evaluating the true few-shot learning performance and suggest a unified standardized approach to few-shot learning evaluation. We aim to encourage research on NLU models that can generalize to new tasks with a small number of examples. Code and data for CLUES are available at https://github.com/microsoft/CLUES.

Author Information

Subhabrata Mukherjee (Microsoft Research)

I am a senior scientist at Microsoft Research (MSR) working at the intersection of natural language understanding, deep learning and transfer learning. My current research is focused on making AI accessible to all with two major themes: (1) Scaling deep and large-scale natural language understanding models to scenarios with limited computational resources leveraging techniques like self-supervised, weakly supervised and curriculum learning, data augmentation, knowledge distillation, etc. (2) Building trustworthy AI for mitigating misinformation and bias to provide fair and equitable information access to all. Prior to joining MSR, I was leading the information extraction efforts to build the Amazon Product Knowledge Graph, an authoritative knowledge graph for all products in the world. I graduated summa cum laude from the Max Planck Institute for Informatics, Germany with a PhD in 2017. I was awarded the 2018 SIGKDD Doctoral Dissertation Runner-up Award for my thesis on credibility analysis and misinformation.

Xiaodong Liu (Microsoft)
Guoqing Zheng (Carnegie Mellon University)
Saghar Hosseini (Microsoft Research)
Hao Cheng (Microsoft)
Ge Yang (Microsoft Research)
Christopher Meek (Microsoft Research)
Ahmed Awadallah (MICROSOFT RESEARCH)

I am passionate about using AI and Machine Learning to create intelligent user experiences that connect people to information. I lead a research and incubation team in Microsoft Research Technologies. Our work at the Language and Information Technologies team is focused on creating language understanding and user modeling technologies to enable intelligent experiences in multiple products. Our work has been shipped in several products such as Bing, Cortana, Office 365, and Dynamics 365. I have hands-on experience building and shipping state-of-the-art ML/AI algorithms. I also have experience building and managing world-class teams of scientists and engineers. My research interests are at the intersection of machine learning, language understanding, and information retrieval. A key part of my work involves using Machine Learning to model large-scale text and user behavior data with applications to intelligent assistants, search, user modeling, quality evaluation, recommendation and personalization. I received my Ph.D. from the department of Computer Science and Engineering at the University of Michigan Ann Arbor. I Invented, published, and patented new approaches in language understanding, information retrieval and machine learning. I published 60+ peer-reviewed papers in these areas and I am an inventor on 20+ (granted and pending) patents.

Jianfeng Gao (Microsoft Research, Redmond, WA)

More from the Same Authors