While paraphrasing is a promising approach for data augmentation in classification tasks, its effect on named entity recognition (NER) is not investigated systematically due to the difficulty of span-level label preservation. In this paper, we utilize simple strategies to annotate entity spans in generations and compare established and novel methods of paraphrasing in NLP such as back translation, specialized encoder decoder models such as Pegasus, and GPT-3 variants for their effectiveness in improving downstream performance for NER across different levels of gold annotations and paraphrasing strength on 5 datasets. We also analyze the quality of generated paraphrases based on their entity preservation and paraphrasing language quality. We find that the choice of the paraphraser greatly impacts NER performance, with one of the larger GPT-3 variants exceedingly capable at generating high quality paraphrases, improving performance in most cases, and not hurting others, while other paraphrasers show more mixed results. We also find inline auto annotations generated by larger GPT-3 to be strictly better than heuristic based annotations. We find diminishing benefits of paraphrasing as gold annotations increase for most datasets. While larger GPT-3 variants perform well by both entity preservation and human evaluation of language quality, those two metrics do not necessarily correlate with downstream performance for other paraphrasers.
Saket Sharma (JP Morgan Chase)
Aviral Joshi (CMU, Carnegie Mellon University)
Namrata Mukhija (New York University)
I'm a second-year M.S. in Computer Science student at [New York University, Courant School of Mathematical Sciences](https://www.courant.nyu.edu/). My primary research area is in Natural Language Processing, focused on devising ways for building/studying fair language models. At NeurIPS, I'm presenting my work centered at assessing the effect of paraphrasing as a data augmentation tool in improving downstream Named Entity Recognition in low versus high resource scenarios (done at [J.P. Morgan, Machine Learning Center of Excellence, New York](https://www.jpmorgan.com/technology/applied-ai-and-ml) as an Research AI & ML intern). At NYU, I work with Dr. He He and Dr. Chen Zhao on multi-modal reasoning. I previously interned at [Microsoft Research, India](https://www.microsoft.com/en-us/research/lab/microsoft-research-india/) under [Dr. Kalika Bali](https://www.microsoft.com/en-us/research/people/kalikab/) and [Dr. Monojit Choudhury](https://www.microsoft.com/en-us/research/people/monojitc/) in developing a [framework](https://arxiv.org/abs/2110.07444) for prioritizing research for low-resource language communities, understanding different [dilemmas](https://dl.acm.org/doi/10.1145/3530190.3534792) faced by technologists, their origin and complexity, and involving low-resource language communities in the development of language technologies. The work was published at [ACM COMPASS](https://dl.acm.org/doi/abs/10.1145/3530190.3534792). I was previously a Software Engineer 2 at [Microsoft](https://www.microsoft.com/). I was involved in developing various products such as [PowerPoint](https://www.microsoft.com/en-in/microsoft-365/powerpoint), [Fluid Framework](https://fluidframework.com/), and [Unified Service Desk](https://docs.microsoft.com/en-us/dynamics365/unified-service-desk/admin/overview-unified-service-desk?view=dynamics-usd-4.1).
Yiyun Zhao (J.P. Morgan Chase)
Hanoz Bhathena (JPMorgan Chase & Co)
Prateek Singh (JP Morgan Chase & Co.)
Sashank Santhanam (Apple)
Pritam Biswas (Columbia University)
More from the Same Authors
2022 : Importance of Synthesizing High-quality Data for Text-to-SQL Parsing »
Yiyun Zhao · Jiarong Jiang · Yiqun Hu · Wuwei Lan · Henghui Zhu · Anuj Chauhan · Hanbo Li · Lin Pan · Jun Wang · Chung-Wei Hang · Sheng Zhang · Mingwen Dong · Joseph Lilien · Patrick Ng · Zhiguo Wang · Vittorio Castelli · Bing Xiang