Timezone: »
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. MARGE provides an alternative to the dominant masked language modeling paradigm, where we self-supervise the \emph{reconstruction} of target text by \emph{retrieving} a set of related texts (in many languages) and conditioning on them to maximize the likelihood of generating the original. We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization. The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks. For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation. We further show that fine-tuning gives strong performance on a range of discriminative and generative tasks in many languages, making MARGE the most generally applicable pre-training method to date.
Author Information
Mike Lewis (Facebook AI Research)
Marjan Ghazvininejad (Facebook AI Research)
Gargi Ghosh (Facebook)
Armen Aghajanyan (Facebook)
Sida Wang (Facebook AI Research)
Luke Zettlemoyer (University of Washington and Facebook)
More from the Same Authors
-
2021 : Panel Discussion »
Pascal Poupart · Ali Ghodsi · Luke Zettlemoyer · Sameer Singh · Kevin Duh · Yejin Choi · Lu Hou -
2021 : Toward Efficient Training of Large Language Models with Balanced Conditional Compute »
Luke Zettlemoyer -
2021 Poster: Luna: Linear Unified Nested Attention »
Xuezhe Ma · Xiang Kong · Sinong Wang · Chunting Zhou · Jonathan May · Hao Ma · Luke Zettlemoyer -
2021 Poster: SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark »
Victor Zhong · Austin W. Hanjie · Sida Wang · Karthik Narasimhan · Luke Zettlemoyer -
2020 : Invited talk - De-noising Sequence-to-Sequence Pre-training »
Luke Zettlemoyer -
2020 Poster: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks »
Patrick Lewis · Ethan Perez · Aleksandra Piktus · Fabio Petroni · Vladimir Karpukhin · Naman Goyal · Heinrich Küttler · Mike Lewis · Wen-tau Yih · Tim Rocktäschel · Sebastian Riedel · Douwe Kiela -
2019 Poster: Hierarchical Decision Making by Generating and Following Natural Language Instructions »
Hengyuan Hu · Denis Yarats · Qucheng Gong · Yuandong Tian · Mike Lewis -
2017 : End-to-end Learning for Broad Coverage Semantics: SRL, Coreference, and Beyond »
Luke Zettlemoyer -
2008 Poster: Multi-Agent Filtering with Infinitely Nested Beliefs »
Luke Zettlemoyer · Brian Milch · Leslie Kaelbling -
2008 Spotlight: Multi-Agent Filtering with Infinitely Nested Beliefs »
Luke Zettlemoyer · Brian Milch · Leslie Kaelbling