Timezone: »

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Wang · Yada Pruksachatkun · Nikita Nangia · Amanpreet Singh · Julian Michael · Felix Hill · Omer Levy · Samuel Bowman

Wed Dec 11 04:55 PM -- 05:00 PM (PST) @ West Ballroom A + B

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at https://super.gluebenchmark.com.

Author Information

Alex Wang (New York University)
Yada Pruksachatkun (New York University)
Nikita Nangia (NYU)
Amanpreet Singh (Facebook)
Julian Michael (University of Washington)
Felix Hill (Google Deepmind)
Omer Levy (Facebook AI Research)
Samuel Bowman (New York University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors